Refunds
A refund in Sill is an agent-initiated return that travels the same signed pipeline as the original purchase: a signed request_refund mandate enters the edge, the active policy evaluates it, a human reviewer clears the escalation, and the merchant’s existing processor (Stripe today; Shopify in test mode) executes the refund. The refund outcome — together with the original mandate it references — is written to the signed, Merkle-chained audit envelope. Sill never custodies funds: Stripe holds the card, performs the refund, and pays the buyer back. Sill issues the signed authorization and the audit record.
How a refund flows
Section titled “How a refund flows”A refund is a second mandate that points back at the original. The agent never names an amount it pleases — the refund amount is server-derived as min(signed cap, original settled total), and the refund settles on whatever rail settled the original.
sequenceDiagram
autonumber
participant Agent
participant Edge as Sill edge
participant Reviewer as Human reviewer
participant Origin as Sill origin
participant Processor as Stripe (or Shopify)
participant Audit as Audit envelope
Agent->>Edge: Signed refund mandate (request_refund + original_mandate_id)
Edge->>Edge: Verify signature, identity, and active policy (r07 → escalate)
Edge->>Reviewer: Pause and route to dashboard queue
Reviewer-->>Edge: approve
Edge->>Origin: Enqueue refund dispatch
Origin->>Origin: Pre-rail gates (rail selection, tenancy, refund-on-refund)
Origin->>Processor: Refund call against original charge or order
Processor-->>Origin: succeeded / failed
Origin->>Audit: Signed refund record + settlement evidence
Audit-->>Agent: Verifiable refund outcome
The refund mandate shape
Section titled “The refund mandate shape”A refund mandate is a normal Sill mandate with intent.action = "request_refund" and a closed, narrowly-typed intent block. The agent may not assert order details (line totals, taxes, buyer identity) — those are resolved server-side from the original mandate’s own records. This is deliberate: it forecloses an “amount-smuggling” attack where the agent invents what it allegedly bought.
{ "envelope": { "alg": "EdDSA", "kid": "agent-key-id-…" }, "signed": { "mandate_id": "mnd_01J9F4Z9R6Q7K3P8Y2N5T6V1W2", "principal": { "type": "human", "ref": "buyer:opaque-id" }, "agent": { "agent_id": "agent_anthropic_claude" }, "site": { "site_id": "01EXAMPLE00000000000000000" }, "intent": { "action": "request_refund", "original_mandate_id": "mnd_01J9C8P7K4V2H5F6Q8R9T3N1M2", "reason_code": "requested_by_customer", "scope": "full", "max_amount": 1.50, "currency": "USD", "merchant": "Example Merchant" }, "issued_at": "2026-06-22T14:30:00Z" }, "signature": "…"}Allowed reason_code values (closed set): requested_by_customer, defective, not_received, wrong_item, other.
scope is "full" in v1. A { "line_items": [...] } shape is parsed by the edge projector but the Stripe and Shopify refund executors only ship the full path today — anything else aborts with scope_unsupported before any rail call.
max_amount is a cap, never an assertion of the original total. The executor computes amount_minor = min(round(max_amount * 100), original.amount_minor) from server-side state. If the agent’s cap is too low, the refund is bounded by the cap; if the cap exceeds the settled total, the refund is bounded by the settled total.
Admission and the original-order pin
Section titled “Admission and the original-order pin”Before a refund mandate is admitted, the origin checks that the referenced original_mandate_id:
- Belongs to the same site as the refund (same-site pinning, enforced in the JS match even though the SQL filter is account-wide under row-level security).
- Has a settlement-authorizing decision — either
approvedorescalated_approved. - Actually settled on a refund-capable rail (charge
paidfor Stripe; orderpaidfor Shopify).
If any check fails, the refund never reaches a rail call. The resulting state is a terminal failed_original_not_charged (or failed_tenancy_violation) row on refund_state, and a signed audit record describing the abort.
Policy and HITL
Section titled “Policy and HITL”In the default policy set, a request_refund mandate hits r07 — destructive action requires human review, which escalates to the dashboard. The mandate is paused, no rail call runs, and the refund appears in the reviewer queue alongside a server-resolved view of the original order: the original audit record id, the line items at settlement, and a link to reveal the buyer block through the existing access-logged decrypt path (no PII rides inline on the queue wire).
A reviewer with the reviewer, admin, or owner user role approves or rejects. The resolution is appended to the audit envelope before the rail call runs. See Human in the loop for the full reviewer surface.
The reviewer queue for a refund escalation: the original-order panel renders server-resolved evidence; the agent’s intent carries only the closed refund triplet.
Rail selection and execution
Section titled “Rail selection and execution”A refund settles on the same rail that settled the original. The settlement-rail claim on the original mandate’s audit record drives dispatcher selection:
| Original rail | Refund executor | Status |
|---|---|---|
stripe | stripe-refund-executor (calls Stripe Refunds API) | Live-mode validated at dogfood scope |
shopify | shopify-refund-executor (calls Shopify Admin GraphQL) | Test-mode only; live-mode gate not flipped |
The closed set of refund-capable rails ({stripe, shopify}) is the single source of truth consumed by both:
- The agent card backing predicate that decides whether
request_refundmay be advertised at all. - The post-commit dispatcher selection in the mandate queue worker.
Adding a future rail without wiring both paths is a compile error.
Stripe refunds
Section titled “Stripe refunds”The Stripe path resolves the original charge_state row by (site_id, original_mandate_id), requires the original charge to be in a paid state, and calls Stripe’s refund endpoint against the recorded stripe_charge_id (or payment_intent when the charge id is unset). The Connect account is plumbed via the Stripe-Account header from the merchant’s stored integration.
Live mode requires three independent gates: the SILL_STRIPE_LIVE_GATE_PASSED deploy-pipeline secret, the per-account stripe_mode = live row, and the merchant’s decrypted live credential. Test-mode and live-mode credentials are dual-bound — a live event signed with the test webhook secret (or vice versa) is rejected.
Shopify refunds
Section titled “Shopify refunds”The Shopify path mirrors the Stripe shape against shopify_order_state, resolves the order id from the original, and calls the Shopify Admin GraphQL refundCreate mutation. The live-mode gate for the Shopify rail is separate from the Stripe gate and has not been flipped — Shopify refunds today are test-mode only. See the overview for the honest bounds.
Idempotency and the refund-on-refund exclusion
Section titled “Idempotency and the refund-on-refund exclusion”Refunds live on a rail-agnostic refund_state table with two structural protections:
- A
UNIQUE (site_id, mandate_id)index makes the executor’sINSERT … ON CONFLICT DO NOTHINGthe idempotency lock for the refund mandate itself. Replaying the samemnd_…produces the same terminal state, never a second rail call. - A partial
UNIQUE (site_id, original_mandate_id) WHERE state IN (non-failed…)index is the refund-on-refund exclusion: one non-failed refund per settled original, in v1. A second refund attempt against an already-refunded original raises23505and the executor surfacesalready_executedevidence (with the existingstripe_refund_idechoed back) — again, no second rail call.
Both gates run before any external API call. The executor never throws past the queue worker; every failure returns structured settlement evidence and a terminal refund_state row.
Refund-state machine
Section titled “Refund-state machine”pending_local_init ├─→ pending_stripe_call ─→ refund_succeeded (terminal) │ ─→ refund_failed (terminal) ├─→ pending_shopify_call ─→ refund_succeeded (terminal) │ ─→ refund_failed (terminal) ├─→ failed_tenancy_violation (terminal — ZERO rail call) ├─→ failed_rate_limited (terminal — ZERO rail call) ├─→ failed_original_not_charged (terminal — ZERO rail call) └─→ failed_scope_unsupported (terminal — v1 ships `full` only)Every UPDATE carries a WHERE state IN (…) predicate, so a stale writer cannot regress a terminal row.
Webhook reconciliation
Section titled “Webhook reconciliation”Stripe also emits asynchronous events that affect refund state independently of Sill’s executor:
| Event | Handler outcome |
|---|---|
charge.refunded | Marks the original charge_state as refunded; sets refunded: true and refunded_minor on the original audit record’s discovery_context.settlement. |
charge.dispute.created | Marks the original charge_state as disputed; records the dispute reason and amount; logged at error severity for ops attention within Stripe’s evidence-submission window. |
Both handlers require the connected-account id on the event to match the integration on file; mismatch → critical log + 200 ACK with zero state mutation. A webhook for a charge Sill did not create is acked without writing.
What the audit envelope records
Section titled “What the audit envelope records”The refund audit record carries — under discovery_context.settlement — a kind: "refund" evidence block discriminated by rail:
{ "rail": "stripe", "kind": "refund", "outcome": "succeeded", "dispatched_at": "2026-06-22T14:31:08Z", "original_mandate_id": "mnd_01J9C8P7K4V2H5F6Q8R9T3N1M2", "stripe_refund_id": "re_3Ti4JaEAXJFotMa3...", "stripe_charge_id": "ch_3Ti4JaEAXJFotMa3...", "stripe_payment_intent_id": "pi_3Ti4JaEAXJFotMa3...", "stripe_mode": "live", "refund_state_id": "rfs_01J9F5...", "refunded_minor": 150, "original_total_minor": 150, "currency": "USD", "reason_code": "requested_by_customer"}The record is ed25519-signed and Merkle-chained alongside every other mandate the site processes. It is independently verifiable against the public JWKS using the same JCS + detached JWS recipe the agent card and ARD catalog use.
Outcome values: succeeded, reconciled, failed, aborted, already_executed, approved_but_rail_disabled. Reason values (on a non-success outcome) include the executor’s terminal reason — scope_unsupported, original_not_charged, already_executed, rate_limited_by_self — or a Stripe / Shopify ConnectorErrorClass.
Honest bounds
Section titled “Honest bounds”- Stripe live rail. The signed-mandate → policy → refund → audit pipeline has cleared and refunded real live-mode Stripe charges end-to-end on a single Sill-controlled merchant. This is dogfood validation, not multi-merchant production refund volume.
- Shopify live rail. Not flipped. Refunds against Shopify orders run test-mode only today; the live-mode gate is a separate founder/ops decision and requires Shopify Payments live plus the corresponding processor agreement.
- Scope. Refund executors today ship the
fullscope only. The mandate shape accepts{ "line_items": [...] }but the rail aborts withfailed_scope_unsupported. - One non-failed refund per original. Partial / multi-tranche refunds are not in v1.
- PCI posture. Refunds never see a raw PAN. Sill handles only opaque processor tokens (Stripe
pm_*/ charge ids / payment-intent ids). Architecture and a CI grep gate enforce this; Sill holds no PCI attestation today and claims none.
Common questions
Section titled “Common questions”Can an agent refund any amount it wants?
Section titled “Can an agent refund any amount it wants?”No. The agent’s max_amount is a cap. The actual refund amount is min(round(max_amount * 100), original.amount_minor), computed server-side from the original charge_state or shopify_order_state row. The agent has no way to assert a fabricated original total.
What if the same refund mandate is replayed?
Section titled “What if the same refund mandate is replayed?”The (site_id, mandate_id) unique index makes the executor’s INSERT-first idempotency lock fire — the second attempt returns the existing terminal state and produces no second rail call.
What if a second refund is requested against an already-refunded original?
Section titled “What if a second refund is requested against an already-refunded original?”The partial unique index on (site_id, original_mandate_id) raises a constraint violation; the executor surfaces already_executed evidence with the existing stripe_refund_id echoed back. No second rail call. Multi-tranche refunds are not in v1.
What happens if the original mandate was never charged?
Section titled “What happens if the original mandate was never charged?”The refund aborts with failed_original_not_charged and a signed audit record describing the abort. No rail call runs.
Can refunds skip human review?
Section titled “Can refunds skip human review?”In the default policy set, request_refund hits r07 destructive action and escalates. A merchant policy can be authored to approve refunds automatically under specific conditions, but the shipped default routes every refund through a reviewer.
What about disputes and chargebacks?
Section titled “What about disputes and chargebacks?”charge.dispute.created is handled at the webhook layer: the original charge_state is marked disputed, the dispute reason and amount are recorded on the original audit record’s settlement evidence, and the event is logged at error severity for ops attention. Sill does not generate dispute evidence on the merchant’s behalf — that is the merchant’s responsibility inside the Stripe Dashboard.
See also
Section titled “See also”- Transactional overview
- Signed mandates
- Policy engine
- Human in the loop
- Payments
- Audit envelope
- Verify a signature
- Public JWKS
- Agent card
- Concepts
- Compliance
- External: Stripe Refunds API, Stripe disputes guide, Shopify
refundCreatemutation, RFC 8785 JCS, RFC 8032 ed25519, RFC 7515 JWS / RFC 8037 EdDSA