Retry AI workflow steps without duplicate side effects
Use idempotency keys and journaled receipts so retries do not duplicate emails, tickets, or payments.
Retries are necessary in production AI workflows. They are also dangerous when a step talks to Stripe, sends email, creates a ticket, or fires a webhook. This cookbook shows the pattern for retrying safely after a side effect may already have happened.
Scenario
An AI workflow classifies a support request and creates a ticket in a CRM. The CRM request succeeds, but the network connection drops before your worker sees the response. The runtime retries the step.
Without an idempotency pattern, the customer gets two tickets. With the pattern, the retry returns the original receipt.
What you build
- A workflow with retryable external side effects.
- Stable idempotency keys for each side-effect step.
- A receipt that is stored in the AGNT5 journal.
- Retry behavior that returns the original external object.
- Trace checks that prove only one side effect happened.
Workflow shape
Keep side effects small and named.
@workflow
async def triage_and_create_ticket(ctx: WorkflowContext, inbound: InboundRequest) -> TicketResult:
classification = await ctx.step(classify_request, inbound)
ticket = await ctx.step(create_crm_ticket_once, inbound.request_id, classification)
email = await ctx.step(send_ack_email_once, inbound.request_id, ticket.id)
return TicketResult(ticket_id=ticket.id, email_id=email.id)create_crm_ticket_once and send_ack_email_once are the only steps that touch
external systems.
Idempotency key
Base the key on the business object, not on the retry attempt.
def crm_idempotency_key(request_id: str) -> str:
return f"crm-ticket:{request_id}"
@function
async def create_crm_ticket_once(
request_id: str,
classification: Classification,
) -> CrmTicket:
return await crm.create_ticket(
subject=classification.subject,
priority=classification.priority,
idempotency_key=crm_idempotency_key(request_id),
)If the CRM supports idempotency keys, use its native support. If it does not, store a receipt in your own database keyed by the same value before returning.
Journaled receipt
The step should return the external receipt, not just true.
class CrmTicket(BaseModel):
id: str
idempotency_key: str
created_at: datetimeOn replay, AGNT5 reads this receipt from the journal. The workflow can continue without creating the ticket again.
Production checks
- Inject a timeout after the CRM creates the ticket.
- Confirm the retry uses the same idempotency key.
- Confirm only one CRM ticket exists.
- Confirm the AGNT5 trace shows the failed attempt and the successful retry.
- Confirm replay returns the journaled ticket receipt.