May 13, 2026 Structured outputToolsTraces
Build a data extraction workflow
Call tools, force JSON outputs, recover from malformed responses, and inspect every extraction step.
This cookbook builds a structured extraction workflow for AI outputs that must be parsed, validated, retried, and explained.
Scenario
An analyst submits free-form notes. The workflow extracts accounts, contacts, dates, and next actions as JSON, validates the result, and stores the structured record.
What you build
- A structured-output prompt.
- A schema validator.
- A repair step for malformed JSON.
- A retry policy for transient model failures.
- A trace that shows raw and parsed outputs.
Workflow shape
@workflow
async def extract_account_update(ctx: WorkflowContext, note_id: str) -> ExtractionResult:
note = await ctx.step(load_note, note_id)
raw = await ctx.step(call_extraction_agent, note.text)
parsed = await ctx.step(parse_and_validate_update, raw)
receipt = await ctx.step(store_update_once, note.id, parsed)
return ExtractionResult(update_id=receipt.id)Separating model call and parse step makes malformed output easy to inspect.
Schema-first extraction
Define the expected output before writing the prompt.
class AccountUpdate(BaseModel):
account_name: str
contacts: list[str]
next_action: str
due_date: date | None
confidence: floatThe validator should reject missing required fields and values that do not match business rules.
Malformed output recovery
If parsing fails, run a bounded repair step and keep both versions in the trace.
@function
async def parse_and_validate_update(raw: str) -> AccountUpdate:
try:
return AccountUpdate.model_validate_json(raw)
except ValidationError:
repaired = await repair_json(raw)
return AccountUpdate.model_validate_json(repaired)Production checks
- Raw model output and parsed output are both trace-visible.
- Repair attempts are bounded.
- Invalid data fails before the storage step.
- The storage step is idempotent.
- Failed extractions can be converted into eval cases.