Structured Outputs Killed Half Our Error Handling

OpenAI shipped structured outputs in August 2024. Guaranteed JSON conforming to a JSON Schema. No more parsing errors. No more "the model returned markdown instead of JSON." No more regex extraction from free text.

Six months after adopting it across all our agent systems, we deleted approximately 40% of our error handling code.

What we removed

Output parsing retries. We had retry loops that re-prompted the model when it returned malformed JSON. With structured outputs, the model cannot return malformed JSON. These retries never fire. Deleted.

Schema validation layers. We had Zod schemas that validated model output at runtime. With structured outputs, the model's response is pre-validated against the schema. Redundant. Deleted.

Fallback extractors. When the model returned text instead of JSON, we had regex-based extractors that attempted to parse the response anyway. Hacky, brittle, and often wrong. With structured outputs, this case doesn't exist. Deleted.

Type coercion utilities. Converting string numbers to actual numbers, parsing date strings, normalizing boolean representations. With structured outputs and proper schema definitions, types arrive correct. Deleted.

What replaced it

Structured outputs solved the format problem. They didn't solve the content problem. The model always returns valid JSON now. But valid JSON with wrong values is still wrong.

Semantic validation. The JSON is valid, but is the sentiment field actually correct? Is the category assignment meaningful? Is the summary an accurate representation of the input? These require a different kind of validation, often another model call or a rule engine.

Schema design errors. With structured outputs, your JSON Schema is your API contract with the model. A poorly designed schema produces consistently structured but consistently useless output. We now spend more time designing schemas and less time parsing output.

Enum drift. You define an enum of valid categories. The model always picks from the enum. But the world changes and new categories are needed. Your structured output faithfully returns old categories because the schema doesn't include new ones. This is a failure mode that didn't exist with free-text output.

Structured outputs are a genuine improvement. But they shifted our error handling from "fight the format" to "fight the semantics." Total defensive code decreased. Complexity of what remains increased.

If you're migrating to structured outputs, we help teams get the schema design right the first time.