The long awaited feature from OpenAI, “Structured Outputs”, is broken
🌈 Abstract
The article discusses the author's experience with using OpenAI's structured outputs feature for their AI applications. The author argues that while structured outputs are touted as a solution to issues with large language models (LLMs) not conforming to expected output formats, the implementation by OpenAI has several issues and arbitrary rules that make it difficult to use in practice.
🙋 Q&A
[01] The author's experience with OpenAI's structured outputs
1. What are the key issues the author encountered when trying to use OpenAI's structured outputs feature?
- OpenAI has arbitrary rules on what they consider a valid JSON schema, even when the schema is syntactically valid according to the JSON specification
- OpenAI requires setting certain optional fields like "additionalProperties" and "required" that are not actually required by the JSON spec
- OpenAI does not allow certain valid JSON schema constructs like "oneOf", forcing the author to use workarounds
- OpenAI has an undocumented rule that the first keys in "anyOf" objects cannot be identical, which the author had to find a solution for
2. What is the author's view on the documentation and error messages provided by OpenAI for their structured outputs feature?
- The author finds the documentation to be sparse, with many of the issues they encountered not being documented
- Error messages from OpenAI do not provide clear explanations or references to the documentation on why certain schemas are rejected
3. How does the author's experience compare to the generally positive reception of OpenAI's structured outputs feature?
- The author notes that they seem to be the only one complaining about the issues with the feature, while most others are praising it as a major advancement in AI
- The author acknowledges that their use case of allowing non-technical users to create sophisticated JSON schemas may be more complex than typical use cases
[02] The author's overall perspective on using OpenAI's structured outputs
1. Despite the issues, why is the author determined to continue using OpenAI's structured outputs?
- The author sees the potential benefits of structured outputs in improving the quality and reliability of their AI applications, and wants to move away from the "ugly" validation and retry logic they currently have
2. What is the author's main criticism of OpenAI's approach to structured outputs?
- The author argues that if OpenAI is choosing to use a common specification like JSON, they should ensure their implementation aligns with the specification, rather than having arbitrary rules that force developers to update their application code
3. What does the author suggest OpenAI could do to improve the structured outputs feature?
- Clearly document all the rules and requirements for valid schemas, including explanations for why certain rules were included
- Provide better error messages that reference the documentation and explain the reasons for rejecting a schema