OpenAI researcher Jason Wolfe breaks down the Model Spec, the public document defining what ChatGPT and other OpenAI models should and should not do. The conversation covers the chain of command that resolves conflicts between operator instructions and user requests, how the spec gets updated based on real-world edge cases, and how it differs from Anthropic's Constitutional AI approach.
The technical substance here goes beyond policy talk. Wolfe addresses how smaller models implement the spec differently than larger ones, whether chain-of-thought reasoning is useful for alignment, and what happens when a model's own reasoning conflicts with the spec. The Santa Claus edge case at 13:35 is a concrete example of how abstract rules break down against real user interactions.
The forward-looking section starting at 27:44 is the reason to watch the full episode. Wolfe discusses where the spec goes as model capabilities expand, and the final segment on whether AI could write a spec for humans reframes the entire governance question. The spec is publicly readable and accepts feedback, which makes this more than a behind-the-scenes explainer.
[WATCH ON YOUTUBE →]