Ernest Ryu used ChatGPT to help crack a 42-year-old open mathematical problem. That single data point anchors this OpenAI Podcast episode, where Ryu and Sébastien Bubeck walk through how AI math capability jumped from arithmetic failures to research-level contributions, and why the Erdős problem catalog is now a serious benchmark for what comes next.
The episode is worth reading in full for the mechanistic breakdown: Bubeck explains the difference between deep literature retrieval and genuine mathematical discovery, which are not the same thing and should not be treated as such. The discussion of proof verification, the risk of models producing shallow pattern-matched proofs, and what longer inference timelines actually change for automated research are the dense sections that reward close attention.
The question the episode keeps circling is structural: what is the human role when the model improves past current limits. No clean answer is given, which is honest. The advice segment at 41:19 on learning math with ChatGPT is a practical coda worth skimming separately.
[WATCH ON YOUTUBE →]