Andrej Karpathy's Autoresearch system runs autonomous agent loops that edit training code, execute fixed five-minute experiments, and commit only the changes that improve a single validation metric. This is not a demo. It is a functional research pipeline where the agent iterates without human intervention.
The piece connects this to the Ralph Wiggum iterative loop and examines multi-agent collaborative research as the logical next step. Concrete application domains include LLM training, code generation, advertising, and product experimentation. The specificity of the five-minute experiment window and the single-metric commit rule are the architectural decisions worth understanding.
What makes this worth reading in full is not the conclusion about automation but the emerging skill set it demands from humans: evaluation design and writing agent memos. Those two capabilities are where human leverage concentrates as the loop takes over everything else.
[WATCH ON YOUTUBE →]