OpenAI has released GPT-Rosalind, a large language model trained specifically on biology workflows, not general science. The name references Rosalind Franklin. This is a direct departure from the broad, field-agnostic science models that Google, Meta, and others have shipped.

According to Yunyun Wang, OpenAI's Life Sciences Product Lead, the model was trained on 50 common biological workflows and the major public biological databases. It is built to handle two concrete problems: the scale of genomic and protein biochemistry datasets that exceed any single researcher's capacity, and the jargon barrier between biology subfields. A geneticist chasing a neurologically active gene no longer has to manually parse decades of neurobiology literature.

The model can suggest biological pathways, prioritize drug targets, and connect genotype to phenotype using known regulatory mechanisms. The full article is worth reading for Wang's framing of how mechanistic protein inference works inside the system, and what OpenAI's Life Sciences division is positioning as the next capability milestone.

[READ ORIGINAL →]