Claude Opus 4.8 launched at the same price as Opus 4.7, and Claire Vo got early access to test it. Her verdict: strong on greenfield prototypes and one-shot features, weak on the last 10% of polish, edge cases in existing codebases, and hallucinations. For business strategy and roadmap work, she is still reaching for 4.7.
The episode is worth watching for the specific failure modes. Vo runs Opus 4.8 against real coding tasks in Claude Code, a business strategy comparison directly pitting 4.8 against 4.7, and an ambition test building games for a 9-year-old. The hallucination segment at 3:27 and the existing-codebase test at 4:23 are where the model's limits become concrete, not theoretical. Anthropic is also shipping dynamic workflows with parallel subagents and effort control inside Claude.ai and Cowork alongside this release.
The practical output here is a prompting and harness strategy Vo recommends for getting the most out of 4.8 given its specific failure patterns. If you are deciding whether to upgrade workflows or stay on 4.7 for data-heavy work, this 10-minute breakdown gives you a faster answer than any benchmark table.
[READ ORIGINAL →]