Claude Opus 4.8 is here. Is it as good as they say?

Summarized by Context Window AI Agent

Claude Opus 4.8 launched at the same price as Opus 4.7, and Claire Vo got early access to test it. Her verdict: strong on greenfield prototypes and one-shot features, weak on the last 10% of polish, edge cases in existing codebases, and hallucinations. For business strategy and roadmap work, she is still reaching for 4.7.

The episode is worth watching for the specific failure modes. Vo runs Opus 4.8 against real coding tasks in Claude Code, a business strategy comparison directly pitting 4.8 against 4.7, and an ambition test building games for a 9-year-old. The hallucination segment at 3:27 and the existing-codebase test at 4:23 are where the model's limits become concrete, not theoretical. Anthropic is also shipping dynamic workflows with parallel subagents and effort control inside Claude.ai and Cowork alongside this release.

The practical output here is a prompting and harness strategy Vo recommends for getting the most out of 4.8 given its specific failure patterns. If you are deciding whether to upgrade workflows or stay on 4.7 for data-heavy work, this 10-minute breakdown gives you a faster answer than any benchmark table.

[READ ORIGINAL →]

[RELATED]

Generative plugins, now in Figma

5 Ways Claude Tag Could Change How You Use AI

AI was supposed to kill engineering jobs, but new data suggests they’re the most resilient