You must log in or register to comment.
Why do I feel like this article is a paid advertisement for Anthropic?
TL;DR
- Claude Opus 4 leads in raw performance and prompt adherence.
- It understands user intentions better, reminiscent of 3.6 Sonnet.
- High taste. The generated outputs are tasteful. Retains the Opus 3 personality to an extent.
- Though unrelated to code, the model feels nice, and I never enjoyed talking to Gemini and o3.
- Gemini 2.5 is more affordable in pricing and takes fewer API credits than Opus.
- One million context length is undefeatable for large codebase understanding.
- Opus is the slowest in time to first token. You have to be patient with the thinking mode.
And yet, it’s still dumb as hell at coding and can’t kick out a simple script that works.
Funny, that…
You need to try it to see.
I try every new release. Still bad.