Categories
AI

Claude Opus 4.7 Is Out, and Its Vision Score Jumped from 54.5% to 98.5%

Anthropic released Claude Opus 4.7 today with a visual acuity score that jumped from 54.5% to 98.5%, a 3x improvement in production tasks on Rakuten-SWE-Bench, and a 1M token context window. Same price as Opus 4.6, substantially more capable.

Anthropic shipped Claude Opus 4.7 today, and its visual acuity score jumped from 54.5% to 98.5%.

That’s not a rounding error. The previous flagship could barely handle computer-use vision tasks reliably. This one is near-perfect on the same benchmark. That gap is the story.

Opus 4.7 is available today across Claude.ai, Anthropic’s API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Azure AI Foundry. Pricing is unchanged at $5 per million input tokens and $25 per million output tokens. The context window is 1 million tokens.

The vision improvement is tied to a direct resolution increase. Opus 4.7 now accepts images up to 2,576 pixels on the long edge, about 3.75 megapixels and more than three times the resolution prior Claude models accepted. The jump in visual acuity from 54.5% to 98.5% on computer-use benchmarks reflects that change directly.

The coding and agent numbers are also substantial. On Anthropic’s internal 93-task coding benchmark, Opus 4.7 scores 13% higher than Opus 4.6. On Rakuten-SWE-Bench, a production-task benchmark built on real software engineering work, it resolves three times as many tasks. There are four tasks that neither Opus 4.6 nor Claude Sonnet 4.6 could solve at all that Opus 4.7 now handles.

What Changed Under the Hood

The model ships with several architectural updates beyond the resolution increase.

A new xhigh effort level sits between the existing high and max settings, giving developers finer-grained control over compute spend per task. Anthropic also added adaptive thinking, a feature that automatically adjusts reasoning depth based on task complexity. The intent is to avoid burning max-effort compute on simple requests.

Other improvements on Anthropic’s spec sheet: a 14% gain on complex multi-step agent workflows, 21% fewer errors on enterprise document reasoning via the OfficeQA Pro benchmark, and better finance reasoning (0.813 vs. 0.767 on the General Finance evaluation module).

There is also a tokenizer change. Input tokens run 1.0 to 1.35 times higher than before due to the updated tokenizer. Anthropic says net token usage on coding evaluations still improved despite that increase, but developers integrating the model should expect higher token counts on text-heavy workloads.

Safety-wise, the profile is close to Opus 4.6. Prompt injection resistance is better. Cybersecurity capabilities have been intentionally reduced compared to Claude Mythos Preview.

Why We’re Watching

The vision upgrade is the part of this release that changes what African developers can build, not just how well they can build it. Fintech companies across Nigeria, Kenya, and Uganda process enormous volumes of degraded document scans: KYC submissions on worn government IDs, utility bills photographed at odd angles, bank statements exported from low-resolution PDFs. AI vision tools fail on these constantly, which pushes document verification back onto human queues. A jump from 54.5% to 98.5% visual acuity is not marginal. It’s the threshold where automated document processing becomes reliable enough to trust. Combine that with 21% fewer document reasoning errors, and the economics of building a compliant KYC pipeline on top of a frontier model shift meaningfully. The 1M context window has been there for a while. The vision quality to use it on real African documents was not.

The 3x production task gain on Rakuten-SWE-Bench matters separately. Synthetic benchmarks are easy to optimize for. A benchmark built on real production engineering tasks is not.

Watch the 30-day adoption numbers from agentic coding platforms. If Factory Droids report the same 10-15% task success lift Anthropic claims, this release will put measurable pressure on every competing frontier model. The metric to watch on the vision side is whether document-heavy African fintech workflows that required human fallback before start running straight through on Opus 4.7.

Sources