newsmode MarketNews
arrow_back К списку
rss_feedAnthropic News ·13.03.2026 open_in_newОригинал

Introducing Claude Opus 4.5

Introducing Claude Opus 4.5
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
logo
Opus 4.5 writes better code, leading across 7 out of 8 programming languages on SWE-bench Multilingual.
Opus 4.5 can solve challenging coding problems with ease with a 10.6% jump over Sonnet 4.5 on Aider Polyglot.
Opus 4.5 improves on frontier agentic search with a significant jump on BrowseComp-Plus.
Opus 4.5 stays on track over the long haul earning 29% more than Sonnet 4.5 on Vending-Bench.
In our evaluation, “concerning behavior” scores measure a very wide range of misaligned behavior, including both cooperation with human misuse and undesirable actions that the model takes at its own initiative [3].
Note that this benchmark includes only very strong prompt injection attacks. It was developed and run by Gray Swan.