Member-only story

Anthropic raises the industry bar for intelligence

5 min readJun 21, 2024

Anthropic released Claude 3.5 Sonnet and it looks very interesting. It is faster and smarter than Anthropic’s previous flagship model, Claude 3 Opus, and makes a solid attempt for the top spot in the leaderboards.

Anthropic has released a new model in their Claude family of models, Claude 3.5 Sonnet, which is also the first Claude 3.5 model released to the public. According to Anthropic, its newest model raises the industry bar for intelligence and makes a solid attempt for the top spot in the leaderboards.

If the benchmark results provided by Anthropic are to be believed, Claude 3.5 Sonnet is a massive improvement over Anthropic’s previous flagship model, Claude 3 Opus, and outperforms competitors such as OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro. Additionally, Claude 3.5 Sonnet follows the recent trend of making AI models not only smarter but also faster, being two times faster than Claude 3 Opus.

Anthropic has also revealed that in their internal agentic coding evaluations, Claude 3.5 Sonnet solved 64% of problems, compared to their previous flagship model, Claude 3 Opus, which solved only 38% of problems in the same test. Additionally, Anthropic reports that Claude 3.5 Sonnet is better at fixing bugs or adding new functionality to open-source projects, given a natural language description of what needs to be done.

When equipped with relevant tools, Claude 3.5 Sonnet can independently write, edit, and execute code with sophisticated reasoning and troubleshooting capabilities, says Anthropic. Claude 3.5 Sonnet is also good at translating code, which could make it useful in maintaining or migrating legacy codebases.

Anthropic has not described in detail what the coding test looked like and only provided results in the Claude 3.5 Sonnet Model Card Addendum.

Anthropic raises the industry bar for intelligence

Create an account to read the full story.

Written by Conrad Gray

No responses yet