Reasoning Engine Benchmarks - Jan 2025

Tue, 28 Jan 2025 00:00:00 +0000

DeepSeek R1 vs Claude Sonnet 3.5

There’s been a lot of excitement around the Chinese Deepseek r1 model that was released last week. Deepseek is a reasoning, chain of thought (CoT) model that’s shaken things up due to the comparatively tiny budget ($ 6 million, which is probably OpenAI’s o1 cost to run its warm-up sequence) needed to train a frontier model of this sort.

(There’s also been a fair amount of dread as US AI models are no longer the only game in town and Deepseek was considerably cheaper than an of the US alternatives. US firms were hammered in Monday’s trading with over $100bn

Research on Andrew Sheves

Reasoning Engine Benchmarks - Jan 2025

DeepSeek R1 vs Claude Sonnet 3.5