<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Research on Andrew Sheves</title><link>https://andrewsheves.com/tags/research/</link><description>Recent content in Research on Andrew Sheves</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Tue, 28 Jan 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://andrewsheves.com/tags/research/index.xml" rel="self" type="application/rss+xml"/><item><title>Reasoning Engine Benchmarks - Jan 2025</title><link>https://andrewsheves.com/2025/01/28/reasoning-engine-benchmarks-jan-2025/</link><pubDate>Tue, 28 Jan 2025 00:00:00 +0000</pubDate><guid>https://andrewsheves.com/2025/01/28/reasoning-engine-benchmarks-jan-2025/</guid><description>&lt;h2 id="deepseek-r1-vs-claude-sonnet-35"&gt;DeepSeek R1 vs Claude Sonnet 3.5&lt;/h2&gt;
&lt;p&gt;There’s been a lot of excitement around the &lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-R1"&gt;Chinese Deepseek r1 model&lt;/a&gt; that was released last week. Deepseek is a reasoning, chain of thought (CoT) model that’s shaken things up due to the comparatively tiny budget ($ 6 million, which is probably OpenAI’s o1 cost to run its warm-up sequence) needed to train a frontier model of this sort.&lt;/p&gt;
&lt;p&gt;(There’s also been a fair amount of dread as US AI models are no longer the only game in town and Deepseek was considerably cheaper than an of the US alternatives. US firms were hammered in Monday’s trading with over $100bn&lt;/p&gt;</description></item></channel></rss>