25 Mar 2025 1 min read

DeepSeek Releases Advanced V3-0324 Model; OpenAI o3 Tops ARC-AGI-2 Benchmark

AI Model Releases and Updates

DeepSeek Releases V3-0324 Model with Significant Improvements: DeepSeek has released a new version of their AI model, DeepSeek-V3-0324, which shows substantial advancements in reasoning capabilities and performance across various benchmarks. The model is available on platforms like Hugging Face and OpenRouter.

Deepseek V3 0324 is far from a minor upgrade
New deepseek v3 vs R1 (left is v3)
- External Link
Deepseek releases new V3 checkpoint (V3-0324)
- External Link
Deepseek v3
Deepseek-v3-0324 on Aider
New DeepSeek benchmark scores
Change log of DeepSeek-V3-0324
- External Link
DeepSeek-V3-0324 HF Model Card Updated With Benchmarks
- External Link
Deepseek releases new V3 checkpoint (V3-0324)
- External Link
DeepSeek V3-0324 has caught up to Sonnet 3.7 in my code creativity benchmark
- External Link
New deepseek v3 vs R1 (first is v3)

AI Benchmarks and Performance

OpenAI o3 Performance on ARC-AGI-2 Benchmark: The ARC-AGI-2 benchmark has been released, focusing on measuring Artificial General Intelligence (AGI) with an emphasis on efficiency. OpenAI's o3 model leads the performance metrics, but no AI model is scoring above 4%, indicating significant room for improvement.

o3 scores <5% on ARC-AGI-2 (but the test looks ... harder?)
O3 (low) falls flat against ARC-AGI v2, barely scores 5% while spending $200 per task (millions of tokens per task)
Arc-AGI-2 new benchmark
OpenAI o3 is leading the newly announced ARC-AGI-2, but no AI is getting above 4%
- External Link

General AI News

General AI News and Discussions: Various discussions and updates related to AI models and their performance, including user experiences and technical details.

New deepseek v3 vs R1 (first is v3)