01 Mar 2025 2 min read

DeepSeek's New V3 and R1 Models: Efficiency Breakthroughs and Cost Savings

Model Releases and Updates

DeepSeek V3 and R1 Models: DeepSeek has released detailed pricing and system architecture information for their V3 and R1 models, showing significantly lower costs compared to competitors. Their inference system demonstrates remarkable efficiency with a theoretical cost-profit ratio of 545%.

TinyR1-32B: Efficient Model with SuperDistillation: Qihoo 360 has released TinyR1-32B-Preview, achieving near-R1 performance with only 5% of the parameters through a process called SuperDistillation.

TinyR1-32B-Preview: SuperDistillation Achieves Near-R1 Performance with Just 5% of Parameters
- https://huggingface.co/qihoo360/TinyR1-32B-Preview

Phi-4-mini Bug Fixes and GGUF Conversion: Llama.cpp has added support for Microsoft's Phi-4 mini model and fixed several tokenization issues, making the model more accessible for local deployment.

Phi-4-mini Bug Fixes + GGUFs

Open-Source DeepResearch Implementation: A new open-source project called Search-R1 has been released, reproducing the methods of DeepSeek-R1 for training reasoning and searching interleaved LLMs.

The first real open source DeepResearch attempt I've seen
- https://github.com/volcengine/verl
- https://github.com/PeterGriffinJin/Search-R1/tree/main

Voice AI Breakthroughs

Sesame Real-Time Voice Chat Model: Sesame AI Labs has released a real-time, low-latency voice chat model available in three sizes, designed to create natural conversation with human-like qualities.

Platform and Feature Updates

GPT-4.5 Release and Integration: OpenAI has released GPT-4.5 with new capabilities including an effort level slider, and the model has already been integrated into Perplexity AI's platform.

Perplexity Deep Research for Enterprise Data: Perplexity AI has introduced a new enterprise feature that connects to Google Drive, OneDrive, and SharePoint, enabling comprehensive research across company files and the web.

Introducing Perplexity Deep Research for Enterprise Data

AI Capabilities and Applications

Gemini 2.0 for Music Generation: A user implemented a new "reasoning" mode for music models using Gemini Pro 2.0, resulting in significant improvements in AI-generated music.

Using Gemini 2.0 to implement a new reasoning or "thinking" mode for music models produces unbelievable results