2 min read

DeepSeek's New V3 and R1 Models: Efficiency Breakthroughs and Cost Savings

Model Releases and Updates

DeepSeek V3 and R1 Models: DeepSeek has released detailed pricing and system architecture information for their V3 and R1 models, showing significantly lower costs compared to competitors. Their inference system demonstrates remarkable efficiency with a theoretical cost-profit ratio of 545%.

TinyR1-32B: Efficient Model with SuperDistillation: Qihoo 360 has released TinyR1-32B-Preview, achieving near-R1 performance with only 5% of the parameters through a process called SuperDistillation.

Phi-4-mini Bug Fixes and GGUF Conversion: Llama.cpp has added support for Microsoft's Phi-4 mini model and fixed several tokenization issues, making the model more accessible for local deployment.

Open-Source DeepResearch Implementation: A new open-source project called Search-R1 has been released, reproducing the methods of DeepSeek-R1 for training reasoning and searching interleaved LLMs.

Voice AI Breakthroughs

Sesame Real-Time Voice Chat Model: Sesame AI Labs has released a real-time, low-latency voice chat model available in three sizes, designed to create natural conversation with human-like qualities.

Platform and Feature Updates

GPT-4.5 Release and Integration: OpenAI has released GPT-4.5 with new capabilities including an effort level slider, and the model has already been integrated into Perplexity AI's platform.

Perplexity Deep Research for Enterprise Data: Perplexity AI has introduced a new enterprise feature that connects to Google Drive, OneDrive, and SharePoint, enabling comprehensive research across company files and the web.

AI Capabilities and Applications

Gemini 2.0 for Music Generation: A user implemented a new "reasoning" mode for music models using Gemini Pro 2.0, resulting in significant improvements in AI-generated music.