DeepSeek's New V3 and R1 Models: Efficiency Breakthroughs and Cost Savings
Model Releases and Updates
DeepSeek V3 and R1 Models: DeepSeek has released detailed pricing and system architecture information for their V3 and R1 models, showing significantly lower costs compared to competitors. Their inference system demonstrates remarkable efficiency with a theoretical cost-profit ratio of 545%.
- New Deepseek report: $1.10 per 1M output tokens for v3, $2.19 per 1M tokens for r1
- Day 6: One More Thing, DeepSeek-V3/R1 Inference System Overview
- China's DeepSeek claims theoretical cost-profit ratio of 545% per day
TinyR1-32B: Efficient Model with SuperDistillation: Qihoo 360 has released TinyR1-32B-Preview, achieving near-R1 performance with only 5% of the parameters through a process called SuperDistillation.
Phi-4-mini Bug Fixes and GGUF Conversion: Llama.cpp has added support for Microsoft's Phi-4 mini model and fixed several tokenization issues, making the model more accessible for local deployment.
Open-Source DeepResearch Implementation: A new open-source project called Search-R1 has been released, reproducing the methods of DeepSeek-R1 for training reasoning and searching interleaved LLMs.
Voice AI Breakthroughs
Sesame Real-Time Voice Chat Model: Sesame AI Labs has released a real-time, low-latency voice chat model available in three sizes, designed to create natural conversation with human-like qualities.
Platform and Feature Updates
GPT-4.5 Release and Integration: OpenAI has released GPT-4.5 with new capabilities including an effort level slider, and the model has already been integrated into Perplexity AI's platform.
- Which model will have an effort level slider? What kind of innovation are they trying to bring?
- GPT-4.5 is available on Perplexity now. THEY ARE FAST
- 4.5 triggered Deep Research to solve NYT word game
Perplexity Deep Research for Enterprise Data: Perplexity AI has introduced a new enterprise feature that connects to Google Drive, OneDrive, and SharePoint, enabling comprehensive research across company files and the web.
AI Capabilities and Applications
Gemini 2.0 for Music Generation: A user implemented a new "reasoning" mode for music models using Gemini Pro 2.0, resulting in significant improvements in AI-generated music.