Gemini 3.1 Pro and Mistral Voxtral Mini Debut Amid Taalas’s Breakthrough Inference Speeds
Large Language Models
Google Launches Gemini 3.1 Pro: Google has released Gemini 3.1 Pro, which features significant improvements in coding, reasoning, and hallucination reduction compared to previous versions. The release is accompanied by detailed benchmarks showcasing its enhanced performance capabilities across various complex tasks.
Mistral AI Releases Voxtral Mini 4B Realtime: Mistral AI has launched Voxtral Mini 4B Realtime, a compact model specifically designed for high-speed, real-time applications. The model is now available for testing on Hugging Face and the Mistral Studio Playground.
AI Hardware & Infrastructure
Taalas Unveils ASIC-Based Inference achieving 16,000 Tokens/Second: Taalas introduced a novel hardware approach that etches LLM weights directly into silicon, bypassing traditional HBM to reach speeds of 16,000 tokens per second. They have launched a public demo featuring Llama 3.1 8B to showcase the extreme throughput and power efficiency of their specialized chips.