Gemini 3 PRO Rumors Swirl as GLM 4.7-Flash Gets llama.cpp & GGUF Boost
New AI Models & Releases
Gemini 3 PRO GA Rumored for Major Improvements
- Rumors suggest the upcoming Gemini 3 PRO GA will deliver a significant performance leap, potentially comparable to a "3.5" version. The AI community is speculating about its capabilities, with high anticipation for its release.
GLM 4.7-Flash Officially Supported in llama.cpp
- GLM 4.7-Flash has received official integration into llama.cpp, enabling local execution. The update is significant for developers running the model efficiently, though some performance and compatibility discussions are ongoing.
Quantized GLM 4.7-Flash GGUF Models Released
- Multiple GGUF-quantized versions of GLM 4.7-Flash have been released, including contributions from Unsloth and Bartowski, optimizing the model for local inference with varying performance trade-offs.
Mosquito: A 7.3M-Parameter Tiny Knowledge Model
- Mosquito, a compact 7.3M-parameter model, has been released for general knowledge tasks. Despite its small size, it demonstrates surprising capability, with a live demo available on Hugging Face.
AI Audio & Multimedia
OpenAI Launches GPT Audio and GPT Audio Mini
- OpenAI has introduced two new text-to-speech models: GPT Audio (higher quality) and GPT Audio Mini (lower cost). Both are now available via OpenRouter with tiered pricing.
AI Community & Milestones
One-Year Anniversary of DeepSeek-R1
- The community reflects on the one-year milestone since DeepSeek-R1’s release, discussing how smaller, more efficient models have evolved in comparison over the past year.