1 min read

**Kimi Linear’s O(n) Breakthrough & Google’s Gemma Removal Shake AI Landscape**

New Models & Research

Kimi Linear: First Linear Attention Mechanism Outperforming Traditional Attention (O(n^2) → O(n))
Introduces Kimi Linear, a novel linear attention mechanism that achieves 6× faster 1M-token decoding and superior accuracy across short/long-context and RL tasks. The project open-sources the KDA kernel, vLLM implementations, and pre-trained/instruction-tuned checkpoints.


Qwen3-Max Thinking Released in Qwen Chat
Alibaba’s Qwen3-Max Thinking is now integrated into Qwen Chat, offering advanced reasoning capabilities. Users note the lack of public benchmarks and manual tool selection requirements.


Qwen3 VL 30B (a3b) Praised for Image Processing Efficiency
Users highlight the Qwen3 VL 30B model’s speed and accuracy in image-based tasks (e.g., inventory list generation), with plans to release a Docker-deployable project demo.


AI Infrastructure & Tools

Bifrost: Open-Source LLM Gateway for AI Agents (50× Faster than LiteLLM)
Bifrost is a high-performance gateway for LLM-powered agents, featuring ultra-low overhead, adaptive load balancing, and multi-provider support. The project seeks community feedback.


Controversies & Policy

Google Removes Gemma from AI Studio After Defamation Accusations
Google pulled Gemma from AI Studio following Senator Blackburn’s defamation claims, though the model remains downloadable via Hugging Face for local use.


AI Linguistics & Research

Study: Polish Named Most Effective Language for AI Prompting
Research suggests Polish outperforms other languages in AI prompting due to its directness and low ambiguity, sparking discussions on methodology and implications.