18 May 2026 2 min read

Google Releases Gemma 4 While GPT-5.5 Achieves Autonomous Protein Folding Breakthrough

Frontier Models & Hardware

GPT-5.4 and GPT-5.5 Progress and Upcoming Public Release: Cerebras has announced that GPT-5.4 and GPT-5.5 are currently running internally on their chips and will be released to the public in the near future. Demonstrating its advanced capabilities, GPT-5.5 recently spent 150 hours autonomously improving protein folding models, achieving significant performance gains on the SimplexFold model.

Gemma 4 Releases & Fine-tunes

Google Announces Gemma 4 Series with Community Fine-tunes Already Emerging: Google's Jeff Dean has announced the release of Gemma 4, a model family ranging from edge-scale versions to a 124B parameter Mixture of Experts (MoE) model. In tandem, the community has released "Gemma-4-Gembrain-31B-it-uncensored-heretic," a merged fine-tune designed to enhance logical thinking and creative prose with reduced refusal rates.

Coding Agents & Tooling

SmallCode Agent Achieves 87% Benchmark Score Using 4B Model: A new coding agent named SmallCode has demonstrated high efficiency by outperforming major agents like Cursor while utilizing only a 4B parameter local model. The system achieves these results through the use of compound tools, improvement loops, and code graphs to maintain reliability on smaller hardware.

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how
- https://github.com/Doorman11991/smallcode

Inference & Optimization

Multi-Tensor Parallelism (MTP) Optimizations Drive High-Speed Local Inference: New optimizations in llama.cpp aim to eliminate logit copying during prompt decoding, further enhancing the Multi-Tensor Parallelism (MTP) feature. Users are reporting massive speed increases, such as running the Qwen 3.6 27B model at up to 65 tokens per second on mid-range workstation GPUs.