30 Mar 2026 2 min read

Cursor and JetBrains Launch New Coding Tools as Claude Dominates Security Research

Coding & Development Tools

Cursor Composer 2 Real-Time RL: Cursor has implemented a system where Composer 2 self-improves every five hours using real-time reinforcement learning from user feedback. This continuous update cycle allows for frequent shipping of improved model versions while actively addressing challenges like reward hacking.

Cursor is continually self improving Composer 2 every 5 hours in real time
- https://xcancel.com/cursor_ai/status/2037205514975629493
- https://cursor.com/blog/real-time-rl-for-composer

JetBrains Air Multi-Agent Coding: JetBrains has introduced "Air," a new tool designed for multi-agent coding workflows. The platform positions itself as a future-forward solution for AI-assisted development, though it has sparked debate regarding whether it represents a significant leap or incremental noise in the AI coding space.

JetBrains Air: The Future of Multi-Agent Coding, or Just More AI Noise?
- https://medium.com/vibecodingpub/jetbrains-air-the-future-of-multi-agent-coding-or-just-more-ai-noise-5450e648a962?sk=32f28fca4c89f98067729a5bd3c550f5

Local LLM Performance & Optimization

Kernel-Anvil Performance Boost for AMD: The kernel-anvil tool provides a 2x decode speedup for llama.cpp on AMD GPUs by auto-tuning kernels based on specific model shapes. It profiles GGUF layer shapes to generate optimal configurations without the need for recompilation.

kernel-anvil: 2x decode speedup on AMD by auto-tuning llama.cpp kernels per model shape

Qwen3.5-397B Optimization on Apple Silicon: An autoresearch project successfully optimized the massive Qwen3.5-397B model to run at 20.34 tokens per second on an M5 Max MacBook Pro. The performance was achieved through a combination of SSD I/O improvements, temporal expert prediction, and specific quantization techniques.

Autoresearch on Qwen3.5-397B, 36 experiments to reach 20.34 tok/s on M5 Max, honest results

Voxtral-4B-TTS Quantization: Mistral’s Voxtral-4B-TTS model has been quantized to int4, enabling near-lossless speech synthesis at 57 fps on consumer hardware like the RTX 3090. The optimization reduces memory usage to just 3.8 GB VRAM through the use of HQQ quantization and static KV caching.

Claude quantized Voxtral-4B-TTS to int4 — 57 fps on RTX 3090, 3.8 GB VRAM, near-lossless quality
- https://github.com/TheMHD1/voxtral-int4

KV Rotation Performance Recovery: Developers discovered that Q8 KV quantization in llama.cpp significantly degraded performance on the AIME25 benchmark, but these losses can be largely recovered using rotation techniques. This finding, documented in a recent pull request, helps maintain model accuracy during quantization.

In the recent kv rotation PR it was found that the existing q8 kv quants tank performance on AIME25, but can be recovered mostly with rotation
- https://github.com/ggml-org/llama.cpp/pull/21038#issuecomment-4150413357

AI Research & Capabilities

Claude Outperforms Top Security Researchers: Renowned security expert Nicolas Carlini reports that Claude has successfully identified complex vulnerabilities in the Linux kernel and smart contracts that escaped human detection. Carlini suggests that AI models are becoming superior security researchers and noted high expectations for the rumored "Mythos" model.

Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost

Real-World Applications

Agentic AI in Logistics: A new implementation uses the OpenAI Response Framework API to automate weather-based shipping decisions for supply chain management. This agentic AI approach puts logistics planning on autopilot, demonstrating a practical application for autonomous industrial decision-making.

How We Used Agentic AI to Put Weather-Based Shipping Decisions on Autopilot