15 May 2026 2 min read

AI-Assisted Apple M5 Exploit Unveiled as Poetiq Hits New Coding Performance SOTA

AI-Assisted Security

Mythos Preview Facilitates Apple M5 Kernel Exploit: Researchers utilized the AI tool Mythos Preview to develop the first public kernel memory corruption exploit for the Apple M5 chip in just five days. This milestone demonstrates how AI-assisted research is enabling small teams to perform complex security tasks that previously required significant time and resources.

The first public macOS kernel memory corruption exploit on Apple M5 was built with Mythos Preview's help, and it only took 5 days.
- https://blog.calif.io/p/first-public-kernel-memory-corruption?referrer=https%3A%2F%2Freddit.com

Coding and Developer Tools

Poetiq Surpasses Opus 4.7 Using Self-Optimizing Harness: Poetiq has set a new benchmark in coding performance by employing a self-optimizing harness with Gemini 3 Flash to outperform models like Opus 4.7. This development highlights the growing potential of recursive self-improvement systems in the AI coding space.

New SOTA: Poetiq uses self-optimizing harness to surpass e.g. Opus 4.7 with Gemini 3 Flash
- https://poetiq.ai/posts/recursive_self_improvement_coding
- https://www.reddit.com/gallery/1tdgnux

OpenAI Expands Codex Accessibility: OpenAI has introduced a new feature that allows users to work with Codex from anywhere, significantly improving the flexibility of the AI coding assistant. This update is designed to streamline developer workflows by making the tool accessible across different environments.

Work with Codex from anywhere | OpenAI
- https://openai.com/index/work-with-codex-from-anywhere/

AI Research and Model Training

TinyForge-Zero Achieves Major Gains via Self-Correction: A researcher has developed TinyForge-Zero, a small model trained on its own mistakes that reached 80% on HumanEval and outperformed GPT-3.5 in math tasks. The experiment underscores the effectiveness of self-correction training loops for enhancing the capabilities of smaller language models.

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

TurboQuant Study Identifies FP8 as Ideal Quantization Standard: A new study of TurboQuant indicates that FP8 remains the most effective default for KV-cache quantization, offering an optimal balance of accuracy and performance. The research provides critical insights into the memory efficiency trade-offs necessary for optimizing AI models.

A First Comprehensive Study of TurboQuant: Accuracy and Performance