AI-Assisted Apple M5 Exploit Unveiled as Poetiq Hits New Coding Performance SOTA
AI-Assisted Security
Mythos Preview Facilitates Apple M5 Kernel Exploit: Researchers utilized the AI tool Mythos Preview to develop the first public kernel memory corruption exploit for the Apple M5 chip in just five days. This milestone demonstrates how AI-assisted research is enabling small teams to perform complex security tasks that previously required significant time and resources.
Coding and Developer Tools
Poetiq Surpasses Opus 4.7 Using Self-Optimizing Harness: Poetiq has set a new benchmark in coding performance by employing a self-optimizing harness with Gemini 3 Flash to outperform models like Opus 4.7. This development highlights the growing potential of recursive self-improvement systems in the AI coding space.
OpenAI Expands Codex Accessibility: OpenAI has introduced a new feature that allows users to work with Codex from anywhere, significantly improving the flexibility of the AI coding assistant. This update is designed to streamline developer workflows by making the tool accessible across different environments.
AI Research and Model Training
TinyForge-Zero Achieves Major Gains via Self-Correction: A researcher has developed TinyForge-Zero, a small model trained on its own mistakes that reached 80% on HumanEval and outperformed GPT-3.5 in math tasks. The experiment underscores the effectiveness of self-correction training loops for enhancing the capabilities of smaller language models.
- I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math
- https://x.com/UsmanReads/article/2055056973075472880/media/2055052500726931456
- https://x.com/UsmanReads/article/2055056973075472880/media/2055052295122108416
- https://x.com/UsmanReads/article/2055056973075472880/media/2055053685940740096
- https://x.com/UsmanReads/article/2055056973075472880/media/2055053974533976064
- https://x.com/UsmanReads/article/2055056973075472880/media/2055054322338263041
- https://x.com/UsmanReads/article/2055056973075472880/media/2055055338848808960
- https://x.com/UsmanReads/article/2055056973075472880/media/2055055753699065856
- https://github.com/ranausmanai/tinyforge-zero
- https://huggingface.co/ranausmans/tinyforge-zero-qwen25-14b-lora
TurboQuant Study Identifies FP8 as Ideal Quantization Standard: A new study of TurboQuant indicates that FP8 remains the most effective default for KV-cache quantization, offering an optimal balance of accuracy and performance. The research provides critical insights into the memory efficiency trade-offs necessary for optimizing AI models.