26 Nov 2025 1 min read

LLaDA2.0 & Flux 2 Break Barriers: 100B Models & 24GB VRAM Accessibility

New AI Models & Releases

LLaDA2.0 (103B/16B) Released: The latest diffusion language models, LLaDA2.0, feature a 100B/6B MoE architecture (flash version) and a 16B/1B MoE architecture (mini version), optimized for practical applications. Both models are available on Hugging Face, with llama.cpp support in progress.

LLaDA2.0 (103B/16B) has been released

Flux 2 Now Runs on 24GB VRAM: The advanced AI model Flux 2 can now be executed on consumer-grade GPUs with 24GB VRAM, significantly broadening accessibility for users with mid-range hardware.

Flux 2 can be run on 24gb vram!!!

AI Tools & Libraries

Unsloth FP8 Reinforcement Learning for Local Training: Unsloth’s updated library enables FP8 reinforcement learning on local hardware with <5GB VRAM, supporting models like Qwen3-4B and Qwen3-1.7B. It optimizes training speed and context length for GPUs like RTX 40/50 series and H100/B200.

You can now do FP8 reinforcement learning locally! (<5GB VRAM)

Splintr: High-Speed BPE Tokenizer in Rust: A new Rust-based BPE tokenizer, Splintr, offers 3-4x faster single-text encoding and 10-12x faster batch encoding compared to OpenAI’s tiktoken. It includes Python bindings and a streaming decoder for real-time LLM output.

BPE tokenizer in Rust - would love feedback from the community
- GitHub - Splintr

AI Research & Policy

White House Launches "The Genesis Mission": A new federal initiative, "The Genesis Mission," aims to accelerate AI research by integrating scientific datasets to train foundation models and develop AI agents. The project raises discussions about open-source implications and potential regulatory shifts.

The White House just launched "The Genesis Mission": A Manhattan Project-style initiative for AI

Conversational AI Improvements

TEN Turn Detection for Natural Voice AI Interactions: An open-source project, TEN Turn Detection, addresses interruptions in voice AI by better identifying when a user has finished speaking. The model is part of the TEN Framework and available on Hugging Face.

Why talking to AI assistants sucks: a project that's finally fixing the interruption problem
- Hugging Face - TEN_Turn_Detection
- GitHub - TEN Framework