1 min read

**Budget AI Powerhouse: GPT-OSS 120B Runs at 20t/s on $500 Mini PC, Qwen3-VL-30B Smashes Speed Records**

Local and Specialized AI Models

GPT-OSS 120B Running at 20t/s with an AMD M780 iGPU Mini PC: In a cost-effective setup, an AMD M780 iGPU mini PC with 96GB DDR5 RAM was used to run GPT-OSS 120B at impressive speeds. The author provided details on hardware configuration and software optimizations, demonstrating the feasibility of running advanced AI models on affordable hardware.

Qwen3-VL-30B-A3B with vLLM: The Qwen3-VL-30B-A3B model, using vLLM, achieved high token throughput on an H100 PCIe GPU. The author discussed the setup and performance metrics, showcowing the efficiency and speed of the model in processing images and text.

GPT-1 Thinking 2.6m: A modified GPT-1 model, enhanced with differential and sparse attention mechanisms, was fine-tuned for improved performance. The author shared benchmarks and performance metrics, demonstrating the model’s capabilities and improvements over the original GPT-1.

Research and Academic Applications

Gemini Deepthink and GPT-5-Pro in Combinatorics: Advanced AI models, specifically Gemini deepthink and GPT-5-pro, were used to solve niche problems in combinatorics. The author built an open-source archive for AI-assisted research, providing links to the archive and related papers.

AI Planning and Agentic Architecture

Brain-Inspired Agentic Architecture: A research paper from Nature discusses a brain-inspired agentic architecture aimed at improving planning with large language models (LLMs). This architecture mimics the human brain’s structure, suggesting advancements in AI’s ability to plan and make decisions more effectively.

Local Models and Accuracy

Closed Frontier vs Local Models: The post discusses the performance gap between closed frontier AI models and local consumer models, highlighting a graph that shows the accuracy progression of various AI models over time. This comparison is crucial for understanding the advancements and accessibility of AI technologies.