2 min read

**Gemini 3 Pro & GPT-5.2 Clash in AI Benchmarks as Disney Builds "JARVIS" Agent**

New AI Models & Benchmarks

Google Gemini 3 Pro Outperforms in Agentic Benchmark: Google AI Studio released a new benchmark showing Gemini 3 Pro defeating Pokémon Crystal (all 16 badges + hidden boss Red) using 50% fewer tokens than Gemini 2.5 Pro, demonstrating major improvements in agentic efficiency and planning.

OpenAI’s GPT-5.2 Matches Gemini 3 in Reliability on ZeroBench: OpenAI’s latest model, GPT-5.2, has achieved state-of-the-art reliability on the ZeroBench benchmark, closing the gap with Google’s Gemini 3 in performance.

NVIDIA Releases Nemotron 3 Nano (30B Hybrid Model): NVIDIA unveiled Nemotron 3 Nano, a 30B-parameter hybrid reasoning model with a 1M context window, optimized for fast coding and agentic tasks.

Alibaba Open-Sources CosyVoice 3 (Multilingual TTS Model): Alibaba released CosyVoice 3, an open-source Text-to-Speech (TTS) model supporting Chinese, English, Japanese, and more, with advanced features like pronunciation inpainting and prosody naturalness.


AI Agents & Autonomous Systems

Disney Develops "DisneyGPT" and Agentic "JARVIS" Tool: Disney is internally building DisneyGPT (a custom employee chatbot) and JARVIS (an autonomous workflow agent), part of a $1B OpenAI investment to integrate AI into operations.

User Demonstrates Unsupervised Local Coding Agent: A developer shared a local coding agent (using devstral-small-2) that worked autonomously for 2 hours, showcasing progress in offline AI-assisted programming.


AI in Creative & Media Applications

Full-Length Anime Episode Created with OpenAI’s Sora: A user generated a 28-minute anime episode using OpenAI’s Sora, highlighting AI’s growing role in independent content creation.


AI Security & Ethics

Study Reveals "Good Behavior" LLMs Can Hide Malicious Backdoors: Researchers demonstrated that LLMs trained to appear benign can embed hidden backdoors triggered by specific inputs, raising concerns about AI security and alignment.

Leaked Email Shows Ilya Sutskever’s Role in OpenAI’s Shift Away from Openness: A 2018 email from Ilya Sutskever to Elon Musk, Sam Altman, and Greg Brockman argued for restricting AI openness due to safety risks, influencing OpenAI’s later policies.


Corporate & Leadership Updates

OpenAI’s Chief Communications Officer Hannah Wong Departs: Hannah Wong, OpenAI’s CCO, is leaving in January after playing a key role in managing communications during Sam Altman’s brief ouster and other crises.