Google has launched Gemini 3.1 Ultra, its most significant model release of 2026. The model features a massive 2-million token context window that works natively across text, image, audio, and video — without transcription intermediaries.
Unlike prior Gemini versions, 3.1 Ultra was designed from the ground up to reason across all modalities simultaneously. It also ships with a new sandboxed Code Execution tool, allowing the model to write, run, and test code mid-conversation.
Key Features of Gemini 3.1 Ultra
- 2M token context window — the largest of any publicly available model
- True native multimodal reasoning — text, image, audio, video processed together
- Sandboxed Code Execution — write, run and test code in real time
- Improved grounding — significantly reduced hallucinations on factual queries
Google also released Gemini 3.1 Flash-Lite alongside, an efficiency-focused variant targeting cost-sensitive deployments.
What This Means for Developers
The 2M token context window opens up use cases like full codebase analysis, long-document processing, and multi-hour video understanding in a single call. Combined with native multimodal reasoning, Gemini 3.1 Ultra positions Google as a strong contender against Claude Opus 4.7 and GPT-5.5 for enterprise AI workloads.
Release date: June 2026