Google Unveils Gemma 4: The Most Capable Open AI Models Yet

Google has launched Gemma 4, a groundbreaking family of open-weight AI models built on Gemini 3 research and released under the permissive Apache 2.0 license. Announced on April 1, 2026, these models prioritize efficiency, advanced reasoning, and on-device deployment across smartphones, laptops, and servers. Developers have already downloaded over 400 million Gemma variants, fueling a vibrant ecosystem.

Contents

Model Variants and Specs

Gemma 4 comes in four sizes tailored for diverse hardware: Effective 2B (E2B) and 4B (E4B) for edge devices, 26B Mixture-of-Experts (MoE), and 31B Dense. Smaller models offer 128K token context windows, while larger ones extend to 256K, enabling handling of extensive documents or conversations.

Key specs include native multimodal support for images, audio, and video, fluency in 140+ languages, built-in function calling, and a “thinking” mode for step-by-step reasoning.

Model	Active Params	Context Window	Min RAM (4-bit)	Best For
E2B	~2B	128K	~5 GB	Smartphones, IoT
E4B	~4B	128K	~5 GB	Mobile apps
26B MoE	3.8B of 26B	256K	~18 GB	Low-latency tasks
31B Dense	31B	256K	~20 GB	Research, fine-tuning

Benchmark Performance

The 31B model ranks #3 on the Arena AI text leaderboard, outperforming models 20x its size, while the 26B secures #6. It excels in coding (LiveCodeBench v6: 80% for 31B), math (AIME 2026: 89.2%), and reasoning (GPQA Diamond: 84.3%).

Benchmark	31B	26B MoE	E4B
MMLU Pro	85.2%	82.6%	69.4%
GPQA Diamond	84.3%	82.3%	58.6%
LiveCodeBench	80.0%	77.1%	52.0%

Key Features and Capabilities

Gemma 4 supports agentic workflows with multi-step planning, tool interaction, and offline code generation. Multimodal features handle object detection, OCR, document parsing, and UI understanding at variable resolutions. Native system prompts enhance structured interactions.

Real-World Applications

Businesses deploy Gemma 4 for on-device assistants, enterprise automation, legal document analysis, and generative design in engineering. It powers privacy-focused mobile apps, autonomous agents, and real-time translation without cloud dependency.

Availability and Access

Available now on Google Cloud (Vertex AI, Cloud Run), Hugging Face, and local hardware via downloads. Run on Android devices, laptops, or GPUs—no royalties required, just attribution. Check ai.google.dev/gemma for models and docs.