Google has launched Gemma 4, a groundbreaking family of open-weight AI models built on Gemini 3 research and released under the permissive Apache 2.0 license. Announced on April 1, 2026, these models prioritize efficiency, advanced reasoning, and on-device deployment across smartphones, laptops, and servers. Developers have already downloaded over 400 million Gemma variants, fueling a vibrant ecosystem.
Model Variants and Specs
Gemma 4 comes in four sizes tailored for diverse hardware: Effective 2B (E2B) and 4B (E4B) for edge devices, 26B Mixture-of-Experts (MoE), and 31B Dense. Smaller models offer 128K token context windows, while larger ones extend to 256K, enabling handling of extensive documents or conversations.
Key specs include native multimodal support for images, audio, and video, fluency in 140+ languages, built-in function calling, and a “thinking” mode for step-by-step reasoning.
Benchmark Performance
The 31B model ranks #3 on the Arena AI text leaderboard, outperforming models 20x its size, while the 26B secures #6. It excels in coding (LiveCodeBench v6: 80% for 31B), math (AIME 2026: 89.2%), and reasoning (GPQA Diamond: 84.3%).
| Benchmark | 31B | 26B MoE | E4B |
|---|---|---|---|
| MMLU Pro | 85.2% | 82.6% | 69.4% |
| GPQA Diamond | 84.3% | 82.3% | 58.6% |
| LiveCodeBench | 80.0% | 77.1% | 52.0% |
Key Features and Capabilities
Gemma 4 supports agentic workflows with multi-step planning, tool interaction, and offline code generation. Multimodal features handle object detection, OCR, document parsing, and UI understanding at variable resolutions. Native system prompts enhance structured interactions.
Real-World Applications
Businesses deploy Gemma 4 for on-device assistants, enterprise automation, legal document analysis, and generative design in engineering. It powers privacy-focused mobile apps, autonomous agents, and real-time translation without cloud dependency.
Availability and Access
Available now on Google Cloud (Vertex AI, Cloud Run), Hugging Face, and local hardware via downloads. Run on Android devices, laptops, or GPUs—no royalties required, just attribution. Check ai.google.dev/gemma for models and docs.