DeepMind's Gemma 4 runs on phones and a Nintendo Switch
Two Minute Papers covers Google DeepMind's Gemma 4, a free and open family of models whose smallest variants run on phones and even a first-generation Nintendo Switch without an internet connection.
Score breakdown
Developers building agentic or AI-assisted apps can deploy Gemma 4 locally — on phones or low-end hardware — eliminating cloud dependency and subscription risk entirely.
- 01Gemma 4 is a free and open family of models released by Google DeepMind.
- 02The smallest Gemma 4 variants require only a few gigabytes of memory and run on phones without an internet connection.
- 03The 2 billion parameter Gemma 4 model runs on a first-generation Nintendo Switch.
Two Minute Papers, hosted by Dr. Károly Zsolnai-Fehér, frames Google DeepMind's Gemma 4 release as a direct answer to the risks of relying on proprietary AI subscriptions — citing reports of some Claude users losing access due to "heavy workloads" as a motivating example. Gemma 4 is described as a free and open family of models where the smallest variants require only a few gigabytes of memory, making them runnable on consumer phones without an internet connection and even on a first-generation Nintendo Switch. Within days of release, community members had already built offline translation apps, summarization tools, and real-time in-browser image classification demos, and fine-tuning work was already publicly available.
The strong performance of a dense model at this scale is presented as a genuine surprise.
The video highlights four surprising findings about Gemma 4. Most notably, the larger 31B parameter model is a dense model — meaning it activates all parameters at inference time — yet it achieved the #3 ranking among open models and outperformed some models 10 times its size, remaining competitive with some 20 times its size on certain benchmarks. This is contrasted with the dominant mixture-of-experts (MoE) architecture used by many modern large models, which routes inputs to only a subset of specialized sub-networks (typically 2 to 8 experts) to keep inference costs manageable. The strong performance of a dense model at this scale is presented as a genuine surprise.
Key facts
- 01Gemma 4 is a free and open family of models released by Google DeepMind.
- 02The smallest Gemma 4 variants require only a few gigabytes of memory and run on phones without an internet connection.
- 03The 2 billion parameter Gemma 4 model runs on a first-generation Nintendo Switch.
- 04Community members built offline translation, summarization, and real-time browser-based image classification apps within days of release.
- 05The 31B Gemma 4 model ranked #3 among open models and beat some models 10 times its size.
- 06The 31B model is a dense model, not a mixture-of-experts (MoE) architecture.
- 07Fine-tuning support for Gemma 4 was already publicly available shortly after release.