Ollama v0.30.6 adds Gemma 4 QAT weights and Oh My Pi integration
Ollama `v0.30.6` ships Gemma 4 QAT model variants for reduced memory usage, integrates `ollama launch omp` with the Oh My Pi AI coding agent, and improves Apple Silicon quantization via NVFP4 global scale for MLX embedding layers.
Score breakdown
Running large Gemma 4 models locally becomes more practical with QAT variants that cut memory overhead, while the Oh My Pi integration extends Ollama's reach directly into IDE-based agentic coding workflows.
- 01Ollama v0.30.6 adds Gemma 4 QAT (Quantization-Aware Training) weights to the model library.
- 02Five QAT model tags are available: `gemma4:e2b-it-qat`, `gemma4:e4b-it-qat`, `gemma4:12b-it-qat`, `gemma4:26b-a4b-it-qat`, and `gemma4:31b-it-qat`.
- 03QAT weights are described as dramatically reducing memory requirements and maximizing on-device performance.
Ollama `v0.30.6` expands its model library with Gemma 4 QAT (Quantization-Aware Training) weights across five size variants: `gemma4:e2b-it-qat`, `gemma4:e4b-it-qat`, `gemma4:12b-it-qat`, `gemma4:26b-a4b-it-qat`, and `gemma4:31b-it-qat`. According to the release notes, QAT dramatically reduces memory requirements and maximizes on-device performance compared to standard weights, making the Gemma 4 family more accessible on consumer hardware.
On the tooling side, the `ollama launch omp` command now integrates with Oh My Pi (`omp.sh`), described as an AI coding agent with IDE integration.
On the tooling side, the `ollama launch omp` command now integrates with Oh My Pi (`omp.sh`), described as an AI coding agent with IDE integration. Additionally, MLX embedding layers on Apple Silicon have been updated to use NVFP4 global scale, improving quantization accuracy for that backend.
Key facts
- 01Ollama v0.30.6 adds Gemma 4 QAT (Quantization-Aware Training) weights to the model library.
- 02Five QAT model tags are available: `gemma4:e2b-it-qat`, `gemma4:e4b-it-qat`, `gemma4:12b-it-qat`, `gemma4:26b-a4b-it-qat`, and `gemma4:31b-it-qat`.
- 03QAT weights are described as dramatically reducing memory requirements and maximizing on-device performance.
- 04`ollama launch omp` now integrates with Oh My Pi, an AI coding agent with IDE integration.
- 05MLX embedding layers now use NVFP4 global scale for improved quantization on Apple Silicon.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 8, 2026 · 15:36 UTC. How this works →