Ollama v0.21.0 adds Hermes agent and Gemma 4 MLX support
Ollama `v0.21.0` ships the Hermes agent with adaptive skill creation, Gemma 4 support on Apple Silicon via MLX, and GitHub Copilot CLI integration in `ollama launch`.
Score breakdown
Developers on Apple Silicon can now run Gemma 4 locally with MLX acceleration, while the expanded `ollama launch` ecosystem makes it easier to wire up agentic coding tools like Hermes and GitHub Copilot CLI in a single command.
- 01Hermes agent launches via `ollama launch hermes` and automatically creates skills to fit user workflows, targeting research and engineering tasks.
- 02Gemma 4 is now supported on Apple Silicon via the MLX backend, including a text-only MLX runtime.
- 03The MLX backend gains mixed-precision quantization, better capability detection, and new op wrappers: `Conv2d`, `Pad`, activations, trig, masked SDPA, and RoPE-with-freqs.
Ollama `v0.21.0` centers on two headline additions: the Hermes agent and expanded MLX support for Gemma 4 on Apple Silicon. Hermes, invoked with `ollama launch hermes`, is described as an agent that learns alongside the user by automatically creating skills tailored to their workflows, with particular emphasis on research and engineering use cases. On the hardware acceleration side, the MLX backend gains a full Gemma 4 integration including a dedicated text-only MLX runtime, mixed-precision quantization, improved capability detection, and a batch of new op wrappers covering `Conv2d`, `Pad`, activation functions, trigonometric ops, masked SDPA, and RoPE-with-freqs.
GitHub Copilot CLI joins the list of supported coding agent integrations configurable in a single command.
The `ollama launch` command received several improvements. GitHub Copilot CLI joins the list of supported coding agent integrations configurable in a single command. The `opencode` integration was updated to write its configuration inline rather than to a separate file, aligning it with how other integrations are handled. A behavioral fix ensures that `ollama launch` no longer rewrites config files or prompts for confirmation when the resolved model list already matches what is saved — previously triggered by pressing → on a configured multi-model integration or passing `--model` with the current primary model. Additional fixes address the `ollama launch openclaw --yes` flag correctly skipping the channels configuration step for non-interactive setups, the restored Gemma 4 `nothink` renderer with the e2b-style prompt, a Gemma 4 compiler error breaking Metal builds, macOS cross-compile issues with certain Xcode versions, and suppressed deprecated warnings during `go build`.
Key facts
- 01Hermes agent launches via `ollama launch hermes` and automatically creates skills to fit user workflows, targeting research and engineering tasks.
- 02Gemma 4 is now supported on Apple Silicon via the MLX backend, including a text-only MLX runtime.
- 03The MLX backend gains mixed-precision quantization, better capability detection, and new op wrappers: `Conv2d`, `Pad`, activations, trig, masked SDPA, and RoPE-with-freqs.
- 04GitHub Copilot CLI integration added to `ollama launch` alongside existing coding agent integrations.
- 05`ollama launch opencode` now writes config inline instead of to a separate file.
- 06`ollama launch` no longer rewrites config or prompts when the resolved model list already matches the saved state.