Ollama v0.30.9 fixes single-token output bug in coding agents
Ollama `v0.30.9` adds Cohere2Moe architecture support, fixes a critical bug that caused `ollama launch claude` and other coding agent use cases to output only one token, and adds a context-window overflow error.
Score breakdown
The single-token output bug fix restores correct multi-token generation for `ollama launch claude` and other coding agent workflows that were broken in prior builds.
- 01Ollama `v0.30.9` adds support for the Cohere2Moe architecture.
- 02A bug causing `ollama launch claude` and other coding agent/assistant use cases to output only one token has been fixed.
- 03The LFM2 parser/renderer is fixed for cases where thinking was not emitted.
Ollama `v0.30.9` is a focused patch release addressing several correctness issues alongside one new architecture addition. The most impactful fix resolves a bug where `ollama launch claude` and other coding agent or assistant workflows would only produce a single token of output — a regression that would have rendered agentic coding use cases effectively non-functional on affected builds. A second fix corrects the LFM2 parser and renderer for cases where the model did not emit a thinking step.
On the new-feature side, the release adds support for the Cohere2Moe architecture, expanding the range of models Ollama can run locally.
On the new-feature side, the release adds support for the Cohere2Moe architecture, expanding the range of models Ollama can run locally. It also introduces a guardrail that returns an explicit error when a single message is larger than the current context window, giving users clearer feedback rather than silent failure or truncation. The full diff is available against `v0.30.8`.
Key facts
- 01Ollama `v0.30.9` adds support for the Cohere2Moe architecture.
- 02A bug causing `ollama launch claude` and other coding agent/assistant use cases to output only one token has been fixed.
- 03The LFM2 parser/renderer is fixed for cases where thinking was not emitted.
- 04Ollama now returns an error when a single message exceeds the current context window.
- 05The release is compared against `v0.30.8` in the full changelog.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 17, 2026 · 10:39 UTC. How this works →