Nocodo targets full-stack app generation with sub-1B LLMs
Nocodo is an experimental, work-in-progress coding agent framework designed to build and deploy full-stack CRUD apps using small (<10B) and tiny (<1B) local LLMs, coordinated through a multi-agent pipeline of specialist roles.
Score breakdown
Nocodo is notable as an attempt to push multi-agent, full-stack code generation down to sub-gigabyte models running entirely on local infrastructure, a constraint that requires deliberate architectural choices the project explicitly documents.
- 01Targets small (<10B) and tiny (<1B) LLMs running entirely on local infrastructure
- 02Current Rust Engineer agent runs `unsloth/Qwen3.5-0.8B-GGUF` via `llama.cpp` at `localhost:8080`
- 03Multi-agent pipeline includes Product Owner, Project Manager, Engineering Manager, and specialist coding agents
Nocodo is an experimental, work-in-progress GitHub project by brainless that aims to build a full coding agent pipeline capable of running on small (<10B) and tiny (<1B) LLMs entirely on local infrastructure. Its long-term vision is to allow people who can only describe what they want in plain language to get full-stack CRUD apps built, deployed, and managed — covering auth, permissions, build pipelines, and VPS or managed cloud — from a desktop app powered by sub-gigabyte models.
Specialist coding agents (Backend, Frontend, DB Engineer, UI Designer, and Rust Engineer) then execute tasks via a state machine that moves work through `draft → needs_technical_shaping → ready → in_progress → done`.
The architecture separates concerns into two layers. A coordination layer of higher-level agents — Product Owner, Project Manager, and Engineering Manager — handles user conversation, requirements intake, epic and task creation, and technical review before work reaches specialists. Specialist coding agents (Backend, Frontend, DB Engineer, UI Designer, and Rust Engineer) then execute tasks via a state machine that moves work through `draft → needs_technical_shaping → ready → in_progress → done`. The coordination layer is described as more mature than the coding agents, and the current focus is closing that gap.
To work around the limitations of tiny models, nocodo applies several reliability techniques: it extracts only fenced code blocks from model output and discards anything outside them, uses low temperature (0.1–0.2) for reproducibility, and returns the full prompt, raw model response, and extracted code on every run for transparency and debugging. The current Rust Engineer agent runs `unsloth/Qwen3.5-0.8B-GGUF` via `llama.cpp` at `localhost:8080`, with the model and endpoint overridable via `RUST_ENGINEER_MODEL` and `LLAMA_CPP_BASE_URL` environment variables. The project is licensed under AGPL-3.0 and has 141 commits at the time of publication.
Key facts
- 01Targets small (<10B) and tiny (<1B) LLMs running entirely on local infrastructure
- 02Current Rust Engineer agent runs `unsloth/Qwen3.5-0.8B-GGUF` via `llama.cpp` at `localhost:8080`
- 03Multi-agent pipeline includes Product Owner, Project Manager, Engineering Manager, and specialist coding agents
- 04Task state machine: `draft → needs_technical_shaping → ready → in_progress → done`
- 05Uses low temperature (0.1–0.2) and discards malformed code blocks to compensate for limited model capability
- 06Coordination layer is more mature than the coding agents; closing that gap is the current focus
- 07Long-term goal is full-stack CRUD app generation, deployment, and management from a desktop app using local models
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 17:05 UTC. How this works →