★ Rank 16 today·NEW·Jun 13, 2026·1 min readOpen Source

Open-source MCP server runs Qwen3 35B on rented Nosana GPUs for cheap bulk-text work

`qwen-nosana-mcp` is an open-source MCP server and CLI that lets Claude Code or Codex agents offload bulk-text tasks to a Qwen3.6 35B instance running on a rented Nosana GPU at roughly $1/hour, with no API key, no rate limits, and no per-token billing.

r/mcp·u/Impressive-Owl3830

Read at source

Composite · rank 16

5.1

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

The project offers a path to running a large open-weight model for bulk agentic coding tasks without per-token API costs, rate limits, or third-party data exposure, by pairing MCP with rented decentralized GPU compute.

01Routes bulk-text work from Claude Code or Codex to a self-hosted Qwen3.6 35B Q8_0 model
02Runs on a Nosana NVIDIA Pro 6000 Blackwell GPU at approximately $1/hour
03No Alibaba API key, no TOS-based content filtering, no rate limits, no per-token billing

Summary— our read of the original

The `qwen-nosana-mcp` project, published by u/Impressive-Owl3830 on r/mcp, provides an open-source MCP server and CLI that allows agentic coding tools like Claude Code and Codex to offload bulk-text workloads — long-document summarization, structured extraction, mass code generation, and translation of long documents — to a Qwen3.6 35B Q8_0 instance running on a Nosana NVIDIA Pro 6000 Blackwell GPU. The intended architecture positions a frontier model such as Sonnet 4.6 or GPT-5 as the "smart conductor" while Qwen3 handles the compute-heavy "cheap muscle" tasks at approximately $1/hour.

The project emphasizes a privacy-first design: because the model runs on a rented decentralized GPU rather than a third-party API, user prompts and data never leave that GPU.

The project emphasizes a privacy-first design: because the model runs on a rented decentralized GPU rather than a third-party API, user prompts and data never leave that GPU. There is no Alibaba API key requirement, no TOS-based content filtering, and no per-token billing. Setup requires obtaining a Nosana API key from `deploy.nosana.com` and exporting it as `NOSANA_API_KEY` in the shell environment. Built-in cost controls include a 60-minute default timeout capping spend at roughly $1, a 5-minute idle auto-stop, and a hard maximum of 4 hours per deployment.

Key facts

01Routes bulk-text work from Claude Code or Codex to a self-hosted Qwen3.6 35B Q8_0 model
02Runs on a Nosana NVIDIA Pro 6000 Blackwell GPU at approximately $1/hour
03No Alibaba API key, no TOS-based content filtering, no rate limits, no per-token billing
04Prompts and data stay on the rented decentralized GPU — never sent to a third-party API
05Built-in cost controls: 60-min default timeout (~$1 max), 5-min idle auto-stop, 4-hour hard cap
06Intended use cases include long-document summarization, structured extraction, mass code generation, and translation
07Setup requires a Nosana API key exported as `NOSANA_API_KEY` in the shell environment

Topics

#mcp #open-source #gpu-rental #qwen #agentic-coding

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 14, 2026 · 09:08 UTC. How this works →

Open-source MCP server runs Qwen3 35B on rented Nosana GPUs for cheap bulk-text work

Score breakdown

Key facts

Topics

More in Open Source.

Score breakdown

Key facts

Topics

More in Open Source.