Lumina: open-source, self-hosted LLM observability platform launches
u/Fantastic-Call-5702 released Lumina, a self-hosted, MIT-licensed observability platform for LLM apps that tracks token costs, agent run trajectories, TTFT, and RAG metrics, built on Go, ClickHouse, Kafka, and PostgreSQL.
Score breakdown
Lumina gives teams a self-hosted alternative to Langfuse, Helicone, and Datadog for LLM cost and performance observability, keeping sensitive trace data on their own infrastructure rather than a third-party SaaS.
- 01Lumina is a self-hosted, open-source LLM observability platform released under the MIT license.
- 02Tracks token costs broken down by model, provider, feature, and user, including prompt-cache savings for OpenAI and Anthropic caching.
- 03Measures time-to-first-token (TTFT) and tokens/sec per model, with side-by-side A/B model comparison.
u/Fantastic-Call-5702 announced Lumina on r/LangChain, a self-hosted, open-source observability platform built for LLM-powered applications. Its LLM-specific features include token breakdowns by model, provider, feature, and user with cost-per-call visibility; prompt-cache savings tracking for OpenAI and Anthropic caching; TTFT and tokens/sec measurements per model; side-by-side model A/B comparison; agent run trajectories showing every step, tool call, and retrieval with per-step cost; a tool catalog surfacing failure rates and error types; and RAG/retrieval metrics covering query volume, average documents returned, and latency. Threshold-based alerting covers cost, latency, error rate, and token usage, with per-feature and per-user LLM cost budgets and alert silencing.
The backend is written in Go and handles ingestion and workers; ClickHouse powers analytics; Kafka handles buffering; and PostgreSQL stores metadata.
Beyond LLM observability, Lumina includes general-purpose telemetry described as "a lightweight SigNoz": HTTP traces with a waterfall view, a live-tail log explorer, a metrics explorer, exception grouping with stack traces, a service map, and multi-turn session views. The backend is written in Go and handles ingestion and workers; ClickHouse powers analytics; Kafka handles buffering; and PostgreSQL stores metadata. The dashboard is built in Next.js and runs on `http://localhost:9191` after a one-command `make start` setup. A Python SDK provides zero-config instrumentation — calling `lumina.init(api_key="...")` automatically traces OpenAI, Anthropic, and LiteLLM calls. Full OpenTelemetry support is also included. The project is available at `https://github.com/lumina-gen/lumina-core` under the MIT license, and the author is actively seeking feedback on OTEL ingestion, the Python SDK, missing features relative to Langfuse, Helicone, or Datadog, and the Go + ClickHouse + Kafka architecture choices.
Key facts
- 01Lumina is a self-hosted, open-source LLM observability platform released under the MIT license.
- 02Tracks token costs broken down by model, provider, feature, and user, including prompt-cache savings for OpenAI and Anthropic caching.
- 03Measures time-to-first-token (TTFT) and tokens/sec per model, with side-by-side A/B model comparison.
- 04Agent run trajectories show every step, tool call, and retrieval with per-step cost; a tool catalog surfaces failure rates and error types.
- 05RAG/retrieval metrics cover query volume, average docs returned, and latency.
- 06Stack: Go backend, ClickHouse for analytics, Kafka for buffering, PostgreSQL for metadata, Next.js dashboard.
- 07Python SDK offers zero-config instrumentation via `lumina.init(api_key="...")`, with full OpenTelemetry support; setup is a single `make start` command.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 13, 2026 · 08:58 UTC. How this works →