You can't improve what you can't see. Agent observability means: every call captured, every trace reconstructable, every dollar attributed, every regression noticed. This page covers what to log, where to send it, and which tools to pick.
Per LLM call:
cache_read_input_tokens, cache_creation_input_tokens)Per agent session:
Log token counts + cost to your own database (Postgres in our case) on every LLM call. See src/services/pipeline/cost-tracker.ts for the pricing map and calculateCost() helper. Pros: zero vendor cost, full control. Cons: you build your own dashboards.
Open-source LLM observability. Self-hosted or SaaS. Captures traces, prompts, token counts, costs, user feedback; supports eval runs and prompt versioning. Integrates via SDK in minutes.
Proxy-based logging. You route requests through Helicone's proxy; every call is logged transparently with cost/cache headers. Minimal code change.
Focused on prompt versioning + evals + cost tracking. Strong if you iterate on prompts frequently.
OpenTelemetry-based observability from the Pydantic team. Pairs naturally with Pydantic AI. Good fit for Python-heavy stacks.
LangChain's first-party observability. Strong if you use LangChain / LangGraph.
Provider dashboards show per-API-key usage and costs. Enough for small projects; not rich enough for prompt-level debugging.
If you're starting from nothing, ship these three logs to a database you own:
// After every LLM call
await db.insert(llmCalls).values({
sessionId,
model: response.model,
inputTokens: response.usage.input_tokens,
outputTokens: response.usage.output_tokens,
cacheReadTokens: response.usage.cache_read_input_tokens ?? 0,
cacheWriteTokens: response.usage.cache_creation_input_tokens ?? 0,
latencyMs: Date.now() - startMs,
toolUse: response.stop_reason === 'tool_use',
error: null,
cost: calculateCost(response.model, response.usage.input_tokens, response.usage.output_tokens),
});That's it. Three queries give you: daily cost, slow calls, error rate. Add dashboards, alerts, and prompt capture as you grow.
Search for a command to run...