Models (Unified)

Last verified: 2026-06-12 · next review in 24 days

All model IDs and context windows verified against vendor docs on the lastVerified date above. Legacy models that are still available but superseded live in their own table at the bottom.

Anthropic (current)

Model	API ID	Context	Max output	Best for
Claude Opus 4.8	`claude-opus-4-8`	1M tokens	128k	Most complex reasoning + agentic coding; default for hardest tasks
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M tokens	64k	Best speed/quality balance; default for most agent workloads
Claude Haiku 4.5	`claude-haiku-4-5-20251001`	200k tokens	64k	Cheap, fast — classification, routing, high-volume tasks

Claude Sonnet 4.6 + Haiku 4.5 support extended thinking; Opus 4.8 uses adaptive thinking (always on) and does not offer extended thinking. All support vision, 1M tokens on Opus/Sonnet, and the Message Batches API for 50% off. On Opus 4.8 the temperature / top_p / top_k params are deprecated (a non-default value returns a 400) — steer behavior through prompting instead.

Aliases (claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5) auto-resolve to the latest snapshot. From the 4.6 generation onward the dateless IDs are themselves pinned snapshots, not evergreen pointers; pin dated IDs (claude-haiku-4-5-20251001) for reproducibility.

A higher tier, Claude Fable 5 (claude-fable-5, 1M context, 128k output, $10/$50 per MTok), is generally available as of June 2026 for the most demanding reasoning + long-horizon agentic work.

Source: platform.claude.com/docs/en/about-claude/models/overview.

OpenAI

OpenAI's model lineup moves fastest. As of 2026-06-12, the docs index lives at developers.openai.com/api/docs/models (the old platform.openai.com/docs/models now 301-redirects there) — visit it for the authoritative current list. The current frontier model is GPT-5.5 (gpt-5.5, 1M context), with GPT-5.4 (gpt-5.4) and GPT-5.4 mini (gpt-5.4-mini, 400k context) below it. In general, expect:

A flagship reasoning model (most capable, highest cost, slowest) — currently GPT-5.5
A mid-tier general-purpose chat model
A small / mini / nano tier for classification + routing
A realtime model family (GPT-Realtime-2 / 1.5 / mini) for voice / streaming use cases

Google (Gemini)

Model	Best for
Gemini 3.1 Pro	Complex reasoning + coding; flagship
Gemini 3.5 Flash	Most intelligent Flash; 1M context
Gemini 3 Flash	Fast general-purpose (preview)
Gemini 3.1 Flash-Lite	Highest throughput
Gemini 2.5 Pro / Flash	Legacy, still available

All Gemini 3 models carry a 1M-token context window. Source: ai.google.dev/gemini-api/docs/models.

Open-source / local

Model family	Provider	Notes
Llama (code variants)	Meta	Run locally via Ollama / vLLM; varies by size
Codestral	Mistral	Open-source, code-focused
DeepSeek	DeepSeek	Strong SWE-bench performance, low cost
Qwen Coder	Alibaba	Code-focused, multiple sizes

Run locally via Ollama, vLLM, or llama.cpp. Latency and capability ceiling are lower than frontier models, but data stays on your machine.

Anthropic legacy (still available)

These models continue to work but are superseded — consider migrating:

Model	API ID	Status
Claude Opus 4.7	`claude-opus-4-7`	Superseded by Opus 4.8
Claude Opus 4.6	`claude-opus-4-6`	Superseded by Opus 4.8
Claude Sonnet 4.5	`claude-sonnet-4-5-20250929`	Superseded by Sonnet 4.6
Claude Opus 4.5	`claude-opus-4-5-20251101`	Legacy
Claude Opus 4.1	`claude-opus-4-1-20250805`	Deprecated, retires 2026-08-05
Claude Sonnet 4	`claude-sonnet-4-20250514`	Deprecated, retires 2026-06-15
Claude Opus 4	`claude-opus-4-20250514`	Deprecated, retires 2026-06-15
Claude Haiku 3	`claude-3-haiku-20240307`	Retired 2026-04-20

Always check model deprecations before pinning an older ID.

Picking a model for a coding agent

Rule of thumb, optimized for cost/quality:

Default workhorse → Claude Sonnet 4.6 (agentic coding sweet spot) or OpenAI current flagship (GPT-5.5)
Hardest tasks / long-horizon agents → Claude Opus 4.8
Classification, routing, summarization → Claude Haiku 4.5 or Gemini 3.5 Flash
Local / privacy-first → DeepSeek Coder or Qwen Coder via Ollama

See Cost Optimization § Model Routing for the split-model pattern.

On this page

Models (Unified)

Anthropic (current)

OpenAI

Google (Gemini)

Open-source / local

Anthropic legacy (still available)

Picking a model for a coding agent

Further reading

On this page

On this page

Models (Unified)

Anthropic (current)

OpenAI

Google (Gemini)

Open-source / local

Anthropic legacy (still available)

Picking a model for a coding agent

Further reading

On this page