Apr 22, 2026·1 min readNew Models & Releases

OpenAI launches GPT-Image-2 with top Arena leaderboard scores

OpenAI released GPT-Image-2 across ChatGPT, Codex, and its API, claiming the #1 spot on all Image Arena leaderboards with a +242 Elo lead on text-to-image over the next competitor.

Latent Space

Read at source

Composite

7.8

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

Developers building agentic coding pipelines should note that GPT-Image-2's strong UI mockup and diagram generation makes it a practical front-end for code agents like Codex — generate a visual spec, then let an agent implement it.

01GPT-Image-2 launched across ChatGPT, Codex, and the API with both thinking and non-thinking variants.
02Arena ranks GPT-Image-2 #1 across all Image Arena leaderboards: 1512 text-to-image, 1513 single-image edit, 1464 multi-image edit.
03GPT-Image-2 holds a +242 Elo lead over the next model on text-to-image.

Summary— our read of the original

OpenAI launched GPT-Image-2, making it available across ChatGPT, Codex, and the API with both thinking and non-thinking variants. The model had previously appeared as a stealth entry on Arena before its official release. Key capabilities highlighted include stronger text rendering, layout fidelity, editing, multilingual support, and a "thinking" mode for images. When paired with a thinking model, GPT-Image-2 can search the web, generate multiple candidates, self-check outputs, and produce artifacts such as slides, infographics, diagrams, UI mockups, and QR codes. Downstream integrations at launch include Figma, Canva, Firefly, fal, and Hermes Agent.

Independent reactions characterized it as a more usable model for UI mockups, documentation, productivity visuals, and reference-driven design — not merely a prettier art generator.

Arena benchmarks show GPT-Image-2 at #1 across all Image Arena leaderboards, with scores of 1512 on text-to-image, 1513 on single-image edit, and 1464 on multi-image edit, and a striking +242 Elo lead over the next model on text-to-image. Independent reactions characterized it as a more usable model for UI mockups, documentation, productivity visuals, and reference-driven design — not merely a prettier art generator. One particularly noted systems implication is that image generation is becoming a front-end for coding agents: a UI spec generated as an image can serve as a visual reference for Codex or another code agent to implement against.

The article also covers Hugging Face's release of `ml-intern`, an open-source agent that automates the post-training research loop — reading papers, collecting datasets, launching training jobs, evaluating runs, and iterating on failures. Reported results include improving GPQA scientific reasoning from 10% to 32% in under 10 hours on Qwen3-1.7B, and a healthcare setup that reportedly beat Codex on HealthBench by 60%. Separately, `DSPy 3.2` shipped with RLM improvements, optimizer chaining, and LiteLLM decoupling, reflecting a broader trend of agent runtime harnesses becoming first-class engineering artifacts.

Key facts

01GPT-Image-2 launched across ChatGPT, Codex, and the API with both thinking and non-thinking variants.
02Arena ranks GPT-Image-2 #1 across all Image Arena leaderboards: 1512 text-to-image, 1513 single-image edit, 1464 multi-image edit.
03GPT-Image-2 holds a +242 Elo lead over the next model on text-to-image.
04Downstream integrations at launch include Figma, Canva, Firefly, fal, and Hermes Agent.
05Hugging Face released `ml-intern`, an open-source agent automating the post-training research loop end-to-end.
06`ml-intern` reportedly improved GPQA scientific reasoning from 10% to 32% in under 10 hours on Qwen3-1.7B.
07Cursor reportedly secured a $10B contract and a right to acquire for $60B with xAI.

Topics

#model-release #image-generation #openai #api-launch #benchmarks

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Apr 23, 2026 · 11:04 UTC. How this works →

Apr 22, 2026·1 min readNew Models & Releases

OpenAI launches GPT-Image-2 with top Arena leaderboard scores

OpenAI released GPT-Image-2 across ChatGPT, Codex, and its API, claiming the #1 spot on all Image Arena leaderboards with a +242 Elo lead on text-to-image over the next competitor.

Latent Space

Read at source

Composite

7.8

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

01GPT-Image-2 launched across ChatGPT, Codex, and the API with both thinking and non-thinking variants.
02Arena ranks GPT-Image-2 #1 across all Image Arena leaderboards: 1512 text-to-image, 1513 single-image edit, 1464 multi-image edit.
03GPT-Image-2 holds a +242 Elo lead over the next model on text-to-image.

Summary— our read of the original

Independent reactions characterized it as a more usable model for UI mockups, documentation, productivity visuals, and reference-driven design — not merely a prettier art generator.

Key facts

01GPT-Image-2 launched across ChatGPT, Codex, and the API with both thinking and non-thinking variants.
02Arena ranks GPT-Image-2 #1 across all Image Arena leaderboards: 1512 text-to-image, 1513 single-image edit, 1464 multi-image edit.
03GPT-Image-2 holds a +242 Elo lead over the next model on text-to-image.
04Downstream integrations at launch include Figma, Canva, Firefly, fal, and Hermes Agent.
05Hugging Face released `ml-intern`, an open-source agent automating the post-training research loop end-to-end.
06`ml-intern` reportedly improved GPQA scientific reasoning from 10% to 32% in under 10 hours on Qwen3-1.7B.
07Cursor reportedly secured a $10B contract and a right to acquire for $60B with xAI.

Topics

#model-release #image-generation #openai #api-launch #benchmarks

Methodology

Score breakdown

Key facts

Topics

Score breakdown

Key facts

Topics