Apr 16, 2026·1 min readResearch Papers

COEVO unifies RTL correctness and PPA optimization in a single evolutionary loop

COEVO, a co-evolutionary framework, jointly optimizes functional correctness and power-performance-area (PPA) metrics in LLM-based RTL generation, achieving 97.5% Pass@1 on VerilogEval 2.0 and best PPA on 43 of 49 synthesizable designs.

ArXiv·Heng Ping, Peiyu Zhang, Shixuan Li

Read at source

Composite

6.1

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

Developers and researchers using LLM-based RTL generation can now jointly optimize for both functional correctness and hardware efficiency metrics without discarding partially correct designs, enabling better exploration of the correctness-PPA trade-off space.

01COEVO treats correctness as a continuous co-optimization dimension rather than a binary gate, allowing partially correct candidates to contribute to PPA-optimal solutions
02The framework uses four-dimensional Pareto-based non-dominated sorting to preserve area, delay, and power trade-offs without manual weight tuning
03An adaptive correctness gate with annealing enables PPA-promising but partially correct designs to guide the evolutionary search

Summary— our read of the original

Existing LLM-based RTL generation methods universally decouple functional correctness from power-performance-area (PPA) optimization, either through sequential multi-agent pipelines, evolutionary search with binary correctness gates, or hierarchical reward dependencies. This approach systematically discards partially correct but architecturally promising candidates. Additionally, these methods reduce the multi-objective PPA space to a single scalar fitness, obscuring trade-offs among area, delay, and power.

COEVO proposes a co-evolutionary framework that unifies correctness and PPA optimization within a single evolutionary loop.

COEVO proposes a co-evolutionary framework that unifies correctness and PPA optimization within a single evolutionary loop. Correctness is formulated as a continuous co-optimization dimension alongside area, delay, and power, enabled by an enhanced testbench providing fine-grained scoring and diagnostic feedback. An adaptive correctness gate with annealing allows PPA-promising but partially correct candidates to guide the search toward jointly optimal solutions. To preserve the full PPA trade-off structure, COEVO employs four-dimensional Pareto-based non-dominated sorting with configurable intra-level sorting, replacing scalar fitness without manual weight tuning.

On VerilogEval 2.0 and RTLLM 2.0, COEVO achieves 97.5% and 94.5% Pass@1 with GPT-5.4-mini, surpassing all agentic baselines across four LLM backbones, while attaining the best PPA on 43 out of 49 synthesizable RTLLM designs.

Key facts

01COEVO treats correctness as a continuous co-optimization dimension rather than a binary gate, allowing partially correct candidates to contribute to PPA-optimal solutions
02The framework uses four-dimensional Pareto-based non-dominated sorting to preserve area, delay, and power trade-offs without manual weight tuning
03An adaptive correctness gate with annealing enables PPA-promising but partially correct designs to guide the evolutionary search
04COEVO achieves 97.5% Pass@1 on VerilogEval 2.0 and 94.5% Pass@1 on RTLLM 2.0 with GPT-5.4-mini
05The framework attains best PPA on 43 out of 49 synthesizable RTLLM designs, surpassing agentic baselines across four LLM backbones

Topics

#code-generation #benchmarks #agent-framework #llm-optimization #hardware-design

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Apr 20, 2026 · 00:31 UTC. How this works →

COEVO unifies RTL correctness and PPA optimization in a single evolutionary loop

Score breakdown

Key facts

Topics

Score breakdown

Key facts

Topics