COEVO unifies RTL correctness and PPA optimization in a single evolutionary loop
COEVO, a co-evolutionary framework, jointly optimizes functional correctness and power-performance-area (PPA) metrics in LLM-based RTL generation, achieving 97.5% Pass@1 on VerilogEval 2.0 and best PPA on 43 of 49 synthesizable designs.
Score breakdown
Developers and researchers using LLM-based RTL generation can now jointly optimize for both functional correctness and hardware efficiency metrics without discarding partially correct designs, enabling better exploration of the correctness-PPA trade-off space.
- 01COEVO treats correctness as a continuous co-optimization dimension rather than a binary gate, allowing partially correct candidates to contribute to PPA-optimal solutions
- 02The framework uses four-dimensional Pareto-based non-dominated sorting to preserve area, delay, and power trade-offs without manual weight tuning
- 03An adaptive correctness gate with annealing enables PPA-promising but partially correct designs to guide the evolutionary search
Existing LLM-based RTL generation methods universally decouple functional correctness from power-performance-area (PPA) optimization, either through sequential multi-agent pipelines, evolutionary search with binary correctness gates, or hierarchical reward dependencies. This approach systematically discards partially correct but architecturally promising candidates. Additionally, these methods reduce the multi-objective PPA space to a single scalar fitness, obscuring trade-offs among area, delay, and power.
COEVO proposes a co-evolutionary framework that unifies correctness and PPA optimization within a single evolutionary loop.
COEVO proposes a co-evolutionary framework that unifies correctness and PPA optimization within a single evolutionary loop. Correctness is formulated as a continuous co-optimization dimension alongside area, delay, and power, enabled by an enhanced testbench providing fine-grained scoring and diagnostic feedback. An adaptive correctness gate with annealing allows PPA-promising but partially correct candidates to guide the search toward jointly optimal solutions. To preserve the full PPA trade-off structure, COEVO employs four-dimensional Pareto-based non-dominated sorting with configurable intra-level sorting, replacing scalar fitness without manual weight tuning.
On VerilogEval 2.0 and RTLLM 2.0, COEVO achieves 97.5% and 94.5% Pass@1 with GPT-5.4-mini, surpassing all agentic baselines across four LLM backbones, while attaining the best PPA on 43 out of 49 synthesizable RTLLM designs.
Key facts
- 01COEVO treats correctness as a continuous co-optimization dimension rather than a binary gate, allowing partially correct candidates to contribute to PPA-optimal solutions
- 02The framework uses four-dimensional Pareto-based non-dominated sorting to preserve area, delay, and power trade-offs without manual weight tuning
- 03An adaptive correctness gate with annealing enables PPA-promising but partially correct designs to guide the evolutionary search
- 04COEVO achieves 97.5% Pass@1 on VerilogEval 2.0 and 94.5% Pass@1 on RTLLM 2.0 with GPT-5.4-mini
- 05The framework attains best PPA on 43 out of 49 synthesizable RTLLM designs, surpassing agentic baselines across four LLM backbones