CASS-RTL steers LLM attention heads to improve RTL code accuracy
CASS-RTL is a new inference-time framework that identifies and leverages correctness-aware attention heads inside LLMs to improve RTL hardware code generation accuracy by 10–20% on VerilogEval and 5% on CVDP, without retraining.
Score breakdown
CASS-RTL demonstrates that steering LLMs' internal attention mechanisms at inference time — without retraining — can meaningfully improve the functional accuracy of generated RTL hardware code, a domain where even small logical errors can make circuits unusable or insecure.
- 01CASS-RTL identifies attention heads whose activation patterns consistently differentiate correct from incorrect RTL code.
- 02The framework constructs a low-dimensional subspace capturing correctness-relevant signals and applies a geometry-aware intervention at inference time.
- 03It is fully model-agnostic and requires no additional supervision or retraining.
Automatic RTL code generation from natural language using LLMs is an active research area in chip design, but it poses unique challenges: unlike general software coding, RTL demands strict cycle accuracy and correct handling of concurrency, where even minor logical errors can render a circuit unusable or insecure. Prior approaches to improving LLM-generated RTL have focused on external verification, self-evaluation prompts, retrieval-augmented prompting, domain-specific fine-tuning, agentic pipelines, and reasoning — but these largely ignore the internal attention mechanisms of LLMs that may inherently correlate with output correctness.
CASS-RTL, proposed by Mohammad Akyash, Nowfel Mashnoor, and Kimia Azar, addresses this gap by operating directly on a model's internals at inference time.
CASS-RTL, proposed by Mohammad Akyash, Nowfel Mashnoor, and Kimia Azar, addresses this gap by operating directly on a model's internals at inference time. The framework first identifies attention heads whose activation patterns reliably differentiate correct from incorrect RTL outputs. It then constructs a low-dimensional subspace that captures these correctness-relevant signals and applies a geometry-aware intervention to steer the model's generation toward functionally accurate RTL — all without any retraining or additional labeled supervision. The authors describe it as a "first-of-its-kind" framework for this purpose.
Empirical results across multiple models show 10–20% improvement in pass@1/5/10 accuracy on the VerilogEval benchmark and a 5% improvement on CVDP. Because CASS-RTL is fully model-agnostic and integrates into existing models without modifying weights or requiring large labeled datasets, it offers a practical path to more reliable LLM-based hardware design workflows.
Key facts
- 01CASS-RTL identifies attention heads whose activation patterns consistently differentiate correct from incorrect RTL code.
- 02The framework constructs a low-dimensional subspace capturing correctness-relevant signals and applies a geometry-aware intervention at inference time.
- 03It is fully model-agnostic and requires no additional supervision or retraining.
- 04Evaluation shows 10–20% improvement in pass@1/5/10 accuracy on VerilogEval.
- 05Evaluation shows 5% improvement on CVDP.
- 06Prior RTL generation approaches largely overlooked LLMs' internal attention mechanisms as a lever for correctness.
- 07The paper is authored by Mohammad Akyash, Nowfel Mashnoor, and Kimia Azar.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 17:05 UTC. How this works →