Heliodor: An RVA23-Compliant Multicore Out-of-Order RISC-V Core in Veryl

dalance1 pts0 comments

Heliodor: An RVA23-Compliant Multicore Out-of-Order RISC-V Core in Veryl | Veryl

Design

Verification

Why Heliodor exists

Acknowledgement

Heliodor: An RVA23-Compliant Multicore Out-of-Order RISC-V Core in Veryl

2026-06-24

Heliodor is a from-scratch multicore<br>out-of-order RISC-V core written in Veryl. We have been using it as a<br>real-scale design to drive the Veryl compiler and the<br>native Veryl simulator.<br>This post announces a milestone: with the addition of the vector (V)<br>extension Heliodor is now compliant with the full RVA23<br>profile, and runs a vector-enabled Linux kernel with<br>userspace vector code on top of it. It also passes the official RISC-V<br>Architectural Compatibility Tests (ACT) v4 — not just for the base RV64GC,<br>but across the broad set of RVA23 extensions that ACT v4 covers.

Design

Heliodor is a 2-wide superscalar out-of-order RV64 core. It performs<br>register renaming onto a physical register file (RAT + free list, with a<br>64-entry integer PRF and 64-entry FP PRF), uses Tomasulo-style dynamic<br>scheduling, and commits in order through a 32-entry reorder buffer for<br>precise exceptions. Branch prediction combines a BTB, a<br>gshare BHT, a TAGE-lite direction predictor, an indirect-target BTB, and a<br>return-address stack, with execute-time early redirect on mispredicts. The<br>memory pipeline does memory-dependence speculation with commit-time replay,<br>store-to-load forwarding from in-flight and committed stores, and a<br>write-combining committed-store buffer.

It runs as 1 / 2 / 4 / 8-hart SMP. Each hart has private write-back L1<br>caches (16 KB, 4-way, non-blocking with MSHRs and critical-word-first fill)<br>kept coherent through a shared 128 KB L2 with an inclusive MESI directory<br>and cache-to-cache transfer; AMO / LR-SC stay in cache and the instruction<br>side is coherent, so FENCE.I and SFENCE.VMA need no flush sweep. The<br>shared bus is a split-transaction read controller with modeled DRAM latency.

The ISA is the full RVA23 profile — RVA23S64, the supervisor-mode profile<br>for 64-bit application processors, which subsumes the RVA23U64 user<br>profile: RV64GC plus Zba/Zbb/Zbs, Zacas/Zabha atomics, Zfa, Zfhmin, Sstc, Svnapot,<br>Svpbmt, Svadu, Zicbom/Zicboz, Zicond, Zawrs, Svinval, Zkt, Sscofpmf,<br>Supm/Ssnpm, PMP (Smpmp), Sv39 virtual memory at M / S / U, the hypervisor<br>(H) extension that RVA23S64 mandates, and the just-added vector (V)<br>extension — the last RVA23 piece to land.

The H-extension support spans the HS / VS / VU modes, two-stage<br>Sv39 × Sv39x4 translation cached in a VMID/VS-ASID-tagged TLB,<br>HLV/HLVX/HSV, the guest-page-fault trio, virtual-interrupt delivery,<br>and Sstc-in-VS guest timers.

The vector unit (RVV 1.0) is a decoupled, in-order unit beside the<br>out-of-order scalar core: it owns the 32 × 128-bit vector register file<br>(VLEN = 128, ELEN = 64) and the vector CSRs, and runs vector ops in program<br>order rather than renaming them into the scalar out-of-order machinery. It<br>covers integer and single/double FP at all LMUL, the full range of vector<br>loads/stores, masking, and widening/narrowing. Vector state plugs into the<br>standard mstatus.VS / sstatus.VS / vsstatus.VS controls and the same<br>two-stage translation as the rest of the H extension, so it works across all<br>of M / HS / VS / VU.

Verification

Heliodor's scalar RVA23 ISA is machine-checked against the official RISC-V<br>Architectural Compatibility Tests (ACT) v4 — the conformance suite whose<br>expected results are signed by the formal RISC-V Sail golden model —<br>passing 506 / 506 across integer / atomic (incl. Zacas / Zabha) / FP /<br>compressed / CSR / Zb* / Zc* / Zfa / Zfhmin / Zicbo* / PMP / Sv* /<br>exceptions. It also runs the riscv-tests arch suites, the in-tree directed<br>tests for OoO, memory coherence, and H-extension corners, and an RVWMO<br>litmus harness.

ACT v4 has no vector suite (nor one for the hypervisor extension), so those<br>two are validated separately. The vector unit is checked by Heliodor's<br>in-tree RVV arch suites, then exercised end-to-end by booting a V-aware<br>Linux 7.1 kernel that discovers the vector extension and runs vector code.<br>Heliodor boots mainline Linux 5.15 / 6.6 LTS / 7.1 SMP through OpenSBI to<br>SBI shutdown on 1 / 2 / 4 (and 5.15 on 8) harts — the 7.1 boot also drives<br>userspace floating point across SMP context switches — and a self-written<br>bare-metal type-1 hypervisor boots an unmodified guest Linux to its own<br>userspace using the H extension. The boots are cross-checked on Verilator<br>and a second Veryl simulator backend.

Here is the full boot log — the [HV] lines are the type-1<br>hypervisor, and everything between them is an unmodified guest Linux 7.1<br>(Machine model: Heliodor guest) running on the virtualized HS / VS / VU<br>modes, from hypervisor start through the guest discovering Sstc-in-VS, its<br>own init, and SBI shutdown:

[HV] hypervisor up on heliodor H-extension<br>[HV] entering guest at 0x0000000080200000<br>Booting Linux on hartid 0<br>Linux version 7.1.0 (builder@host) (riscv64-unknown-elf-gcc (g04696df0963) 14.2.0, GNU ld (GNU Binutils)...

vector heliodor order extension rva23 veryl

Related Articles