Salt: Systems programming, mathematically verified
Salt delivers exact performance parity with clang -O3
Systems programming,
mathematically verified.
Verified safety without the runtime cost.
Z3-proven contracts at build time.
No GC. No borrow checker.
Mathematical certainty.
View on GitHub
Try it live →
hello.salt
package main
use std.collections.HashMap
fn main() {<br>let mut map = HashMapStringView, i64>::new();<br>map.insert("hello", 1);<br>map.insert("world", 2);
// Pipe + optional chaining<br>map.get("hello") |?> println(f"Found: {_}");
// Iterate with f-strings<br>for entry in map.iter() {<br>println(f"{entry.key}: {entry.value}");
Built with Salt.
Salt is built alongside real systems to validate its compiler guarantees in practice.
Basalt
A complete Llama 2 inference engine. Weights are memory-mapped directly from disk,<br>matrix multiplications are tiled through MLIR's affine optimizer, and every<br>compute kernel carries a Z3 proof that its array accesses are in bounds.<br>The whole thing is about 600 lines of Salt.
~870
tok/s
≈C
vs llama2.c
~600
lines of code
Z3
verified kernels
View source →
kernels.salt
// Z3-verified matrix multiply<br>fn mat_mul(<br>out: Ptrf32>,<br>a: Ptrf32>,<br>b: Ptrf32>,<br>m: i64, n: i64, d: i64<br>) requires(m > 0, n > 0, d > 0) {<br>for i in 0..m {<br>for j in 0..n {<br>let mut sum: f32 = 0.0;<br>for k in 0..d {<br>sum = sum + a[i*d+k] * b[k*n+j];<br>out[i * n + j] = sum;
kmain.salt
// KeuOS kernel entry point<br>@no_mangle<br>pub fn kmain(magic: u32, mb: &u8) {<br>serial.init();<br>gdt.init();<br>idt.init();<br>pit.init();<br>run_smp_tests(); // ACPI → APIC → AP boot<br>percpu_init_bsp();<br>scheduler.init();
serial.print("KEUOS BOOT [OK]\n");<br>scheduler.start();<br>// Preemptive scheduling is live<br>bench_suite_run();
KeuOS
A keuos microkernel that boots on real x86 hardware, written from scratch in Salt.<br>16-core SMP with Chase-Lev work-stealing, ACPI/APIC,<br>preemptive scheduler with 7-field fiber migration, per-core sharded allocators,<br>a zero-trap socket API where read() and write() are pure<br>shared-memory<br>operations with no kernel traps, a Ring 3 TCP stack hardened with stateless SYN cookies<br>(SipHash-2-4), and epoch-based memory reclamation.<br>No C runtime, no libc, no external dependencies.
59
cycle alloc
487
cycle ctx switch
297
cycle IPC
KVM
bare metal
View source →
59 cycles
Arena Allocation
~15 ns @ 4.0 GHz · glibc malloc ≈<br>1,000–5,000
487 cycles
Context Switch (16-core<br>SMP)
~122 ns · full GPR + 512B FXSAVE
297 cycles
IPC Ping-Pong
~74 ns · fiber-to-fiber zero-copy<br>yield
60M+ PPS
NetD C10M Data Plane
Ring 3 SPSC bridge · 6× C10M<br>target
102 cycles
Null Syscall (Ring<br>3→0→3)
~26 ns · SWAPGS + KPTI · 7.4×<br>Linux
136 cycles
Socket Data Plane
per 64B read+write · zero kernel<br>traps
Intel Xeon Platinum 8151 (Skylake, 4.0 GHz), AWS z1d.metal, QEMU/KVM.<br>-enable-kvm -cpu host.<br>Full methodology →
KeuOS Train
A neural network that trains on MNIST handwritten digits — entirely in Salt.<br>Weight matrices are arena-allocated, every array access is Z3-verified through<br>Slice bounds, and training time is competitive with the equivalent C implementation.<br>The entire trainer is 188 lines.
97%
accuracy
≈C
training time
188
lines of code
Z3
bounds verified
View source →
keuos_train.salt
// Arena + Slice — the safety bridge<br>let arena = Arena::new(4194304);<br>let w1 = Slicef32>::new(<br>arena.alloc_array::f32>(HIDDEN * INPUT),<br>HIDDEN * INPUT<br>);
for epoch in 0..8 {<br>for sample in 0..TRAIN_COUNT {<br>matvec(w1, input, pre_relu, HIDDEN, INPUT);<br>relu(hidden.as_ptr(), HIDDEN);<br>softmax_cross_entropy_grad(...);<br>println(f"Epoch {epoch}: {accuracy}%");
Lettuce
A Redis-compatible in-memory data store. Leads Redis at every<br>concurrency level tested (1–100 clients, 13-point sweep) using<br>zero-copy RESP parsing, an arena-backed hash table, and a kqueue<br>event loop. 567 total lines of Salt, zero external deps.
1–100
clients tested
all
concurrency wins
567
lines of code
Redis commands
View source →
server.salt
// The entire event loop<br>fn run(&mut self) {<br>let poller = Poller::new();<br>poller.add(self.listener.fd, 0);
loop {<br>let events = poller.wait(64);<br>for ev in events {<br>if ev.fd == self.listener.fd {<br>self.accept_client(&poller);<br>} else {<br>self.handle_client(ev.fd);
GPU-accelerated graphics: FACET
A full-stack 2D rendering engine: from Bézier curve flattening to Metal<br>compute shaders. Five composable layers (raster, window, compositor,<br>UI, GPU), and every pixel write carries a Z3 proof that it's within the<br>framebuffer. Algorithm-identical to the C reference: Salt's MLIR codegen<br>matches clang -O3 at 457 fps.
457
fps
≈C
raster perf
layers
Metal
GPU backend
View source →
demo_tiger.salt
raster.salt
// Z3-verified pixel writes<br>pub fn set_pixel(<br>&mut self,<br>x: i32, y: i32,<br>r: u8, g: u8, b: u8, a: u8<br>) requires(<br>x >= 0 && x = 0 && y let off = (y as i64) * self.stride<br>+ (x as i64) * 4;<br>self.pixels.offset(off).write(r);<br>// Z3 proves: off in [0, stride*height)
C10M
High-connection networking...