Lines or Less: Test Case Minimization

birdculture1 pts0 comments

256 Lines or Less: Test Case Minimization

Property Based Testing and fuzzing are a deep and science-intensive<br>topic. There are enough advanced techniques there for a couple of<br>PhDs, a PBT daemon, and a client-server architecture. But I have this weird parlor-trick<br>PBT library, implementable in a couple of hundred lines of code in one<br>sitting.

This week I’ve been thinking about a cool variation of a consensus<br>algorithm. I implemented it on the weekend. And it took just a couple<br>of hours to write a PBT library itself first, and then a test, that<br>showed a deep algorithmic flaw in my thinking (after a dozen trivial<br>flaws in my coding). So, I don’t get to write more about consensus<br>yet, but I at least can write about the library. It is very simple,<br>simplistic even. To use an old Soviet joke about<br>Babel and Bebel, it’s Gogol rather than Hegel. But for just 256 lines,<br>it’s one of the highest power-to-weight ratio tools in my toolbox.

Read this post if:

You want to stretch your generative testing muscles.

You are a do-it-yourself type, and wouldn’t want to pull a<br>ginormous PBT library off the shelf.

You would pull a library, but want to have a more informed opinion<br>about available options, about essential and accidental complexity.

You want some self-contained real-world Zig examples :P

Zig works well here because it, too, is exceptional in its<br>power-to-weight.

FRNG

The implementation is a single file,<br>FRNG.zig, because the core abstraction here is a<br>Finite Random Number Generator — a PRNG where all numbers are<br>pre-generated, and can run out. We start with standard boilerplate:

const std = @import("std");<br>const assert = std.debug.assert;

entropy: []const u8,

pub const Error = error{OutOfEntropy};

const FRNG = @This();

pub fn init(entropy: []const u8) FRNG {<br>return .{ .entropy = entropy };

In Zig, files are structs: you obviously need structs, and the<br>language becomes simpler if structs are re-used for what files are.<br>In the above<br>const FRNG = @This()<br>assigns a conventional name to the file struct, and<br>entropy: []const u8<br>declares instance fields (only one here). const Error<br>and fn init are “static” (container level)<br>declarations.

The only field we have is just a slice of raw bytes, our<br>pre-generated random numbers. And the only error condition we can<br>raise is OutOfEntropy.

The simplest thing we can generate is a slice of bytes. Typically,<br>API for this takes a mutable slice as an out parameter:

pub fn fill(prng: *PRNG, bytes: []u8) void { ... }

But, due to pre-generated nature of FRNG, we can return the slice<br>directly, provided that we have enough entropy. This is going to be<br>our (sole) basis function, everything else is going to be a<br>convenience helper on top:

pub fn bytes(frng: *FRNG, size: usize) Error![]const u8 {<br>if (frng.entropy.len return error.OutOfEntropy;<br>const result = frng.entropy[0..size];<br>frng.entropy = frng.entropy[size..];<br>return result;

The next simplest thing is an array (a slice with a fixed size):

pub fn array(frng: *FRNG, comptime size: usize) Error![size]u8 {<br>return (try frng.bytes(size))[0..size].*;

Notice how Zig goes from runtime-known slice length, to comptime<br>known array type. Because size is a comptime constant, slicing []const u8 with<br>[0..size] returns a pointer to array,<br>*const [size]u8.

We can re-interpret a 4-byte array into u32. But,<br>because this is Zig, we can trivially generalize the function to<br>work for any integer type, by passing in Int comptime<br>parameter of type type:

const builtin = @import("builtin");

pub fn int(frng: *FRNG, Int: type) Error!Int {<br>comptime {<br>assert(@typeInfo(Int).int.signedness == .unsigned);<br>assert(builtin.cpu.arch.endian() == .little);<br>return @bitCast(try frng.array(@sizeOf(Int)));

This function is monomorphised for every Int type, so<br>@sizeOf(Int) becomes a compile-time constant we can<br>pass to fn array.

Production code would be endian-clean here, but, for simplicity, we<br>encode our endianness assumption as a compile-time assertion. Note<br>how Zig communicates information about endianness to the program.<br>There isn’t any kind of side-channel or extra input to<br>compilation, like --cfg flags. Instead, the compiler<br>materializes all information about target CPU as Zig code. There’s<br>a builtin.zig file somewhere in the compiler caches<br>directory that contains

pub const cpu: std.Target.Cpu = .{<br>.arch = .aarch64,<br>.model = &std.Target.aarch64.cpu.apple_m3,<br>// ...

This file can be accessed via @import("builtin") and all the constants inspected<br>at compile time.

We can make an integer, and a boolean is even easier:

pub fn boolean(frng: *FRNG) Error!bool {<br>return (try frng.int(u8)) & 1 == 1;

Strictly speaking, we only need one bit, not one byte, but tracking<br>individual bits is too much of a hassle.

From an arbitrary int, we can generate an int in range. As per<br>Random Numbers Included, we use a closed range, which<br>makes the API infailable and is usually more convenient at the<br>call-site:

pub fn int_inclusive(frng: *FRNG, Int: type, max: Int)...

frng const size entropy error type

Related Articles