A Vision for a Rust Formal Specification

emschwartz1 pts0 comments

A Vision for a Rust Formal Specification | Nadri's musings

Many people want to know, precisely, which pieces of text are valid Rust programs,<br>and for those that are, what they do.<br>This group includes compiler writers, language designers, researchers, unsafe code writers,<br>safety-critical industry assessors, and of course any Rust developer trying to understand a piece of<br>code.

A document that answers these questions is called a "specification".<br>If it is precise enough, it's called "formal".<br>We'd quite like to get one of these.<br>Even better, an ideal specification would also be:

Understandable by non-experts;

Executable, so we can test it on real code;

Well-structured, so it can be turned into maths and have stuff proven about it;

Easy to evolve, so it can be at the heart of language evolution instead of an afterthought.

We're not close to having that today,<br>but lately a couple of efforts have been coming together<br>that paint a story I find quite compelling.<br>Allow me to sketch it for you.

The Core Ingredients: MiniRust, Desugarings, and Trait Proofs

The Base: MiniRust

MiniRust is the backbone of this whole vision.<br>It's a project led by Pr. Ralf Jung<br>to precisely specify the runtime (aka dynamic) semantics<br>of Rust code, which includes precisely answering the<br>question of what is or isn't Undefined Behavior (UB).

MiniRust ticks all the boxes I laid out above:

It's written in code mixed with English explanations, legible by anyone who knows Rust;

It takes the form of an interpreter for a simplified Rust, which makes it directly executable. UB<br>is simply an exception raised by this interpreter when encountering an invalid program state;

It is amenable to being turned into maths, I hear there are students working on it;

It is easy to evolve: two<br>recent proposals for changes to Rust<br>semantics came with patches to MiniRust in order to remove the ambiguity inherent in an<br>English-text proposal.

It's a real thing you can use today, and it is close to being considered the source of truth<br>about Rust's runtime semantics1.<br>That part I consider taken care of, and I tip my hat to Ralf for a beautiful vision and execution.

From Rust to MiniRust: Desugarings

MiniRust captures a very bare-bones version of Rust:<br>it has no generics, no patterns, no method call syntax, no dropping at end of scope (in fact, no<br>scopes), etc.<br>It's reduced to 1. a control-flow graph, 2. individual statements that each do one simple operation.<br>This is technically enough to execute any real Rust program,<br>but MiniRust doesn't describe how to handle any of these advanced features.

Hence the second piece of the puzzle: we need to describe how to go from a real Rust program<br>to one that MiniRust can execute.<br>Based on an idea shared by Niko Matsakis, I've been exploring the approach<br>of defining advanced features by translating (desugaring)<br>them into more basic ones.<br>E.g. pattern-matching can be defined by how it desugars into simple boolean conditions.

To prove that this idea could work, I wrote an mdbook<br>that sketches 30 or so desugaring steps that transform<br>a Rust function down to something very close to a MiniRust one.<br>I propose this as the second half of our spec: a tower of small desugarings<br>that explain powerful features in terms of simpler ones.

The Missing Piece: Generics and Trait Proofs

This two-part model (desugarings+MiniRust)<br>covers a significant part of the Rust language<br>but leaves one important domain uncovered: generics, and in particular traits.<br>MiniRust has no notion of traits, and it's also<br>not obvious how to describe trait solving as a desugaring.

It so happens that the types team has lately been<br>experimenting with an idea called "dictionary-passing style"<br>or "explicit trait proofs",<br>in which the trait solver would keep track explicitly of how it came to conclude that a particular<br>trait fact holds.

That's the third piece of the puzzle: I propose that<br>the target of our desugarings should have syntax to explicitly<br>refer to trait impls and trait bounds, to combine them, to call methods on them etc.<br>I wrote this blog post<br>to introduce the idea, and wrote up<br>a syntax proposal<br>that covers most of what we should need2.

Therefore what we get after desugaring isn't a MiniRust program,<br>but what I'm uninventively3 calling a PolyMiniRust program, which<br>is MiniRust extended with generics and trait proofs.

The Complete Specification

Here is what I have in mind:

The whole spec would be a program written in specr, the Rust-flavored literate language<br>backing MiniRust, with abundant explanatory English text (like MiniRust);

It would be structured as a book, much like rust-via-desugarings;

The first chapter would define the datastructures for our language, e.g. struct FunctionDeclaration, enum Type etc;

The second chapter would start with parsing text into these datastructures, followed by<br>a series of individual desugaring steps, applied in order to the source program;

The last chapter would define PolyMiniRust as a subset of...

rust minirust trait program desugarings specification

Related Articles