A Vision for a Rust Formal Specification | Nadri's musings
Many people want to know, precisely, which pieces of text are valid Rust programs,<br>and for those that are, what they do.<br>This group includes compiler writers, language designers, researchers, unsafe code writers,<br>safety-critical industry assessors, and of course any Rust developer trying to understand a piece of<br>code.
A document that answers these questions is called a "specification".<br>If it is precise enough, it's called "formal".<br>We'd quite like to get one of these.<br>Even better, an ideal specification would also be:
Understandable by non-experts;
Executable, so we can test it on real code;
Well-structured, so it can be turned into maths and have stuff proven about it;
Easy to evolve, so it can be at the heart of language evolution instead of an afterthought.
We're not close to having that today,<br>but lately a couple of efforts have been coming together<br>that paint a story I find quite compelling.<br>Allow me to sketch it for you.
The Core Ingredients: MiniRust, Desugarings, and Trait Proofs
The Base: MiniRust
MiniRust is the backbone of this whole vision.<br>It's a project led by Pr. Ralf Jung<br>to precisely specify the runtime (aka dynamic) semantics<br>of Rust code, which includes precisely answering the<br>question of what is or isn't Undefined Behavior (UB).
MiniRust ticks all the boxes I laid out above:
It's written in code mixed with English explanations, legible by anyone who knows Rust;
It takes the form of an interpreter for a simplified Rust, which makes it directly executable. UB<br>is simply an exception raised by this interpreter when encountering an invalid program state;
It is amenable to being turned into maths, I hear there are students working on it;
It is easy to evolve: two<br>recent proposals for changes to Rust<br>semantics came with patches to MiniRust in order to remove the ambiguity inherent in an<br>English-text proposal.
It's a real thing you can use today, and it is close to being considered the source of truth<br>about Rust's runtime semantics1.<br>That part I consider taken care of, and I tip my hat to Ralf for a beautiful vision and execution.
From Rust to MiniRust: Desugarings
MiniRust captures a very bare-bones version of Rust:<br>it has no generics, no patterns, no method call syntax, no dropping at end of scope (in fact, no<br>scopes), etc.<br>It's reduced to 1. a control-flow graph, 2. individual statements that each do one simple operation.<br>This is technically enough to execute any real Rust program,<br>but MiniRust doesn't describe how to handle any of these advanced features.
Hence the second piece of the puzzle: we need to describe how to go from a real Rust program<br>to one that MiniRust can execute.<br>Based on an idea shared by Niko Matsakis, I've been exploring the approach<br>of defining advanced features by translating (desugaring)<br>them into more basic ones.<br>E.g. pattern-matching can be defined by how it desugars into simple boolean conditions.
To prove that this idea could work, I wrote an mdbook<br>that sketches 30 or so desugaring steps that transform<br>a Rust function down to something very close to a MiniRust one.<br>I propose this as the second half of our spec: a tower of small desugarings<br>that explain powerful features in terms of simpler ones.
The Missing Piece: Generics and Trait Proofs
This two-part model (desugarings+MiniRust)<br>covers a significant part of the Rust language<br>but leaves one important domain uncovered: generics, and in particular traits.<br>MiniRust has no notion of traits, and it's also<br>not obvious how to describe trait solving as a desugaring.
It so happens that the types team has lately been<br>experimenting with an idea called "dictionary-passing style"<br>or "explicit trait proofs",<br>in which the trait solver would keep track explicitly of how it came to conclude that a particular<br>trait fact holds.
That's the third piece of the puzzle: I propose that<br>the target of our desugarings should have syntax to explicitly<br>refer to trait impls and trait bounds, to combine them, to call methods on them etc.<br>I wrote this blog post<br>to introduce the idea, and wrote up<br>a syntax proposal<br>that covers most of what we should need2.
Therefore what we get after desugaring isn't a MiniRust program,<br>but what I'm uninventively3 calling a PolyMiniRust program, which<br>is MiniRust extended with generics and trait proofs.
The Complete Specification
Here is what I have in mind:
The whole spec would be a program written in specr, the Rust-flavored literate language<br>backing MiniRust, with abundant explanatory English text (like MiniRust);
It would be structured as a book, much like rust-via-desugarings;
The first chapter would define the datastructures for our language, e.g. struct FunctionDeclaration, enum Type etc;
The second chapter would start with parsing text into these datastructures, followed by<br>a series of individual desugaring steps, applied in order to the source program;
The last chapter would define PolyMiniRust as a subset of...