The Typestate Pattern in Rust - Cliffle
Cliffle
Fortuna Eruditis Favet
About Me
Blog
Projects
Electronics Store
Papers
Food
Colophon
Copyright ©2011-2024 Cliff L. Biffle
Contact
— RSS
2019-06-05
What are typestates?
A simple example: the living and the dead
More than one “living” state
Typestate in the wild: serde
Variation: state type parameter
Variation: state types that contain actual state
Conclusions
The typestate pattern is an API design pattern that encodes information about<br>an object’s run-time state in its compile-time type. In particular, an API<br>using the typestate pattern will have:
Operations on an object (such as methods or functions) that are only available<br>when the object is in certain states,
A way of encoding these states at the type level, such that attempts to use<br>the operations in the wrong state fail to compile,
State transition operations (methods or functions) that change the<br>type-level state of objects in addition to, or instead of, changing run-time<br>dynamic state, such that the operations in the previous state are no longer<br>possible.
This is useful because:
It moves certain types of errors from run-time to compile-time, giving<br>programmers faster feedback.
It interacts nicely with IDEs, which can avoid suggesting operations that are<br>illegal in a certain state.
It can eliminate run-time checks, making code faster/smaller.
This pattern is so easy in Rust that it’s almost obvious, to the point that<br>you may have already written code that uses it, perhaps without realizing it.<br>Interestingly, it’s very difficult to implement in most other programming<br>languages — most of them fail to satisfy items number 2 and/or 3 above.
I haven’t seen a detailed examination of the nuances of this pattern, so here’s<br>my contribution.
What are typestates?
Typestates are a technique for moving properties of state (the dynamic<br>information a program is processing) into the type level (the static world<br>that the compiler can check ahead-of-time).
Typestates are a broader topic than the specific pattern I’ll discuss here,<br>which is why I’m calling it the “typestate pattern.”
The special case of typestates that interests us here is the way they can<br>enforce run-time order of operations at compile-time. Here are some examples<br>of properties that can be enforced by the typestate pattern in Rust (I assert<br>— I don’t have implementations of all of them):
“The buffer can only be translated if you have checked that it’s valid UTF-8.”
“You must not perform any I/O operations on a file handle after it’s been<br>closed.”
“These messages can only be sent to the client after authentication has<br>succeeded, and not after we have ended the session.”
“Once you have done action A, you must perform either B or C (but not both)<br>before you can do D.”
In most other languages, we would have to handle these with runtime checks and<br>errors/exceptions. Or, we might get lazy and not check them at all, instead<br>mentioning them in the documentation and hoping people read it!
With the typestate pattern, we can prevent code that breaks these rules from<br>compiling, helping programmers find mistakes earlier and eliminating the<br>overhead of run-time checks.
A simple example: the living and the dead
There’s a common pattern in Rust libraries that allows an API to have two<br>states, “living” and “dead.” Or, to put things concretely, std::fs::File<br>from the standard library has two states: “open” and “closed.” If you have<br>access to a File, it’s open: the only way to obtain one1 is from the<br>open operation:
// At this point, we do not yet have access to a File.
// Open one:<br>let file = std::fs::File::open("myfile.txt")?;
// Here, we have access to `file`, and it's open. If opening had<br>// failed, we would have gotten an error.
In every case, when I say something like “the only way to X,”<br>there’s an implicit caveat: “using safe code.” Using unsafe code it is<br>possible to violate any invariant in the language, if you work hard enough at<br>it! All guarantees discussed here are in terms of safe code.
We can close a file by letting it go out of scope, but for the sake of this<br>discussion, let’s explicitly give up access using drop:
drop(file);
// Trying to use the file after close is a compile error:<br>// file.read_to_string(&mut buffer);<br>This works because of the signature of drop:
pub fn dropT>(value: T);
That is, drop takes its argument by value, not by reference (&T). This<br>means the argument gets moved into the drop function, and the caller loses<br>access to it.
This is an example of the typestate pattern, enforcing the property “you may not<br>perform other operations on a File after closing it.”
This might look like RAII to you, and you’re right. In Rust, most cases of the<br>RAII pattern are also applications of the two-state typestate pattern. For<br>instance, Box maintains two properties of the pointer it contains:
Only one Box may point to a given heap-allocated chunk at any given time.
Once the heap-allocated chunk...