Diplomat: Multi-language FFI for Rust Libraries - In Pursuit of Laziness
This is a post I’ve been meaning to write and publish for years, and only recently got around to doing it. I’m hoping to get back into writing more!
For the past few years, as a part of my work on ICU4X, I’ve been working on Diplomat, a multi-language unidirectional FFI tool for wrapping Rust libraries.
I originally designed Diplomat in 2021 as a response to the question “What is the best way to expose ICU4X (A Rust library) to other programming languages?”. For context, while written in Rust, one of ICU4X’s core design goals was to be available to any programming language, starting with a core set and expanding over time. This is in contrast to the existing Unicode libraries ICU4C and ICU4J, which serve C/C++ and Java respectively.
In the long run, for such a project, tooling becomes a necessity. If ICU4X was just being exposed to a single language, this could potentially be feasible: someone manually writes FFI for every new API that gets written in Rust, and you need to ramp up at least part of the team on writing FFI for one particular language. However, as the number of languages you wish to support grows, this becomes more and more untenable. It is unreasonable to expect most members of an engineering team to be experts on the FFI peculiarities of C++, JS, Dart, the JVM, etc.
When we were getting started, I performed an investigation of the available tooling at the time, and arrived at the conclusion that none of the existing tools served our use case: a library in Rust wishing to expose an API to multiple languages. Some of these tools answered part of the story but would need to be stitched together with other work. I also wrote down a design for my “pie in the sky FFI tool” that I figured would be too much of a yak shave to build, but would fill this gap in the Rust FFI tooling ecosystem I have felt for a long time. In the meantime, we stuck to manually written C bindings as we were still figuring stuff out.
One of the core reasons the existing FFI tools didn’t work was that they weren’t “unidirectional”, they were “bidirectional”, or “unidirectional” but going in the opposite direction1.
What’s “unidirectional” and “bidirectional” in the context of an FFI tool?
So, it’s possible this is terminology I just made up one day2, but it’s an ontology that I’ve found useful on many, many occasions, so I think it’s worth introducing
Unidirectional vs bidirectional FFI tools
In general when doing FFI there are, broadly speaking, two distinct possible goals, with distinct characteristics.
One use case, served by tools like bindgen, cbindgen, wasm-bindgen, uniffi, and PyO3, is when you have a library in one language which you wish to use from another language. This is “unidirectional” FFI, since the wrapped library doesn’t need to know anything about the codebase calling into it.
Note that calls in “unidirectional” FFI can still go in both ways; a unidirectional FFI tool may support things like callbacks that allow the calling codebase to pass a closure to the library and have the library invoke it. This is still unidirectional since the API definition is within the wrapped library.
The other use case, served by tools like cxx, autocxx, crubit, and swift-bridge is where you are working on a combined codebase of two languages and need interop in “both ways”, e.g. you need Rust to be able to access C++ APIs and C++ to be able to access Rust APIs. This is the kind of interop situation I recall when working on Stylo, the project to use Servo’s style system in Firefox. Even with Servo being relatively modular, this was not a case of “call Servo like a library”, it was a case of integrating two codebases with a somewhat jagged API boundary. At the time there was not much tooling and we managed to convince bindgen to work for this, however this was very much a “bidirectional” use case.
Bidirectional tools can often be used for unidirectional use cases, but they are also usually designed with those two specific languages in mind, which constrains the utility of the underlying bindings for work with other languages. You can’t use the bindings as a neutral “hub” that many languages radiate out from.
A wishlist for an FFI tool
When designing Diplomat, there were several things I had in mind that may not necessarily match choices made by other FFI tools:
No action-at-a-distance
Editing your regular library Rust code should never silently change your FFI layer. I did not want Diplomat to parse the full dependency graph: it should be abundantly clear when an edit to code is going to change the FFI layer, by restricting what Diplomat consumes to specially-tagged “bridge”3 code. In ICU4X, the FFI layer only changes when people update the Diplomat “bridge” code living under ffi/capi.
Why is this a useful property for a tool to have?
For one, it’s just easier to design a tool when it does not need to parse the full range of what Rust...