Cross-Language Data Types

Cross-Language Data Types - ekxide Blog | ekxide

Contact

Cross-Language Data Types Andreas Weis - 15/06/2026 When using different programming languages like C++ and Rust in the same project, one problem that always comes up is how to share data across language boundaries. In this article we will explore some of the options that can be used for sharing data between C++ and Rust code. Our working assumption will be that we want to share data without copying it, in order to allow efficient sharing of large sets of data. Memory Representation The first step in allowing such forms of data sharing is to ensure that our data type of choice can actually be represented in both C++ and Rust. What this boils down to usually, is that we restrict ourselves to the same data types that are available in C: The elementary data types for signed and unsigned integers, pointers, and floating points. And the ability to build compound array and struct types from those elementary types. C is de facto the lingua franca when it comes to interoperability between programming languages, so whenever we want to ensure that data can be passed across language boundaries, we fall back to what is representable with C. In C++, thanks to its backward-compatibility with C, struct types follow the same memory layout, as long as they don't use any C++-only language features that impact the memory layout. The C++ standard uses the term standard-layout class type for such types. The C++ standard library also provides the std::is_standard_layout type trait to check whether a type upholds these constraints. cpp // A point in 3D space struct Point3 { std::int32_t x; std::int32_t y; std::int32_t z; }; static_assert(std::is_standard_layout_vPoint3>);

Rust by default reserves a lot more liberties for the exact layout of its data types, but it provides the repr(C) representation for forcing types to use a memory layout that is compatible with C. rust // A point in 3D space #[repr(C)] struct Point3 { x: i32, y: i32, z: i32,

While each language provides mechanisms for ensuring that the declared type uses the correct memory layout to be compatible with C, there is no built-in way in the languages to ensure that the two types from the C++ and Rust world are compatible with each other. We must be extra careful to ensure that the declarations are indeed consistent. The reward for those struggles is that we end up with data types that have the exact same layout in both languages, so we can send the raw bits from Rust to C++ (or vice versa) and they can be directly accessed with the same meaning in the other language. Preserving Type Invariants As long as our shared data is nothing more than a soup of numerical values, ensuring a consistent memory layout is all we need. For more complex data types, additional concerns may arise, in particular if a type relies on complex invariants regarding its state. The valid values for a member of the type are often constrained, potentially depending on the value of other fields. For example, consider the following type representing rational numbers: rust struct Rational { numerator: i32, denominator: i32,

cpp struct Rational { std::int32_t numerator; std::int32_t denominator; };

The denominator must not be set to 0. Also, if the fraction is stored in reduced form, each change to one of the fields potentially requires a change to the other to maintain the reduced form. Violating these constraints may result in a value that is no longer valid for this type. Such problems are well addressed by the use of encapsulation. Encapsulation requires a set of operations to be shipped alongside the data. Data is not accessed directly, but only via the operations operating on the type, which in turn have been carefully designed (and tested) to uphold any type invariants. In the example above, a setter for the fraction could reject values of 0 for the denominator and take care of properly reducing the fraction when writing the fields. For complex types, it is not sufficient to ensure a consistent memory layout, we must also ensure that the surrounding program logic operating on such data is consistent. There are generally two ways to address this. Language Bindings Instead of just sharing the data layout between languages, this approach shares code as well. We implement the methods that act on the underlying data once in our programming language of choice, and then provide bindings for all the other languages that allow invoking these functions. Internally these bindings will use the foreign function interface (FFI) of the respective language. The obvious advantage of this approach is that it is very easy to enforce consistency between implementations, as there is only one single implementation of the core logic interacting with the data. The maintenance and testing burden is also carried in large part by that single implementation. The downside of this approach is that the complete interface of the type will have to fit through the...

Cross-Language Data Types

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

German ruling declares Google liable for false answers in AI Overviews

Britain Became as Poor as Mississippi