Writing portable ARM64 assembly · Ariadne's Space
Ariadne's Space
Pulling at the threads of complexity...
Writing portable ARM64 assembly
Apr 12, 2023 · 4 min read
An unfortunate side effect of the rising popularity of Apple’s ARM-based<br>computers is an increase in unportable assembly code which targets the<br>64-bit ARM ISA. This is because developers are writing these bits of<br>assembly code to speed up their programs when run on Apple’s ARM-based<br>computers, without considering the other 64-bit ARM devices out there,<br>such as SBCs and servers running Linux or BSD.
The good news is that it is very easy to write assembly which targets<br>Apple’s computers as well as the other 64-bit ARM devices running<br>operating systems other than Darwin. It just requires being aware of<br>a few differences between the Mach-O and ELF ABIs, as well as knowing<br>what Apple-specific syntax extensions to avoid. By following the<br>guidance in this blog, you will be able to write assembly code which<br>is portable between Apple’s toolchain, the official ARM assembly<br>toolchain, and the GNU toolchain.
Differences between the ELF and Mach-O ABIs
Modern UNIX systems, including Linux-based systems largely use the<br>ELF binary format. Apple uses Mach-O in Darwin<br>instead for historical reasons. This is not a requirement for Apple<br>imposed by their use of Mach, indeed, OSFMK, the kernel that Darwin,<br>MkLinux and OSF/1 are all based on, supports ELF binaries just fine.<br>Apple just decided to use the Mach-O format instead.
When it comes to writing assembly (or, really, just linking code<br>in general) targeting Darwin, the main difference to be aware of is<br>that all symbols are prefixed with a single underscore. For example,<br>if you have a function that would be declared in C like:
extern void unmask(const char *payload, const char *mask, size_t len);
On Darwin, the function in your assembly code must be defined as _unmask.
The other major difference is that ELF defines different classes of<br>data, for example STT_FUNC and STT_OBJECT. There is no equivalence<br>in Mach-O, and thus the .type directive that you would use when writing<br>assembly for ELF targets is not supported.
A brief note on Platform ABIs
You will also need to be aware of minor differences between the Darwin<br>ABI and other platform ABIs. A notable example is that the x18<br>register is reserved by the Darwin ABI and is explicitly zeroed on<br>context switches in some cases. This register is also reserved on<br>Android, but not on GNU/Linux or Alpine.
Apple-specific vector mnemonics
The other main thing to watch out for is Apple’s custom mnemonics for<br>NEON. In order to make writing NEON code less cumbersome, Apple<br>introduced a set of mnemonics that allow simplification of specifying<br>NEON instructions. For example, if you are targeting Apple devices<br>only, you might write an exclusive-or NEON instruction like so:
eor.16b v2, v2, v0
This is an Apple-specific extension to the ARM assembly syntax. The<br>official ARM assembly manual specifies that the memory layout<br>must be specified for each register:
eor v2.16b, v2.16b, v0.16b
Abstracting the ABI details with some macros
The good news is that the ABI details can easily be abstracted with a<br>few macros. As for using NEON functions, the answer is simple: stick to<br>what the ARM manual says to do, rather than using Apple’s mnemonics.
There are two macros that you need. These can be placed in a header<br>file somewhere if wanted.
The first macro allows you to deal with the underscore requirement of the<br>Darwin ABI:
#ifdef __APPLE__<br># define PROC_NAME(__proc) _ ## __proc<br>#else<br># define PROC_NAME(__proc) __proc<br>#endif
The second macro is optional, but it allows you to define the correct<br>ELF symbol types outside of Apple’s toolchain:
#ifdef __clang__<br># define TYPE(__proc, __typ)<br>#else<br># define TYPE(__proc, __typ) .type __proc, __typ<br>#endif
Then you just write your assembly as normal, but using these macros:
.global PROC_NAME(unmask)<br>.align 2<br>TYPE(unmask, @function)<br>PROC_NAME(unmask):<br>...
And that’s all there is to it. As long as you follow these guidelines,<br>you will have assembly which is portable to any UNIX-like environment on<br>64-bit ARM.