Dissecting Apple's Sparse Image Format (ASIF) | schamper.dev
Contents
Dissecting Apple's Sparse Image Format (ASIF)
2026-06-18<br>· 18 min read<br>· Erik Schamper
At WWDC 2025, Apple announced macOS 26 Tahoe. One of the new features in macOS Tahoe is a new disk image format: ASIF.<br>Designed for use with virtual machines (its documentation lives under the Virtualization framework), ASIF takes a lot of inspiration from existing virtual disk formats. Practically, that means it’s another sparse virtual disk format, and functions very similar to sparse VMDK, VHDX or QCOW2 files (for the uninitiated, it allow you to store a large disk, or file, in a smaller, “sparse” manner).
Shortly before the release of macOS Tahoe (late 2025), I thought it’d be a fun exercise to try and write a parser for ASIF files. It’s been a while since then, but I wanted to go back and show my process on how I approach these kinds of problems. Maybe someone unfamiliar with reverse engineering file formats can pick up a thing or two. For that reason, you can find the occasional “Research note” sprinkled throughout this post with some additional insights.
Let’s create a test file with the command listed in the Apple documentation, write a test pattern to it and get started:
Research note
For testing purposes, I usually like to write a test pattern that allows me to verify the content matches the “offset”. In this case, basically just numbered 1 MiB blocks of bytes. There are definitely better test patterns, but for an initial peek at the file format, it’s also important to just fill up the file with anything. Having a predictable and verifiable pattern can make later steps easier.
❯ diskutil image create blank --fs none --format ASIF --size 1GiB file
file.asif created
❯ diskutil image attach -nomount file.asif<br>/dev/disk4
❯ python3<br>Python 3.14.0 (main, Oct 7 2025, 09:34:52) [Clang 17.0.0 (clang-1700.3.19.1)] on darwin<br>Type "help", "copyright", "credits" or "license" for more information.<br>>>> fh = open("/dev/disk4", "wb")<br>>>> for i in range(255):<br>... fh.write(bytes([i] * 1024 * 1024))<br>>>> fh.close()
❯ hdiutil detach disk4<br>"disk4" ejected.
Eyeball hexdumps
As usual, we start by eyeballing some hexdumps to see if we can discern some details.
❯ xxd file.asif | head -5
000102030405060708090A0B0C0D0E0F
00000000
73686477000000010000020000000000
shdw············
00000010
00000000000002000000000000041400
00000020
8af9ead2cf3849c08eec0095cf5c7899
·····8I······\x·
00000030
00000000001dcd650000080000000000
·······e········
00000040
001000000200000000000000ffffffff
We can immediately spot some kind of file magic, followed by some big endian looking integers.
Research note
Whenever you’re reverse engineering a file format and you see some magic bytes, it’s always a good idea to search for any available information on it online. I usually search for a combination of the string/byte representation, big endian hex and little endian hex of the file magic in various search engines (Google, GitHub, VirusTotal Retrohunt). In this case, I didn’t find much useful information.
As for “spotting endianness” or integer fields, it’s almost like riding a bicycle after a while. I guess a tip is to scan from left to right in chunks of 4 bytes (uint32), then 8 bytes (uint64), then possibly dividing into smaller chunks (uint16 or even uint8), until you can parse out reasonable looking integers (round base 16, or cross reference with offsets in the file, optionally multiplied by other values you spot). If you see “natural order” looking integers, it’s big endian. If it looks reverse, it’s little endian.
Let’s quickly type up a rough structure, making a best guess at the integer widths and inspect it further with dissect.cstruct:
# /// script<br># requires-python = ">=3.10"<br># dependencies = ["dissect.cstruct"]<br># ///
import sys
from dissect.cstruct import cstruct, dumpstruct
asif_def = """<br>struct header {<br>char magic[4];<br>uint32 field4;<br>uint32 field8;<br>uint32 fieldC;<br>uint64 field10;<br>uint64 field18;<br>char field20[16];<br>uint64 field30;<br>uint64 field38;<br>uint32 field40;<br>uint32 field44;<br>uint32 field48;<br>uint32 field4C;<br>};<br>"""<br>c_asif = cstruct(asif_def, endian=">")
with open(sys.argv[1], "rb") as fh:<br>header = c_asif.header(fh)<br>dumpstruct(header)
000102030405060708090A0B0C0D0E0F
00000000
73686477000000010000020000000000
shdw············
00000010
00000000000002000000000000041400
00000020
8af9ead2cf3849c08eec0095cf5c7899
·····8I······\x·
00000030
00000000001dcd650000080000000000
·······e········
00000040
001000000200000000000000ffffffff
header
Field<br>Offset<br>Size<br>Value
magic[4]
0x0000
4 bytes
b'shdw'
field4
0x0004
4 bytes
0x1
field8
0x0008
4 bytes
0x200
fieldC
0x000c
4 bytes
0x0
field10
0x0010
8 bytes
0x200
field18
0x0018
8 bytes
0x41400
field20[16]
0x0020
16...