Dissecting Apple's Sparse Image Format (ASIF)

supermatou1 pts0 comments

Dissecting Apple's Sparse Image Format (ASIF) | schamper.dev

Contents

Dissecting Apple's Sparse Image Format (ASIF)

2026-06-18<br>· 18 min read<br>· Erik Schamper

At WWDC 2025, Apple announced macOS 26 Tahoe. One of the new features in macOS Tahoe is a new disk image format: ASIF.<br>Designed for use with virtual machines (its documentation lives under the Virtualization framework), ASIF takes a lot of inspiration from existing virtual disk formats. Practically, that means it&rsquo;s another sparse virtual disk format, and functions very similar to sparse VMDK, VHDX or QCOW2 files (for the uninitiated, it allow you to store a large disk, or file, in a smaller, &ldquo;sparse&rdquo; manner).

Shortly before the release of macOS Tahoe (late 2025), I thought it&rsquo;d be a fun exercise to try and write a parser for ASIF files. It&rsquo;s been a while since then, but I wanted to go back and show my process on how I approach these kinds of problems. Maybe someone unfamiliar with reverse engineering file formats can pick up a thing or two. For that reason, you can find the occasional &ldquo;Research note&rdquo; sprinkled throughout this post with some additional insights.

Let&rsquo;s create a test file with the command listed in the Apple documentation, write a test pattern to it and get started:

Research note

For testing purposes, I usually like to write a test pattern that allows me to verify the content matches the &ldquo;offset&rdquo;. In this case, basically just numbered 1 MiB blocks of bytes. There are definitely better test patterns, but for an initial peek at the file format, it&rsquo;s also important to just fill up the file with anything. Having a predictable and verifiable pattern can make later steps easier.

❯ diskutil image create blank --fs none --format ASIF --size 1GiB file

file.asif created

❯ diskutil image attach -nomount file.asif<br>/dev/disk4

❯ python3<br>Python 3.14.0 (main, Oct 7 2025, 09:34:52) [Clang 17.0.0 (clang-1700.3.19.1)] on darwin<br>Type "help", "copyright", "credits" or "license" for more information.<br>>>> fh = open("/dev/disk4", "wb")<br>>>> for i in range(255):<br>... fh.write(bytes([i] * 1024 * 1024))<br>>>> fh.close()

❯ hdiutil detach disk4<br>"disk4" ejected.

Eyeball hexdumps

As usual, we start by eyeballing some hexdumps to see if we can discern some details.

❯ xxd file.asif | head -5

000102030405060708090A0B0C0D0E0F

00000000

73686477000000010000020000000000

shdw············

00000010

00000000000002000000000000041400

00000020

8af9ead2cf3849c08eec0095cf5c7899

·····8I······\x·

00000030

00000000001dcd650000080000000000

·······e········

00000040

001000000200000000000000ffffffff

We can immediately spot some kind of file magic, followed by some big endian looking integers.

Research note

Whenever you&rsquo;re reverse engineering a file format and you see some magic bytes, it&rsquo;s always a good idea to search for any available information on it online. I usually search for a combination of the string/byte representation, big endian hex and little endian hex of the file magic in various search engines (Google, GitHub, VirusTotal Retrohunt). In this case, I didn&rsquo;t find much useful information.

As for &ldquo;spotting endianness&rdquo; or integer fields, it&rsquo;s almost like riding a bicycle after a while. I guess a tip is to scan from left to right in chunks of 4 bytes (uint32), then 8 bytes (uint64), then possibly dividing into smaller chunks (uint16 or even uint8), until you can parse out reasonable looking integers (round base 16, or cross reference with offsets in the file, optionally multiplied by other values you spot). If you see &ldquo;natural order&rdquo; looking integers, it&rsquo;s big endian. If it looks reverse, it&rsquo;s little endian.

Let&rsquo;s quickly type up a rough structure, making a best guess at the integer widths and inspect it further with dissect.cstruct:

# /// script<br># requires-python = ">=3.10"<br># dependencies = ["dissect.cstruct"]<br># ///

import sys

from dissect.cstruct import cstruct, dumpstruct

asif_def = """<br>struct header {<br>char magic[4];<br>uint32 field4;<br>uint32 field8;<br>uint32 fieldC;<br>uint64 field10;<br>uint64 field18;<br>char field20[16];<br>uint64 field30;<br>uint64 field38;<br>uint32 field40;<br>uint32 field44;<br>uint32 field48;<br>uint32 field4C;<br>};<br>"""<br>c_asif = cstruct(asif_def, endian=">")

with open(sys.argv[1], "rb") as fh:<br>header = c_asif.header(fh)<br>dumpstruct(header)

000102030405060708090A0B0C0D0E0F

00000000

73686477000000010000020000000000

shdw············

00000010

00000000000002000000000000041400

00000020

8af9ead2cf3849c08eec0095cf5c7899

·····8I······\x·

00000030

00000000001dcd650000080000000000

·······e········

00000040

001000000200000000000000ffffffff

header

Field<br>Offset<br>Size<br>Value

magic[4]

0x0000

4 bytes

b'shdw'

field4

0x0004

4 bytes

0x1

field8

0x0008

4 bytes

0x200

fieldC

0x000c

4 bytes

0x0

field10

0x0010

8 bytes

0x200

field18

0x0018

8 bytes

0x41400

field20[16]

0x0020

16...

file rsquo bytes asif format uint32

Related Articles