Show HN: ZeroFS – Make S3 your primary storage

Eikon2 pts0 comments

ZeroFS — S3 as your primary storage

Make S3 yourprimary storage.

ZeroFS serves S3-compatible buckets as POSIX filesystems over NFS and 9P, or as raw block devices over NBD. Data is compressed and encrypted before upload. Warm reads come from local cache in microseconds.

curl -sSfL https://sh.zerofs.net | sh<br>Copy

Quickstart →

or via Docker · GitHub Action

8,6621<br>POSIX suite tests in CI

1.6&thinsp;µs2<br>Random reads, warm cache

0.83&thinsp;ms3<br>Mean small-write latency

16&thinsp;EiB4<br>Maximum filesystem size

1 pjdfstest runs on every change. A few cases needing semantics NFS/9P can't express are excluded; the list is public in the repo.&ensp;<br>2 SQLite bench, random reads on ZeroFS served from local cache. A raw S3 round-trip is 50–300&thinsp;ms.&ensp;<br>3 File appends over NFS, ZeroFS bench suite; data at rest in S3.&ensp;<br>4 Addressable by design: 64-bit inode and size fields, 32&thinsp;KiB chunks.

01Verification

The test suites run in public CI.

CI runs pjdfstest, xfstests, kernel builds, stress-ng, and ZFS scrubs on every change. The first three run separately over NFS, 9P, and the FUSE client. Each card links to its workflow.

POSIX semantics

The pjdfstest suite runs on every change, once per protocol: permissions, ownership, links, rename behavior. The exclude lists, a few cases per protocol, are published in the repository.

pjdfstest workflow →

xfstests

The kernel filesystem test suite runs over NFS, 9P, and FUSE in separate workflows. These are the tests ext4 and XFS themselves are validated against.

xfstests workflow →

ZFS as the end-to-end test

CI builds a ZFS pool on ZeroFS block devices, extracts the Linux kernel source tree onto it, and runs a full scrub. The scrub reports no checksum errors.

zfs-test workflow →

Kernel builds

CI compiles the Linux kernel on NFS, 9P, and FUSE mounts with make -j$(nproc). Parallel compilation is the stress test: many processes writing the same tree at once.

kernel-compile workflow →

stress-ng

Filesystem stressors run against live mounts in CI: access, chdir, chmod, chown, and the rest of the file-handling set, all at once.

stress-ng workflow →

Self-hosting

The Rust toolchain builds ZeroFS on a filesystem that ZeroFS itself is serving. This one is a recorded session rather than a CI job.

watch the recording →

02Protocols

Files over NFS and 9P, block devices over NBD.

All three servers run in one userspace process against the same bucket. Clients mount it with the NFS and 9P support already in Linux, the NFS clients other systems ship, or nbd-client for block devices.

NFS

9P

NBD

S3

ENCRYPTED OBJECTS

NFS

9P

NBD

S3

ENCRYPTED OBJECTS

File · everywhere<br>NFS

macOS, Linux, Windows, and the BSDs mount it over their own NFS support, with nothing extra installed on the client. The server stays in userspace.

# mounts from any major OS<br>mount -t nfs 127.0.0.1:/ /mnt/zerofs

File · precise<br>9P

9P follows POSIX more closely than NFS, and fsync returns only after data reaches stable storage. The bundled FUSE client mounts without root and reconnects on its own.

# bundled FUSE client, no root<br>zerofs mount 127.0.0.1:5564 /mnt/zerofs

Block · raw<br>NBD

Raw block devices stored in the bucket hold ext4 filesystems, ZFS pools, or VM boot disks. New devices are picked up at runtime, with no server restart.

# attach a block device<br>nbd-client 127.0.0.1 10809 /dev/nbd0 -N vol1

Runs on Amazon S3Google Cloud StorageAzure BlobAny S3-compatible storelocal disk

03Geo-distribution

A ZFS mirror across three S3 regions.

Each ZeroFS instance exposes one S3 region as a block device. To ZFS they are plain disks, so a mirror that spans continents is set up like any other pool.

global-pool — zsh

$ nbd-client 10.0.1.5 10809 /dev/nbd0 -N storage -persist # us-east

$ nbd-client 10.0.2.5 10809 /dev/nbd1 -N storage -persist # eu-west

$ nbd-client 10.0.3.5 10809 /dev/nbd2 -N storage -persist # ap-southeast

$ zpool create global-pool mirror /dev/nbd0 /dev/nbd1 /dev/nbd2

$ zpool status global-pool | grep state

state: ONLINE

If a region becomes unreachable, the pool degrades and the data stays available from the other two.

04Capabilities

Eight properties of the storage engine.

04.1Always encrypted<br>Every block is encrypted with XChaCha20-Poly1305 before upload. The data key is wrapped with a key derived from your password using Argon2id. There is no unencrypted mode.

04.2Compression<br>Data is compressed with zstd or lz4 before encryption. The codec can change at any time without migration, since the codec of existing data is detected on read.

04.3Caching<br>Configurable memory and disk caches hold recently used blocks. Warm reads return in microseconds. A raw S3 round trip takes 50 to 300 milliseconds.

04.4Checkpoints<br>Named checkpoints capture the filesystem at a point in time. Any of them can be opened read-only with a single flag at startup.

04.5Read replicas<br>One instance writes while read-only instances serve the same bucket. Replicas pick up the writer's...

zerofs block client storage data devices

Related Articles