Building Docker images 7x Faster with Clipper

Building Docker images 7x Faster — Clipper Blog

DocsGet Started

← All postsIn my previous post, I announced native BuildKit support for Clipper. In this one, I'll go into why it's so much faster.

Background

First off, a quick explanation for what makes Clipper's format different than Docker.

I'm not going to go into too much on how Docker works, but generally: During a docker build, each command in a Dockerfile (such as RUN apt install vim or COPY --from=build /install /usr/local/) creates a filesystem. That filesystem consists of the files that were added/changed/removed during the build. Once the build is complete, each of those filesystems is transfomred into a layer.

For Docker/OCI, that layer is a a compressed tarball. A simple layer that adds three shared objects might look like this, as one monolithic tar file:

Clipper instead produces a table of contents JSON file holding the metadata as well as separate blobs containing the file data.

(a lot of fields are omitted here as well as some implementation details around small files)

When clipper pull is run with such a layer, the client does one of two things:

for containerd (or Kubernetes/Docker on top of containerd), we can directly write a new filesystem into containerd's content store, with no round tripping through other formats

for Docker (no containerd) and Podman, we convert back to a tar layer and let those systems handle conversion back to a filesystem

Some differences to note:

With regular OCI layers, changing the metadata for a file results in a completely new tarball layer with no data shared with the old one. Clipper doesn't suffer from this - a new TOC will be created, which will usually be under a MB in size, and the file data will be shared between the old and new layers.

All OCI objects are referenced by compressed digest. Recompressing, uncompressing, or just running a build on another machine with a newer compression library will result in a new hash for an object. All Clipper references are by uncompressed digest, even if compression is later applied to an object.

It's somewhat possible to squint and see how this will result in faster pushes and pulls due to better sharing between layers, so let's talk about builds.

The test setup

Scenarios

llamacpp

Building llama.cpp, with CUDA enabled, and copying the build results on top of a runtime CUDA base layer.

We bust the cache every build by injecting the current timestamp into the main cpp file.

DockerfileARG BASE_IMAGE ARG RUNTIME_BASE

FROM ${BASE_IMAGE} AS build

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ --mount=type=cache,target=/var/lib/apt,sharing=locked \ rm -f /etc/apt/apt.conf.d/docker-clean && \ apt-get update && \ apt-get install -y --no-install-recommends \ build-essential cmake git ca-certificates curl xz-utils

ARG CCACHE_VERSION=4.13.6 ARG TARGETARCH RUN --mount=type=tmpfs,target=/tmp case "$TARGETARCH" in \ amd64) cc_arch=x86_64 ;; \ arm64) cc_arch=aarch64 ;; \ *) echo "unsupported TARGETARCH=$TARGETARCH for ccache install" >&2; exit 1 ;; \ esac && \ pkg="ccache-${CCACHE_VERSION}-linux-${cc_arch}-musl-static" && \ curl -fsSL "https://github.com/ccache/ccache/releases/download/v${CCACHE_VERSION}/${pkg}.tar.xz" -o /tmp/ccache.tar.xz && \ tar -xJf /tmp/ccache.tar.xz -C /usr/local/bin --strip-components=1 --no-same-owner "${pkg}/ccache" && \ ccache --version

ADD https://github.com/ggml-org/llama.cpp.git#0827b2c1da299805288abbd556d869318f2b121e /src WORKDIR /src

ARG CACHE_BUST ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs/ RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 RUN --mount=type=cache,target=/root/.cache/ccache \ export CCACHE_LOGFILE=/tmp/ccache.log CCACHE_STATSLOG=/tmp/ccache.statslog && \ echo "// bench-mutation ${CACHE_BUST}" >> src/llama.cpp && \ cmake -B build \ -DGGML_CUDA=ON \ -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache && \ cmake --build build -j"$(nproc)" --target llama-cli

RUN mkdir -p /opt/llama/bin /opt/llama/lib && \ cp build/bin/llama-cli /opt/llama/bin/ && \ cp build/bin/*.so* /opt/llama/lib/

FROM ${RUNTIME_BASE} COPY --from=build /opt/llama /usr/local uv

Running uv sync on a pyproject with a number of ML packages

DockerfileARG BASE_IMAGE FROM ${BASE_IMAGE}

COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

WORKDIR /app COPY pyproject.toml uv.lock ./

ARG CACHE_BUST

RUN echo "$CACHE_BUST" >/dev/null && \ uv sync --frozen --no-progress && \ rm -rf /root/.cache/uv ENV PATH=/app/.venv/bin:$PATH pyproject.toml[project] name = "ml-bench" version = "0" requires-python = "==3.14.*" dependencies = [ "tensorrt", "accelerate", "datasets", "scipy", "scikit-learn", "pandas",

[tool.uv] package = false There are ways to improve both Dockerfiles, but they represent pretty normal usage.

Builders

GHA:

Default GitHub Actions 4 core Ubuntu builders

There's some amount of variance between runs in...

Building Docker images 7x Faster with Clipper

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

The labor share of income in the US is at its lowest post-war level