Building Docker images 7x Faster — Clipper Blog
DocsGet Started
← All postsIn my previous post, I announced native BuildKit support for Clipper. In this one, I'll go into why it's so much faster.
Background
First off, a quick explanation for what makes Clipper's format different than Docker.
I'm not going to go into too much on how Docker works, but generally: During a docker build, each command in a Dockerfile (such as RUN apt install vim or COPY --from=build /install /usr/local/) creates a filesystem. That filesystem consists of the files that were added/changed/removed during the build. Once the build is complete, each of those filesystems is transfomred into a layer.
For Docker/OCI, that layer is a a compressed tarball. A simple layer that adds three shared objects might look like this, as one monolithic tar file:
Clipper instead produces a table of contents JSON file holding the metadata as well as separate blobs containing the file data.
(a lot of fields are omitted here as well as some implementation details around small files)
When clipper pull is run with such a layer, the client does one of two things:
for containerd (or Kubernetes/Docker on top of containerd), we can directly write a new filesystem into containerd's content store, with no round tripping through other formats
for Docker (no containerd) and Podman, we convert back to a tar layer and let those systems handle conversion back to a filesystem
Some differences to note:
With regular OCI layers, changing the metadata for a file results in a completely new tarball layer with no data shared with the old one. Clipper doesn't suffer from this - a new TOC will be created, which will usually be under a MB in size, and the file data will be shared between the old and new layers.
All OCI objects are referenced by compressed digest. Recompressing, uncompressing, or just running a build on another machine with a newer compression library will result in a new hash for an object. All Clipper references are by uncompressed digest, even if compression is later applied to an object.
It's somewhat possible to squint and see how this will result in faster pushes and pulls due to better sharing between layers, so let's talk about builds.
The test setup
Scenarios
llamacpp
Building llama.cpp, with CUDA enabled, and copying the build results on top of a runtime CUDA base layer.
We bust the cache every build by injecting the current timestamp into the main cpp file.
DockerfileARG BASE_IMAGE<br>ARG RUNTIME_BASE
FROM ${BASE_IMAGE} AS build
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \<br>--mount=type=cache,target=/var/lib/apt,sharing=locked \<br>rm -f /etc/apt/apt.conf.d/docker-clean && \<br>apt-get update && \<br>apt-get install -y --no-install-recommends \<br>build-essential cmake git ca-certificates curl xz-utils
ARG CCACHE_VERSION=4.13.6<br>ARG TARGETARCH<br>RUN --mount=type=tmpfs,target=/tmp<br>case "$TARGETARCH" in \<br>amd64) cc_arch=x86_64 ;; \<br>arm64) cc_arch=aarch64 ;; \<br>*) echo "unsupported TARGETARCH=$TARGETARCH for ccache install" >&2; exit 1 ;; \<br>esac && \<br>pkg="ccache-${CCACHE_VERSION}-linux-${cc_arch}-musl-static" && \<br>curl -fsSL "https://github.com/ccache/ccache/releases/download/v${CCACHE_VERSION}/${pkg}.tar.xz" -o /tmp/ccache.tar.xz && \<br>tar -xJf /tmp/ccache.tar.xz -C /usr/local/bin --strip-components=1 --no-same-owner "${pkg}/ccache" && \<br>ccache --version
ADD https://github.com/ggml-org/llama.cpp.git#0827b2c1da299805288abbd556d869318f2b121e /src<br>WORKDIR /src
ARG CACHE_BUST<br>ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs/<br>RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1<br>RUN --mount=type=cache,target=/root/.cache/ccache \<br>export CCACHE_LOGFILE=/tmp/ccache.log CCACHE_STATSLOG=/tmp/ccache.statslog && \<br>echo "// bench-mutation ${CACHE_BUST}" >> src/llama.cpp && \<br>cmake -B build \<br>-DGGML_CUDA=ON \<br>-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \<br>-DCMAKE_C_COMPILER_LAUNCHER=ccache \<br>-DCMAKE_CXX_COMPILER_LAUNCHER=ccache && \<br>cmake --build build -j"$(nproc)" --target llama-cli
RUN mkdir -p /opt/llama/bin /opt/llama/lib && \<br>cp build/bin/llama-cli /opt/llama/bin/ && \<br>cp build/bin/*.so* /opt/llama/lib/
FROM ${RUNTIME_BASE}<br>COPY --from=build /opt/llama /usr/local<br>uv
Running uv sync on a pyproject with a number of ML packages
DockerfileARG BASE_IMAGE<br>FROM ${BASE_IMAGE}
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /app<br>COPY pyproject.toml uv.lock ./
ARG CACHE_BUST
RUN echo "$CACHE_BUST" >/dev/null && \<br>uv sync --frozen --no-progress && \<br>rm -rf /root/.cache/uv<br>ENV PATH=/app/.venv/bin:$PATH<br>pyproject.toml[project]<br>name = "ml-bench"<br>version = "0"<br>requires-python = "==3.14.*"<br>dependencies = [<br>"tensorrt",<br>"accelerate",<br>"datasets",<br>"scipy",<br>"scikit-learn",<br>"pandas",
[tool.uv]<br>package = false<br>There are ways to improve both Dockerfiles, but they represent pretty normal usage.
Builders
GHA:
Default GitHub Actions 4 core Ubuntu builders
There's some amount of variance between runs in...