Making devenv start fast, and the whole nixpkgs with it

domenkozar1 pts0 comments

Making devenv start fast, and the whole nixpkgs with it - devenv

Skip to content

Initializing search

cachix/devenv

Processes

Services

Containers

Binary caching

Git hooks

Tests

Profiles

Overlays

Outputs

Writing devenv.yaml

Overview

Extending

Guides

Integrations

Tutorial

Examples

Editor support

Reference

Recipes

Blog

Roadmap

Community

Discord

Making devenv start fast, and the whole nixpkgs with it

I'm sitting here next to Farid Zakaria at Tacosprint where we looked at the stat storm that has been haunting nixpkgs for a decade.

devenv auto activation runs devenv<br>hook-should-activate on every shell prompt to decide whether you've stepped<br>into a project directory. It does almost nothing: discover the project, check<br>the trust database, print a path. So its runtime is pure startup overhead, and<br>it runs on every single prompt redraw.

$ time devenv hook-should-activate<br>/home/domen/dev/myproject<br>real 0m0.070s<br>...

70ms before a prompt, every prompt.

And this isn't devenv's tax to pay, it's nixpkgs'. Every program pays it before<br>it runs a line of its own code: the dynamic loader has to find each shared<br>library, and the way Nix scatters packages across the store makes that search<br>slow. This is not news. The cost has been measured, written up, and partly fixed<br>more than once, and yet it has sat in limbo for the better part of a decade with<br>no general fix merged into nixpkgs.

Most of that is the dynamic loader looking for a shared object that is sitting<br>right there in the store, just not in the first directory it tried. The loader<br>knocks on 486 wrong doors before it finds the right ones, and almost all of it<br>happens before main even starts.

That number is the whole game. Above ~30ms you have to bolt a caching layer on<br>top of the hook; in single digit milliseconds you just run it on every prompt<br>and throw the cache away.

And it scales with the closure: imagemagick's magick --version<br>makes 1225 failing opens:

$ strace -f -e openat magick --version 2>&1 >/dev/null | grep '\.so' | grep -c ENOENT<br>1225

The community has been circling a real fix for years. This post walks through<br>the problem, the approaches people have tried with their tradeoffs, and a more<br>radical one we spiked for devenv to see if it was even possible: deleting the<br>dynamic loader altogether by linking the whole program into one static binary.

The umbrella tracking issue for the general problem is<br>NixOS/nixpkgs#481620.

Why Nix makes the loader work so hard

On a traditional distribution every shared library lives in a handful of global<br>directories such as /usr/lib. The dynamic loader has a short, mostly cached<br>search path, and ld.so.cache (built by ldconfig) turns soname lookups into a<br>hash table hit.

Nix is different by design. Every package lives in its own<br>/nix/store/-name/lib directory, and there is no global ld.so.cache for<br>store libraries. To make a binary find its dependencies, Nix records a<br>DT_RUNPATH in the ELF header that lists one directory per dependency . A<br>program linked against fifty libraries gets a DT_RUNPATH with dozens of<br>entries.

Now recall how glibc resolves a DT_NEEDED soname with DT_RUNPATH present: it<br>walks every DT_RUNPATH directory in order, trying to open dir/soname in<br>each, until one succeeds. So resolving N libraries against a path of M<br>directories costs on the order of N times M openat() attempts, almost all of<br>which fail. That is the stat storm.

It gets worse. For every directory it searches, glibc first probes the<br>glibc-hwcaps subdirectories for your CPU (x86-64-v3, x86-64-v2, and so on),<br>which adds roughly three more failing opens per directory on a modern machine.<br>On a fast SSD with a warm cache none of this is noticeable. On a slow disk, a<br>network filesystem, a cold cache, or a low power ARM board, it is the difference<br>between snappy and sluggish, and it multiplies across every process a shell<br>script spawns.

Concretely, the two workloads we traced most closely:

Workload<br>Loaded libraries<br>DT_RUNPATH dirs<br>Failing .so opens

devenv version<br>83<br>12 (leaf binary)<br>~486

imagemagick magick --version<br>91<br>35<br>~1225

The wider a binary's own DT_RUNPATH and the deeper its transitive graph, the<br>worse the storm.

What a good fix has to preserve

The reason this problem has stayed open so long is that the obvious fixes break<br>things people rely on. Any serious solution is judged against a checklist:

LD_LIBRARY_PATH override. NixOS injects the GPU driver by putting<br>/run/opengl-driver/lib on LD_LIBRARY_PATH. If a fix stops that from<br>winning, graphics break.

LD_PRELOAD. Interposers and shims must still load first.

The libGL / glvnd runtime swap. A program built against Mesa must be able<br>to pick up the vendor driver at runtime.

Two libraries with the same soname. This is the heart of the Nix model:<br>different parts of one closure can legitimately depend on different builds of<br>the same soname, and resolution must stay per object.

dlopen. Plugins loaded at runtime are a related but separate...

devenv nixpkgs directory loader dt_runpath whole

Related Articles