Unfathomable bugs #10: The Broken Windows Build
Unfathomable bugs #10: The Broken Windows Build
27 Jun 2026
MathJax was blocked.<br>Formulas like $\frac{a}{b}$ won't render into .
Allow scripts from algorithmicassertions.com and mathjax.org to fix.
It all started as I was showing one of our summer interns how to use stimflow to make a quantum circuit.<br>We noticed a stupid bug: adding a flow with start="auto" was failing if the flow was named.<br>Easy fix.<br>I wrote it up, created a pull request on stim’s github repository… and the nightmare started.
The windows builds were failing.
The windows builds weren’t just failing for this PR, but for all PRs.<br>Halfway through unit testing, continuous integration would crash with a vague but ominous message: “access violation”.<br>Something had broken, and created an error that could plausibly lead to security exploits.
Dependency Whinging
One of the frustrating things about modern software engineering is that things never just keep working.<br>You can get something to work, but nothing keeps working.<br>Eventually someone somewhere changes something, and you lose your day figuring out what the hell happened.
This “can’t just keep working” problem gets worse if you have a lot of dependencies.<br>Fortunately, I took a principled stand on this when writing stim and use very few dependencies.<br>…Except in the build system.
In case you didn’t know: building python packages is bullshit.<br>The process is known to be very brittle, and every year they make things more complicated by trying to fix that it’s complicated.<br>Currently, the recommended method for building a package is to use a docker container.<br>Otherwise too many details about the system doing the building can make their way into the package and cause problems.<br>But even this isn’t enough; after doing the containerized build you still need to run a tool called auditwheel over the package to fix some remaining problems.
Of course, the bullshit of building a python package is layered on top of the usual bullshit of getting something to build cross-platform.<br>For example, did you know that on Windows it’s so hard to find the location of the C++ compiler that they ship a program called vswhere.exe that specializes in solving this task?<br>(How do you find vswhere.exe? Well it should be at %ProgramFiles(x86)%\Microsoft Visual Studio\Installer\vswhere.exe of course.)<br>Even after you’ve used vswhere.exe to find the compiler, you’re not done.<br>Compilation will fail due to the compiler complaining it can’t find standard headers, like .<br>Worry not; it’s known that finding standard headers is also very hard on Windows.<br>There’s another program, vcvarsall.bat, which specializes in solving that task.
…Perhaps now you have a sense of why one might throw up their hands, shout “FINE! I’ll let someone else solve it!”, and take on a build dependency.
Because of how complicated it is to build cross-platform python packages, I use cibuildwheel to make stim’s packages.<br>cibuildwheel solves my immediate problem, but it isn’t a flawless library.<br>Libraries of this kind have a nasty tendency to solve 90% of your problem while making the remaining 10% ten times harder.<br>The simple cases work great, but as soon as you do something non-standard (like runtime detection of SIMD support) the abstraction buckles.<br>You inevitably end up needing to explain to tool A how to explain to tool B how to explain to tool C how to explain to tool D to please do this one simple thing, but unfortunately the layers of explanation turn that simple desire into some eldritch incantation that you write once and then try to never think of again.<br>Solving these buckled abstractions yields no enduring lessons and creates no sense of satisfaction; at best it kindles a latent fear of abstraction.
Anyways, all of this was to try to explain to you why “the windows builds are failing with an access violation” is such a nightmare.<br>It could be a bug in my code.<br>It could be a bug in cibuildwheels.<br>It could be a bug in github actions.<br>It could be a bug in visual studio.<br>It could be a bug in the interaction between these systems.<br>There’s very little to go on, and it’s invariably going to end in some unsatisfying way.
Adding salt to the wound, I can’t reproduce the bug locally.<br>I don’t have a windows machine, and Github actions doesn’t support local execution.<br>The only environment I know the bug happens in is the Windows build within continuous integration triggered by pushing to github.<br>(I’ve never managed to explain to cibuildwheel to explain to setup.py to explain to cl.exe to please build things in parallel, so these builds take several minutes.)<br>Progress will be slow.
Scorched Earth Debugging
When I’m stuck on a complicated debugging problem, I pull what I call the nuclear strategy.<br>I start deleting everything.<br>Find something irrelevant, delete it, check that the bug is still there, repeat.<br>Keep going until the bug is small enough that you can understand your own stupidity.
(I have a fond...