One bitmask in task_struct fixes 15 years of Linux signal conflicts

One bitmask in task_struct: how a 10-line kernel patch resolves 15 years of multi-runtime signal conflicts on Linux — Goga Koreli

gkoreli.com

Where excitement ends, depth begins.

Contents

Studio

backlog-mcp↗ @nisli/core↗ gkoreli.com↗ Design Language Animations Lab

One bitmask in task_struct: how a 10-line kernel patch resolves 15 years of multi-runtime signal conflicts on Linux

I spent the last two days debugging why a Bun server on Linux would permanently freeze the moment a Go shared library and a WebAssembly module coexisted in the same process. The strace showed 8,500 SIGPWR signals per second flooding the main thread. The event loop never recovered.

A fix is in progress — Bun's team is patching their WebKit fork to work around it. But the root cause isn't a bug in any one project. It's a kernel feature that doesn't exist yet — one that would take about 10 lines to implement.

The bug

A server process on Linux loads two things:

A Go CGo shared library via dlopen() (for authentication)

A WebAssembly module (for collaborative editing)

The first WASM function call permanently kills the event loop. setTimeout never fires. fetch never resolves. Microtasks still work (Promise.resolve is fine), but all macrotasks are dead. The process burns 100% CPU doing nothing useful.

Strace reveals the cause:

[pid 555498] tgkill(555498, 555498, SIGPWR) = 0 [pid 555498] tgkill(555498, 555498, SIGPWR) = 0 [pid 555498] tgkill(555498, 555498, SIGPWR) = 0 ... (25,678 times in 3 seconds) A compilation helper thread sends SIGPWR to the main thread in an infinite retry loop. The signal handler never acknowledges. The helper never stops.

Why it happens

Three facts about Linux signal delivery:

sigaction flags (including SA_ONSTACK) are process-wide. All threads share one signal disposition per signal.

sigaltstack is per-thread. Each thread can configure its own alternate signal stack.

The kernel delivers on the alt stack if and only if BOTH are true: SA_ONSTACK is set on the handler AND the receiving thread has a sigaltstack configured.

Now the sequence:

Bun starts. Main thread calls sigaltstack(512KB) for its crash handler (needs alt stack to report stack overflows). Installs a SIGPWR handler without SA_ONSTACK — SIGPWR is used for thread suspension and must run on the normal stack for the handler's stack-position check to work.

Go .so loaded via dlopen. Go's runtime calls setsigstack() on every signal with a non-default handler. This reads the current sigaction, ORs in SA_ONSTACK, and reinstalls it. It's one line in Go's runtime/signal_unix.go:

// Even if we are not installing a signal handler, // set SA_ONSTACK if necessary. if fwdSig[i] != _SIG_DFL && fwdSig[i] != _SIG_IGN { setsigstack(i)

Next SIGPWR delivery. Kernel checks: SA_ONSTACK? Yes (Go added it). Thread has sigaltstack? Yes (Bun's crash handler). Delivers on the alt stack.

Handler runs on wrong stack. The handler's stack-position check fails (it's on the alt stack, not the normal stack). It doesn't acknowledge the suspension. The sender retries. Forever.

This isn't a bug in Go, Bun, or WebKit

Go's behavior is documented and intentional:

"If there is an existing signal handler, the Go runtime will turn on the SA_ONSTACK flag and otherwise keep the signal handler."

Go needs SA_ONSTACK because goroutine stacks are 8KB. Without it, a signal arriving on a goroutine thread would overflow. Go configures per-thread sigaltstack on its own threads, but the kernel requires SA_ONSTACK on the handler too — otherwise the alt stack won't be used.

Bun needs sigaltstack on its main thread for crash reporting. Without it, a stack overflow followed by SIGSEGV would have no stack to run the crash handler on.

Both are correct. Both are necessary. They're incompatible because POSIX was designed for single-runtime processes — a world where one process meant one runtime with one signal handling policy.

The same bug, everywhere

Once I understood the mechanism, I found it recurring across the ecosystem:

Year Project Issue Impact

2015 Go #13034 Signal forwarding broken with embedders

2016 Linux kernel bugzilla #153531 AVX-512 overflows MINSIGSTKSZ → memory corruption (P1, still open )

2025 Go + .NET #78883 CoreCLR SIGSEGV when loaded with Go

2026 Bun + Go #31158 Event loop permanently dead

2026 Bun + Go + Prisma #29843 Database queries hang

Valve/Proton #6762 Games crash on Linux

Duplicati #5793 .NET + Go backup crashes

AFLplusplus #2545 Fuzzer sigaltstack failure

LLVM #48092 libFuzzer breaks ASAN stack-overflow detection

Each team thought it was their bug. Each shipped their own workaround:

Bun: read the interrupted SP from ucontext instead of the handler's own SP (WebKit #235)

.NET: increase alt stack size (dotnet/runtime#110368)

LLVM: preserve SA_ONSTACK flag in libFuzzer

Go: "host must use SA_ONSTACK" (documentation, not a fix)

Valve: unfixed

Nobody stepped back and asked: why does this keep happening?

The missing kernel primitive

The...

One bitmask in task_struct fixes 15 years of Linux signal conflicts

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down