A corrupted linked list caused .NET createdump to consume 100% CPU forever
Ruwanpurage's Substack
SubscribeSign in
A corrupted linked list caused .NET createdump to consume 100% CPU forever<br>I triggered a 100% CPU infinite loop in .NET createdump — and it got fixed upstream
Ruwanpurage Pawan Ranasinghe<br>May 29, 2026
Share
A few days ago, I was debugging a Linux process that behaved unexpectedly during crash handling.
Thanks for reading Ruwanpurage's Substack! Subscribe for free to receive new posts and support my work.
Subscribe
Instead of producing a crash dump, the process entered a state where CPU usage remained pinned at 100%, and the dump generation never completed.
That investigation led into the internals of the .NET runtime’s Linux crash tool: createdump.
The result was a reproducible infinite traversal condition in ELF loader enumeration logic, later confirmed and fixed in the upstream dotnet/runtime repository.
Reference:
GitHub Issue: https://github.com/dotnet/runtime/issues/128623
Fix Commit: https://github.com/dotnet/runtime/commit/3849630
What createdump does
On Linux, when a .NET process crashes, the runtime launches a helper tool called createdump.
Its responsibility is to:
attach to the crashed process
inspect memory
enumerate loaded native modules
generate a crash dump
To enumerate loaded libraries, it walks a Linux dynamic loader structure called:
> link_map
This is a linked list of shared libraries:
libA → libB → libC → NULL
Traversal is implemented as a simple pointer walk:
for (link_map* m = head; m != nullptr; m = m->l_next)
Under normal conditions, this always terminates safely.
The core issue
The implementation assumes a critical invariant:
> The link_map structure is always acyclic and well-formed.
However, that assumption is not guaranteed under memory corruption or deliberate manipulation.
If the structure becomes cyclic:
libA → libB → libC → libA → ...
then traversal never terminates.
There are no safeguards in the original logic:
no cycle detection
no visited-node tracking
no iteration bound
As a result, the traversal becomes:
> an unbounded loop over remote process memory
Observed behavior
In the reproduction environment, a cyclic link_map was installed before triggering a crash.
When the crash occurred:
createdump started normally
module enumeration began
traversal entered an infinite loop
CPU usage remained ~100%
no crash dump was produced
A representative strace snippet showed continuous remote memory reads:
process_vm_readv(...) = 4<br>process_vm_readv(...) = 4<br>process_vm_readv(...) = 4<br>process_vm_readv(...) = 4
No state change or termination was observed until the process was manually killed.
Impact
This is not a security boundary violation.
However, it is a serious reliability failure in crash-handling infrastructure:
crash dumps never complete
diagnostic pipelines stall indefinitely
CPU resources are consumed continuously
container restart workflows may be blocked
failure analysis becomes impossible exactly when needed
In practice:
> the crash handler becomes the failure point.
Root cause
The root cause is a missing safety assumption in a failure-critical code path:
createdump trusts that loader metadata is well-formed even during crash states.
Specifically:
> it assumes link_map traversal will always terminate via NULL pointer
This assumption does not hold under corrupted or adversarial process memory.
Fix
The upstream fix introduces a traversal bound to prevent infinite loops.
Example mitigation:
constexpr int MAX_LINK_MAP_ENTRIES = 100000;
int count = 0;
for (link_map* m = head; m != nullptr; )<br>if (++count > MAX_LINK_MAP_ENTRIES)<br>return false;
m = m->l_next;
This ensures:
deterministic termination
protection against cyclic structures
minimal overhead in normal execution
Why this matters
Crash dump systems operate at the most fragile point in the system lifecycle:
after memory corruption may already exist
under partial process failure
with inconsistent runtime state
This makes them uniquely sensitive to unsafe assumptions.
This bug is a clear example:
> a simple linked-list traversal becoming a full CPU-exhaustion loop in failure-handling infrastructure.
Reproduction environment
.NET Runtime: 8.0.26
OS: Amazon Linux 2023
Architecture: ARM64 (aarch64)
Reproduction was performed using a controlled modification of link_map prior to crash triggering.
Takeaway
The most interesting part of this issue is not the loop itself.
It is the assumption behind it:
> even failure-handling systems can fail catastrophically when they trust corrupted state too much.
Reliability engineering does not end at application logic.
It extends into the code that runs after everything has already gone wrong.<br>Thanks for reading Ruwanpurage's Substack! Subscribe for free to receive new posts and support my work.
Subscribe
Share
Discussion about this post<br>CommentsRestacks
TopLatest
No posts
Ready for...