A corrupted linked list caused .NET createdump to consume 100% CPU forever

Ruwanpurage1 pts0 comments

A corrupted linked list caused .NET createdump to consume 100% CPU forever

Ruwanpurage's Substack

SubscribeSign in

A corrupted linked list caused .NET createdump to consume 100% CPU forever<br>I triggered a 100% CPU infinite loop in .NET createdump — and it got fixed upstream

Ruwanpurage Pawan Ranasinghe<br>May 29, 2026

Share

A few days ago, I was debugging a Linux process that behaved unexpectedly during crash handling.

Thanks for reading Ruwanpurage's Substack! Subscribe for free to receive new posts and support my work.

Subscribe

Instead of producing a crash dump, the process entered a state where CPU usage remained pinned at 100%, and the dump generation never completed.

That investigation led into the internals of the .NET runtime’s Linux crash tool: createdump.

The result was a reproducible infinite traversal condition in ELF loader enumeration logic, later confirmed and fixed in the upstream dotnet/runtime repository.

Reference:

GitHub Issue: https://github.com/dotnet/runtime/issues/128623

Fix Commit: https://github.com/dotnet/runtime/commit/3849630

What createdump does

On Linux, when a .NET process crashes, the runtime launches a helper tool called createdump.

Its responsibility is to:

attach to the crashed process

inspect memory

enumerate loaded native modules

generate a crash dump

To enumerate loaded libraries, it walks a Linux dynamic loader structure called:

> link_map

This is a linked list of shared libraries:

libA → libB → libC → NULL

Traversal is implemented as a simple pointer walk:

for (link_map* m = head; m != nullptr; m = m->l_next)

Under normal conditions, this always terminates safely.

The core issue

The implementation assumes a critical invariant:

> The link_map structure is always acyclic and well-formed.

However, that assumption is not guaranteed under memory corruption or deliberate manipulation.

If the structure becomes cyclic:

libA → libB → libC → libA → ...

then traversal never terminates.

There are no safeguards in the original logic:

no cycle detection

no visited-node tracking

no iteration bound

As a result, the traversal becomes:

> an unbounded loop over remote process memory

Observed behavior

In the reproduction environment, a cyclic link_map was installed before triggering a crash.

When the crash occurred:

createdump started normally

module enumeration began

traversal entered an infinite loop

CPU usage remained ~100%

no crash dump was produced

A representative strace snippet showed continuous remote memory reads:

process_vm_readv(...) = 4<br>process_vm_readv(...) = 4<br>process_vm_readv(...) = 4<br>process_vm_readv(...) = 4

No state change or termination was observed until the process was manually killed.

Impact

This is not a security boundary violation.

However, it is a serious reliability failure in crash-handling infrastructure:

crash dumps never complete

diagnostic pipelines stall indefinitely

CPU resources are consumed continuously

container restart workflows may be blocked

failure analysis becomes impossible exactly when needed

In practice:

> the crash handler becomes the failure point.

Root cause

The root cause is a missing safety assumption in a failure-critical code path:

createdump trusts that loader metadata is well-formed even during crash states.

Specifically:

> it assumes link_map traversal will always terminate via NULL pointer

This assumption does not hold under corrupted or adversarial process memory.

Fix

The upstream fix introduces a traversal bound to prevent infinite loops.

Example mitigation:

constexpr int MAX_LINK_MAP_ENTRIES = 100000;

int count = 0;

for (link_map* m = head; m != nullptr; )<br>if (++count > MAX_LINK_MAP_ENTRIES)<br>return false;

m = m->l_next;

This ensures:

deterministic termination

protection against cyclic structures

minimal overhead in normal execution

Why this matters

Crash dump systems operate at the most fragile point in the system lifecycle:

after memory corruption may already exist

under partial process failure

with inconsistent runtime state

This makes them uniquely sensitive to unsafe assumptions.

This bug is a clear example:

> a simple linked-list traversal becoming a full CPU-exhaustion loop in failure-handling infrastructure.

Reproduction environment

.NET Runtime: 8.0.26

OS: Amazon Linux 2023

Architecture: ARM64 (aarch64)

Reproduction was performed using a controlled modification of link_map prior to crash triggering.

Takeaway

The most interesting part of this issue is not the loop itself.

It is the assumption behind it:

> even failure-handling systems can fail catastrophically when they trust corrupted state too much.

Reliability engineering does not end at application logic.

It extends into the code that runs after everything has already gone wrong.<br>Thanks for reading Ruwanpurage's Substack! Subscribe for free to receive new posts and support my work.

Subscribe

Share

Discussion about this post<br>CommentsRestacks

TopLatest

No posts

Ready for...

crash createdump process traversal runtime link_map

Related Articles