SupXML: The modern, memory-safe XML parser drop-in replacement for libxml2

jrpt1 pts0 comments

Introducing SupXML, the modern memory-safe XML parser alternative to libxml2 - Supported Source

Sign in<br>Learn more

Published Jun 5, 2026

Introducing SupXML, the modern memory-safe XML parser alternative to libxml2

A fast, memory-safe, modern XML parser from Supported Source

For over two decades, most companies have relied on libxml2 for parsing XML files.<br>It ships inside browsers, operating<br>systems, programming language runtimes, and countless enterprise systems.<br>It is, by any measure, critical infrastructure for the internet.<br>But there were concerns recently when the original maintainer stepped<br>down.

SupXML is our attempt to carry that legacy forward by making a modern,<br>memory-safe XML library built for the next twenty years.

Standing on libxml2's shoulders

libxml2 is a genuinely remarkable piece of engineering. Written in C in the<br>late 1990s, it became the default XML toolkit.<br>If you've parsed a XML config file, validated an XSD, signed a SAML<br>assertion, or rendered a web page, there's a good chance libxml2 was somewhere<br>in the stack. We have enormous respect for it.

What's changed since the late 90s is the language we'd choose to write it in.<br>Across the industry, roughly 70% of security vulnerabilities (CVEs) come from<br>memory-safety bugs: buffer overflows, use-after-free, and the like. Any large<br>C codebase that parses untrusted input lives with that risk because C<br>doesn't have bounds-checked arrays.

SupXML: memory-safe by construction

SupXML is a memory-safe, fast, spec-compliant XML library for Rust, with a<br>drop-in C ABI replacement for libxml2. It's written in pure Rust, which means<br>an entire class of vulnerabilities simply can't happen — the memory-safety<br>bugs that account for the majority of XML-parser CVEs are ruled out at compile<br>time.

SupXML isn't memory-safe by accident, either. The small amount of<br>unsafe code it does contain is a tightly audited core where every<br>block carries a safety comment, checked with Miri and fuzzing to<br>catch the unknown unknowns.

We're faster, too

Safety often comes with a performance tax, but SupXML is built for efficiency too.<br>On a full-validation DOM parse across benchmarks, SupXML is faster<br>than libxml2 on every fixture, with a median speedup of about<br>2.1×. Entity-heavy and attribute-dense documents see even<br>more: a 599 KB Arabic text parses ~3.75× faster, an English Wikipedia article<br>~2.96× faster.

It's strict on correctness, as well. On the W3C XML conformance suite, SupXML<br>passes all the tests. Across a broader 1,147-file corpus drawn from multiple<br>vendors, SupXML scores 95.6%. On XSD 1.0 validation it reaches 99.2%<br>conformance. These stats are all just as good if not better than libxml2.

Not to single out libxml2 here, because it's worth pointing out libxml2 is<br>actually significantly better than most other XML parsers. I've tested<br>them, and basically nothing comes close to the scores of libxml2 or SupXML.<br>Most XML parsers are not very correct, which has security implications<br>for companies relying on them.<br>Even if you think you trust your XML files, the trust boundary is fuzzy,<br>and there are lots of ways a malicious file can get into what you think<br>is a trusted space.

Drop-in replacement for libxml2

When I set out to make SupXML, I intentionally built it in Rust,<br>but added a C ABI (Application Binary Interface) matching libxml2's<br>functions, to make it easy to migrate. This means that anyone<br>using libxml2, from any language, can migrate to SupXML without<br>having to update their code. The existing code will call our C<br>functions, which then use our Rust library.

Why does critical software depend on volunteers?

The technical case for SupXML is strong, but it isn't the most important part<br>of the story. The most important part is why a library this critical<br>was carried for so long by so few.

Open source built the modern world, and it did so without much financial support at all.<br>The people who write and maintain the libraries everyone depends on are, with<br>very few exceptions, not paid for it. They do it on nights and weekends, out<br>of a sense of duty, until they burn out and walk away, at which point the<br>multi-billion-dollar companies depending on their work scramble to find<br>someone else to do it for free.

The fact is, open source software is worth billions, if not trillions,<br>of dollars to the global economy, but barely gets any monetary support.<br>Companies are freeriders when it comes to critical software. It's<br>the tragedy of the commons.

libxml2 is a case in point. In recent years, its long-time maintainer was candid about<br>the fact that a single unpaid volunteer can't be expected to provide urgent,<br>ongoing security work indefinitely. He was right, and it wasn't a complaint<br>about the code or about the people who use it. It's a structural problem: a<br>foundational library used by the largest companies in the world should not<br>depend on one person's goodwill. It's not the maintainers fault, really.<br>It's a failure of how the world funds (or rather,...

libxml2 supxml memory safe modern parser

Related Articles