Getting LLMs Drunk to Find Remote Linux Kernel OOB Writes (and More)

Getting LLMs Drunk to Find Remote Linux Kernel OOB Writes (and More) · Hey, it's Asim

↓ Skip to main content

Hey, it’s Asim

Table of Contents

TLDR: the grossly overengineered, self-orchestrating team of vulnerability-hunting agents detailed below has discovered 20+ CVEs over the past few months, including CVE-2026-31432 and CVE-2026-31433: two remote, unauthenticated OOB writes in the Linux kernel’s ksmbd . Read on for the details of the setup that achieved this, including – yes! – getting LLMs drunk.

Background

“LLMing” vulnerability research has been on my “Do Something About This” list since DARPA’s AIxCC and XBOW’s initial results. But back in 2023-24, models required a lot of harnessing to get anything useful, tool use was rudimentary, and the idea of squeezing as much code as I could into a model’s context – then triaging away the false positives – filled me with dread.

The push to actually do something came in the summer of 2025. Rich Mirch reported a dead-simple, unnoticed-for-12-years local privilege escalation in sudo: CVE-2025-32462. Contrary to the documentation, the --host flag did not just permit listing privileges on a different host – it made the hostname portion of sudo rules irrelevant. So, e.g., if a sudoers rule granted you root on somehost but not the local host, you could abuse the flag to get full root locally.

This LPE was not LLM-found (AFAICT), but it did make me wonder: what if instead of getting LLMs to drive various tools, we had them hunt for (stupid simple) mismatches between documentation and the actual code? It seemed like an easier lift for (local) LLMs in terms of context size, harnessing complexity, and intelligence required. These would not be the most technically exciting findings, but their practical effects would be just as serious: impact-wise, an LPE is an LPE!

By the end of 2025, I’d begun working on a harness to do just this. But, to paraphrase Mike Tyson, everyone has a plan until a new model drops. Almost as soon as my harness was done, the models got good enough to greatly simplify the scaffolding required even for context-heavy external tool use. At this point, my quest fissioned into three :

Can we find the “docs ↔ code mismatch”-type vulnerabilities – the original goal, inspired by the finding above?

Given the step change in capabilities, what about vulnerabilities in general?

More speculatively, can we get a “move 37” out of LLMs to either a) find entirely novel bug classes, or at least b) unlock something in smaller models to enhance their hunting capabilities?

Findings

The answers were roughly “yes,” “yes,” and “maybe.” Below are 30+ findings (20+ CVEs assigned as of 2026-04-29, some not yet published) discovered fully autonomously via the custom harness. I prioritized network-reachable services first, given the impending avalanche:

Target Issue CVE #

Linux kernel (ksmbd) Compound READ + QUERY_INFO(Security) requests can trigger a (remote, unauthenticated) out-of-bounds write in ksmbd CVE-2026-31432, fix

Linux kernel (ksmbd) Compound QUERY_DIRECTORY + QUERY_INFO(FILE_ALL_INFORMATION) requests can trigger a (remote, unauthenticated) out-of-bounds write in ksmbd CVE-2026-31433, fix

Docker PUT /containers/{id}/archive executes container binary on the host -> container-to-host-root breakout CVE-2026-41567

Docker Crafted Docker API requests can make AuthZ plugins see no request body, bypassing body-inspecting authorization policies CVE-2026-34040

Docker Race condition in docker cp allows creation of arbitrary empty files on the host via symlink swap CVE-2026-41568

OpenSSL CMS AuthEnvelopedData processing may accept forged messages CVE-2026-34182

MariaDB wsrep SST unsafe parameter handling on the donor side -> RCE on the donor host CVE-2026-44168

CUPS On network-exposed CUPS with a shared PostScript queue, unauthenticated Print-Job requests can reach arbitrary code execution over the network as lp CVE-2026-34980

CUPS An unprivileged local attacker can coerce cupsd into leaking a reusable local admin token, escalating to a rootful file (over)write CVE-2026-34990

CUPS RSS notify-recipient-uri path traversal lets a remote IPP client write RSS XML outside CacheDir/rss, including clobbering job.cache CVE-2026-34978

HAProxy Single-packet infinite-loop DoS (QUIC) CVE-2026-26080

HAProxy Single-packet DoS (QUIC) CVE-2026-26081

Caddy Large host lists make MatchHost case-sensitive, enabling host-based routing/access-control bypass CVE-2026-27588

Caddy %xx escaped-path matching skips case normalization, enabling path-based access-control bypass CVE-2026-27587

Traefik TCP readTimeout bypass in the Postgres STARTTLS handling path, allowing an unauthenticated connection-stalling denial of service CVE-2026-25949

udisks Missing authorization on LUKS header restore lets a local unprivileged user...

Getting LLMs Drunk to Find Remote Linux Kernel OOB Writes (and More)

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars