SafeRE: Building a production-quality regex library with agents

SafeRE: Building a production-quality regex library with agents - Eddie Aftandilian

June 20, 2026 Part of the SafeRE series SafeRE: Building a production-quality regex library with agents This is the first in a series of blog posts about SafeRE, my linear-time regular expression library for Java.

A few months ago, I was having coffee with a friend, and we were talking about how good AI agents had gotten. Recent frontier models like Opus 4.6 and GPT-5.5 felt like a step change: not just better at small coding tasks, but much more capable of working through complex, long-running tasks. I started to wonder: could I build a substantial, production-quality software project purely with agents, with no human-written code at all? How would I ensure correctness if I wasn’t writing every line myself? Could agents make a project like this feasible to attempt in my spare time?

I decided to try an experiment.

When I worked on the Java team at Google, we considered building a linear-time regular expression library in pure Java. A bit of background: many popular regular expression libraries use backtracking engines, which can take exponential time on some patterns and inputs. Attackers can exploit that behavior by sending inputs that cause a service to burn huge amounts of CPU evaluating a regex – a class of attacks known as regular expression denial of service, or ReDoS. A linear-time regex library avoids that failure mode by ensuring that matching time grows linearly with the size of the input. While this might sound like a niche concern, it was a real problem at Google.

Building a new regex library would have been a lot of work. We estimated it at roughly two engineer-years. We couldn’t justify the investment, so we never built it. But I could never fully let go of the idea. It’s the kind of project that’s the reason I got into this field: using computer science to solve a real-world problem.

Perhaps naively, I thought I could build this library in my spare time with agents doing the bulk of the work. So I decided to try it.

The outcome is SafeRE, which is open-source and available at github.com/eaftan/safere.

When I say SafeRE was built with agents, I don’t mean that I told an agent “go build a regex engine” and came back a week later to a finished project. I mean that agents wrote the code, while I directed the work: breaking down tasks, reviewing code, steering the agents when they went in the wrong direction, and shaping how I wanted them to approach the problem. My role was somewhere between tech lead and pair programmer.

Suitability for agents

I initially chose this project because it seemed well-suited to agents. In reality, it turned out to be much harder than I expected. I was overly optimistic at the start.

Why did it seem well-suited?

While it’s technically difficult to build a linear-time regular expression library, the core ideas are well understood. There are existing libraries, RE2 in particular, that SafeRE could learn from. Russ Cox, the author of RE2, also wrote an excellent series of blog posts explaining the ideas behind it. So while the work is difficult, it is not research. We don’t have to invent new techniques to do this.

SafeRE owes a huge debt to RE2. The project started as a Java port of RE2, and I intentionally kept RE2’s license and license header to make that lineage clear. As the project evolved, SafeRE diverged from RE2 because the goal shifted from “RE2 in Java” to drop-in compatibility with java.util.regex, whose semantics are often different. But RE2 was the starting point, both technically and intellectually.

Regular expression engines are also unusually testable. They are deterministic and self-contained. You don’t have to wire together a distributed system to test them. There are also extensive open-source test suites that can be reused or adapted, where licenses permit and with appropriate attribution.

Why was it hard?

This is the part where I was overconfident. Regular expressions are a type of programming language, and they are very widely used. The popular implementations are incredibly battle-tested. My stated goal was for SafeRE to be a drop-in replacement for java.util.regex. That meant SafeRE had to be in the same neighborhood as the Java standard library’s regex implementation for correctness.

java.util.regex has been around since Java 1.4 in 2002 and has widespread usage. SafeRE was built from scratch. To be viable for production usage, I was going to have to polish it to an incredibly high standard. This turned out to be where I spent most of my time on the project.

A concrete example: SafeRE inherited support for POSIX bracket classes from RE2. In RE2, expressions like [[:lower:]] and [[:digit:]] have special meaning. Java’s regex library accepts those strings, but doesn’t treat them as POSIX bracket classes. In Java, POSIX-style character properties are written with escapes like \p{Lower}. So this was not a parser error or a missing feature. It was...

SafeRE: Building a production-quality regex library with agents

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

German ruling declares Google liable for false answers in AI Overviews