Lessons from building SQLite with LLMs

Lessons from building (SQLite) with LLMs | by Sharvanath Pathak | Jun, 2026 | MediumSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Lessons from building (SQLite) with LLMs

Sharvanath Pathak

5 min read· Just now

Listen

Having worked as Founding engineer at Glean and now CTO at WisdomAI, I have seen the LLM transformation very closely. In this post, I have captured my learnings over the last few years of managing an AI obsessed engineering team. To convey the point, I will try a simple experiment and then capture the lessons on how to do efficient software development with LLM agents. Spoiler alert: answer is more nuanced than adopt agents everywhere. Preface Unless you have been living under a rock, you know the way software was built before has changed significantly. At a high level the process for building a software has been gathering product requirements, defining the design/architecture, and finally, coding and implementing it. All of this can now be done by an LLM. However, this still doesn’t mean you can ask Claude to build the full software. In this post we will take an example of a well-known system SQLite and single shot Codex to build it. MiniSQL experiment I try to recreate SQLite using Codex, and name this recreation MiniSQL. Here’s the prompt I used for development: “Mini SQL is a self-contained, serverless relational database engine that stores entire databases as single, cross-platform files. Build this fully featured, make your best judgement on everything, take your time and come back with a finished software.” The session completed in ~15m and what it generated (https://github.com/sharvanath/MiniSQL) is simply amazing. Most engineers would take much longer to build the first version of this quality. However, note that this is still first version and has to go through several iterations before getting to a state where it can be used as production grade database. Feature Incompleteness I created a super basic test script which passes on SQLite and tried to transform it to test on MiniSQL. Codex runs into the following issues. “MiniSQL doesn’t have SQLite’s parameter binding, AUTOINCREMENT, COUNT(*), or real indexes yet, so I’m going to write the test in the same spirit but against the current MiniSQL surface: create, insert, select, update, delete, uniqueness, and an explicit assertion that CREATE INDEX is currently unsupported.” Architectural gaps MiniSQL’s core architectural gap is that it currently behaves like an in-memory SQL interpreter with whole-file JSON persistence, while SQLite is a layered database engine built around paged storage, B-tree tables and indexes, a pager-managed transaction system, crash recovery, a query planner, and a bytecode virtual machine. In short: MiniSQL can execute simple relational operations, but it does not yet have the storage, transaction, indexing, and execution architecture that makes SQLite durable, scalable, concurrent. Again you only realize these when you dig a level deeper. Correctness bugs I asked Codex and Claude Code to find bugs in this and they found several interesting bugs. Here are the top few: Transactional semantics (found by both Codex/Claude Code) : minisql/engine.py#L143 appends each inserted row as it validates. If a later row violates a constraint, earlier rows stay in memory even though the statement failed. A later successful write can persist that partial failed insert. For instance,INSERT INTO users VALUES (1, 'a'), (2, 'a'); fails on duplicate email, but row (1, 'a') remains visible and can later be committed. Join bug (only by Claude Code) : SELECT * silently drops/overwrites columns on JOINs when both tables share a column name (e.g. id). The merged row context uses the same flat key for both sides, so the right-hand table's value clobbers the left-hand table's — data loss, not just a display quirk. Qualified selects (a.id, b.id) are unaffected, so the underlying join data is fine; only the * expansion is broken. Conditional parsing bug (only by Claude Code): Three-valued NULL logic isn’t implemented correctly. Comparisons (=, , etc.) short-circuit to False whenever either side is NULL, rather than UNKNOWN. This mostly avoids weirdness in plain WHERE/OR clauses, but breaks NOT: e.g. NOT (a = b) evaluates TRUE when both a and b are NULL, where standard SQL would exclude that row (UNKNOWN, not TRUE). The essence of Iteration loop This experiment is not the best example since a suitably designed harness/agent can easily build a much better first version of SQLite. However, that is precisely because a well written, serverless local SQL engine like SQLite already exists for the agent to learn from and iterate against. In practice, you don’t have these lessons already captured and you learn them via usage based iterations. The lesson is simple, software building is iterative by nature. We build and launch something, get feedback (feature requirements, bugs, or performance gaps),...

Lessons from building SQLite with LLMs

Related Articles

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7