Bootstrapping a SQL catalog on a flat key-value store

Keys and Values Are All You Need | nevzheng

Go back Keys and Values Are All You Need 14 May, 2026 This post is a walkthrough: how to bootstrap a SQL catalog on top of a key-value store, every write end to end — from a name, to a UUID, to a row in a system table, to a durable user record.

It’s based on what I worked through while building Strata, my personal LSM project, but the recipe is general. A handful of conventions on top of a four-operation KV API gets you surprisingly far. Keys and values really are all you need.

A note on the code: snippets throughout are rusty pseudocode for ease of explanation — not necessarily lifted directly from Strata.

Table of contents

Open Table of contents

Everything is a Table

What’s a table anyway?

The Catalog API

The Challenge

Composite Keys

System Tables & Reserved UUIDs

Creating a Table: End to End

Step 1: Mint new IDs

Step 2: Register in _uuids

Step 3: Write the metadata

Step 4: Write rows

Wiring It Together

Implications

What’s Next

Further Reading

Everything is a Table

In databases, anything you can model as a table tends to be easier to work with. Indexes, metadata, system configuration. SQL already operates on tables, so why not store everything that way too? But how do you represent a table in a system that only speaks key-value? StrataDB’s storage engine exposes just four operations:

pub trait StorageEngine { /// Insert or update a key-value pair. fn put(&mut self, key: &[u8], value: &[u8]);

/// Retrieve the value for a key, or None if it doesn't exist. fn get(&self, key: &[u8]) -> OptionVecu8>>;

/// Delete a key by writing a tombstone. fn delete(&mut self, key: &[u8]);

/// Return all key-value pairs whose keys fall within the given range, /// sorted by key. fn scan(&self, range: impl RangeBoundsVecu8>>) -> VecKVPair>; What’s a table anyway?

Before we can store anything, we need to define what we’re storing. In Strata, the catalog is organized around three concepts borrowed from BigQuery:

Project — the top-level namespace, like an organization or account

Dataset — a logical grouping of tables within a project

Table — a named collection of rows with a schema

Each level has a corresponding metadata type:

pub struct ProjectId(Uuid); pub struct DatasetId(Uuid); pub struct TableId(Uuid);

pub struct ProjectMeta { pub id: ProjectId, pub name: String,

pub struct DatasetMeta { pub id: DatasetId, pub name: String,

pub struct TableMeta { pub id: TableId, pub name: String, pub schema: Schema, Names are human-readable. IDs are stable and opaque — the rest of the system never references a resource by name internally, only by ID. Each ID is a new type wrapping a UUID. The compiler now enforces that you can’t accidentally pass a TableId where a ProjectId is expected — a mistake that would be invisible with raw Uuid values and silent at runtime.

Tables additionally carry a Schema — a list of typed fields that describes the shape of its rows.

The Catalog API

The public API exposes these as a fluent, scoped interface:

pub trait CatalogApi { fn project(&self, name: &str) -> ProjectScope;

pub trait ProjectScope { fn create_dataset(&self, name: &str) -> DatasetScope; fn dataset(&self, name: &str) -> DatasetScope; fn drop_dataset(&self, name: &str); fn list_datasets(&self) -> VecString>;

pub trait DatasetScope { fn create_table(&self, name: &str, schema: Schema) -> TableScope; fn table(&self, name: &str) -> TableScope; fn drop_table(&self, name: &str); fn list_tables(&self) -> VecString>;

pub trait TableScope { fn put(&self, key: &[u8], value: Value); fn get(&self, key: &[u8]) -> OptionValue>; fn delete(&self, key: &[u8]);

// Create a table let table = db .project("acme") .create_dataset("metrics") .create_table("events", schema);

// Write and read a row table.put(b"event:001", json!({ "type": "click", "ts": 1234567890 })); table.get(b"event:001"); // Some({"type": "click", "ts": 1234567890})

// Navigate to an existing table let table = db .project("acme") .dataset("metrics") .table("events");

// List tables in a dataset db.project("acme") .dataset("metrics") .list_tables(); // ["events"] The Challenge

We have a KV store that speaks bytes. We have a catalog that speaks projects, datasets, and tables. How do we bridge the gap?

Composite Keys

Encode the full namespace into the key itself, and the flat KV store gets structure for free.

{project_id}|{dataset_id}|{table_id}|{user_key}|{version}→ value Every row in the database gets a key that encodes exactly where it lives. Two rows in different tables can never collide because their keys differ at the table_id segment. And because the KV store sorts keys lexicographically, all rows in the same table are physically adjacent — listing them is just a prefix scan. Appending a version gives us a basis for MVCC for free. Multiple versions of the same row sort together naturally, laying the groundwork for point-in-time reads and snapshot isolation down the line.

That’s the...

Bootstrapping a SQL catalog on a flat key-value store

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast