Passing DBs Through Continuations

cps

Dedicated to the Minnowbrook Analytic Reasoning Seminar with special thanks to Kris Micinski and Michael Ballantyne

Suppose you want to write a database. You'd probably start by implementing relational algebra operators — projection, filter, join, etc. The easy way is to implement them as functions that take in tables and return tables, and assemble them into a larger expression. That was how Prela worked in its first incarnation. The code was clean, but it was hella slow! Which was not surprising, because every operator materialized every intermediate result. The standard solution to this is the iterator model, where each operator implements an Iterator interface that streams intermediate tables row by row instead of materializing them. But implementing the iterator model naively still incurs overhead: every call to Iterator.next() triggers a dynamic dispatch, which costs vtable lookups and destroys cache locality. There are two standard remedies: vectorization and compilation. A vectorized database amortizes the overhead by implementing Iterator.next_batch() which returns a whole batch of data that can be processed together; a compiled database, well, compiles the incoming query directly to fast machine code that runs without any dynamic dispatch. Either approach takes a lot of very smart people spending their entire working life to build, and it's why systems like DuckDB and Umbra exist. I'm moderately smart but don't have a lot of time, so I was looking for a shortcut. The shortcut I stumbled upon was so beautiful that I literally cried1 when I finally understood it, and I hope my explanation below will make you cry too :' )

To keep things simple, let's suppose we're just dealing with lists of numbers, and we want to do two very simple things to them: inc adds 1 to every number, and dbl doubles them. That's pretty easy to write:2

inc(xs) = [x + 1 for x in xs]

dbl(xs) = [2 * x for x in xs]

Now, we can chain them together with dbl(inc(xs)) which will do two steps in sequence. Problem is, because each function takes in a list and returns a list, our program produces an intermediate, namely inc(xs). This allocates a new list only to be thrown away by the call to dbl. Things only gets worse when we chain together multiple calls to inc and dbl. A more efficient implementation would fuse together the operations:

inc_n_dbl(xs) = [2 * (x + 1) for x in xs]

Of course, we can't write down every possible combination of operators like this. Is there a way to define each operator modularly, yet still have them compose into tightly fused operations automatically? Yes, if we use a bit of magic from functional compilers — continuation-passing style (CPS).

The key idea of CPS is to define operators that do things instead of making things. inc and dbl as defined above each takes in a list and makes a list. Instead, the CPS version of each operator takes in a list and an additional input k: this k is a function that the caller passes in, specifying what it wants to do with each element after the operator's work is done. k is called the continuation. Let's look at some code:

function inc(xs, k) for x in xs k(x + 1) end end

Now suppose k is the print function, then inc as defined above will add 1 to each number, then print the result. Note that nothing is returned, and inc only does its job (adding 1) then performs what it's told to (apply k). As an exercise, you can try and write down dbl in CPS style.

But currently each of inc and dbl still takes in a list, and there's no obvious way to compose multiple operators. To do that, we replace xs with a "child" operator op:

inc(op, k) = op(x -> k(x + 1))

dbl(op, k) = op(x -> k(x * 2))

function scan(xs, k) for x in xs k(x) end end

Intuitively, inc now trusts its child op to do its job, namely, that op will apply the continuation it receives to each item. So instead of iterating over xs, inc simply tags the + 1 step onto the continuation and passes it to op. I've also defined a "source" operator scan that connects the input list to the operators. Let's see the code in action.

Start by calling inc(scan(xs), print).3

According to the definition of inc, this will call scan(xs, x -> print(x + 1))

Plugging in the definition of scan, this gets us for x in xs; print(x + 1); end

So chaining together inc and scan indeed does what we want! Now let's try a longer chain dbl(inc(scan(xs)), print):

Expanding dbl gets us inc(scan(xs), x -> print(x * 2))

Expanding inc gets us scan(xs, x -> print((x + 1) * 2))

Finally, expanding scan gets us for x in xs; print((x + 1) * 2); end

Notice how I used the word expand — if we annotate every operator definition with @inline, the compiler will actually unfold the code as we did above, and an operator chain gets compiled down to a fused loop in the end! You can try expanding longer chains like dbl(inc(dbl(inc(scan(xs)))), print) to get some practice thinking about CPS. Julia also has handy tools...

Passing DBs Through Continuations

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy