Streaming-Joins

prakashqwerty1 pts0 comments

Polars — Streaming sort-merge joins in Polars<br>We're hiring

Docs User guide Rust Python

Resources Our services Academy Blog About us

We're hiring

--> Back to blog<br>Streaming sort-merge joins in Polars<br>By Thijs Nieuwdorp on Thu, 9 Apr 2026

Joins are often one of the most expensive parts of a query.<br>Once tables get large, the join can heavily impact both runtime and memory usage, especially when the engine has to build a large hash table as part of execution.

If the join keys are already sorted, Polars can now take a cheaper path: a streaming sort-merge join.<br>This post explains how that algorithm works, how to make use of it, how to verify that Polars chose the merge-join using the physical plan, and when this path is worth using.<br>At the end, we show you can reach up to 18x performance improvements .

Why sort-merge joins matter

A sort-merge join becomes attractive when the expensive part, sorting, has already been done or is unnecessary because the data already arrives ordered.<br>You can think of taxi ride logs, ticker data, or sensor readings.

In that situation, it has three useful properties:

it does not require any intermediate data structures

it has low additional memory pressure

it has sequential access patterns that fit streaming execution well

However, if your data is not sorted, a hash join is often still the better default.<br>You can view it as “sorted data unlocks a cheaper join strategy”.

The basic sort-merge algorithm

At the core of the algorithm are two sorted sequences and one linear scan.

The naive algorithm

The naive approach requires join keys that are both sorted and unique.<br>This would be the pseudocode:

result = []<br>while not left.done and not right.done:<br>if left right:<br>left.next()<br>elif left > right:<br>right.next()<br>else:<br>result.append(left + right)<br>left.next()<br>right.next()<br>return result<br>Let’s take the following sequences and see how this plays out:<br>Left = [1, 3, 4] and Right = [2, 3, 4].

Step 1:

Left : [1] 3 4<br>Right: [2] 3 4<br>Output: []<br>1 , so Left advances.

Step 2:

Left : 1 [3] 4<br>Right: [2] 3 4<br>Output: []<br>3 > 2, so Right advances.

Step 3:

Left : 1 [3] 4<br>Right: 2 [3] 4<br>Output: [(3, 3)]<br>Now the keys match, so we emit one joined row and advance both sequences.

Step 4:

Left : 1 3 [4]<br>Right: 2 3 [4]<br>Output: [(3, 3), (4, 4)]<br>Again, the keys match, so we emit and advance.

Step 5:

Left : 1 3 4 [done]<br>Right: 2 3 4 [done]<br>Output: [(3, 3), (4, 4)]<br>At least one sequence is done, so we’re finished.

That is the whole idea: each side moves forward once, and we never need to jump backwards.<br>Two pointers advance in lockstep.

The problem: duplicate keys

However, real data has repeated keys.<br>What happens when the sequences we want to join look like Left: [2, 2] and Right: [2, 2]?

With Left = [2, 2] and Right = [2, 2], the naive scan looks fine at first:

Frame 1:

Left : [2] 2<br>Right: [2] 2<br>Output: [(L[0], R[0])]<br>Frame 2:

Left : 2 [2]<br>Right: 2 [2]<br>Output: [(L[0], R[0]), (L[1], R[1])]<br>Then the sequences run out:

Frame 3:

Left : 2 2 [done]<br>Right: 2 2 [done]<br>Output: [(L[0], R[0]), (L[1], R[1])]<br>We still owe (L[0], R[1]) and (L[1], R[0]), but the scan has no way to return to the start of the matching run and emit the full Cartesian product of matches.<br>We need to be able to rewind.

The fix: mark

mark saves where matching keys started in Right.<br>When Left advances to another row with the same key, we rewind, advance only Left, and match again, instead of advancing both Left and Right.

This is the pseudocode for an inner join:

result = []

while True:<br>if left.done:<br>break

if right.done and not mark:<br>break

if not mark:<br>while left right:<br>left.next()<br>if left.done: break<br>while left > right:<br>right.next()<br>if right.done: break<br>mark = right

if not right.done and left == right:<br>result.append(left + right)<br>right.next()<br>else:<br>right = mark<br>left.next()<br>mark = None

return result<br>Step 1: save mark when the first match is found.

Left : [2] 2<br>Right: [2] 2<br>Mark : ^<br>Output: []<br>Steps 2 & 3: consume the whole matching run for L0.

Left : [2] 2<br>Right: 2 [2]<br>Mark : ^<br>Output: [(L[0], R[0]), (L[0], R[1])]<br>Step 4: rewind Right to mark, then advance Left.

Left : 2 [2]<br>Right: [2] 2<br>Mark : ^<br>Output: [(L[0], R[0]), (L[0], R[1])]<br>Steps 5 & 6: consume the same right-hand run again, now for L1.

Left : 2 [2]<br>Right: 2 [2]<br>Mark : ^<br>Output: [(L[0], R[0]), (L[0], R[1]), (L[1], R[0]), (L[1], R[1])]<br>Step 7: We rewind Right, but then run out of elements on the Left side when we advance it.<br>We finish, emitting (L[0], R[0]), (L[0], R[1]), (L[1], R[0]), (L[1], R[1]).

Left : 2 2 [done]<br>Right: [2] 2<br>Mark : ^<br>Output: [(L[0], R[0]), (L[0], R[1]), (L[1], R[0]), (L[1], R[1])]<br>That is the purpose of mark: turn one contiguous run of equal keys on the right into a reusable range for every matching row on the left.

The important detail is that we do not need to remember all previous rows.<br>We only need to remember where the current run of equal keys started.<br>As long as the left key stays the same, that one bookmark is...

right left mark done output join

Related Articles