How ClickHouse became fast at joins<br>Open searchOpen region selectorEnglish<br>Japanese
47.8kSign inGet started
->Scroll to top
BackBlog<br>Engineering<br>Copy pageCopied!More actionsView as Markdown Open this page in Markdown<br>Open in ChatGPT Ask questions about this page<br>Open in Claude Ask questions about this page<br>Open in v0 Ask questions about this page
How ClickHouse became fast at joins
Tom Schreiber<br>Jun 3, 2026 · 20 minutes read
TL;DR
Over two years, ClickHouse became 26× faster on join-heavy analytical workloads. This post explains the engineering that made joins a first-class strength.
Two years of focused join engineering #
ClickHouse is known for fast analytical queries, high compression, and real-time performance at scale.
Over the last two years, one major engineering focus has been bringing that same performance profile to join-heavy SQL queries.
At the ClickHouse 24.5 release webinar, Alexey Milovidov, inventor of ClickHouse, described the direction clearly:
“From now on, you will see JOIN improvements in every ClickHouse release.”
The chart below shows what that looked like in practice.
The first year laid the foundation: faster parallel hash join, smarter planning, aggressive filter pushdown, and local join reordering.
By 25.4, the same TPC-H SF100 join-heavy workload was already 4.4× faster than in 22.4.
The second year pushed much further. Between 25.4 and 26.4, a new wave of optimizer and execution improvements made the same workload another 6× faster with default settings.
End to end, ClickHouse is now 26× faster on TPC-H SF100 than it was in 22.4.
This post explains how we got there. The companion post shows what it unlocked: ClickHouse Cloud now runs TPC-H for less than a cent , and competes head-to-head with Snowflake, Databricks, BigQuery, and Redshift on SF100.
Year one: building the foundation #
A year ago, at our first Open House user conference in San Francisco, ClickHouse join engineering lead, Robert Schulze, presented the first year of major join-performance work.
That first year was about building the foundation. ClickHouse made parallel hash join the default in 24.12, added local automatic join reordering for two-table joins, and delivered a steady stream of low-level execution improvements.
Several of those changes landed across consecutive releases:
24.7: improved hash table allocation for faster parallel hash joins
24.12: made parallel hash join the default strategy and introduced local automatic join reordering
25.1: sped up the hash join probe phase
25.2: removed thread contention in the hash join build phase
Together, those improvements made the same TPC-H SF100 join-heavy workload 4.4× faster by 25.4 compared with 22.4.
But year one was only the foundation. The second year is where joins became dramatically better with default settings.
Year two: making joins competitive by default #
This year at Open House, Robert is back with the second chapter of the join-performance story.
Year one built the foundation. Year two made joins competitive with default settings.
The goal was simple: users should not have to rewrite queries, tune join orders by hand, or know which internal optimization to enable. ClickHouse should recognize more join-heavy SQL, automatically choose better plans, and avoid unnecessary work during execution.
The chart below shows the four main improvements that became effective by default between 25.4 and 26.4. Together, they made the same TPC-H SF100 join-heavy workload another 6× faster.
The rest of this post walks through those four improvements:
① Correlated subqueries in JOINs — support more real-world SQL directly.
② Lazy column replication — avoid copying repeated values produced by joins.
③ Runtime filters — skip probe-side rows before expensive hash-table lookups.
④ Statistics-based join reordering — choose better join plans automatically.
All examples use TPC-H SF100 on the same hardware, an AWS EC2 m6i.8xlarge instance (32 vCPUs, 128 GiB RAM), so the improvements are easy to compare.
① Correlated subqueries in JOINs #
Two years ago, the problem was not just that some joins were slower than we wanted them to be. Some important join-heavy queries could not run at all.
Why this mattered #
In the TPC-H benchmark section of our first research paper, presented at VLDB 2024, we had to exclude seven queries: Q2, Q4, Q13, Q17, and Q20–Q22. Because they used correlated subqueries, which ClickHouse did not fully support at the time.
A correlated TPC-H query #
TPC-H Q4 is a good example. It contains an inner query over lineitem that references orders, the table from the outer query:
This is what makes it a correlated subquery : the inner query cannot be understood in isolation, because it depends on values from the outer query.
From row-by-row execution to set-oriented plans #
Correlated subqueries are common because they are natural to write. They are also increasingly common in generated SQL, including queries...