CostBench: an open benchmark for data warehouse cost-performance

Introducing CostBench: an open benchmark for data warehouse cost-performance Open searchOpen region selectorEnglish Japanese

47.6kSign inGet started

->Scroll to top

BackBlog Engineering Copy pageCopied!More actionsView as Markdown Open this page in Markdown Open in ChatGPT Ask questions about this page Open in Claude Ask questions about this page Open in v0 Ask questions about this page

Introducing CostBench: an open benchmark for data warehouse cost-performance

Tom Schreiber and Lionel Palacin May 27, 2026 · 5 minutes read

TL;DR

CostBench is an open benchmark for cloud data warehouse cost-performance: performance-per-dollar, not just speed.

It helps teams choose the system that delivers the most performance per dollar for real-time analytical workloads.

Performance alone is only half the story #

Most benchmarks tell you how fast a query runs. That is useful, but incomplete.

In cloud data platforms, speed and cost are inseparable.

If warehouse A is faster than warehouse B, A looks better on a performance chart. But if A costs three times more to run, the comparison changes. You might spend the same budget on a larger configuration of B, get more compute, and make B faster than A for less money overall.

That comparison is hard because every platform exposes cost differently: credits, DBUs, slot-seconds, compute units, RPUs.

The unit names differ, but the underlying question is the same:

How much compute did the system need to finish the workload, and what did that compute cost?

CostBench answers that question directly. It also exposes where cost-performance breaks: during ingest, while making data query-ready, or when serving reads.

Why this matters in the AI era #

Agentic analytics raises the pressure on every layer of the database.

New data never stops: events, transactions, logs, traces, user activity, fraud signals, operational state. At the same time, users and agents expect fast answers over fresh data.

If the database is slow, the agent is slow. If the database is expensive, teams start rationing what agents can do: fewer retries, smaller datasets, less context, stale data.

In the AI era, fast and low-cost has to hold across the full analytics path: continuous ingest, query-ready preparation, and reads.

Read-side pressure comes from query volume. A single user question can trigger many SQL queries: schema exploration, validation, retries, refinements, drilldowns, and follow-ups. Each extra query burns compute. At agentic scale, query volume turns directly into cost pressure.

Write-side pressure comes from real-time freshness: fresh data has to be continuously ingested, compressed, and organized so queries can skip more data. That work burns compute before the first query even runs, and determines how much compute those queries burn later.

What CostBench measures #

CostBench turns that pressure into a full-path cost-performance lens with two measurable dimensions:

Read-side cost-performance : how much query performance you get per dollar.

Write-side cost-performance : how efficiently each dollar turns fresh ingest into query-ready data.

Together, they help answer the question that matters when choosing a platform:

Which system gives you the most performance per dollar for real-time analytical workloads?

The first release focuses on the read side: analytical queries over data that has already been loaded. We have also started measuring the write side, beginning with Snowflake as a contrast point for ClickHouse. Broader write-side coverage will follow.

This gives CostBench a simple roadmap: expose whether real-time cost-performance holds across the full analytics pipeline, from making fresh data query-ready to querying it efficiently.

The first results: read-side cost-performance #

The first CostBench release turns read-side performance into a comparable performance-per-dollar result across major cloud data warehouses.

We compare ClickHouse Cloud, Snowflake, Databricks, BigQuery, and Redshift using 43 production-derived analytical queries on a real anonymized dataset, then apply each vendor’s actual compute billing model to place every system on the same cost-performance plane: faster or slower, lower-cost or higher-cost.

ClickHouse Cloud is the only system that stays in the fast and low-cost zone as data scales. The nearest competitor is 23× worse in cost-performance.

That is the value of CostBench: it turns vendor-specific runtimes and billing models into a result teams can use when choosing a platform.

Open and reproducible by design #

CostBench is open because cost-performance claims should be inspectable.

The benchmark publishes the workload, scripts, configurations, pricing assumptions, raw JSON results, and methodology. If a result looks surprising, you can inspect the setup that produced it. If a configuration can be improved, it can be reviewed and corrected in the open.

Try it yourself #

Explore the results on the ClickHouse benchmark hub, inspect the...

CostBench: an open benchmark for data warehouse cost-performance

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine