Postgres FDW: Pushdown is a negotiation

saisrirampur1 pts0 comments

Postgres FDW: Pushdown is a negotiation<br>Open searchOpen region selectorEnglish<br>Japanese

47.5kSign inGet started

->Scroll to top

BackBlog<br>Engineering<br>Copy pageCopied!More actionsView as Markdown Open this page in Markdown<br>Open in ChatGPT Ask questions about this page<br>Open in Claude Ask questions about this page<br>Open in v0 Ask questions about this page

Postgres FDW: Pushdown is a negotiation

Kaushik Iska, David Wheeler and Philip Dubé<br>May 14, 2026 · 20 minutes read

Postgres extensions add functionality that Postgres itself does not include. Examples include PostGIS for geospatial data, pgvector for embeddings, and TimescaleDB for time-series. Via extensions the Postgres ecosystem distributes new functionality; just CREATE EXTENSION to hook into Postgres internals, and thereafter the user mostly forgets about it.

Foreign Data Wrapper extensions, also known as “FDW”s, teach Postgres how to read (and sometimes write) data outside of Postgres. Simply declare a foreign table:

1CREATE FOREIGN TABLE events (...)<br>2SERVER my_clickhouse OPTIONS (table 'events');

Then SELECT * FROM events resembles a query against any other table. Internally, Postgres asks the FDW to fetch the data from elsewhere.

pg_clickhouse, the FDW we maintain, fetches data from ClickHouse. While people often rely on the unified data stack — Postgres for transactional data and ClickHouse for analytics — pg_clickhouse executes SQL queries on both systems. As we worked on it for the past 6 months, we found that a single question drives most of the engineering work: how much of a query do we send across the wire as SQL versus how many rows do we drag back as data?

That question encompasses the meaning of pushdown: how much work can we “push down” to the remote service? The answer seems simple: “send everything!” — the WHERE clause, the GROUP BY, the LIMIT. But a closer examination reveals the complexity: Some clauses we can send. Some we can almost send — if we rewrite them. Some we used to send but stopped, because they returned the wrong results. And some clauses can never be sent by any FDW, regardless of the engineering one throws at them.

We’ve found the process of making these determinations highly iterative.

To demonstrate, let’s examine the impact of that iteration on a single query: what gets pushed down, what doesn't, and how we continuously modified the code in response to the question.

The goal is to illuminate the inner workings of an FDW for people who’ve heard of FDWs but don’t know how they work in detail. Whether you're a Postgres user curious about ClickHouse, a ClickHouse user curious about Postgres, or someone thinking about writing an FDW yourself, we hope you find this exercise edifying.

The query that takes 80 ms or 4 minutes #

This query ranks the busiest web events in the last week for the US, UK, and DE, by country and event name. It reports volume, unique users, premium share, p95 latency, and each event’s rank by country, returning the top 100 rows overall to provide a small snapshot and avoid streaming raw events.

1SELECT<br>2 u.country,<br>3 e.event_name,<br>4 count(*) AS n,<br>5 count(DISTINCT e.user_id) AS unique_users,<br>6 count(*) FILTER (WHERE e.properties->>'tier' = 'premium') AS premium_count,<br>7 percentile_cont(0.95) WITHIN GROUP (ORDER BY e.duration_ms) AS p95_ms,<br>8 ROW_NUMBER() OVER (PARTITION BY u.country ORDER BY count(*) DESC) AS rank_in_country<br>9FROM events_ch e JOIN users_ch u USING (user_id)<br>10WHERE e.ts >= now() - interval '7 days'<br>11 AND u.country IN ('US', 'UK', 'DE')<br>12 AND e.properties->>'platform' = 'web'<br>13GROUP BY u.country, e.event_name<br>14ORDER BY n DESC<br>15LIMIT 100;

events_ch and users_ch are foreign tables backed by ClickHouse. With every clause in this query pushed down, it returns 100 rows in roughly 80 ms.

If a single clause fails to push down, the query takes minutes. We built support for pushing down the window function, the percentile, the JSON access, and the FILTER aggregate; but each was, at some earlier point, not yet pushed down. When an operation cannot be pushed down, the rest of the query can't push down either; rows that should have been aggregated remotely must stream back to Postgres so it can aggregate them locally. The wire ends up carrying tens to hundreds of millions of rows instead of 100.

Pushdown looks like a feature, but it’s really an agreement between two SQL grammars, revised for every Postgres release. This explains why the pg_clickhouse release notes often look like corrections: revoking incorrect array-function pushdown, adding safer functions like levenshtein and soundex, and respecting planner invariants like EvalPlanQual even when the remote could move faster.

What the FDW actually negotiates #

An FDW does not send the original Postgres query to the remote database; doing so causes errors where the SQL dialects differ. But ideally, It also does not pull every row back unless it has to (as simple FDWs like file_fdw do), because although the results would...

postgres query data down pushdown send

Related Articles