Has the lakehouse battle shifted from table formats to catalogs?

Databricks Iceberg Support Has a Catch. It's Called Unity Catalog.

-->

Talk with our engineers by joining the new Onehouse community Slack!

Solutions

Learn

blog

About Us

Try onehouse free

June 11, 2026 Databricks Iceberg Support Has a Catch. It's Called Unity Catalog.

Written by: Kyle Weller and

Kyle Weller

The Iceberg Interoperability Promise Apache Iceberg was built around one architectural principle: no single vendor owns your data. The spec is open, the catalog protocol is open, and any engine that implements the reader/writer spec can interoperate with any other without translation layers, without data copies, without routing through a control plane you don't control.

That last part is what matters architecturally. The Iceberg REST Catalog (IRC)exists precisely so that catalog governance is decoupled from compute. Apache Polaris, Project Nessie, AWS Glue, Snowflake Horizon: any vendor that implements the IRC spec becomes a first-class governing catalog for Iceberg tables. The compute engine (Spark, Flink, Trino, DuckDB, Athena, BigQuery) doesn't need to know or care who's governing. It speaks IRC. The data sits in your object store. You own it. That separation of compute from catalog is not an implementation detail; it is the entire value proposition of the format. The breadth of native engine support today reflects how seriously the ecosystem has taken this promise. Spark, Flink, Trino, DuckDB, Snowflake, Athena, BigQuery are all reading and writing the same underlying files, governed by whichever catalog your team chooses, with no single point of control. This is in production at companies running serious multi-engine lakehouse stacks, and the interoperability story is exactly why teams have been migrating to Iceberg from proprietary formats. What’s below the surface of the Databricks Iceberg?

"Full performance. Full interoperability. No tradeoffs." That is Databricks' own headline for its Iceberg support as written by the original Iceberg creators. Databricks certainly has earned purchased the right to be taken seriously on a claim like this. This is the vendor that paid over $1 billion to merge a competing table format into their own. Despite taking 2yrs to reach “Public Preview”, if the headline claim here held, Databricks could be one of the strongest platforms anywhere for running a multi-engine Iceberg stack: write from Flink, query from Trino, govern from the catalog of your choice, with Databricks as one engine among equals. But the test of that claim isn't just reading the announcement. It's the limitations section of Databricks' own documentation, and reading the two side by side is disorienting, because they describe different products. The announcement describes the open lakehouse Iceberg was designed for. The documentation describes an implementation where Iceberg exists only inside Unity Catalog, every external catalog becomes read-only, and a substantial list of Iceberg-native features are simply absent. The information that follows walks through documentation in detail, what's mandatory, what's read-only, and what's missing. While the majority of this blog will stick to public docs and community references, if anyone is interested we also have a pile of receipts from our experience supporting Iceberg users in production who face regular frictions when integrating with the Databricks platform. Unity Catalog Is Not Optional

To use Iceberg on Databricks, Unity Catalog is mandatory. This is not a configuration default you can override; it is a hard architectural requirement stated plainly in the Iceberg documentation: ‍"You must meet the following requirements: A workspace with Unity Catalog enabled." What this means concretely: you cannot use Apache Polaris, Project Nessie, AWS Glue, or any other Iceberg REST Catalog for Iceberg tables with Databricks. You cannot mix/match that part of the stack. To make sure this is locked down tight the LOCATION property (which would let you specify where table metadata lives) is not supported for Iceberg in Unity Catalog. Databricks controls the metadata path. The legacy Hive Metastore path is also closed entirely. Unity Catalog is Databricks' proprietary SaaS metadata layer; it is not an IRC implementation you can run elsewhere, substitute with an equivalent, or operate outside Databricks' control plane. The whole point of the IRC standard is that any conforming catalog can govern any table. Unity Catalog's mandatory status is a direct contradiction of that principle, and the asymmetry this creates is worth stating precisely. External engines (Spark, Flink, Trino) can connect to Unity Catalog as an IRC server and read/write Databricks-managed Iceberg tables. That half of the interoperability story works. What doesn't work is Databricks compute acting as an IRC client connecting outward to Polaris, Nessie, or Glue. The IRC standard flows one direction on Databricks: inward to Unity Catalog. The asymmetry is confusing enough that a...

Has the lakehouse battle shifted from table formats to catalogs?

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs