Delta Lake and Apache Iceberg: 2025 comparison
Actualizado: 2026-05-03
Open table formats over data lakes have moved in three years from a minor technical decision to the core of any serious analytics architecture. Delta Lake, originally created by Databricks and released under Apache 2.0, and Apache Iceberg, incubated at Netflix and graduated as a top-level Apache project, are the two that have captured most share. With Delta Lake 4.0 released in April 2025 and Apache Iceberg 1.9 in May, it’s time to review where both formats stand and what criteria to apply when choosing.
The data governance and catalog decision context connects to what we describe in Microsoft GraphRAG in the enterprise and in the architecture of RAG 2.0 with knowledge graphs. For the object storage layer where these formats operate, the Kubernetes 1.33 improvements analysis covers the relevant platform context.
Key takeaways
- Delta Lake 4.0 brings Spark 4.0 support, Rust as a native JVM-free engine, liquid clustering, and UniForm improvements.
- Apache Iceberg 1.9 refines the REST catalog; Snowflake’s Polaris arriving resolves the historically weak catalog point.
- Delta’s UniForm lets a single file set be read as Delta or Iceberg, reducing the ecosystem gap between the two.
- For companies starting fresh without Databricks ties, Iceberg is the safer bet for neutrality and mature REST catalogs.
- For teams already operating on Databricks, Delta remains the natural choice.
What they are and where they come from
Both Delta and Iceberg solve the same problem: bringing ACID transactions, schema evolution, time travel, and metadata management to storage formats like Parquet or ORC on object systems like S3 or compatible storage. The difference is not what they do but how they do it and in the ecosystem around them.
Delta Lake was born at Databricks in 2017 and open-sourced in 2019. For years it had an ambiguous relationship with its parent, with advanced features only available on Databricks, until the 3.0 release of 2023 opened nearly the entire surface as UniForm. Databricks’ acquisition of Tabular, the commercial creators of Iceberg, in June 2024 marked a shift toward convergence.
Apache Iceberg emerged at Netflix in 2017 and reached the Apache Foundation in 2018. It was designed as a neutral format with no dominant company, and that neutrality attracted Snowflake, Google BigQuery, AWS Athena, Trino, Dremio, and Starburst as a first-class format. Iceberg became the bet of those who wanted multi-engine without vendor lock-in.
What is new in recent versions
Delta Lake 4.0 (April 2025) brings three relevant changes:
- Full support for Spark 4.0 and Rust as a native engine via the delta-rs project, enabling Delta read/write without the JVM. That matters for environments wanting to avoid Spark’s weight in simple ingestion.
- Liquid clustering as a data organization strategy, replacing traditional Z-ordering with better adaptability to changing query patterns.
- Improvements to the UniForm protocol allowing a single file set to be read as Delta or Iceberg without duplication.
Apache Iceberg 1.9 (May 2025) brings improvements to the REST catalog, hidden partition support, and concurrent write performance. The most relevant piece of the last year wasn’t an Iceberg feature itself but the arrival of mature REST catalogs such as Snowflake’s Polaris, open-sourced in 2024, letting Iceberg tables be managed agnostic to the provider. This matters because the catalog has historically been Iceberg’s weak point: too many mutually incompatible implementations.
Real-world performance
Performance depends on both format and engine, so clean comparisons are hard. What can be said is that in equivalent workloads, Delta on Spark and Iceberg on Trino give very similar read results. The difference appears in concurrent writes: Iceberg uses a copy-on-write model with optimistic concurrency control that scales better when many processes write in parallel, whereas Delta has traditionally been more conservative with conflicts.
In compaction and maintenance, differences are subtle. Delta has mature OPTIMIZE commands, and the new liquid clustering improves query performance on variable patterns. Iceberg has equivalent procedures but integration depends on the engine: each engine implements its own maintenance commands, which can create experience variability.
For massive workloads, both tools handle petabytes without known issues. Beyond a certain volume, the performance difference between formats becomes smaller than the difference between engine implementations or partition strategies.
Ecosystem and engines
Here is where real competition has played out during 2024 and 2025. Iceberg leads on neutrality: Snowflake supports it natively, BigQuery can read it without ingestion, Athena queries it directly, and the main open-source engines — Trino, Spark, Flink, and Dremio — support it maturely.
Delta has closed that gap at full speed. With UniForm, a single Delta file set can be read as Iceberg by engines without native Delta support. Databricks’ acquisition of Tabular in 2024 points to greater convergence: Iceberg’s main maintainers now work at Databricks, and the company has said publicly that it wants Delta and Iceberg to be complementary, not exclusive.
Practical choice criteria
For a company starting from scratch in 2025, the decision depends on the main engine:
- Center of gravity in Databricks: Delta is the natural choice.
- Snowflake, BigQuery, or a multi-engine catalog with Trino: Iceberg fits better through native support.
- Mixed architectures: Delta’s UniForm offers a pragmatic path: write in Delta and be consumable as Iceberg when needed.
The catalog is a factor worth analyzing carefully: Iceberg with a REST catalog like Polaris or AWS Glue enables multi-engine access with centralized control; Delta with Unity Catalog offers a very polished experience but historically tied to Databricks.
Migration cost
A frequent question is whether to migrate from a format already in use. Short answer: rarely. Migrating petabytes is expensive, risky, and rarely pays off if the current platform works. When it can make sense is for companies wanting to open data to external consumers or to teams with different engines. In that case, UniForm or an Iceberg mirror layer over Delta can solve the problem without full migration.
My read
My read of the situation in 2025 is that both formats are solid, both have a future, and competition between them has improved the overall offering. The decision for new companies reduces to where their analytics platform’s center of gravity sits.
- For teams starting now without prior Databricks ties: Iceberg is probably the safer bet for neutrality and the sweet spot of mature REST catalogs.
- For teams already operating on Databricks: Delta remains natural.
The interesting scenario is that convergence may make this decision matter less over time. If UniForm evolves well and catalogs improve interoperability, the underlying format may matter less and less, and the decision will move where it matters more: governance, cost management, and data quality.