Apache XTable Architecture - Lakehouse interoperability. XTable is an omni-directional translation layer on top of open table formats such as Apache Hudi, Apache Iceberg & Delta Lake. It is NOT ? a new table format! Essentially what we are doing is this: SOURCE ---> (read metadata) ---> XTable's Model ---> write into TARGET We read the metadata from the SOURCE table format, put it as a unified representation & write the metadata in the TARGET format. Note that we are only touching metadata, not the actual data files (such as #Parquet) with XTable at this stage. I have been asked a lot about the inner workings of Apache XTable (Incubating) Let's breakdown this down. XTable’s architecture consists of three key components: 1. Conversion Source: ? These are table format specific modules responsible for reading metadata from the source ? They extract information like schema, transactions, partitions & translate it into XTable’s unified internal representation 2. Conversion Logic: ? This is the central processing unit of XTable ? It orchestrates the entire translation process, including initializing of all components, managing sources and targets, among other critical things 3. Conversion Target: ? These mirror the source readers ? They take the internal representation of the metadata & maps it to the target format’s metadata structure Interested in a detailed read? Paper link in comments. #dataengineering #softwareengineering
Apache XTable (Incubating)
数据基础架构与分析
Menlo Park,CA 5,942 位关注者
Seamless cross-table interop between Apache Hudi, Delta Lake, and Apache Iceberg
关于我们
Apache XTable (Incubating) is a cross-table omni-directional interop of lakehouse table formats Apache Hudi, Apache Iceberg, and Delta Lake. XTable is formerly known as and recently renamed from OneTable. XTable is NOT a new or separate format, XTable provides abstractions and tools for the translation of lakehouse table format metadata. Choosing a table formats is a costly evaluation. Each project has rich features that may fit different use-cases. Some vendors use a table format as a point of lock-in. Your data should be UNIVERSAL! https://github.com/apache/incubator-xtable
- 网站
-
https://xtable.apache.org
Apache XTable (Incubating)的外部链接
- 所属行业
- 数据基础架构与分析
- 规模
- 11-50 人
- 总部
- Menlo Park,CA
- 类型
- 合营企业
- 创立
- 2023
- 领域
- Data Lakehouse、Data Engineering、Lakehouse、Apache Iceberg、Apache Hudi、Delta Lake、Apache Spark、Trino、Apache Flink和Presto
地点
-
主要
US,CA,Menlo Park,94025
动态
-
Matrix fans assemble! In ????? ????????????, ?????? ?????????????????? is the creator of the Matrix itself. He represents cold, machine logic, determinism, and control, in contrast to ?????? ????????????, who embodies intuition, free will, and chaos. This battle between rigid structures vs. adaptable intelligence mirrors the evolution of data architectures. The Architect built the Matrix to be a perfectly controlled system, but the first versions failed because they didn’t account for human (data) unpredictability. The Oracle saw this flaw and introduced an essential element: ????????????. More interestingly, the ???????????????? ???? ????????????. The same applies to traditional Lakehouse architectures, where a single table format (e.g., Iceberg, Delta, or Hudi) dictates the entire ecosystem. ???? ?????? ??????????????????’?? ???????????? ???? ?????? ??????????????????: ? Everything follows strict rules. ? You’re locked into one format, ensuring consistency. ? The system is optimized—but only for those willing to conform. ?? ?????? ?????? But what happens when business needs evolve? What if different teams need different tools that speak different table formats? Just like the first Matrix, this rigid system eventually fails because real-world use cases demand flexibility. The future of the Lakehouse is not about choosing one format—it’s about interoperability. The underlying data may be in a specific format, but end-users are able to maneuver it through their preferred formats (illusion of choice). Where formats like Iceberg, Delta, and Hudi coexist, and Apache XTable acts as the Oracle, translating metadata across them. Instead of forcing teams to conform to a single format, XTable introduces interoperability without breaking transactional consistency. ???? ?????? ????????????’?? ???????????? ???? ?????? ??????????????????: ? No format lock-in – Choose Iceberg, Delta, or Hudi without constraints. ? Seamless interoperability – Read/write across multiple engines without costly conversions. ? Evolvability – The lakehouse adapts to business needs, not the other way around. ???????????? ???????????? makes this possible by acting as a translator, not a disruptor. Instead of forcing a single “perfect” format (the Architect’s mistake), it allows organizations to move fluidly between table formats. 1?? XTable detects the table format (Iceberg, Delta, or Hudi). 2?? Maps the metadata to an intermediate abstraction layer. 3?? Exposes the table in a compatible way to query engines and apps. 4?? Ensures transactional consistency across different formats. 5?? Enables format-specific optimizations (e.g., partition pruning, snapshot isolation). What's your take on this Open Table Format (OTF) Revolution? #DataManagement #Lakehouse #DataArchitectures
-
-
Databases & datalakes SME Brenna Buuck explores Apache XTable (Incubating), an #opensource metadata translator that simplifies interoperability between the #opentable formats: Apache Iceberg, Apache Hudi, and Delta Lake—what it is, how it works, use cases, limitations and more. Check it out. https://lnkd.in/dyVymqeH
-
In the Episode 2 of 'Apache XTable in the Lakehouse', we will be joined by Matthias Rudolph & Stephen Said from AWS for a show & tell session. They will go over: 1) Open Table Formats - Iceberg, Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) Join us!
Apache XTable on AWS - Efficiently converting between Lakehouse Formats
www.dhirubhai.net
-
Join us for the Episode 2 of 'Apache XTable in the Lakehouse' ?? We will be joined by Matthias Rudolph & Stephen Said from Amazon Web Services (AWS) for a show & tell session. They will go over: 1) Open Table Formats - Apache Iceberg, Apache Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) ???11th March 2025 | 9 AM PT ?? https://lnkd.in/gWMUS8pX
-
-
Dremio published this week their 2025 State of Data survey results with responses from data and technology leaders across industries. One of the questions asked was about adoption of open table formats Delta Lake, Apache Hudi, and Apache Iceberg. It looks like Delta Lake comes out on top? The numbers don't seem to match common social media narratives? ?? Read the report in full here: https://lnkd.in/geCNH7bX #datalakehouse #deltalake #apachehudi #apacheiceberg
-
-
Very grateful for the opportunity to co-present with Microsoft and speak on Open Table Format Interoperability using Apache XTable (Incubating) at the Seattle Apache Iceberg Meetup. Interoperability is critical, as it allows organizations to avoid lock-in, choose the best tools for their specific use case, and adapt more easily to emerging technologies. Apache XTable is a cross-table converter for Lakehouse table formats that facilitates interoperability across data processing systems and query engines. Thank you to Anand Sivaram and Ashvin A. for the opportunity to speak together as well to Kevin Liu, Tej Luthra, Matthew Hicks for organizing this meetup. If you're interested in learning more about Apache XTable (Incubating), check out the talk between Onehouse and Microsoft below. https://lnkd.in/giuAFp8v We are looking for contributors to help join XTable and help shape the future of open lakehouse formats!
Open Table Format Interoperability with Apache XTable
https://www.youtube.com/
-
Nice write-up by Brenna from MinIO on Apache XTable. It touches upon: ? Enhanced Interoperability ? Simplified Data Management ? Architectural Components ? Current Limitations Link: https://lnkd.in/dyVymqeH #lakehouse #dataengineering
-
-
***???????? ???????? ??????????????????*** ?? Excited to share some thoughts on Apache XTable? (Incubating) – a game-changer for data lakehouse interoperability! As organizations juggle Apache Hudi, Apache Iceberg, and Delta Lake, Apache XTable steps in as a cross-table converter, enabling seamless metadata translation without duplicating data. Why it’s useful:?? - ??????????????????????: Write data in one format (e.g., Hudi for fast ingestion) and query it in another (e.g., Iceberg with Snowflake or Delta with Databricks Photon).?? - ???????? & ????????????????????: Avoids expensive data rewrites by syncing only metadata, keeping storage lean and ensuring data consistency.?? - ?????????????????? ??????????????: Bridges diverse tools and vendor preferences, making data accessible across platforms and teams.? How to leverage it:?? - ?????????????????? ???????? ????????: Transition between table formats without disrupting pipelines – perfect for evolving data strategies.?? - ????????-???????? ??????????????????: Pair it with high-throughput systems to unlock faster insights across formats.? Read more in Documentation:?https://xtable.apache.org/ Whether you’re optimizing a lakehouse, unifying team workflows, or future-proofing your data stack, Apache XTable is worth exploring. What’s your take on table format interoperability? Let’s discuss! #DataLakehouse #ApacheXTable #BigData #OpenSource #apachespark #aws #dataengineer #ApacheIceberg #hudi
-
Join the AWS Team next week for the Episode 2 of 'Apache XTable in the Lakehouse' ?? Agenda: 1) Open Table Formats - Iceberg, Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) ??? 11th March | 9 AM PT Join us! Link: https://lnkd.in/gWMUS8pX
-