Apache XTable (Incubating)

数据基础架构与分析

Menlo Park，CA 5,942 位关注者

Seamless cross-table interop between Apache Hudi, Delta Lake, and Apache Iceberg

关注

关于我们

Apache XTable (Incubating) is a cross-table omni-directional interop of lakehouse table formats Apache Hudi, Apache Iceberg, and Delta Lake. XTable is formerly known as and recently renamed from OneTable. XTable is NOT a new or separate format, XTable provides abstractions and tools for the translation of lakehouse table format metadata. Choosing a table formats is a costly evaluation. Each project has rich features that may fit different use-cases. Some vendors use a table format as a point of lock-in. Your data should be UNIVERSAL! https://github.com/apache/incubator-xtable

网站: https://xtable.apache.org
Apache XTable (Incubating)的外部链接
所属行业: 数据基础架构与分析
规模: 11-50 人
总部: Menlo Park，CA
类型: 合营企业
创立: 2023
领域: Data Lakehouse、Data Engineering、Lakehouse、Apache Iceberg、Apache Hudi、Delta Lake、Apache Spark、Trino、Apache Flink和Presto

地点

主要

US，CA，Menlo Park，94025

获取路线

动态

Apache XTable (Incubating)转发了
Dipankar Mazumdar, M.Sc

Staff Data Engineer Advocate @Onehouse.ai | Apache Hudi, Iceberg Contributor | Author of "Engineering Lakehouses"
2 天前
举报此动态
Apache XTable Architecture - Lakehouse interoperability. XTable is an omni-directional translation layer on top of open table formats such as Apache Hudi, Apache Iceberg & Delta Lake. It is NOT ? a new table format! Essentially what we are doing is this: SOURCE ---> (read metadata) ---> XTable's Model ---> write into TARGET We read the metadata from the SOURCE table format, put it as a unified representation & write the metadata in the TARGET format. Note that we are only touching metadata, not the actual data files (such as #Parquet) with XTable at this stage. I have been asked a lot about the inner workings of Apache XTable (Incubating) Let's breakdown this down. XTable’s architecture consists of three key components: 1. Conversion Source: ? These are table format specific modules responsible for reading metadata from the source ? They extract information like schema, transactions, partitions & translate it into XTable’s unified internal representation 2. Conversion Logic: ? This is the central processing unit of XTable ? It orchestrates the entire translation process, including initializing of all components, managing sources and targets, among other critical things 3. Conversion Target: ? These mirror the source readers ? They take the internal representation of the metadata & maps it to the target format’s metadata structure Interested in a detailed read? Paper link in comments. #dataengineering #softwareengineering
2 条评论

赞评论分享
Apache XTable (Incubating)转发了
Animesh Kumar

CTO | DataOS: Data Products in 6 Weeks ?
3 天前
举报此动态
Matrix fans assemble! In ????? ????????????, ?????? ?????????????????? is the creator of the Matrix itself. He represents cold, machine logic, determinism, and control, in contrast to ?????? ????????????, who embodies intuition, free will, and chaos. This battle between rigid structures vs. adaptable intelligence mirrors the evolution of data architectures. The Architect built the Matrix to be a perfectly controlled system, but the first versions failed because they didn’t account for human (data) unpredictability. The Oracle saw this flaw and introduced an essential element: ????????????. More interestingly, the ???????????????? ???? ????????????. The same applies to traditional Lakehouse architectures, where a single table format (e.g., Iceberg, Delta, or Hudi) dictates the entire ecosystem. ???? ?????? ??????????????????’?? ???????????? ???? ?????? ??????????????????: ? Everything follows strict rules. ? You’re locked into one format, ensuring consistency. ? The system is optimized—but only for those willing to conform. ?? ?????? ?????? But what happens when business needs evolve? What if different teams need different tools that speak different table formats? Just like the first Matrix, this rigid system eventually fails because real-world use cases demand flexibility. The future of the Lakehouse is not about choosing one format—it’s about interoperability. The underlying data may be in a specific format, but end-users are able to maneuver it through their preferred formats (illusion of choice). Where formats like Iceberg, Delta, and Hudi coexist, and Apache XTable acts as the Oracle, translating metadata across them. Instead of forcing teams to conform to a single format, XTable introduces interoperability without breaking transactional consistency. ???? ?????? ????????????’?? ???????????? ???? ?????? ??????????????????: ? No format lock-in – Choose Iceberg, Delta, or Hudi without constraints. ? Seamless interoperability – Read/write across multiple engines without costly conversions. ? Evolvability – The lakehouse adapts to business needs, not the other way around. ???????????? ???????????? makes this possible by acting as a translator, not a disruptor. Instead of forcing a single “perfect” format (the Architect’s mistake), it allows organizations to move fluidly between table formats. 1?? XTable detects the table format (Iceberg, Delta, or Hudi). 2?? Maps the metadata to an intermediate abstraction layer. 3?? Exposes the table in a compatible way to query engines and apps. 4?? Ensures transactional consistency across different formats. 5?? Enables format-specific optimizations (e.g., partition pruning, snapshot isolation). What's your take on this Open Table Format (OTF) Revolution? #DataManagement #Lakehouse #DataArchitectures
16 条评论

赞评论分享
Apache XTable (Incubating)转发了
MinIO

27,126 位关注者
1 周
举报此动态
Databases & datalakes SME Brenna Buuck explores Apache XTable (Incubating), an #opensource metadata translator that simplifies interoperability between the #opentable formats: Apache Iceberg, Apache Hudi, and Delta Lake—what it is, how it works, use cases, limitations and more. Check it out. https://lnkd.in/dyVymqeH

Apache XTable: Advancing Data Interoperability in Data Lakehouses

blog.min.io

赞评论分享
Apache XTable (Incubating)

5,942 位关注者
3 周已编辑
举报此动态
In the Episode 2 of 'Apache XTable in the Lakehouse', we will be joined by Matthias Rudolph & Stephen Said from AWS for a show & tell session. They will go over: 1) Open Table Formats - Iceberg, Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) Join us!

Apache XTable on AWS - Efficiently converting between Lakehouse Formats

www.dhirubhai.net

3 条评论

赞评论分享
Apache XTable (Incubating)

5,942 位关注者
1 周
举报此动态
Join us for the Episode 2 of 'Apache XTable in the Lakehouse' ?? We will be joined by Matthias Rudolph & Stephen Said from Amazon Web Services (AWS) for a show & tell session. They will go over: 1) Open Table Formats - Apache Iceberg, Apache Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) ???11th March 2025 | 9 AM PT ?? https://lnkd.in/gWMUS8pX
赞评论分享
Apache XTable (Incubating)转发了
Kyle Weller

VP of Product @ Onehouse.ai | ex Azure Databricks
1 周
举报此动态
Dremio published this week their 2025 State of Data survey results with responses from data and technology leaders across industries. One of the questions asked was about adoption of open table formats Delta Lake, Apache Hudi, and Apache Iceberg. It looks like Delta Lake comes out on top? The numbers don't seem to match common social media narratives? ?? Read the report in full here: https://lnkd.in/geCNH7bX #datalakehouse #deltalake #apachehudi #apacheiceberg
15 条评论

赞评论分享
Apache XTable (Incubating)转发了
Rahil Chertara

Software Engineer @ Onehouse | Ex-AWS
2 周已编辑
举报此动态
Very grateful for the opportunity to co-present with Microsoft and speak on Open Table Format Interoperability using Apache XTable (Incubating) at the Seattle Apache Iceberg Meetup. Interoperability is critical, as it allows organizations to avoid lock-in, choose the best tools for their specific use case, and adapt more easily to emerging technologies. Apache XTable is a cross-table converter for Lakehouse table formats that facilitates interoperability across data processing systems and query engines. Thank you to Anand Sivaram and Ashvin A. for the opportunity to speak together as well to Kevin Liu, Tej Luthra, Matthew Hicks for organizing this meetup. If you're interested in learning more about Apache XTable (Incubating), check out the talk between Onehouse and Microsoft below. https://lnkd.in/giuAFp8v We are looking for contributors to help join XTable and help shape the future of open lakehouse formats!

Open Table Format Interoperability with Apache XTable

https://www.youtube.com/

11 条评论

赞评论分享
Apache XTable (Incubating)

5,942 位关注者
2 周
举报此动态
Nice write-up by Brenna from MinIO on Apache XTable. It touches upon: ? Enhanced Interoperability ? Simplified Data Management ? Architectural Components ? Current Limitations Link: https://lnkd.in/dyVymqeH #lakehouse #dataengineering
1 条评论

赞评论分享
Apache XTable (Incubating)转发了
Mahir Jayswal

Immediate joiner || AWS Data engineer || 5X AWS || snowflake || Databricks || Lambda || Glue || Pyspark || SQL || Data warehousing || GenAI
3 周
举报此动态
***???????? ???????? ??????????????????*** ?? Excited to share some thoughts on Apache XTable? (Incubating) – a game-changer for data lakehouse interoperability! As organizations juggle Apache Hudi, Apache Iceberg, and Delta Lake, Apache XTable steps in as a cross-table converter, enabling seamless metadata translation without duplicating data. Why it’s useful:?? - ??????????????????????: Write data in one format (e.g., Hudi for fast ingestion) and query it in another (e.g., Iceberg with Snowflake or Delta with Databricks Photon).?? - ???????? & ????????????????????: Avoids expensive data rewrites by syncing only metadata, keeping storage lean and ensuring data consistency.?? - ?????????????????? ??????????????: Bridges diverse tools and vendor preferences, making data accessible across platforms and teams.? How to leverage it:?? - ?????????????????? ???????? ????????: Transition between table formats without disrupting pipelines – perfect for evolving data strategies.?? - ????????-???????? ??????????????????: Pair it with high-throughput systems to unlock faster insights across formats.? Read more in Documentation:?https://xtable.apache.org/ Whether you’re optimizing a lakehouse, unifying team workflows, or future-proofing your data stack, Apache XTable is worth exploring. What’s your take on table format interoperability? Let’s discuss! #DataLakehouse #ApacheXTable #BigData #OpenSource #apachespark #aws #dataengineer #ApacheIceberg #hudi

1 条评论

赞评论分享
Apache XTable (Incubating)

5,942 位关注者
2 周
举报此动态
Join the AWS Team next week for the Episode 2 of 'Apache XTable in the Lakehouse' ?? Agenda: 1) Open Table Formats - Iceberg, Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) ??? 11th March | 9 AM PT Join us! Link: https://lnkd.in/gWMUS8pX
赞评论分享

相似主页

查看职位

登录看看您认识Apache XTable (Incubating)的哪些人

Apache XTable (Incubating)

数据基础架构与分析

Menlo Park，CA 5,942 位关注者

Seamless cross-table interop between Apache Hudi, Delta Lake, and Apache Iceberg

关于我们

地点

动态

Apache XTable: Advancing Data Interoperability in Data Lakehouses

blog.min.io

Apache XTable on AWS - Efficiently converting between Lakehouse Formats

www.dhirubhai.net

Open Table Format Interoperability with Apache XTable

https://www.youtube.com/

立即加入，查看您错过的职场动态

相似主页

Apache Hudi

Onehouse

Apache Iceberg

Delta Lake

Tabular (now part of Databricks)

Apache Doris

Polars

Apache Iceberg Workshops

Unity Catalog

Apache Spark

查看职位

专员职位

工程师职位

数据工程师职位

初级工程师职位

数据分析员职位

人力资源专员职位

管理员职位

建筑师职位

软件工程师职位

解决方案架构师职位

总监职位

分析师职位

副总裁职位