Apache XTable (Incubating)的封面图片
Apache XTable (Incubating)

Apache XTable (Incubating)

数据基础架构与分析

Menlo Park,CA 5,942 位关注者

Seamless cross-table interop between Apache Hudi, Delta Lake, and Apache Iceberg

关于我们

Apache XTable (Incubating) is a cross-table omni-directional interop of lakehouse table formats Apache Hudi, Apache Iceberg, and Delta Lake. XTable is formerly known as and recently renamed from OneTable. XTable is NOT a new or separate format, XTable provides abstractions and tools for the translation of lakehouse table format metadata. Choosing a table formats is a costly evaluation. Each project has rich features that may fit different use-cases. Some vendors use a table format as a point of lock-in. Your data should be UNIVERSAL! https://github.com/apache/incubator-xtable

网站
https://xtable.apache.org
所属行业
数据基础架构与分析
规模
11-50 人
总部
Menlo Park,CA
类型
合营企业
创立
2023
领域
Data Lakehouse、Data Engineering、Lakehouse、Apache Iceberg、Apache Hudi、Delta Lake、Apache Spark、Trino、Apache Flink和Presto

地点

动态

  • Apache XTable (Incubating)转发了

    查看Dipankar Mazumdar, M.Sc的档案

    Staff Data Engineer Advocate @Onehouse.ai | Apache Hudi, Iceberg Contributor | Author of "Engineering Lakehouses"

    Apache XTable Architecture - Lakehouse interoperability. XTable is an omni-directional translation layer on top of open table formats such as Apache Hudi, Apache Iceberg & Delta Lake. It is NOT ? a new table format! Essentially what we are doing is this: SOURCE ---> (read metadata) ---> XTable's Model ---> write into TARGET We read the metadata from the SOURCE table format, put it as a unified representation & write the metadata in the TARGET format. Note that we are only touching metadata, not the actual data files (such as #Parquet) with XTable at this stage. I have been asked a lot about the inner workings of Apache XTable (Incubating) Let's breakdown this down. XTable’s architecture consists of three key components: 1. Conversion Source: ? These are table format specific modules responsible for reading metadata from the source ? They extract information like schema, transactions, partitions & translate it into XTable’s unified internal representation 2. Conversion Logic: ? This is the central processing unit of XTable ? It orchestrates the entire translation process, including initializing of all components, managing sources and targets, among other critical things 3. Conversion Target: ? These mirror the source readers ? They take the internal representation of the metadata & maps it to the target format’s metadata structure Interested in a detailed read? Paper link in comments. #dataengineering #softwareengineering

    • 该图片无替代文字
  • Apache XTable (Incubating)转发了

    查看Animesh Kumar的档案

    CTO | DataOS: Data Products in 6 Weeks ?

    Matrix fans assemble! In ????? ????????????, ?????? ?????????????????? is the creator of the Matrix itself. He represents cold, machine logic, determinism, and control, in contrast to ?????? ????????????, who embodies intuition, free will, and chaos. This battle between rigid structures vs. adaptable intelligence mirrors the evolution of data architectures. The Architect built the Matrix to be a perfectly controlled system, but the first versions failed because they didn’t account for human (data) unpredictability. The Oracle saw this flaw and introduced an essential element: ????????????. More interestingly, the ???????????????? ???? ????????????. The same applies to traditional Lakehouse architectures, where a single table format (e.g., Iceberg, Delta, or Hudi) dictates the entire ecosystem. ???? ?????? ??????????????????’?? ???????????? ???? ?????? ??????????????????: ? Everything follows strict rules. ? You’re locked into one format, ensuring consistency. ? The system is optimized—but only for those willing to conform. ?? ?????? ?????? But what happens when business needs evolve? What if different teams need different tools that speak different table formats? Just like the first Matrix, this rigid system eventually fails because real-world use cases demand flexibility. The future of the Lakehouse is not about choosing one format—it’s about interoperability. The underlying data may be in a specific format, but end-users are able to maneuver it through their preferred formats (illusion of choice). Where formats like Iceberg, Delta, and Hudi coexist, and Apache XTable acts as the Oracle, translating metadata across them. Instead of forcing teams to conform to a single format, XTable introduces interoperability without breaking transactional consistency. ???? ?????? ????????????’?? ???????????? ???? ?????? ??????????????????: ? No format lock-in – Choose Iceberg, Delta, or Hudi without constraints. ? Seamless interoperability – Read/write across multiple engines without costly conversions. ? Evolvability – The lakehouse adapts to business needs, not the other way around. ???????????? ???????????? makes this possible by acting as a translator, not a disruptor. Instead of forcing a single “perfect” format (the Architect’s mistake), it allows organizations to move fluidly between table formats. 1?? XTable detects the table format (Iceberg, Delta, or Hudi). 2?? Maps the metadata to an intermediate abstraction layer. 3?? Exposes the table in a compatible way to query engines and apps. 4?? Ensures transactional consistency across different formats. 5?? Enables format-specific optimizations (e.g., partition pruning, snapshot isolation). What's your take on this Open Table Format (OTF) Revolution? #DataManagement #Lakehouse #DataArchitectures

    • The Lakehouse Vision: The Oracle (Interoperability) vs The Architect (Conformity)
  • Apache XTable (Incubating)转发了

    查看MinIO的组织主页

    27,126 位关注者

    Databases & datalakes SME Brenna Buuck explores Apache XTable (Incubating), an #opensource metadata translator that simplifies interoperability between the #opentable formats: Apache Iceberg, Apache Hudi, and Delta Lake—what it is, how it works, use cases, limitations and more. Check it out. https://lnkd.in/dyVymqeH

  • 查看Apache XTable (Incubating)的组织主页

    5,942 位关注者

    In the Episode 2 of 'Apache XTable in the Lakehouse', we will be joined by Matthias Rudolph & Stephen Said from AWS for a show & tell session. They will go over: 1) Open Table Formats - Iceberg, Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) Join us!

    Apache XTable on AWS - Efficiently converting between Lakehouse Formats

    Apache XTable on AWS - Efficiently converting between Lakehouse Formats

    www.dhirubhai.net

  • Join us for the Episode 2 of 'Apache XTable in the Lakehouse' ?? We will be joined by Matthias Rudolph & Stephen Said from Amazon Web Services (AWS) for a show & tell session. They will go over: 1) Open Table Formats - Apache Iceberg, Apache Hudi & Delta Lake 2) Apache Xtable 3) How to run XTable using AWS Lambda & Glue 4) The Xtable-Lambda architecture 5) Other usecases (e.g. MWAA) ???11th March 2025 | 9 AM PT ?? https://lnkd.in/gWMUS8pX

    • 该图片无替代文字
  • Apache XTable (Incubating)转发了

    查看Kyle Weller的档案

    VP of Product @ Onehouse.ai | ex Azure Databricks

    Dremio published this week their 2025 State of Data survey results with responses from data and technology leaders across industries. One of the questions asked was about adoption of open table formats Delta Lake, Apache Hudi, and Apache Iceberg. It looks like Delta Lake comes out on top? The numbers don't seem to match common social media narratives? ?? Read the report in full here: https://lnkd.in/geCNH7bX #datalakehouse #deltalake #apachehudi #apacheiceberg

    • 该图片无替代文字
  • Apache XTable (Incubating)转发了

    查看Rahil Chertara的档案

    Software Engineer @ Onehouse | Ex-AWS

    Very grateful for the opportunity to co-present with Microsoft and speak on Open Table Format Interoperability using Apache XTable (Incubating) at the Seattle Apache Iceberg Meetup. Interoperability is critical, as it allows organizations to avoid lock-in, choose the best tools for their specific use case, and adapt more easily to emerging technologies. Apache XTable is a cross-table converter for Lakehouse table formats that facilitates interoperability across data processing systems and query engines. Thank you to Anand Sivaram and Ashvin A. for the opportunity to speak together as well to Kevin Liu, Tej Luthra, Matthew Hicks for organizing this meetup. If you're interested in learning more about Apache XTable (Incubating), check out the talk between Onehouse and Microsoft below. https://lnkd.in/giuAFp8v We are looking for contributors to help join XTable and help shape the future of open lakehouse formats!

  • Apache XTable (Incubating)转发了

    查看Mahir Jayswal的档案

    Immediate joiner || AWS Data engineer || 5X AWS || snowflake || Databricks || Lambda || Glue || Pyspark || SQL || Data warehousing || GenAI

    ***???????? ???????? ??????????????????*** ?? Excited to share some thoughts on Apache XTable? (Incubating) – a game-changer for data lakehouse interoperability! As organizations juggle Apache Hudi, Apache Iceberg, and Delta Lake, Apache XTable steps in as a cross-table converter, enabling seamless metadata translation without duplicating data. Why it’s useful:?? - ??????????????????????: Write data in one format (e.g., Hudi for fast ingestion) and query it in another (e.g., Iceberg with Snowflake or Delta with Databricks Photon).?? - ???????? & ????????????????????: Avoids expensive data rewrites by syncing only metadata, keeping storage lean and ensuring data consistency.?? - ?????????????????? ??????????????: Bridges diverse tools and vendor preferences, making data accessible across platforms and teams.? How to leverage it:?? - ?????????????????? ???????? ????????: Transition between table formats without disrupting pipelines – perfect for evolving data strategies.?? - ????????-???????? ??????????????????: Pair it with high-throughput systems to unlock faster insights across formats.? Read more in Documentation:?https://xtable.apache.org/ Whether you’re optimizing a lakehouse, unifying team workflows, or future-proofing your data stack, Apache XTable is worth exploring. What’s your take on table format interoperability? Let’s discuss! #DataLakehouse #ApacheXTable #BigData #OpenSource #apachespark #aws #dataengineer #ApacheIceberg #hudi

相似主页

查看职位