Lately, we have heard a lot of people conflating open table formats - Apache Hudi, Apache Iceberg, Delta Lake- with the #datalakehouse architecture. Formats matter, but there is much more to a lakehouse. Dipankar Mazumdar, M.Sc literally wrote the book on “Engineering Lakehouses with Open Table Formats,” so we are excited to hear his take on all the necessary components of a data lakehouse. ?? Join us for this upcoming webinar. #dataengineering #deltalake #apachehudi #apacheiceberg
关于我们
Onehouse, the pioneer in open data lakehouse technology, empowers enterprises to deploy and manage a world-class data lakehouse in minutes on Apache Hudi, Apache Iceberg, and Delta Lake. Delivered as a fully-managed cloud service in your VPC, Onehouse offers high-performance ingestion pipelines for minute-level freshness and optimizes tables for maximum query performance. Thanks to its truly open data architecture, Onehouse eliminates data format, table format, compute and catalog lock-ins, guarantees interoperability with virtually any warehouse/data processing engine, and ensures exceptional ELT and query performance for all your workloads. Companies worldwide rely on Onehouse to power their analytics, reporting, data science, machine learning, and GenAI use cases from a single, unified source of data. Built on Apache Hudi and Apache XTable (Incubating), Onehouse features advanced capabilities such as indexing, ACID transactions, and time travel, ensuring consistent data across all downstream query engines and tools. The platform’s unique incremental processing capabilities deliver unmatched ELT cost and performance by minimizing data movement and optimizing resource usage. With 24/7 reliability, immediate cost savings, and open access for all major tools and query engines, benefit from Onehouse's #nolockin philosophy to future-proof any stack.
- 网站
-
https://onehouse.ai
Onehouse的外部链接
- 所属行业
- 软件开发
- 规模
- 51-200 人
- 总部
- Sunnyvale,California
- 类型
- 私人持股
- 创立
- 2021
地点
-
主要
150 Mathilda Place
Suite 106
US,California,Sunnyvale,94086
Onehouse员工
动态
-
We love all the conversations about open #datalakehouse table formats! But a lakehouse is about more than Apache Hudi, Apache Iceberg or Delta Lake. Now it's time to talk about clouds, catalogs, engines and more that you need to support. Dipankar Mazumdar, M.Sc breaks down all the necessary components of a data lakehouse tomorrow in this webinar. https://lnkd.in/djHDpm_p
-
Onehouse转发了
Blogs on Lakehouse Architecture. In the past couple months I have published numerous blogs about the Lakehouse architecture covering open table formats Apache Hudi, Apache Iceberg, Delta Lake! These blogs goes over the architectural details/nuances and understanding some of the real-world problems that you can see in a lakehouse. Here are the 8 recent ones: ? Data Deduplication Strategies in an Open Lakehouse Architecture - https://lnkd.in/gyTRYasQ ? What is Apache Arrow Flight, Flight SQL & ADBC? https://lnkd.in/gxchSSpx ? ACID Transactions in an Open Data Lakehouse - https://lnkd.in/g6BM3Fxt ? What is Clustering in an Open Data Lakehouse? https://lnkd.in/dvXsxNab ? How to Optimize Performance for Your Open Data Lakehouse? https://lnkd.in/dY9k9SUp ? Open Table Formats and the Open Data Lakehouse, In Perspective - https://lnkd.in/dhDC_hNP ? Run Apache XTable in AWS Lambda for background conversion of open table formats - https://lnkd.in/dSrgCCFt ? Concurrency Control in Open Data Lakehouse - https://lnkd.in/dkbWyrFi If you find them insightful, bookmark and share. Happy reading! #dataengineering #softwareengineering
-
-
At Onehouse, we know the value of real-time insights. That’s why we’re excited to be a launch partner for the GA of Confluent Tableflow, turning Kafka topics into open lakehouse tables instantly accessible for analytics with any open engine. ?? Unlock the power of Tableflow + Onehouse:?https://lnkd.in/gjmupDjE ?? And watch the demo: https://lnkd.in/duRztzSY
-
You’ve settled on Apache Hudi, Apache Iceberg or Delta Lake for your open table format, the foundation to your #datalakehouse. ?? But what cloud, catalogs, engines and more do you need to consider? ?? Dipankar Mazumdar, M.Sc breaks down all the necessary components of a data lakehouse in this upcoming webinar. #dataengineering?#apachehudi #apacheiceberg #deltalake
Lately, we have heard a lot of people conflating open table formats - Apache Hudi, Apache Iceberg, Delta Lake- with the #datalakehouse architecture. Formats matter, but there is much more to a lakehouse. Dipankar Mazumdar, M.Sc literally wrote the book on “Engineering Lakehouses with Open Table Formats,” so we are excited to hear his take on all the necessary components of a data lakehouse. ?? Join us for this upcoming webinar. #dataengineering #deltalake #apachehudi #apacheiceberg
Table Format != Data Lakehouse: Breaking Down the Lakehouse Components
www.dhirubhai.net
-
?????? ????????: ???????????????? ???????? ?????????????????????? ???? ?????????????????? ?????????????????????????? ?? Data duplication could be a silent killer in data pipelines - driving up storage costs, impacting query performance, and compromising data integrity. But how do you ?????????????? and ???????????? duplicates in an open lakehouse? In this blog, we explore: ? How duplication creeps into data pipelines—from ingestion to storage merging. ? The?challenges?of deduplication in streaming, batch, and multi-source integrations. ? How?Apache Hudi?provides built-in deduplication strategies at multiple stages. ? How?Apache Iceberg and Delta Lake?approach deduplication ? A hands-on example demonstrating?record merging and deduplication with Apache Hudi. LINK: https://lnkd.in/eXEBbxnR #dataengineering #datalakehouse #dataarchitecture
-
Onehouse转发了
The Format Wars Are Over. One data lake. Multiple table formats. Zero copies. ?? Streaming teams write in Apache Hudi ???? ML teams access in Delta Lake ?? Analytics use Apache Iceberg All from a single source of truth. No data duplication needed. See how. https://lnkd.in/gCYzcMGi
-
The Format Wars Are Over. One data lake. Multiple table formats. Zero copies. ?? Streaming teams write in Apache Hudi ???? ML teams access in Delta Lake ?? Analytics use Apache Iceberg All from a single source of truth. No data duplication needed. See how. https://lnkd.in/gCYzcMGi
-
Excited to hear from Koti Darla at Southwest Airlines next week on the Apache Hudi Community Sync. Join us to learn the lesson's Koti and his team learning during their data architecture transformation. #dataengineering #datalakehouse https://lnkd.in/gFpRRW_E
Upcoming Apache Hudi Community Sync ?? Join Southwest Airlines' Tech Lead Data Engineer, Koti Darla, for an in-depth exploration of their data architecture transformation. In this session, you'll get an inside look at Southwest Airlines' journey from older systems to a modern, high-performance framework powered by Apache Hudi. Koti will break down key performance challenges, share real-world insights, and highlight the significant gains achieved through this transition. ???19th March 2025 | 9 AM PT ?? https://lnkd.in/dj5q3qmx #dataengineering #softwareengineering
-
-
Onehouse转发了
Dremio published this week their 2025 State of Data survey results with responses from data and technology leaders across industries. One of the questions asked was about adoption of open table formats Delta Lake, Apache Hudi, and Apache Iceberg. It looks like Delta Lake comes out on top? The numbers don't seem to match common social media narratives? ?? Read the report in full here: https://lnkd.in/geCNH7bX #datalakehouse #deltalake #apachehudi #apacheiceberg
-