How companies implementing "DATA LAKE"
Data Processing applications era started in 1959 with launching of COBOL language (Common Business Oriented Language) and evolved further all thanks to RDBMS (Relational Data Base Management System). Entire world started using RDBMS and started facing challenges with "Bigdata") and evolved further all thanks to RDBMS (Relational Data Base Management System). Entire world started using RDBMS and started facing challenges with "Bigdata".
Many developer communities appreciated Google for its white paper releases on "Google File System" and MapReduce framework/programming model to tackle Big Data problem. World adopted the same strategy and created opensource project "Hadoop". Hadoop is first attempt of Data Lake implementation. Hadoop also slowly losing its charm and world moving towards Cloud. Modern data management has many components now and I would like to discuss how many companies levering it.
What is DATA LAKE?
Data Lake is central repository which can store Structure, Unstructured and Semi Structured data in raw format and process it at scale.
How was DATA LAKE initially imagined?
However, Data Lake technology misses two critical features:
So, Data Lake started integrating RDBMS/Data warehouses for reporting & BI purposes. ML/AI works on Data Lake, but BI/Reporting shifted to warehouses. This is current architecture of Data Lake for many organizations.
领英推荐
How Data Lake Transition Happened?
In recent days, Hadoop as a platform lost its excitement and cloud infrastructure became more economical and started seeing wider developments. Present data lake implementation includes many cloud components.
#dataenginneering #datalake #Spark
Co-founder @Streambased
5 个月Great write-up! A new trend is emerging around the convergence of operational and analytical systems via Kafka, effectively turning it into a "streaming datalake." With solutions like Confluent's Tableflow and Streambased, you can now query streaming data directly at the source, bypassing complex and costly ETL/ELT processes. This approach not only ensures total consistency but also unlocks a much greater volume of data for prediction and advanced analytics.
Data Scientist | Senior Recruiter
8 个月Hi Suresh, we have a community of Power BI, Tableau, and other BI technology professionals. If interested, you can join our group and share your experience with us. https://www.dhirubhai.net/groups/8164518/