登录查看更多内容

Lambda Architecture - What is the Buzz?

Abhishek Srivastava

Data, AI / ML, Cloud, Enterprise Architecture Leader & Innovator | IIT, Mccombs

发布日期: 2016年6月9日

Nathan Marz, who created Apache storm, came up with term Lambda Architecture (LA). Although there is nothing Greek about it, I think it is called so, primarily because of its shape. It is a data processing architecture designed to handle massive data quantities of data by taking advantage of both batch, and stream processing methods. LA is an approach to building stream processing applications on top of map reduce or storm or similar applications. This has become popular in big data space with companies such as LinkedIn, Twitter, Amazon and the likes.

Lambda Architecture pattern solves the problem of speed on Big data, and is suited to applications, where there are delays in data collection, and availability through dashboards, requiring data validity for online processing for older data sets to find behavioral pattern as per users’ needs. One of basic requirement for LA is to have immutable data store, which appends the data instead of following update, and delete as part of CRUD operations. But the downside of this immutable data store is that batch processing is not real time. Although the batch processing will improve with time, it is also true that the volume of data grows at the same pace, if not faster. Applications for BI or delivery layer expect to access the data real time, and cannot rely entirely on batch processing to finish up.

The way it works is that an immutable sequence of records is captured and fed into the batch system, and stream processing system in parallel. The transformation logic is applied twice to both processing systems - once in batch and once in stream processing. The result is then stitched together from both the systems at query time to present final answer.

So why there is so much buzz about Lambda Architecture these days. Well…the reason most likely is because of growing complexities in data space and raised business expectations for quick data insights, there is a need to build low latency processing systems. What we have at our disposal is scalable high latency batch system that can process historical data and a low latency stream processing system that can process results. By merging these two solutions we can actually build a workable solution.

Amit Pandey

Professor CSE

8 年

Thanks for the article.

Mayank Srivastava

Driving Digital Transformation, Product Innovation, and Operational Excellence | Senior Product, Technology, and Operations Leader | Financial Services and FinTech Expert

8 年

Thanks for writing this article. Very informative.

3 次回应

查看更多评论

要查看或添加评论，请登录

Abhishek Srivastava的更多文章

GDPR: PII vs. Personal data

2017年7月19日

GDPR: PII vs. Personal data

The European Union’s new General Data Protection Regulation (GDPR), which goes into full effect in May 2018…
Data Architecture: How to build the castle?

2017年5月23日

Data Architecture: How to build the castle?

“Architecture is frozen music.” This famous quote is from 18th-century writer Johann Wolfgang von Goethe.

2 条评论
Data Architecture: What is it and Why should we care?

2017年5月7日

Data Architecture: What is it and Why should we care?

Data Architecture, as understood by most in the industry, has many different definitions. Here is what Wikipedia says –…
Master Data Management: Cloud Vs on-premises

2017年4月14日

Master Data Management: Cloud Vs on-premises

Cloud computing is not new to businesses anymore. The growth of cloud services, cloud data, and cloud usage continues…

3 条评论

Lambda Architecture - What is the Buzz?

Abhishek Srivastava

Data, AI / ML, Cloud, Enterprise Architecture Leader & Innovator | IIT, Mccombs

Abhishek Srivastava的更多文章

社区洞察

其他会员也浏览了

The Rise of EtLT(Extract, Tweak Light Transform, Load, Transform) in Modern Data Processing

Embracing Event-Driven Architectures: The Future of Data Engineering

Medallion Architecture framework within the Microsoft Fabric (Bronze Layer) - Part 1

New in dbt 1.9

Data Engineering at Lingokids

Efficiently manage Delta Live Tables Dependencies in Databricks

Rise of the LakeHouse Architecture

My Key Thoughts on Medallion Architecture After Years of Experience

Data Engineering — Aamir P

Build on top of Secoda with our new SDK - Secoda Wrap 16

Abhishek Srivastava的更多文章

GDPR: PII vs. Personal data

Data Architecture: How to build the castle?

Data Architecture: What is it and Why should we care?

Master Data Management: Cloud Vs on-premises

社区洞察

其他会员也浏览了

The Rise of EtLT(Extract, Tweak Light Transform, Load, Transform) in Modern Data Processing

Embracing Event-Driven Architectures: The Future of Data Engineering

Medallion Architecture framework within the Microsoft Fabric (Bronze Layer) - Part 1

New in dbt 1.9

Data Engineering at Lingokids

Efficiently manage Delta Live Tables Dependencies in Databricks

Rise of the LakeHouse Architecture

My Key Thoughts on Medallion Architecture After Years of Experience

Data Engineering — Aamir P

Build on top of Secoda with our new SDK - Secoda Wrap 16