Consumption Layer
Mustafa Qizilbash
‘Open for New Opportunities (Globally) Author & Podcaster of “Let’s Talk About Data!”, Data & AI Practitioner & CDMP Certified, Innovator of DAC Architecture & PVP Approach
Consumption Layer is confused with Serving Layer due to its name. Folks relates it that the layer which is used by Consumers to extract data. The Layer which Consumers use, to extract information is Serving layer (explained separately).
Let’s decode it…..
Consumption Layer, as its name says it all, is a layer which consumes or takes-in data from source systems. In current era, due to huge data coming in every second from sources like social media, CCTV, images, audios, sensors data etc. the technology was required to consume data on a fast pace in MPP mode.
Tools like Hadoop, Object Storage and many RDBMS are giving MPP features now a days.
In Lambda Architecture, Serving Layer is the last layer, Consumption Layer is the layer which process data from source till Serving layer.
In the image, Batch and Speed Layers are Consumption Layers. In an OLAP system, the consumption layer is Data Warehouse and Data Mart, considering there are OLAP Cubes and Semantic layers but in a Data Warehouse, consumption layer is Landing and Staging zone.
领英推荐
Difference Zones
·????????Landing Zone is where source data model is copies on as-is basis.
·????????Staging Zone is where data model remains the same but basic technical data quality rules are applied like convert all date format to one date format, convert null values into some default value, if data is in number format then convert field data type to digit etc. Please not, from Raw to Process layer it is not Business Data Quality rules, those are Technical Data Quality rules.
·????????Transformed Zone is where data model is change as per business requirement i.e., in Data Warehouse, Data Mart etc.
For real-time processing Kafka is one of the most used consumption tools. CDC tools are also becoming very famous in Big Data where data is huge, and one expects to pull only changed data from source rather than whole dump every time data flows in. Spark, Hadoop, Object Storages are also few frameworks known for MPP consuming layers.
Cheers.
| Data SME | BPM | Data Quality | Project Management | LSS Black Belt | Data Governance | SAP MDG | Metadata Management | CRM |
2 年Thanks for posting