DATA KA GHAR (House of DATA)
Nitin Kumar Gaur
GCP Data Engineer Consultant | 2xGCP-Certified Professional Data Engineer | CI/CD | DataFlow | Terraform | Big Query | Apache Airflow | SQL | Python | Hadoop | Big Data| 5? SQL, Python HackerRank
In my case I always bit confuse in data warehouse, lake & base, from starting I only know about Database
where
we can store Data and access very quickly as per the requirement.
Phir jab suna Data warehouse bhi hai, we can store data there also (its my initial understanding about Data warehouse)
Aur phir samane aaya Data lake jiske baare me I just know that it will store any thing without any specific structure, matalab kuchh bhi laao kaisa bhi hoga we can store here.
But its just begining for exploring "DATA KA GHAR".
But during working with BigQuery which is DATA WAREHOUSE on GCP (store aur access yaha bhi ho raha hai ye warehouse kyo ,Database kyo nahi bol sakate) and Now I can say that Data WareHouse is not just a storage place.I mean agar store hi karna hai to Data lake hai naa waha kar lo....
So here is the explanation... :)
DATA LAKE , DATA WAREHOUSE, DATABASE
1...:> Data Lake (Sabse Bada Ghar for every Data)
Data lake, from starting I'm correct only for Data lake, Data lake were you store all type of data
structured (like relational tables),
semi-structured (like JSON or XML files),
or unstructured data(like text documents or images).
Data lakes are often used to store large volumes of data, including big data.
Bottom line ye hai Data lake sabhi type ke data ko store karta hai, but koi bhi specific structure nahi hota hai store karne ka , its just take data and put it in randomly and remember, data lakes are designed to store vast amounts of raw, diverse, and unprocessed data, providing flexibility and scalability for data exploration, analysis, and advanced analytics.
Now, let's discuss some characteristics of a data lake:
2.::> Data Warehouse .(Data ka ghar for structured and organized Data)
Data Warehouse, not only store its also allow to processed the data, querying and analysis ,main thing its only store structured and pre-defined schema data.
That's why I told you Data Warehouse its not only where we just store data,
So,Data warehouse focuses on organizing and storing specific types of structured data in a structured manner.
Ex. BigQuery on GCP. (Our next topic will be BigQuery).
characteristics of a data warehouse:
领英推荐
3.:::>DATABASE (Particular GHAR for DATA).
Database: Its like also same characteristics like Data warehouse, its also a structured collection of data that is organized and stored in a way that allows for efficient retrieval, manipulation, and management of the data.
difference with data warehouse is that, Database is the subset of Data warehouse.
We can understand it like , lets suppose , Ram is leaving in building RMP-A in room no.123, so we can say RMP-A is a data warehouse where many rooms are available in structure format, and the room no. 123 is the database for Ram.
So I hope Now you can relate that where and why we used data warehouse and why database for data storage .
Very simple example for all DATA KA GHAR( Data lake , Warehouse & Base).
Example for DATA LAKE: Imagine you have a big, magical lake called the "Toy Lake." This lake can hold any kind of toy, no matter what type it is. You can throw all your toys into the Toy Lake without worrying about organizing them. So, you have cars, action figures, and puzzles all swimming together in the lake. When you want to play with a toy, you simply jump into the Toy Lake, swim around, and pick out any toy that catches your eye. The Toy Lake is like a big storage place where you can keep all your toys without categorizing them.
Question: If you want to play with a random toy, which place would you go to, the Toy Warehouse or the Toy Lake?
I hope now you have clear picture.
Example for Data Warehouse: Think of a data warehouse as a special room in your house where you store all your toys based on their types. In this case, we'll call it a "Toy Warehouse." In the Toy Warehouse, you have separate shelves for cars, action figures, and puzzles. So, when you want to play with a car, you go to the car shelf and find all your cars in one place. The Toy Warehouse helps you organize and store your toys by their types, making it easier for you to find and play with the specific kind of toy you want.
Question: If you want to play with an action figure, which shelf in the Toy Warehouse would you go to?
Example for Database: Lastly, let's talk about a database. Imagine you have a special cabinet in your room called the "Toy Cabinet." This cabinet has different drawers, and each drawer is labeled with a specific category, like cars, action figures, and puzzles. In each drawer, you can organize your toys neatly. So, all your cars are in the car drawer, all your action figures are in the action figure drawer, and all your puzzles are in the puzzle drawer. The Toy Cabinet helps you keep your toys organized by category, and whenever you want to play with a specific type of toy, you open the corresponding drawer and pick one out.
Question: If you want to play with a puzzle, which drawer in the Toy Cabinet would you open?
I'm thrilled to receive your suggestion! I would be delighted if you could share it with me. Your input is highly valued, and I'm excited to see what you have in mind.
Thank you
Nitin Kumar Gaur
Marketing Student | Digital Marketing | Ready to Excel in the Online World
1 年Fantastic article. Data management can be complex, but your brilliantly explained breakdown of Data Lakes, Databases, and Data Warehouses is invaluable. Understanding these differences is crucial for effective data organization and analysis. Additionally, staying updated with Business Intelligence (BI) & Analytics trends can help make informed decisions and drive meaningful insights. Check out the insightful article at:https://www.sganalytics.com/blog/top-business-intelligence-trends/
Data Engineer @Capgemini | Ex - THBS | Abinitio | ETL Developer
1 年Superb????
D&T Integration and Dev Analyst | .NET | C# | Web API
1 年Good job.
Senior Analyst
1 年Well done ???