How big MNC's like Google, Facebook, Instagram etc stores, manages and manipulate Thousands of Terabytes of data

How big MNC's like Google, Facebook, Instagram etc stores, manages and manipulate Thousands of Terabytes of data

WHAT IS BIG DATA ?

BIG DATA?is a collection of data that is huge in volume, yet growing exponentially with time. It is a?data with?so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data but with huge size.

Big data is not any concept nor any technology it is a big problem , which big tech companies are facing in todays world of technologies.

THE PROBLEMS OF BIG DATA ARE 5V'S

But many of the xperts consider 3 V's

(i)Volume :-?The name Big Data itself is related to a size which is enormous. Size of data plays a very crucial role in determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data. Hence,?VOLUME?is one characteristic which needs to be considered while dealing with Big Data.

(ii)Velocity :- refers to the speed of generation of data (fast accessing of data). The majority of companies involved with technologies such as social media, the Internet of Things and eCommerce meet this criterion.?

(iii) VARIETY :- refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications. This variety of unstructured data poses certain issues for storage, mining and analyzing data.

LET US SEE SOME COMPANIES DAILY DATA TRANSFER

IN YEAR 2014

In?2014, researchers published a study in the journal Supercomputing Frontiers and Innovations estimating the storage capacity of the?Internet?at 10^24?bytes, or 1 million exabytes

Over 2.5 quintillion bytes of?data?are?created every?single?day, and it's only going to grow from there. By?2020, it's estimated that 1.7MB of?data?will be?created every?second for?every?person on earth.

YouTube is the second most popular website on the planet after Google. As of May 2019,?more than 500hrs of vedio content is uploaded to youtube per minute,nearly 16016 PETABYTES

SIMILARLY MANY COMPANIES ARE GENRATING THIS MUCH OF DATA BUT WHAT IS THE SOLUTION TO RESOLVE THIS PROBLEM .?

They are converting this problem into their advantage by a concept name?Distributing Storage Cluster using a technology Hadoop and mapreduce method



what is distributing storage

It is difficult to maintain huge volumes of data in a single machine. Therefore, it becomes necessary to break down the data into smaller chunks and store it on multiple machines.Filesystems that manage the storage across a network of machines are called distributed file systems.

Hadoop Distributed File System (HDFS) is the storage component of Hadoop. All data stored on Hadoop is stored in a distributed manner across a cluster of machines. But it has a few properties that define its existence.

Huge volumes?– Being a distributed file system, it is highly capable of storing petabytes of data without any glitches.

Data access?– It is based on the philosophy that “the most effective data processing pattern is write-once, the read-many-times pattern”

Cost-effective?– HDFS runs on a cluster of commodity hardware. These are inexpensive machines that can be bought from any vendor.

MAPREDUCE

MapReduce?is a software framework and programming model used for processing huge amounts of data.?MapReduce?program work in two phases, namely, Map and Reduce. Map tasks deal with splitting and mapping of data while Reduce tasks shuffle and reduce the data.

CONCLUSION :-)

The world is changing the way it is operating currently and Big-data is playing an important role in it. Hadoop is a framework that makes an engineers life easy while working on large sets of data. There are improvements on all the fronts.??The future is exciting

要查看或添加评论,请登录

Harsh Patial的更多文章

社区洞察

其他会员也浏览了