登录查看更多内容

How Facebook manages their huge data :

Deepak Shah

Terraform || Openshift(EX180) || AWS(CLF-CO2) || AWS(SAA-C03) Certified

发布日期: 2020年9月17日

1). Where they store this data?

------- Obviously they store their data in the servers in data centers . A combination of servers is known as a rack. A single server have a capacity to store peta bytes of data.

To stop the overheating in these racks Facebook uses a seven room rooftop natural air conditioning system.

To maintain 24 hrs power supply they use approx 40 generators each producing 3 mega watt electricity.

2). Which technology Facebook use to maintain their data?

-------- Hadoop

“Facebook runs the world’s largest Hadoop cluster" says Jay Parikh, Vice President Infrastructure Engineering, Facebook..

Basically, Facebook runs the biggest Hadoop cluster that goes beyond 4,000 machines and storing more than hundreds of millions of gigabytes.

Facebook Messenger, based on Hadoop database, i.e., Apache HBase, which has a layered architecture that supports plethora of messages in a single day.

----------Scuba

With a huge amount of unstructured data coming across each day, Facebook slowly realized that it needs a platform to speed up the entire analysis part. That’s when it developed Scuba, which could help the Hadoop developers dive into the massive data.

-----------Cassandra

The traditional data storage started lagging behind when Facebook's search team discovered an Inbox Search problem.

The challenge was to develop a new storage solution, that's why Prashant Malik and Avinash Lakshman started developing Cassandra.

---------- Hive

After Yahoo implemented Hadoop for its search engine, Facebook thought about empowering the data scientists so that they could store a larger amount of data in the Oracle data warehouse. Hence, Hive came into existence.

-----------Prism

Initially when Facebook implemented Hadoop, it was not designed to run across multiple data centers. And that’s when the requirement to develop Prism was felt by the team of Facebook.

Prism is a platform which brings out many namespaces instead of the single one governed by the Hadoop. This in turn helps to develop many logical clusters.

This system is now expandable to as many servers as possible without worrying about increasing the number of data centers.

Note: There is a master - slave topology used in Hadoop to store the such huge amount data by splitting them into blocks.

要查看或添加评论，请登录

Deepak Shah的更多文章

Git & GitHub - 2 Days Workshop Experience

2021年2月1日

Git & GitHub - 2 Days Workshop Experience

???????????????? ??????????????????????..
How to Operate AWS Services Using AWS CLI

2020年10月13日

How to Operate AWS Services Using AWS CLI

?????????? ?????????????????????? ..

6 条评论
Case Study - How companies and startups are growing rapidly with the help of AWS.

2020年9月22日

Case Study - How companies and startups are growing rapidly with the help of AWS.

Amazon Web Services - AWS : AWS provides the services of cloud to the companies to maintain their IT infrastructure. In…

How Facebook manages their huge data :

Deepak Shah

Terraform || Openshift(EX180) || AWS(CLF-CO2) || AWS(SAA-C03) Certified

Deepak Shah的更多文章

社区洞察

其他会员也浏览了

Big Data and Its Key Tools: MapReduce, Spark, SQL (Hive), and Hadoop in Action

Big Data: The Bigger Picture #4

Spark vs. Hadoop: Speed Comparison in Big Data Processing

Introduction to Big data and Hadoop

Fetch Bulk Data from HBase Using Spark Multi-Executor With foreachPartitionAsync

Optimize your EMR cluster

Hadoop: the technology behind MNCs storing and manipulating huge data with high speed and efficiency

Are Hammers Good?

Unveiling the Power of Big Data: Harnessing Insights with Spark and Hadoop

BIG DATA : ITS USE CASES AND MANAGEMENT BY GOOGLE

Deepak Shah的更多文章

Git & GitHub - 2 Days Workshop Experience

How to Operate AWS Services Using AWS CLI

Case Study - How companies and startups are growing rapidly with the help of AWS.

社区洞察

其他会员也浏览了

Big Data and Its Key Tools: MapReduce, Spark, SQL (Hive), and Hadoop in Action

Big Data: The Bigger Picture #4

Spark vs. Hadoop: Speed Comparison in Big Data Processing

Introduction to Big data and Hadoop

Fetch Bulk Data from HBase Using Spark Multi-Executor With foreachPartitionAsync

Optimize your EMR cluster

Hadoop: the technology behind MNCs storing and manipulating huge data with high speed and efficiency

Are Hammers Good?

Unveiling the Power of Big Data: Harnessing Insights with Spark and Hadoop

BIG DATA : ITS USE CASES AND MANAGEMENT BY GOOGLE