登录查看更多内容

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Hemendra Chaudhary

DevOps Engineer @ MyWays | AWS, DevOps, OpenSearch

发布日期: 2021年3月14日

To understand the term 'Big Data', we first need to understand "What is data?". So, Data are a collection of facts, such as numbers, words, measurements, observations, or just descriptions of things. In a more technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects.

What is Big Data?

Big Data is also data but when data is much more from our storage capability, it is called Big Data. We stored the data on the Hard Disk.

For example, if we want to download a 100 Gb file in our system but our system can store only 64 Gb data, then we can be called it Big Data.

If we talk about Facebook, then Facebook revealed some big, big stats on Big Data, including that its system processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half-hour. As you can see. 500+ Tb data, it's a very huge volume of data, that Facebook received from its users per day and became 15 petabytes in a month. Facebook has to stored data permanently if it wants to run its business because the user can demand any time to see their content.

And again if we talk about Google, Google now processes over 40,000 search queries every second on average, which translates to over 3.5 billion searches per day and 1.2 trillion searches per year worldwide. Google currently processes over 20 petabytes of data per day and it is more than 40 times of Facebook.

Here in this Task, I'm going to discuss the integration of LVM with Hadoop and here I provide Elasticity to the Datanode Storage.

First, I create the Cluster of Hadoop with two RHEL 8 systems. Here I'm performing this task on my VMs.
I configure one system as Master Node and another one as the Data node and share the storage of the data node with the master node.

Now, I create one logical volume with name lv1 and path=/dev/demovg/lv1 with lv size = 50G in Data node.
And after that, I mounted this LV partition to a mountpoint => /lv after that give the path of the mountpoint inside the /etc/hadoop/hdfs-site.xml file in the <value>/lv</value>

Now, I'm going to check that data node successfully share their storage or not by running the command 'hadoop dfsadmin -report'

Now, increase the size of LV by +5G and on the fly, the size of shared storage also increased. And after increasing the size again check the shared storage of the data node using the command 'hadoop dfsadmin -report'

Here You can see that the size of shared storage increases on the fly from 49 GiB to 54 GiB

So here we have done with the integration of LVM with Hadoop

Thank You!!!

要查看或添加评论，请登录

Hemendra Chaudhary的更多文章

Industry use cases of Jenkins

2021年9月12日

Industry use cases of Jenkins

What is Jenkins? Jenkins? is an open-source automation server. With Jenkins, organizations can accelerate the software…
The Usecase of JavaScript n industries

2021年8月12日

The Usecase of JavaScript n industries

What is Javascript? JavaScript is a lightweight, open-source and cross-platform programming. It is designed for…
K-Means Clustering and UseCases in Security Domain.

2021年8月12日

K-Means Clustering and UseCases in Security Domain.

K means is one of the most popular Unsupervised Machine Learning Algorithms Used for Solving Classification Problems. K…
Confusion Matrix And Cyber Crime

2021年6月6日

Confusion Matrix And Cyber Crime

What is Confusion Matrix? When we get the data, after data cleaning, pre-processing, and wrangling, the first step we…
Neural Networks and their Applications in Industry

2021年3月31日

Neural Networks and their Applications in Industry

INTRODUCTION Over the past few years, technology has become very dynamic. It is fuelling itself at an ever-increasing…
USE-CASE FOR KUBERNETES

2021年1月21日

USE-CASE FOR KUBERNETES

Introduction Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized…
Ansible: How industries are solving challenges using Ansible

2020年12月6日

Ansible: How industries are solving challenges using Ansible

In this article, we come to know about: What is Ansible Architecture of Ansible Ansible: Concept Why we need Ansible…
Use Case Of ML/AI In Agriculture

2020年10月21日

Use Case Of ML/AI In Agriculture

Artificial Intelligence(AI) refers to the simulation of human intelligence in machines that are programmed to think…
Control EC2 Service Using CLI

2020年10月16日

Control EC2 Service Using CLI

In this task, we are going to perform the following: Create a Key Pair Create a Security Group Launch an instance using…
Case Studies - Cloud Computing

2020年9月21日

Case Studies - Cloud Computing

An introduction to cloud computing right from the basics What is cloud computing, in simple terms? Cloud computing is…

1 条评论

See all articles

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Hemendra Chaudhary

DevOps Engineer @ MyWays | AWS, DevOps, OpenSearch

What is Big Data?

Here in this Task, I'm going to discuss the integration of LVM with Hadoop and here I provide Elasticity to the Datanode Storage.

Hemendra Chaudhary的更多文章

社区洞察

其他会员也浏览了

3 Reasons Why "Hadoop as a Service" Is Making Sense for Business Analytics?

Increasing/decreasing the size of Hadoop Datanode dynamically

Hadoop vs Spark: Which Big Data Framework is the Best Fit for Your Organization?

All about BIG data

Understanding What Data is Stored in the Name Node

HADOOP HDFS

Big Data Diagnosis: (Hadoop & Distributed Storage Clusters)

BigData-Hadoop

Why Learn Big Data And Hadoop – Top Reasons To Learn Hadoop

LinkedIn's Data Mastery: Navigating Massive Volumes with Cutting-Edge Technology

What is Big Data?

Here in this Task, I'm going to discuss the integration of LVM with Hadoop and here I provide Elasticity to the Datanode Storage.

Hemendra Chaudhary的更多文章

Industry use cases of Jenkins

The Usecase of JavaScript n industries

K-Means Clustering and UseCases in Security Domain.

Confusion Matrix And Cyber Crime

Neural Networks and their Applications in Industry

USE-CASE FOR KUBERNETES

Ansible: How industries are solving challenges using Ansible

Use Case Of ML/AI In Agriculture

Control EC2 Service Using CLI

Case Studies - Cloud Computing

社区洞察

其他会员也浏览了

3 Reasons Why "Hadoop as a Service" Is Making Sense for Business Analytics?

Increasing/decreasing the size of Hadoop Datanode dynamically

Hadoop vs Spark: Which Big Data Framework is the Best Fit for Your Organization?

All about BIG data

Understanding What Data is Stored in the Name Node

HADOOP HDFS

Big Data Diagnosis: (Hadoop & Distributed Storage Clusters)

BigData-Hadoop

Why Learn Big Data And Hadoop – Top Reasons To Learn Hadoop

LinkedIn's Data Mastery: Navigating Massive Volumes with Cutting-Edge Technology