登录查看更多内容

Integrating LVM with Hadoop and provide elasticity to DataNode with LVM Concept

Naga phani

?? Computer Science Engineer || ???? Software Engineer ||?? Cloud & DevOps Enthusiast || ?? AI & Machine Learning Practitioner

发布日期: 2021年3月12日

What is Hadoop?

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

What is LVM?

LVM, or Logical Volume Management, is a storage device management technology that gives users the power to pool and abstract the physical layout of component storage devices for easier and flexible administration. Utilizing the device mapper Linux kernel framework, the current iteration, LVM2, can be used to gather existing storage devices into groups and allocate logical units from the combined space as needed.

The main advantages of LVM are increased abstraction, flexibility, and control. Logical volumes can have meaningful names like “databases” or “root-backup”. Volumes can be resized dynamically as space requirements change and migrated between physical devices within the pool on a running system or exported easily. LVM also offers advanced features like snapshotting, striping, and mirroring.

In the above diagram we can see that How LVM is Powerful tool to the technology

Task : Attaching LV to the Hadoop Data Node

For this task I am completely Using AWS Cloud

AWS Linux Instance (Ubuntu 20.04).
Installed Hadoop(3.3.0) in this instance.
EBS

Firstly login in to the AWS Instance

After logging I am created a new user for the Hadoop which is named as hduser.Login in to that.

After that check the disks that are in the system with the help of

fdisk -l command

Creating EBS volume and adding to the Instance

We clearly see that there 8gib storage is there.We add the Disk from the EBS with name fordatanode of 2GB volume.

Warning: EBS Should belong to the same Availability Zone.Because it doesn't attached for different availability zone of instance.

Attach this EBS(2GB) to the our Ubuntu Instance.

Now again check the disks in the Instance

fdisk -l

we can see there new disk coming with name /dev/xvdf

Creating Physical Volume

Creating Physical volume for the newly attached Hard Disk

Command: pvcreate /dev/xvdf

Displaying the volumegroup with

command: pvdisplay /dev/xvdf

Creating volume group(lucky) and attached the physical volume to this volume group

Creating Volume group

Command: vgcreate lucky /dev/xvdf

Display the volume group whether the physical volume is attached or not.

command: vgdisplay lucky

It is successfully attached to the volumegroup

Start Namenode

Now we have to start our Hadoop Namenode

command: hdfs namenode # iamusing 3.3.0 so some commands are vary for different versions.

Now we have to see the report of the hadoop cluster.

command: hdfs dfsadmin -report -live

we can see there is no data nodes are running.

Create Logical volume of 1 Gb

Create the the logical volume of 1Gb of name logvol from the volumegroup lucky

command:lvcreate --size 1G --name logvol lucky

We can see the logical volume details with the command lvdisplay /lucky/logvol

Formatting the LV with ext4 filesystem

Format the logical volume with ext4 file system for storing the data in the LV.

Command: mkfs.ext4 /dev/lucky/logvol

Mount LV to Hadoop Datanode

After creating filesystem for logical volume(logvol) we have to mount this LV to Hadoop DataNode Directory

command: mount /dev/lucky/logvol /usr/local/hadoop_store/hdfs/datanode

After Mounting the LV storage to DataNode dir we can check whether it is mounted or not with the help of df -h command

Wow we can see that it is successfully mounted to the datanode.

Start Datanode

Now we have to start the DataNode .

Command: hadoop datanode

After run the command the datanode had run.Check whether it is running or not with command jps

From this command it is clearly say Namenode and Datanode are running.

Now we have to check the Hadoop cluster info with command

command: hdfs dfsadmin -report -live

There is Onedata node is running and successfully shared the LV to Hadoop Datanode.

Extend the LV of 0.5Gb to Datanode without shutting the Datanode

Now we have to extended the logical volume of 0.5Gb

command: lvextend --size +0.5G /dev/lucky/logvol

It is successfully extended and we have to format the extended partition and in LVM done with Online we don't need to unmount the LV.

To format the extended partition we use the command resize2fs

command:resize2fs /dev/lucky/logvol

After resizing the volume we can check whether it is extended or not with the help of df -h command.

Wow! yes it is successfully extended to logvol.And also we have to check if the datanode storage is increased or not.

command: hdfs dfsadmin -report -live

We made it.We can see clearly that 1Gb storage is increased to 1.45Gb.

Thanks for your patience for reading :)

要查看或添加评论，请登录

Naga phani的更多文章

Configuring Web-server on Ec2 Instance using Ansible

2021年3月22日

Configuring Web-server on Ec2 Instance using Ansible

What is ansible? Ansible is an open-source software provisioning, configuration management, and application-deployment…
Creating AWS Instance

2021年3月17日

Creating AWS Instance

Key pair A key pair, consisting of a private key and a public key, is a set of security credentials that you use to…
LVM Partition using Python Script

2021年3月14日

LVM Partition using Python Script

Firstly we have to write the code according to our purpose. Here our purpose to create the LVM with one click i.
Industry Use cases of Kubernetes

2021年3月14日

Industry Use cases of Kubernetes

What is kubernetes? Kubernetes is an open-source container-orchestration system for automating computer application…
Setup i.e Ping to Google Not to Facebook from same system

2021年3月14日

Setup i.e Ping to Google Not to Facebook from same system

Before moving to the setup we need to know some concepts. What is Ping Command? Ping is a computer network…
How to setup python Interpreter On the Docker

2021年3月14日

How to setup python Interpreter On the Docker

To Install any Software we need OS.In the Docker world Container is the one OS which takes less computing power.
Creating httpd webserver on the top of Docker container

2021年3月14日

Creating httpd webserver on the top of Docker container

First start the Docker service To start the docker service we have to install the docker.In my case i have already…
Neural-Networks Changing the Lifestyle of Human

2021年3月4日

Neural-Networks Changing the Lifestyle of Human

What Are Neural Networks Neural networks are a set of algorithms, they are designed to mimic the human brain, that is…
3 Success Stories On How Companies Use Ansible

2020年12月1日

3 Success Stories On How Companies Use Ansible

What is Ansible in devops In this article we will discuss what is ansible in devops and its use cases? If anyone ask…

See all articles

Integrating LVM with Hadoop and provide elasticity to DataNode with LVM Concept

Naga phani

?? Computer Science Engineer || ???? Software Engineer ||?? Cloud & DevOps Enthusiast || ?? AI & Machine Learning Practitioner

What is Hadoop?

What is LVM?

Creating EBS volume and adding to the Instance

Creating Physical Volume

Creating Volume group

Start Namenode

Create Logical volume of 1 Gb

Formatting the LV with ext4 filesystem

Mount LV to Hadoop Datanode

Start Datanode

Extend the LV of 0.5Gb to Datanode without shutting the Datanode

Naga phani的更多文章

社区洞察

其他会员也浏览了

Integration of LVM with Hadoop-Cluster To contribute limited storage of datanode on aws

Hadoop Ecosystem and Their Components

What the Heck is... Hadoop? And Why You Should Know About It

Now You’re a Hadoop Expert

Introduction to Hadoop Ecosystem: Understanding HDFS, MapReduce, and YARN

Apache? Hadoop?

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node

The History of Hadoop and Big Data

Starting My Journey into Big Data: Apache Spark, Apache Hadoop, and AWS EMR

Hadoop Architecture Made Easy!

What is Hadoop?

What is LVM?

Creating EBS volume and adding to the Instance

Creating Physical Volume

Creating Volume group

Start Namenode

Create Logical volume of 1 Gb

Formatting the LV with ext4 filesystem

Mount LV to Hadoop Datanode

Start Datanode

Extend the LV of 0.5Gb to Datanode without shutting the Datanode

Naga phani的更多文章

Configuring Web-server on Ec2 Instance using Ansible

Creating AWS Instance

LVM Partition using Python Script

Industry Use cases of Kubernetes

Setup i.e Ping to Google Not to Facebook from same system

How to setup python Interpreter On the Docker

Creating httpd webserver on the top of Docker container

Neural-Networks Changing the Lifestyle of Human

3 Success Stories On How Companies Use Ansible

社区洞察

其他会员也浏览了

Integration of LVM with Hadoop-Cluster To contribute limited storage of datanode on aws

Hadoop Ecosystem and Their Components

What the Heck is... Hadoop? And Why You Should Know About It

Now You’re a Hadoop Expert

Introduction to Hadoop Ecosystem: Understanding HDFS, MapReduce, and YARN

Apache? Hadoop?

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node

The History of Hadoop and Big Data

Starting My Journey into Big Data: Apache Spark, Apache Hadoop, and AWS EMR

Hadoop Architecture Made Easy!