Integrating LVM with Hadoop and provide elasticity to DataNode with LVM Concept

Integrating LVM with Hadoop and provide elasticity to DataNode with LVM Concept

What is Hadoop?

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

What is LVM?

LVM, or Logical Volume Management, is a storage device management technology that gives users the power to pool and abstract the physical layout of component storage devices for easier and flexible administration. Utilizing the device mapper Linux kernel framework, the current iteration, LVM2, can be used to gather existing storage devices into groups and allocate logical units from the combined space as needed.

The main advantages of LVM are increased abstraction, flexibility, and control. Logical volumes can have meaningful names like “databases” or “root-backup”. Volumes can be resized dynamically as space requirements change and migrated between physical devices within the pool on a running system or exported easily. LVM also offers advanced features like snapshotting, striping, and mirroring.

No alt text provided for this image

In the above diagram we can see that How LVM is Powerful tool to the technology

Task : Attaching LV to the Hadoop Data Node

For this task I am completely Using AWS Cloud

  • AWS Linux Instance (Ubuntu 20.04).
  • Installed Hadoop(3.3.0) in this instance.
  • EBS

Firstly login in to the AWS Instance

No alt text provided for this image

After logging I am created a new user for the Hadoop which is named as hduser.Login in to that.

No alt text provided for this image

After that check the disks that are in the system with the help of

fdisk -l command

No alt text provided for this image

Creating EBS volume and adding to the Instance

We clearly see that there 8gib storage is there.We add the Disk from the EBS with name fordatanode of 2GB volume.

No alt text provided for this image

Warning: EBS Should belong to the same Availability Zone.Because it doesn't attached for different availability zone of instance.

Attach this EBS(2GB) to the our Ubuntu Instance.

No alt text provided for this image

Now again check the disks in the Instance

fdisk -l

No alt text provided for this image

we can see there new disk coming with name /dev/xvdf

Creating Physical Volume

  • Creating Physical volume for the newly attached Hard Disk

Command: pvcreate /dev/xvdf

No alt text provided for this image

Displaying the volumegroup with

command: pvdisplay /dev/xvdf

Creating volume group(lucky) and attached the physical volume to this volume group

Creating Volume group

Command: vgcreate lucky /dev/xvdf

No alt text provided for this image

Display the volume group whether the physical volume is attached or not.

command: vgdisplay lucky

No alt text provided for this image

It is successfully attached to the volumegroup

Start Namenode

Now we have to start our Hadoop Namenode

command: hdfs namenode # iamusing 3.3.0 so some commands are vary for different versions.

No alt text provided for this image

Now we have to see the report of the hadoop cluster.

command: hdfs dfsadmin -report -live

No alt text provided for this image

we can see there is no data nodes are running.

Create Logical volume of 1 Gb

Create the the logical volume of 1Gb of name logvol from the volumegroup lucky

command:lvcreate --size 1G --name logvol lucky

No alt text provided for this image

We can see the logical volume details with the command lvdisplay /lucky/logvol

Formatting the LV with ext4 filesystem

Format the logical volume with ext4 file system for storing the data in the LV.

Command: mkfs.ext4 /dev/lucky/logvol

No alt text provided for this image

Mount LV to Hadoop Datanode

After creating filesystem for logical volume(logvol) we have to mount this LV to Hadoop DataNode Directory

command: mount /dev/lucky/logvol /usr/local/hadoop_store/hdfs/datanode

After Mounting the LV storage to DataNode dir we can check whether it is mounted or not with the help of df -h command

No alt text provided for this image

Wow we can see that it is successfully mounted to the datanode.

Start Datanode

Now we have to start the DataNode .

Command: hadoop datanode

No alt text provided for this image

After run the command the datanode had run.Check whether it is running or not with command jps

No alt text provided for this image

From this command it is clearly say Namenode and Datanode are running.

Now we have to check the Hadoop cluster info with command

command: hdfs dfsadmin -report -live

No alt text provided for this image

There is Onedata node is running and successfully shared the LV to Hadoop Datanode.

Extend the LV of 0.5Gb to Datanode without shutting the Datanode

Now we have to extended the logical volume of 0.5Gb

command: lvextend --size +0.5G /dev/lucky/logvol

No alt text provided for this image

It is successfully extended and we have to format the extended partition and in LVM done with Online we don't need to unmount the LV.

To format the extended partition we use the command resize2fs

command:resize2fs /dev/lucky/logvol

No alt text provided for this image

After resizing the volume we can check whether it is extended or not with the help of df -h command.

No alt text provided for this image

Wow! yes it is successfully extended to logvol.And also we have to check if the datanode storage is increased or not.

command: hdfs dfsadmin -report -live

No alt text provided for this image

We made it.We can see clearly that 1Gb storage is increased to 1.45Gb.

Thanks for your patience for reading :)


要查看或添加评论,请登录

Naga phani的更多文章

社区洞察

其他会员也浏览了