Integrating LVM with Hadoop
Hey Everyone, in this article we will learn how we can provide elasticity to our datanode's storage.
First of all, we need to setup a Hadoop DFS Cluster. For our requirement, I just took 2 datanodes and 1 namenode (you can do this practical with 1 datanode also)
This is what our dfsadmin report looks like (before LVM) (static 15 GB storage):
Now for integrating the LVM, we must know what it is and it's heirarchy
This is what the hierarchy of LVM looks like. The main part of LVM is, it's physical volumes. From Physical Volumes(PVs) we create Volume Groups (VGs) and From VGs we create Logical Volumes (LVs). Logical Volumes are the units which we work on at the top level. Hence the name Logical Volume Management(LVM).
Now in datanode's folder (the directory which we provide for DFS storage) should be mounted to the LVs, so that we can increase/decrease as per our needs. This provides the elasticity to out Datanode's Storage.
Steps to provide elasticity to Datanode folder:
- Create PV
- Assign PVs to VG
- Create LV out of VG
- mount LV to the datanode folder (i took /dn as datanode folder)
- Then we can Increase/Decrease the size
1: Creating PV
command: pvcreate /dev/xvdf (xvdf is my device name)
2: Assigning PVs to VG
command: vgcreate v1 /dev/xvdf (v1 is my vg name)
3: Creating LV out of VG:
command: lvcreate -n vol1 -L 10G v1 (vol1 is my LV name, 10G is size of LV)
4: Mounting:
command: mount /dev/v1/vol1 /dn
This will update the Datanode storage as 10GB as we have mounted dn to v1/vol1 and vol1 has 10Gb capacity.
5: Increasing/Decreasing Size:
Now we have come to the main part of this practical. Here we will be extending or reducing the size of the datanode folder by doing some operations on LVs
- Increasing/Extending the Datanode Storage
commands:
lvextend -L 17G v1/vol1
resize2fs /dev/v1/vol1
This will increasing the storage capacity to 20GB, hence datanode will now provide 20GB to Namenode
dfsadmin report after increasing the storage capacity(to 17GB):
2. Decreasing/Reducing the Datanode Storage
For this, first we have to unmount the LV and run the following commands
commands:
umount /dn
e2fsck -f /dev/v1/vol1
resize2fs /dev/v1/vol1 10G
mount /dev/v1/vol1 /dn
These commands will result into reduced datanode storage. Datanode capacity will be reduced to 10GB
Dfsadmin report after reducing the size to 10GB:
Conclusion:
Hence we successfully Integrating LVM with Hadoop and provided elasticity to the DataNode's Storage. Thus now we can easily increase or decrease the datanode storage.
Thanks for the Read. See you in another Article