Elasticity Task
First to start the task, lets know about the some concepts:
What is Hadoop?
Apache Hadoop is an open source software framework used to develop data processing applications which are executed in a distributed computing environment.
Applications built using HADOOP are run on large data sets distributed across clusters of commodity computers. Commodity computers are cheap and widely available. These are mainly useful for achieving greater computational power at low cost.
What is LVM?
LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.
The physical volumes are combined into logical volumes, with the exception of the /boot partition. The /boot partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/) partition is on a logical volume, create a separate /boot partition which is not a part of a volume group.
Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.
The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3. When "partitions" reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.
Task Description ??
??Integrating LVM with Hadoop and providing Elasticity to DataNode Storage
??Increase or Decrease the Size of Static Partition in Linux.
Let's start the task:
??Integrating LVM with Hadoop and providing Elasticity to DataNode Storage
First launch an instance in AWS and configure it as NameNode.
To configure it first we need two software file installed in this instance.
1. jdk-8u171-linux-x64.rpm
2. hadoop-1.2.1-1.x86_64.rpm
To install JDK file and hadoop file using following command: rpm -ivh jdk-8u171-linux-x64.rpm and rpm -ivh hadoop-1.2.1-1.x86_64.rpm –force respectively.
Next we have to create a directory e.g mkdir /namenode. Then go inside directory /etc/hadoop.
Then we have to configure the hdfs-site.xml file using command: vi hdfs-site.xml
Now configure core-site.xml file using command: vi core-site.xml
Now format the directory that we have created named /namenode using command: hadoop namenode -format
To start the service using following command: hadoop-daemon.sh start namenode and use jps to check it is started or not.
Next we have to configure the DataNode:
Same steps to install jdk and hadoop file in datanode terminal.
First launch an instance in AWS and configure it as DataNode. Then install JDK file and hadoop file using following command: rpm -ivh jdk-8u171-linux-x64.rpm and rpm -ivh hadoop-1.2.1-1.x86_64.rpm –force respectively.
Now create a directory e.g mkdir /datanode and there is no need to format directory. After create a directory then go to cd /etc/hadoop directory to configure the hdfs-site.xml and core-site.xml file.
Now configure the core-site.xml file using command: vi core-site.xml in this file we use IP address of NameNode.
Now start the Datanode use the command: hadoop-daemon.sh start datanode and use jps to check it is started or not.
Now go to NameNode and use command hadoop dfsadmin -report to check datanode is connected or not.
Here you can see is the storage provided by Datanode to NameNode that is 7.99 GB and we will be changing that using LVM.
Now let's start with our main topic we have to attach Volume to DataNode System. I have attached two storage to vm named redhat8_arth_1.vdi and redhat8_arth_2.vdi .
Now check how many storage are attached to the OS using command: fdisk -l
To create Physical Volume using following command: pvcreate /dev/sdb and pvcreate /dev/sdc
To display the Physical Volume use command: pvdisplay storage_name
After create a Physical Volume we have to create Volume Group use command: vgcreate Arth_VG /dev/sdb /dev/sdc
To display Volume Group use command: vgdisplay Arth_VG
Here Volume Group is created and its size is 29.99GiB.
Now we have to create Logical Volume use command: lvcreate --size 5G LV1 Arth_VG
To display Logical volume use command: lvdisplay Arth_VG/LV1
After create LV we have to format the LV use the command: mkfs.ext4 /dev/Arth_VG/LV1
Now create a directory to mount this Logical Volume using command: mkdir /DN And mount the above Logical Volume to this directory using command: mount /dev/Arth_VG/LV1 /DN
Now check that logical volume is mounted or not using command: df -h
Now Start the Service of DataNode.
Now check the report of hadoop cluster in NameNode using command: hadoop dfsadmin -report
Here you can see DataNode is only Contributing 5GB to NameNode.
??Increase or Decrease the Size of Static Partition in Linux.
First increase the logical volume size. I am increasing the size of logical volume by 4GB using command: lvextend --size +4G /dev/Arth_VG/LV1
Now We have to format 4Gb extra part, to format the Non-partition portion from the complete LV using command: resize2fs /dev/Arth_VG/LV1
Now check how many size is increase using command: lvdisplay Arth/LV1
Here. the lv size is increase 5gib to 9gib. Now check the hadoop cluster report in NameNode using command: hadoop dfsadmin -report
Here, we have extended the Logical Volume. In the Same way we can also reduce the LV size using command: lvreduce -L 2G /dev/Arth_VG/LV1
Now check the size of logical volume using command: lvdisplay Arth_VG/LV1
Here, you can see LV size is reduced from 9GB to 2GB. Finally, Task is completed.
Software Engineer at Infosys Limited/Java/Spring-Boot/Back-End Developer
4 年Nice work ??