Elasticity Task
Elasticity Task

Elasticity Task

First to start the task, lets know about the some concepts:

What is Hadoop?

Apache Hadoop is an open source software framework used to develop data processing applications which are executed in a distributed computing environment.

 Applications built using HADOOP are run on large data sets distributed across clusters of commodity computers. Commodity computers are cheap and widely available. These are mainly useful for achieving greater computational power at low cost.

What is LVM?

LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.

The physical volumes are combined into logical volumes, with the exception of the /boot partition. The /boot partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/) partition is on a logical volume, create a separate /boot partition which is not a part of a volume group.

Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.

lvm

The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3. When "partitions" reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.

lvm2

Task Description ??

??Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

??Increase or Decrease the Size of Static Partition in Linux.

Let's start the task:

??Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

First launch an instance in AWS and configure it as NameNode.

namenode

To configure it first we need two software file installed in this instance.

1. jdk-8u171-linux-x64.rpm

2. hadoop-1.2.1-1.x86_64.rpm

To install JDK file and hadoop file using following command: rpm -ivh jdk-8u171-linux-x64.rpm and rpm -ivh hadoop-1.2.1-1.x86_64.rpm –force respectively.

No alt text provided for this image

Next we have to create a directory e.g mkdir /namenode. Then go inside directory /etc/hadoop. 

mkdir /namenode

Then we have to configure the hdfs-site.xml file using command: vi hdfs-site.xml

vi hdfs-site.xml

Now configure core-site.xml file using command: vi core-site.xml

core-site.xml

Now format the directory that we have created named /namenode using command: hadoop namenode -format

hadoop namenode -format

To start the service using following command: hadoop-daemon.sh start namenode and use jps to check it is started or not.

hadoop-daemon.sh start namenode

Next we have to configure the DataNode:

Same steps to install jdk and hadoop file in datanode terminal.

First launch an instance in AWS and configure it as DataNode. Then install JDK file and hadoop file using following command: rpm -ivh jdk-8u171-linux-x64.rpm and rpm -ivh hadoop-1.2.1-1.x86_64.rpm –force respectively.

datanode
jdkdatanode
hadoopdatanode

Now create a directory e.g mkdir /datanode and there is no need to format directory. After create a directory then go to cd /etc/hadoop directory to configure the hdfs-site.xml and core-site.xml file.

mkdir /datanode
vi hdfs-site.xml

Now configure the core-site.xml file using command: vi core-site.xml in this file we use IP address of NameNode.

vi core-site.xml

Now start the Datanode use the command: hadoop-daemon.sh start datanode and use jps to check it is started or not.

hadoop-daemon.sh start datanode

Now go to NameNode and use command hadoop dfsadmin -report to check datanode is connected or not.

hadoop dfsadmin -report

Here you can see is the storage provided by Datanode to NameNode that is 7.99 GB and we will be changing that using LVM.

Now let's start with our main topic we have to attach Volume to DataNode System. I have attached two storage to vm named redhat8_arth_1.vdi and redhat8_arth_2.vdi .

vm

Now check how many storage are attached to the OS using command: fdisk -l

fdisk -l

To create Physical Volume using following command: pvcreate /dev/sdb  and pvcreate /dev/sdc

pvcreate

To display the Physical Volume use command:  pvdisplay storage_name

 pvdisplay storage_name

After create a Physical Volume we have to create Volume Group use command: vgcreate Arth_VG /dev/sdb /dev/sdc

vgcreate Arth_VG  /dev/sdb  /dev/sdc

To display Volume Group use command: vgdisplay Arth_VG

No alt text provided for this image

Here Volume Group is created and its size is 29.99GiB.

Now we have to create Logical Volume use command: lvcreate  --size 5G  LV1 Arth_VG

lvcreate  --size 5G  LV1 Arth_VG

To display Logical volume use command: lvdisplay Arth_VG/LV1

lvdisplay Arth_VG/LV1

After create LV we have to format the LV use the command: mkfs.ext4 /dev/Arth_VG/LV1

mkfs.ext4 /dev/Arth_VG/LV1

Now create a directory to mount this Logical Volume using command: mkdir /DN And mount the above Logical Volume to this directory using command: mount /dev/Arth_VG/LV1 /DN

dn

Now check that logical volume is mounted or not using command: df -h

df -h

Now Start the Service of DataNode.

dn2

Now check the report of hadoop cluster in NameNode using command: hadoop dfsadmin -report

5gb

Here you can see DataNode is only Contributing 5GB to NameNode.

??Increase or Decrease the Size of Static Partition in Linux.

First increase the logical volume size. I am increasing the size of logical volume by 4GB using command: lvextend --size +4G /dev/Arth_VG/LV1

lvextend --size +4G  /dev/Arth_VG/LV1

Now We have to format 4Gb extra part, to format the Non-partition portion from the complete LV using command: resize2fs /dev/Arth_VG/LV1

 resize2fs  /dev/Arth_VG/LV1

Now check how many size is increase using command: lvdisplay Arth/LV1

lvdisplay  Arth/LV1

Here. the lv size is increase 5gib to 9gib. Now check the hadoop cluster report in NameNode using command: hadoop dfsadmin -report

hadoop dfsadmin -report

Here, we have extended the Logical Volume. In the Same way we can also reduce the LV size using command: lvreduce -L 2G /dev/Arth_VG/LV1

lvreduce -L 2G /dev/Arth_VG/LV1

Now check the size of logical volume using command: lvdisplay Arth_VG/LV1

lvdisplay Arth_VG/LV1

Here, you can see LV size is reduced from 9GB to 2GB. Finally, Task is completed.


?Thank You for Reading the Article. ????















Shweta Bhardwaj

Software Engineer at Infosys Limited/Java/Spring-Boot/Back-End Developer

4 年

Nice work ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了