Contribute Limited Amount Of Storage Of DataNode In Hadoop Cluster
Task :-
??In a Hadoop cluster, find how to contribute limited/specific amount of storage as slave to the cluster?
* To complete this task or To solve this issue , we have to use Linux Partition Concept .
* I will follow the below step for this task -
<1>. Add New HardDisk To DataNode
<2>. Create Partition In Add Device At DataNode
<3>. Format & Mount Partition at DataNode
<4>. Configure NameNode
<5>. Configure DataNode
<6>. Check Contribution Of DataNode In Distributed File Storage of Hadoop Cluster
I have a Hadoop Cluster in which one NameNode and one DataNode is present.
IP of NameNode - 192.168.43.106 Hostname is "NN1"
IP of DataNode - 192.168.43.65 Hostname is "DN1"
Step - 1 Add New HardDisk To DataNode -
Now I am using Oracle Virtual Box so we don't need to purchase new hard disk . We will use Virtual Hard Disk concept.
We are adding new hard disk because we don't have any unallocated space to do partitions.
* You can also refer this video for more understanding for this -
To add new hard disk DataNode must be in "Stopped" state then Follow this steps -
(A) Go Storage in Settings of DataNode -
(B) Click on "Controller: SATA" & after this click on right "+" icon of "Controller: SATA" -
(C) Click On "Create" -
(D) Click on "Next" -
(E) Again "Next" -
(F) Choose your Hard Disk Size & Do "Create" -
In my case My Hard Disk size is 10 GiB
(G) Click on "DN_1.vdi" because in my case hard disk name is "DN_1.vdi" & and choose it -
(H) Now our Hard Disk is attached -
(I) To check hard disk is attached or not run "fdisk -l" command -
You will see "/dev/sdb : 10 GiB"
Step - 2 Create Partition In Add Device At DataNode -
* You can also refer this video for more understanding for this-
(A) Run "fdisk /dev/sdb" command -
"/dev/sdb" is name of added device in previous step.
(B) Run "n" to create new partition -
(C) Run "p" -
* Here I want to create Primary Partitions.
(D) Press "Enter" -
(E) Again press "Enter" -
(F) Give value of Last Sector -
* I want to create 2 GiB partition so that DataNode can only use 2 GiB for contribution in Hadoop Cluster.
(G) Run "w" to Save this Partition -
(H) Run "fdisk -l /dev/sdb" to check partition -
(I) Run "udevadm settle" to load Driver for Partition -
* Whenever New device is added in Computer then we have to load respectively driver so that we can communicate with that device.
Step - 3 Format & Mount Partition at DataNode -
* You can refer this video more understanding for this -
(A) Run this command to format "mkfs.ext4 /dev/sdb1" -
* In my case I am using "ext4" format type ,you can choose according to you.
(B) Create a Directory where you want to mount Partition -
* I will use this directory in Hadoop Cluster Distributed File Storage .
(C) Mount Partition at "/DataNode" Directory -
Step -4 Configure NameNode -
(A) Make a Directory "/nn" -
(B) "hdfs-site.xml" file configuration -
(C) "core-site.xml" file configuration -
(D) Format NameNode -
(E) Start NameNode -
Check with "jps" command NameNode is working or not.
(F) Stop Firewalld -
Step - 5 Configure DataNode -
(A) "hdfs-site.xml" file configuration -
(B) "core-site.xml" file configuration -
(C) Stop Firewalld -
(D) Start DataNode -
Check with "jps" command that DataNode is running or not.
Step -6 Check Contribution Of DataNode In Distributed File Storage of Hadoop Cluster -
* Run this command "hadoop dfsadmin -report"
You can see DataNode is contributing around 2 GiB . Thus we can set limitation of contribution of DataNode in Hadoop Cluster.
Thank you for giving you time to my article.
Full stack Java | Cloud Computing | Ansible Automation | Python | Angular | React | C | C++ | Nextjs | NestJs | springboot | HTML | CSS | Javascript | Typescript | SQL | Devops enthusiastic
4 年nice