Building a Hadoop Cluster from the Powerful Automation Tool: Ansible

Building a Hadoop Cluster from the Powerful Automation Tool: Ansible

Hello everyone,

Here's my new blog where I am going to show you how to configure the whole Hadoop Cluster basically I mean configuring one node as Namenode and the other as Datanode.
What's new...?

This whole cluster setup will be not done by me but one of the powerful tool Automation Tool in the market like Ansible will do this for me.

Sounds cool....

Firstly we have to create one Virtual Machine as Ansible Controller node and two more VM's one for Namenode/Master and other for Datanode/Slave. This two VM's will work as Target node for Ansible.

No alt text provided for this image

Now we will install Ansible in our Controller node by using this pip command coz Ansible is written in Python language. To check use this command...

#pip3 install ansible
No alt text provided for this image

As Ansible is an Agentless so we don't have need to install the ansible software in our Target node.  

Now as Controller node don't have any information than hoe it will do the configuration on the Targets, so for that we have to give both Target Node's IP, user name and their login and password to the Controller node in /etc/ip.txt file so that Controller node can do the configuration. Basically it is known as Inventory where we can give all the Target node IP we like to Configure. One node information is given in one particular line.

No alt text provided for this image

And now we have to give this file name to the ansible by defining this file name in the ansible configuration file. Bydefault configuration file is not given by ansible so we have to create a folder use this following commands:-

#mkdir /etc/ansible

#vim /etc/ansible/ansible.cfg

Now write the same in this file...

No alt text provided for this image

It is always a good practice to check that all the Managed nodes are connected by pinging.

No alt text provided for this image

Now we will configure the Namenode and the Datanode from the Ansible Automation Tool.

Before using the Ansible tool it is always a good practise to make a Hard coded or Soft coded note for all the steps we need to this configuration.

Ansible can be used in two ways firstly by using the cli method basically known as Ad-hoc Commands and secondly by creating a Playbook.

First we will run the adhoc commands step by step.

Note: If you find difficulty in any of the ad-hoc command then go for the playbook.

Step 1: Copying jdk and hadoop software in all

From my Hadoop folder where I have downloaded both the softwares will be copied to all the Master and Slave nodes.

#ansible all -m copy -a "src=/hadoop/hadoop-1.2.1-1.x86_64.rpm dest=/root"


#ansible all -m copy -a "src=/hadoop/jdk-8u171-linux-x64.rpm dest=/root"

Step 2: Clearing the caches in all

In some cases while we download these two softwares if the base Redhat OS have less Ram then we will not able to do the installation part. So in this step we are going to clear the caches.

#ansible all -m shell -a "echo 3 > /proc/sys/vm/drop_caches"

Step 3: Installing JDK and Hadoop in both in all
#ansible all -m shell -a "rpm -i jdk-8u171-linux-x64.rpm"


#ansible all -m shell -a "rpm -i hadoop-1.2.1-1.x86_64.rpm"

Step 4: Configuring the Master node

Now we are going to do the configuration of the Namenode and for doing so these are the following steps:

Creating a Namenode Directory

# ansible Master -m file -a "name='nn' state=directory"

Copying hdfs-site.xml

#ansible Master -m copy -a "src=/hadoop/nn_hdfs-site.xml dest=/etc/hadoop/hdfs-site.xml"


Copying core-site.xml

#ansible Master -m copy -a "src=/hadoop/nn_core-site.xml dest=/etc/hadoop/core-site.xml"


Format the Master node

#ansible Master -m shell -a "echo Y | hadoop namenode -format

Start the services

#ansible Master -m shell -a "hadoop-daemon.sh start namenode"

Step 5: Configuring the Slave node

Now we are going to do the configuration of the Datanode and for doing so these are the following steps:

Creating a Datanode Directory

# ansible Slave -m file -a "name='dn' state=directory"

Copying hdfs-site.xml

#ansible Slave -m copy -a "src=/hadoop/dn_hdfs-site.xml dest=/etc/hadoop/hdfs-site.xml"

Copying core-site.xml

#ansible Slave -m copy -a "src=/hadoop/dn_core-site.xml dest=/etc/hadoop/core-site.xml"

Starting the Datanode services

#ansible Slave -m shell -a "hadoop-daemon.sh start datanode"

Here is the Playbook of the same.

To run this Playbook use the following command...

#ansible-playbook cluster.yml

No alt text provided for this image

Here my software's are installed and running this playbook again and again so that's why it gives this error. It will be ignored.

No alt text provided for this image

As we can see our Namenode services has been started.

No alt text provided for this image

Now Datanode services are also started.

After running the Playbook in the Controller node we can check it from the Target node whether it is launched successfully or not. By using the "jps" command.

Hope you find this artical intresting.
Thank you !!
Yash Raj

Infoscian | Ex-TCSer | Ex-Merkle | Digital Marketing | AEC | Adobe Campaign Classic Developer | ACM | AEP | AJO | Adobe RT-CDP | Adobe CJA | Agile Model | Confluence | Jira | Python | C Prog | Unix | REST API | SQL | ML

4 年

great job

回复
Komal Suthar

Technical Support Engineer@Red Hat | RHCA

4 年

To download the playbook.. Visit my GitHub : https://github.com/24-komal/Ansible_Hadoop_Cluster_Configuration

回复
Aditya Raj

DevOps Engineer @ Hike || AWS Certified || RHCE || RHCSA || DevOps || Cloud Computing

4 年

Amazing you have explained each and every steps

回复

要查看或添加评论,请登录

Komal Suthar的更多文章

社区洞察

其他会员也浏览了