Configuring Hadoop via Ansible Playbook
Shreya Garg
LLM Engineer @ ZS | Artificial intelligence Enthusiast | Red Hat Certified Engineer |
To configure Hadoop using Ansible Playbook we have to first know what is Hadoop?
HADOOP
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.
We use Hadoop to solve big data problems.
To know about ansible playbook check my previous blog.
CONFIGURING HADOOP VIA ANSIBLE PLAYBOOK
First we'll get into the folder of ansible playbook and make a yml file for the same
#cd /ansible-playbook
#vim 11task.yml
From the above code you can see that it has some more files(xml) which are required for the hadoop configuration so we will make it in the folder of ansible playbook itself
#vim cnn.xml
#vim hnn.xml
#vim hdn.xml
Also make groups for datanode and namenode in the ip.txt file
#vim ip.txt
Now we will run the ansible playbook
#ansible-playbook 11task.yml
Now open the managed node 1 and check jps command if the namenode is working or not
#jps
Same for managed node 2 and also check whether the namenode has the datanode connected or not
#jps
#hadoop dfsadmin -report