Configuration of Hadoop Cluster using Ansible
Task 11.1
Configure Hadoop cluster using Ansible
1. Installation of Hadoop Requirements 2.Configuration of Name Node & Data Node 3. Starting Hadoop Services
lets start, Ansible
Ansible is configuration management tool. It works on push mechanism and it is agentless. ansible is built on the top of python hence before ansible installation we should have installed python3. we can install ansible using pip3 install ansible. after that configure /etc/ansible/ansible.cfg and inventory files like below
lets write playbook for master nodes
1.Transfer java JDK and install it on target node because Hadoop built from java language then transfer Hadoop library which will be compactiable with java.
2. Creating directory for master node and updating the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file
3. Format the /master directory to store metadata of data nodes. then start service of master node.
4. Then firewall rules like 50070/tcp,50010/tcp and 9001/tcp because 9001 is used for service and 50070 is used for WebUI.
lets start, configuration of datanode
1.Transfer JDK and install it on target node because Hadoop built from java language then transfer Hadoop library which will be compactiable with java.
2. Creating directory for data node and updating the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file
3.start service of data node and add the firewall rules
then lets create file for variables which which was mentioned in above playbook
then check syntax of playbook by ansible-playbook --syntax-check playbook_name then run playbook by ansible-playbook playbook_name
Name node:
Data Node:
Thus I have successfully completed task 11.1