CONFIGURE HADOOP AND START CLUSTER SERVICES USING ANSIBLE PLAYBOOK:-
Udit Agarwal
Software Engineer | Python | GCP Cloud | Devops | Kubernetes | Grafana | AWS cloud | JAVA enthusiast | web developer | Docker | Rhel 8
REDHAT ANSIBLE:-
Ansible is an open-source automation tool which is used for IT tasks such as configuration management, application deployment, intra-service orchestration, and provisioning. Automation is crucial these days, with IT environments that are too complex and often need to scale too quickly for system administrators and developers to keep up if they had to do everything manually. Automation simplifies complex tasks, not just making developers’ jobs more manageable but allowing them to focus attention on other tasks that add value to an organization. In other words, it frees up time and increases efficiency.
HADOOP CLUSTER:-
Hadoop is an Apache open-source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from a single server to thousands of machines, each offering local computation and storage.
It includes Master-Slave Architecture also known as Namenode-Datanode Architecture.
Task 11.1 Description:-
?? Configure Hadoop and start cluster services using Ansible Playbook.
Let's start...
My Controller Node IP is 192.168.0.102 in which ansible is installed and also my Master/Namenode for Hadoop.
My Target/Managed Nodes IP's are 192.168.0.104 and 192.168.0.103 which is my Slave/Datanode and Client for Hadoop respectively.
? STEP 1:- Check the version.
ansible --version
? STEP 2:- Update the Inventory file and then check the pinging.
gedit /root/ip.txt
ansible all --list-hosts
ansible all -m ping
? STEP 3:- Update the Ansible configuration file.
gedit /etc/ansible/ansible.cfg
? STEP 4:- Ansible Playbook.
gedit task11-1.yml
For configuring hadoop cluster we need jdk & hadoop software so first we will copy and install this software in namenode, datanode and client.
And then we need to configure namenode, datanode and client individually.
gedit var.yml
It contains all the variables and their values used in playbook.
gedit hdfs-site.xml
The hdfs-site.xml file for namenode and datanode used as template in playbook are stored in namenode_files and datanode_files i.e. in their respective folders.
gedit core-site.xml
The core-site.xml file for namenode, datanode and client used as template in playbook is stored in the workspace.
? STEP 5:- Run the Ansible Playbook.
ansible-playbook task11-1.yml
? STEP 6:- Check that directory created and service started manually and then generate report.
We can clearly see that we have successfully added one datanode(slave) to the namenode (master) of 49.98 GB out of which 43.87 GB is available for namenode. Like this we just need to update IP’s of multiple datanodes in the inventory file and then run the playbook & we can increase our hadoop cluster.
TASK COMPLETED SUCCCESSFULLY??????? ??
Thanks for reading!! ??
Network Engineer L1at HPE
2 年I love it concept