登录查看更多内容

Configure Hadoop and Start Cluster Services Using Ansible Playbook And Restarting HTTPD Service Is Not Idempotence In Nature Using Ansible Playbook

Nishant Singh

Senior Software Engineer@HCL Tech | Red Hat Certified System Administrator | AWS Certified Solution Architect-Associate | AWS Certified Developer Associate | AWS Cloud Practitioner Certified

发布日期: 2020年11月29日

What is Ansible?

Ansible is an open-source automation tool, or platform, used for IT tasks such as configuration management, application deployment, intraservice orchestration, and provisioning. Automation is crucial these days, with IT environments that are too complex and often need to scale too quickly for system administrators and developers to keep up if they had to do everything manually. Automation simplifies complex tasks, not just making developers’ jobs more manageable but allowing them to focus attention on other tasks that add value to an organization. In other words, it frees up time and increases efficiency. And Ansible, as noted above, is rapidly rising to the top in the world of automation tools.

Advantages of Ansible:

Free: Ansible is an open-source tool.
Very simple to set up and use: No special coding skills are necessary to use Ansible’s playbooks (more on playbooks later).
Powerful: Ansible lets you model even highly complex IT workflows.
Flexible: You can orchestrate the entire application environment no matter where it’s deployed. You can also customize it based on your needs.
Agentless: You don’t need to install any other software or firewall ports on the client systems you want to automate. You also don’t have to set up a separate management structure.
Efficient: Because you don’t need to install any extra software, there’s more room for application resources on your server.

Task Description??

?? 11.1 Configure Hadoop and start cluster services using Ansible Playbook

?? 11.3 Restarting HTTPD Service is not idempotence in nature and also consume more resources suggest a way to rectify this challenge in Ansible playbook

Solution 11.1:

Before doing the task you have to download and configure the inventory of ansible. Type this command in your vm it will download the ansible for you.

pip3 install ansible

Now we have to make random name file in my case i make a file named /etc/myhosts.txt and write your other virtual machine ip (vm in which you want to configure and setup the hadoop namenode and datanode) and other things like root and password etc..

Now check the ansible version by typing.

ansible --version

Acoording to above image ansible see its repository in /etc/ansible/ansible.conf file so configure this file.

See all the hosts by typing ansible all --list-hosts.

Ping to the host to see there is ssh connectivity between both the virtual machine or not.

Now I am ready with my playbook code.

- hosts: namenode
  vars_files:
          - var.yml
  tasks:
          - name: Copy Java Software
            copy:
                    src: "/root/jdk-8u171-linux-x64.rpm"
                    dest: "/root/"

          - name: Copy Hadoop Software
            copy:
                    src: "/root/hadoop-1.2.1-1.x86_64.rpm"
                    dest: "/root/"

          - name: Install Java Software
            shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
            register: java_install

          - name: java install information
            debug:
                    var: java_install

          - name: Install Hadoop Software
            shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
            register: hadoop_install
            when: java_install.rc == 0

          - name: hadoop install information
            debug:
                    var: hadoop_install

          - name: Create Directory
            file:
                    state: directory
                    path: "{{ name_dir }}"

          - name: Copy hdfs-site.xml file
            template:
                    src: "n_hdfs-site.xml"
                    dest: "/etc/hadoop/hdfs-site.xml"

          - name: Copy core-site.xml file
            template:
                    src: "n_core-site.xml"
                    dest: "/etc/hadoop/core-site.xml"

          - name: Format the namenode directory
            shell: "echo Y | hadoop namenode -format"

          - name: Start Namenode Service
            shell: "hadoop-daemon.sh start namenode"

- hosts: datanode
  vars_files:
          - var.yml
  tasks:
          - name: Copy Java Software
            copy:
                    src: "/root/jdk-8u171-linux-x64.rpm"
                    dest: "/root/"

          - name: Copy Hadoop Software
            copy:
                    src: "/root/hadoop-1.2.1-1.x86_64.rpm"
                    dest: "/root/"

          - name: Install Java Software
            shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
            register: java_install

          - name: java install information
            debug:
                    var: java_install

          - name: Install Hadoop Software
            shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
            register: hadoop_install
            when: java_install.rc == 0

          - name: hadoop install information
            debug:
                    var: hadoop_install

          - name: Create Directory
            file:
                    state: directory
                    path: "{{ data_dir }}"

          - name: Copy hdfs-site.xml file
            template:
                    src: "d_hdfs-site.xml"
                    dest: "/etc/hadoop/hdfs-site.xml"

          - name: Copy core-site.xml file
            template:
                    src: "d_core-site.xml"
                    dest: "/etc/hadoop/core-site.xml"

          - name: Start Namenode Service
            shell: "hadoop-daemon.sh start datanode"

And my var file where I store the variables.

name_ip: 192.168.43.102
name_port: 9001
name_dir: /nn8
data_dir: /dn8

Now check the syntax of the main playbook ansible-playbook --syntax-check hadoop.yml and after that run this playbook by typing ansible-playbook hadoop.yml. It will give the output like this.

ansible-playbook --syntax-check hadoop.yml
ansible-playbook hadoop.yml

Now I check in the Namenode virtual machine that everything is going good or not.

In the above image you can see that firstly java and hadoop is not installed and jps command is not working but after running playbook everything is configured.

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file is configured after running playbook.

Now I check in the Datanode virtual machine that everything is going good or not.

In the above image you can see that firstly java and hadoop is not installed and jps command is not working but after running playbook everything is configured.

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file is configured after running playbook.

You can check the report of hadoop claster by typing hadoop dfsadmin -report.

hadoop dfsadmin -report

Hadoop setup completed.

?? 11.3 Restarting HTTPD Service is not idempotence in nature and also consume more resources suggest a way to rectify this challenge in Ansible playbook

Solution 11.3:

Before doing the task you have to download and configure the inventory of ansible. Type this command in your vm it will download the ansible for you.

pip3 install ansible

Now we have to make random name file in my case i make a file named /etc/myhosts.txt and write your other virtual machine ip (vm in which you want to configure and setup the docker) and other things like root and password etc...

Now check the ansible version by typing.

ansible --version

Acoording to above image, Ansible see its repository in /etc/ansible/ansible.conf file so configure this file.

See all the hosts by typing ansible all --list-hosts.

Ping to the host to see there is ssh connectivity between both the virtual machine or not.

Now I am ready with my playbook code.

---
- hosts: all
  vars_files:
  - var1.yml

  tasks:
  - name: "Create directory for dvd mount"
    file:
              state: directory
              path: "{{ dvd_dir }}"

  - name: "Mount the dvd to the directory"
    mount:
              src: "/dev/cdrom"
              path: "{{ dvd_dir }}"
              state: mounted
              fstype: "iso9660"

  - name: "Configure AppStream for yum"
    yum_repository:
              baseurl: "{{ dvd_dir }}/AppStream"
              name: "dvd1"
              description: "dvd1 for AppStream packages"
              gpgcheck: no

  - name: "Configure BaseOS for yum"
    yum_repository:
              baseurl: "{{ dvd_dir }}/BaseOS"
              name: "dvd2"
              description: "dvd2 for BaseOS packages"
              gpgcheck: no

  - name: "Install package"
    package:
              name: "httpd"
              state: present
    register: x

  - name: "Create directory for web server"
    file:
              state: directory
              path: "{{ doc_root }}"
    register: y

  - name: "Copy the configuration file"
    template:
              dest: "/etc/httpd/conf.d/lw.conf"
              src: "lw.conf"
    when: x.rc == 0
    notify:
              - Start service

  - name: "Copy the web page"
    copy:
              dest: "{{ doc_root }}/index.html"
              content: "this is neeew web page\n"
    when: y.failed == false
            
  - name: "start httpd service"
    service:
              name: "httpd"
              state: started

  - name: "Create firewall rule"
    firewalld:
              port: "{{ http_port }}/tcp"
              state: enabled
              permanent: yes
              immediate: yes

  handlers:
  - name: Start service
    service:
              name: "httpd"
              state: restarted

And my var file where I store the variables.

doc_root: "/var/www/nishant"
dvd_dir: "/dvd5"
http_port: 8082

Now check the syntax of the main playbook ansible-playbook --syntax-check hadoop.yml and after that run this playbook by typing ansible-playbook hadoop.yml. It will give the output like this.

ansible-playbook --syntax-check hadoop.yml
ansible-playbook hadoop.yml

Now you can check in virtual machine whose IP is 192.168.43.131 where I want to deploy web server.

Now you can from the browser that web server is running or not.

Now If you run the playbook again then it will shows that Your service is started so no need the restart again this become possible because of the handlers and notify keyworks in ansible.

Now I change my var file where I store the variables.

doc_root: "/var/www/harsh"
dvd_dir: "/dvd5"
http_port: 8083

Now I run my playbook again with new variables.

Now you can check in virtual machine whose IP is 192.168.43.131 where I want to deploy web server.

You can check the final output from the browser and type both the port number 8082 as well as 8083.

GitHub Link:

Thanks guys for reading this article.

要查看或添加评论，请登录

Nishant Singh的更多文章

CREATE A DYNAMIC ANSIBLE PLAYBOOK FOR DEPLOYING A WEBPAGE IN THE RedHat-8 and Ubuntu-20 OS

2020年12月25日

CREATE A DYNAMIC ANSIBLE PLAYBOOK FOR DEPLOYING A WEBPAGE IN THE RedHat-8 and Ubuntu-20 OS

What is Ansible? Ansible is an open-source automation tool, or platform, used for IT tasks such as configuration…

2 条评论
What is Kubernetes and case study of Kubernetes

2020年12月24日

What is Kubernetes and case study of Kubernetes

What is Kubernetes? Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and…

3 条评论
LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON REDHAT8

2020年12月10日

LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON REDHAT8

What is Ansible? Ansible is an open-source software provisioning, configuration management, application-development…

2 条评论
LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON AWS

2020年12月8日

LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON AWS

What is Ansible? Ansible is an open-source software provisioning, configuration management, application-development…

10 条评论
Ansible Introduction And Case Study

2020年11月28日

Ansible Introduction And Case Study

Ansible is simple open source IT engine which automates application deployment, intra service orchestration, cloud…
LAUNCH WEB SERVER ON THE TOP OF DOCKER USING ANSIBLE PLAYBOOK

2020年11月25日

LAUNCH WEB SERVER ON THE TOP OF DOCKER USING ANSIBLE PLAYBOOK

What is Ansible? Ansible is an open-source automation tool, or platform, used for IT tasks such as configuration…

5 条评论
How Client put the file, read the file in Hadoop Cluster and How it retrieve data when DataNode is crashed

2020年11月13日

How Client put the file, read the file in Hadoop Cluster and How it retrieve data when DataNode is crashed

What is Big Data? Big Data is also data but with a huge size. Big Data is a term used to describe a collection of data…
Let’s research and the world the know about the Myths of Hadoop

2020年11月12日

Let’s research and the world the know about the Myths of Hadoop

A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge…
Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

2020年11月2日

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

What is LVM? LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and…
Configuring the Webserver on the Docker and set the environment for python programs in the Docker container

2020年11月2日

Configuring the Webserver on the Docker and set the environment for python programs in the Docker container

Problem Statement: ??Configuring HTTPD Server on Docker Container Introduction of Docker Docker is an open platform for…

2 条评论

See all articles

Configure Hadoop and Start Cluster Services Using Ansible Playbook And Restarting HTTPD Service Is Not Idempotence In Nature Using Ansible Playbook

Nishant Singh

Senior Software Engineer@HCL Tech | Red Hat Certified System Administrator | AWS Certified Solution Architect-Associate | AWS Certified Developer Associate | AWS Cloud Practitioner Certified

What is Ansible?

Advantages of Ansible:

Nishant Singh的更多文章

社区洞察

其他会员也浏览了

Understanding Narrow and Wide Transformations in Apache Hadoop and Apache Spark

Commercial Distributions of Hadoop: An Overview

Data Analysis Using Apache Hadoop and Apache Spark

Difference between RDBMS and HBase

Integration of LVM with Hadoop-Cluster To contribute limited storage of datanode on aws

Spark vs. Hadoop: A Comprehensive Comparison for Big Data Processing

Building Scalable Data Pipelines with Apache Spark & Hadoop

Unleashing the Power of Big Data with Apache Hive

How Apache Hadoop Revolutionized Data Processing in the Big Data Era

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node

What is Ansible?

Advantages of Ansible:

Nishant Singh的更多文章

CREATE A DYNAMIC ANSIBLE PLAYBOOK FOR DEPLOYING A WEBPAGE IN THE RedHat-8 and Ubuntu-20 OS

What is Kubernetes and case study of Kubernetes

LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON REDHAT8

LAUNCH LOAD BALANCER USING HAPROXY AND CONFIGURE WEB SERVER USING ANSIBLE PLAYBOOK ON AWS

Ansible Introduction And Case Study

LAUNCH WEB SERVER ON THE TOP OF DOCKER USING ANSIBLE PLAYBOOK

How Client put the file, read the file in Hadoop Cluster and How it retrieve data when DataNode is crashed

Let’s research and the world the know about the Myths of Hadoop

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Configuring the Webserver on the Docker and set the environment for python programs in the Docker container

社区洞察

其他会员也浏览了

Understanding Narrow and Wide Transformations in Apache Hadoop and Apache Spark

Commercial Distributions of Hadoop: An Overview

Data Analysis Using Apache Hadoop and Apache Spark

Difference between RDBMS and HBase

Integration of LVM with Hadoop-Cluster To contribute limited storage of datanode on aws

Spark vs. Hadoop: A Comprehensive Comparison for Big Data Processing

Building Scalable Data Pipelines with Apache Spark & Hadoop

Unleashing the Power of Big Data with Apache Hive

How Apache Hadoop Revolutionized Data Processing in the Big Data Era

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node