登录查看更多内容

Practically find how to contribute the specific amount of storage with name node in Hadoop cluster ????

Anushka Visapure

Solution-Oriented DevOps Engineer || Skilled in Kubernetes | Terraform | Ansible | Docker | Git and GitHub | GitHub Action || Expanding Capabilities in AWS | GCP | Linux.

发布日期: 2020年11月10日

Problem: ??In a Hadoop cluster, find how to contribute a limited/specific amount of storage as a slave to the cluster?

Solution:??????????

what is Hadoop? Hadoop is an intelligent tool which used to manage big data and it is built on top of java Like Hadoop there are many other tools for managing the big data tools like apache spark and Google BigQuery etc

What is Hadoop Architecture? In basic Hadoop Architecture we will have one Name Node which is like a master node, one client from where we upload or read the data, and "n" number of data nodes where the master or Name node only contains metadata about all the data nodes in the cluster and each data node shares some storage with name node and name node will share this details to client and client will do whatever it wants to do either upload data into data nodes or reads the files from the data nodes .etc

This practical we have done in a team where we had 1 slave and 1 master

STEP -1 -Configure the configuration Files in Name Node

Here we have configured the core-site.xml file with IP address with hdfs protocol because internally Hadoop will use the hdfs protocols to transfer the files in this we have allowed all the IP's with port 9001

This we have configured in hdfs-site.xml config files where we have created the directory and in this dir, all the metadata about the data nodes are saved. After successful configuration of the files start the service of name node

Here you can check that name node service is on and it is running successfully and you can check the report of the name node ie how many data nodes are connected to the name node by using the command

" Hadoop dfsadmin -report "

STEP -2 Configuring the same configuration files in data nodes also

This is data node core-site-xml file where we will configure with name node IP address in the same way we have to configure all the data nodes we have in this cluster

Here we have created the directory and data nodes stores the data about the files it has will store the information and sends the heartbeat every three seconds to the name node and same has to be configured to every data node in the cluster

After the successful configuration of configuration files in datanode starts the data node service and check whether the data node process is successfully started or not.

STEP -3: After successfully starting the data node then we have to create the EBS volume of how many storages we required and then we have to connect it with our data node

After that, we have to check the EBS volume has been connected or not by using the command: " fdisk -l "

STEP -4: After connecting the EBS volume we have to create the partition by using the command: " fdisk device_name "

STEP -5: After creating the partition then we have to format it by using the command: " mkfs.ext3 device_name "

STEP -6: Now we have to mount this partition on the drive which we can share with the name node in my case the drive name is /data_node

After that, we have to check partition has been mounted or not by using the command: " df -l " and to see how many storage we have given to the drive we have to use the command: " lsblk "

After completing all above the steps successfully and now we can check in the name node whether the data node has been connected and shared the given storage successfully or not by using the command: "Hadoop dfsadmin -report"

And here we can see the data node share only that much storage which we have given it

THANK YOU FOR READING ????

Prasant Mahato

DevOps, Cloud & Performance Engineer| DevOps Engineer

4 年

Well done Anushka Visapure ?

1 次回应

Aditi Singh

Java|| Python||Linux and Networking||Hadoop ||Ansible || Kubernetes|| Jenkins|| AWS ||Docker||DSA

4 年

????Anushka Visapure

1 次回应

Onkar Naik

4 年

Nice work Anushka Visapure ??

1 次回应

Nilesh Gopale

4 年

Great work Anushka Visapure ?????

1 次回应

查看更多评论

要查看或添加评论，请登录

Anushka Visapure的更多文章

Azure Kubernetes Service Case Study !!

2021年3月3日

Azure Kubernetes Service Case Study !!

What is Kubernetes? Kubernetes is a rapidly evolving platform that manages container-based applications and their…

2 条评论
Restarting HTTPD Service Is Not Idempotence In Nature Using Ansible Playbook ????

2021年2月16日

Restarting HTTPD Service Is Not Idempotence In Nature Using Ansible Playbook ????

Task Description?? ?? 11.3 Restarting HTTPD Service is not idempotence in nature and also consume more resources…

4 条评论
Configure Hadoop and start cluster services using Ansible Playbook !!!

2021年2月16日

Configure Hadoop and start cluster services using Ansible Playbook !!!

What is Ansible ? Ansible is an open-source software provisioning, configuration management, and application-deployment…

6 条评论
Integration of Docker with Ansible !!!

2020年12月11日

Integration of Docker with Ansible !!!

Integration of Docker with Ansible (Task 10) Write an Ansible Playbook for Docker operations in the managed node…

4 条评论
AWS AND ANSIBLE !!????????

2020年11月30日

AWS AND ANSIBLE !!????????

Hello Connections !!!! ? Why do we need Configuration Management tool? Anyone who works as an operations engineer has…
Creating High Availability With AWS CLI (S3 , EBS , CloudFront) !!!

2020年11月12日

Creating High Availability With AWS CLI (S3 , EBS , CloudFront) !!!

What is a Web-Server? A web server is software and hardware that uses HTTP (Hypertext Transfer Protocol) and other…

2 条评论
Launching the Webserver and the Python Interpreter On the top of Docker Container !!!!

2020年11月5日

Launching the Webserver and the Python Interpreter On the top of Docker Container !!!!

!! ?????????? ?????????????????????? !! ?? TASK DESCRIPTION: ?? Configuring HTTPD Server on Docker Container ?? Setting…

14 条评论
TEAM TASK NO 1

2020年11月4日

TEAM TASK NO 1

!! ?????????? ?????????????????????? !! ?? TASK DESCRIPTION: ?? Whenever the client uploads the file ( for ex - f.txt)…

6 条评论
How Microsoft Is Using Machine Learning To Secure Its Software Development Cycle !!

2020年10月22日

How Microsoft Is Using Machine Learning To Secure Its Software Development Cycle !!

What Is Machine Learning? Machine learning is the concept that a computer program can learn and adapt to new data…

8 条评论
Launching New Instance Using AWS CLI !!!!

2020年10月13日

Launching New Instance Using AWS CLI !!!!

Task Description: ?? Create a key pair ?? Create a security group ?? Launch an instance using the above created key…

14 条评论

See all articles

Practically find how to contribute the specific amount of storage with name node in Hadoop cluster ????

Anushka Visapure

Solution-Oriented DevOps Engineer || Skilled in Kubernetes | Terraform | Ansible | Docker | Git and GitHub | GitHub Action || Expanding Capabilities in AWS | GCP | Linux.

Anushka Visapure的更多文章

社区洞察

其他会员也浏览了

Do I need Hadoop to be a good Data Scientist?

?? Hadoop Made Easy: Fix Common Errors and Install it Like a Pro!"

Hadoop 3: Comparison with Hadoop 2 and Spark

Frequently Asked Hadoop Questions

Hadoop Usage in Data Analytics: An Overview

Understanding the Varied Components of Hadoop and Benefits!

Mastering Hadoop: Leveraging Big Data in the World of Data Science

HCatalog Applications and Its Use Cases

"Introduction to Apache Impala: A Comprehensive Guide"

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Anushka Visapure的更多文章

Azure Kubernetes Service Case Study !!

Restarting HTTPD Service Is Not Idempotence In Nature Using Ansible Playbook ????

Configure Hadoop and start cluster services using Ansible Playbook !!!

Integration of Docker with Ansible !!!

AWS AND ANSIBLE !!????????

Creating High Availability With AWS CLI (S3 , EBS , CloudFront) !!!

Launching the Webserver and the Python Interpreter On the top of Docker Container !!!!

TEAM TASK NO 1

How Microsoft Is Using Machine Learning To Secure Its Software Development Cycle !!

Launching New Instance Using AWS CLI !!!!

社区洞察

其他会员也浏览了

Do I need Hadoop to be a good Data Scientist?

?? Hadoop Made Easy: Fix Common Errors and Install it Like a Pro!"

Hadoop 3: Comparison with Hadoop 2 and Spark

Frequently Asked Hadoop Questions

Hadoop Usage in Data Analytics: An Overview

Understanding the Varied Components of Hadoop and Benefits!

Mastering Hadoop: Leveraging Big Data in the World of Data Science

HCatalog Applications and Its Use Cases

"Introduction to Apache Impala: A Comprehensive Guide"

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage