登录查看更多内容

Hadoop: Contributing specified storage on Data Node to the Cluster

Ankit Kumar

Platform Engineer @ Brevo | Kubernetes | Python | Linux | Cloud | RHCE | RHCSA

发布日期: 2020年10月21日

Many times while creating Hadoop cluster, a condition arises where we don't want to contribute the entire storage available on Data Node. So, today we'll be seeing how to contribute specified space on the Data Node to the Name Node or the cluster. We'll be achieving this with the help of partitions.

Prerequisites: Here, for demonstration I've already configured a Name Node and a Data Node over an AWS EC2 instance. Also, I have created and attached an EBS volume of 1GB.

You can see in the image below that the EBS volume (i.e /dev/xvdf) isn't mounted.

Let's say that we only want to contribute 512MB of the attached EBS volume to the cluster. So, we'll create a partition of that size and contribute it's space to the cluster.

Step 1: Creating Partition

cmd# fdisk /dev/xvdf

Now you can see that the new partition has appeared as /dev/xvdf1

Step 2: Formatting the partition

cmd# mkfs.ext4 /dev/xvdf1

We'll have to format the partition with the above command so that we can store data on it.

Step 3: Mounting the partition

cmd# mkdir /dn1
cmd# mount /dev/xvdf1 /dn1

Now, we'll have to mount the partition to a directory that we'll be dedicating to the cluster.

Now you can see that the partition has been mounted over to /dn1 and is ready for use. I had already configured the hdfs-site.xml to directory /dn1.

So, now we'll start the Data Node.

cmd# hadoop-daemon.sh start datanode

You can confirm weather the Data Node has been started by using the jps command.

cmd# hadoop dfsadmin -report

Finally, by using the above command you can see that no more than the allocated space is being used.

Thanks...

Hope you enjoyed it...

See you again...!!!

要查看或添加评论，请登录

Ankit Kumar的更多文章

How OpenAI leverages by using Kubernetes...

2020年12月26日

How OpenAI leverages by using Kubernetes...

Okay so we've all heard this term a lot, right? But now it's time that we take a better look at it. So, let's get…
Ansible: Automating Load Balancer and Web Servers on AWS

2020年10月28日

Ansible: Automating Load Balancer and Web Servers on AWS

Today, we'll be deploying a load balancer and three web servers What's Load Balancing? Load Balancing refers to the…
Ansible: Automating Apache Web Server Configuration over AWS with Dynamic Inventory

2020年10月25日

Ansible: Automating Apache Web Server Configuration over AWS with Dynamic Inventory

Today we're going to launch an EC2 instance running on RHEL8 over AWS Cloud and configure Apache Web Server on top of…
Getting your hands dirty with AWS CLI

2020年10月17日

Getting your hands dirty with AWS CLI

What's AWS CLI? The AWS Command Line Interface (CLI) is a tool that provides access to multiple AWS services in one…
Automating Apache Web Server Configuration on Docker using Ansible

2020年9月29日

Automating Apache Web Server Configuration on Docker using Ansible

So we're going to host web pages using Apache's web server (i.e.

See all articles

Hadoop: Contributing specified storage on Data Node to the Cluster

Ankit Kumar

Platform Engineer @ Brevo | Kubernetes | Python | Linux | Cloud | RHCE | RHCSA

Ankit Kumar的更多文章

社区洞察

其他会员也浏览了

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node

Hadoop: Pioneering the Era of Big Data Storage Technologies

Elasticity Task

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Increase/Decrease Storage shared by Data Nodes in Hadoop Cluster on the Fly

INTEGRATION OF LVM Partition WITH HADOOP CLUSTER

Spark Or Hadoop : Which Is The Best Big Data Framework?

Let’s research and the world the know about the Myths of Hadoop

How we can store huge data which is beyond our capacity (Big-Data)?

Limiting The Storage In Hadoop Cluster By Data Node

Ankit Kumar的更多文章

How OpenAI leverages by using Kubernetes...

Ansible: Automating Load Balancer and Web Servers on AWS

Ansible: Automating Apache Web Server Configuration over AWS with Dynamic Inventory

Getting your hands dirty with AWS CLI

Automating Apache Web Server Configuration on Docker using Ansible

社区洞察

其他会员也浏览了

Setting Up Hadoop Cluster on Top of AWS & Checking the Existence of Replica by Crashing the data node

Hadoop: Pioneering the Era of Big Data Storage Technologies

Elasticity Task

Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Increase/Decrease Storage shared by Data Nodes in Hadoop Cluster on the Fly

INTEGRATION OF LVM Partition WITH HADOOP CLUSTER

Spark Or Hadoop : Which Is The Best Big Data Framework?

Let’s research and the world the know about the Myths of Hadoop

How we can store huge data which is beyond our capacity (Big-Data)?

Limiting The Storage In Hadoop Cluster By Data Node