Distributed storage cluster and Hadoop

Distributed storage cluster and Hadoop

Earlier all companies were using RDBMS to store their data in which we can read once and write n number of times .So solution came up as Hadoop .

Distributed Storage

The exponential growth of data volumes demands new storage technology . To manage the data, distributed storage system were introduced . In distributed storage system, data is divided in different blocks and block of data is stored in different virtual or physical machine .Distributed storage system is based on Master- Slave topology.

Hadoop

Hadoop is a software which provide the facility to create master slave topology in distributive storage system. Hadoop is built in java language .To install hadoop software in our system ,we need to first install the JDK(java development kit).Hadoop belongs to Apache community.

Hadoop Cluster

Two terms most frequently used in hadoop cluster are:

  • Cluster: Collection of node.
  • Node: Process running on virtual or physical machine.

In Hadoop Cluster ,one system is named as a master and other system associated with it is named as a slave.In hadoop, master node is named as Name node and slave node is named as data node.Every node added to the cluster gives the corressponding boost in throughput.Master node is associated with slave node through a protocol called HDFS(Hadoop distributed file system).

Advantage of Hadoop Cluster

  1. Scalable
  2. Cost efficient
  3. Flexible
  4. Fast
  5. Resilient to Failure




要查看或添加评论,请登录

Rupal Singh的更多文章

  • How Google Uses Machine Learning?

    How Google Uses Machine Learning?

    Google and its parent company Alphabet are heavily invested in Machine Learning research in almost all imaginable field…

  • How Amazon Web Services is powering Netflix

    How Amazon Web Services is powering Netflix

    Why Netflix migrated to AWS In 2008 ,Netflix faces the Database corruption incident. Netflix working on DVD by mail…

社区洞察

其他会员也浏览了