HDFS goals

HDFS goals

Fault detection and recovery : Because HDFS contains a large number of commodity hardware, the probability of failure of components is very high. Therefore, HDFS has to have mechanisms to identify and recover quickly and automatically.

The huge data set : HDFS has hundreds of nodes in each cluster to manage applications with large data sets.

Hardware in Data - When calculations occur near data, a task can be done effectively. Especially when large data sets are involved, it reduces network traffic and increases performance.

要查看或添加评论,请登录

Babak Rezaei Bastani的更多文章

  • NameNode Server in HDFS

    NameNode Server in HDFS

    The main node in HDFS is that it maintains and manages the blocks on the DataNodes. NameNode is a very…

  • HDFS Architecture (Basic concepts)

    HDFS Architecture (Basic concepts)

    HDFS is a blocked file system in which each file is split into blocks of predefined size. These blocks are stored in…

  • What is MapReduce?

    What is MapReduce?

    MapReduce is a processing method and a Java-based distribution model for distributed computing. The MapReduce algorithm…

  • An overview of HDFS

    An overview of HDFS

    The Hadoop file system was developed using distributed file system design and runs on commodity hardware. Unlike other…

  • Introduction to Hadoop

    Introduction to Hadoop

    Hadoop is an apache-based open source framework written in Java programming language, which allows simple…

  • Data Science Processing Tools

    Data Science Processing Tools

    Once learned with data storage, you need to be familiar with data processing tools for converting data lakes to data…

  • Data Warehouse Bus Matrix

    Data Warehouse Bus Matrix

    The Enterprise Bus Matrix is a data warehouse planning tool developed by Ralph Kimball and is being used by numerous…

  • Data vault

    Data vault

    Data vault modeling, designed by Dan Linstedt, is a database modeling method that has been deliberately structured in…

  • Data Lake

    Data Lake

    A Data lake is a data storage tank for a large amount of raw data. Waiting for future needs, the data lake saves the…

  • Data Science Storage Tools

    Data Science Storage Tools

    The data science ecosystem has a set of tools that we use to build our solutions. The capabilities of this environment…

社区洞察

其他会员也浏览了