Hadoop vs Hive
Darshika Srivastava
Associate Project Manager @ HuQuo | MBA,Amity Business School
Difference Between Hadoop vs Hive
Hadoop is a Framework or Software invented to manage huge data or Big Data. Hadoop stores and processes extensive data distributed across a cluster of commodity servers. Hadoop stores the data using Hadoop distributed file system and process/query it using the Map-Reduce programming model. Hive is an application that runs over the Hadoop framework and provides an SQL-like interface for processing/querying the data. Hive was designed and developed by Facebook before becoming part of the Apache-Hadoop project. Hive runs its query using HQL (Hive query language). Hive has the same structure as RDBMS, and almost the same commands can be used in Hive. Hive can store the data in external tables, so it’s not mandatory to use HDFS. Also, it supports file formats such as ORC, Avro files, Sequence Files and Text files, etc.
Hadoop’s Major Components
Figure 1, a Basic architecture of a Hadoop component.
Hadoop Base/Common: Hadoop Common will provide one platform to install all its components.
ADVERTISEMENT
All-in-One Data Science Bundle - 400+ Courses | 550+ Mock Tests | 2000+ Hours | Lifetime | 2000+ Hour of HD Videos | 80 Learning Paths | 400+ Courses | Verifiable Certificate of Completion | Lifetime Access 4.7
HDFS (Hadoop Distributed File System): HDFS is a major part of the Hadoop framework. It?takes care of all the data in the Hadoop Cluster. It works on Master/Slave Architecture and stores the data using replication.
领英推荐
Master/Slave Architecture & Replication
YARN (Yet Another Resource Negotiator): It manages Hadoop resources. Also, it plays a vital role in scheduling users’ applications.
MR (Map Reduce): This is the primary programming model of Hadoop. It is used to process/query the data within the Hadoop framework.
Hive’s Major Components
Figure 2: Hive’s Architecture & Its Major Components
Hive Clients: Besides SQL, Hive also supports programming languages like Java, C, and Python using various drivers such as ODBC, JDBC, and Thrift. One can write any Hive client application in other languages and can run in Hive using these Clients.
ADVERTISEMENT
MS Excel & VBA for Data Science Course Bundle - 24 Courses in 1 | 10 Mock Tests 87+ Hours of HD Videos | 24 Courses | 10 Mock Tests & Quizzes | Verifiable Certificate of Completion | Lifetime Access 4.5
Hive Services: Under Hive services, execution of commands and queries take place. Hive Web Interface has five sub-components.