登录查看更多内容

Introduction to Hive

Ankit Singh

Research Scholar @Glbimr || 5?in SQL @Hackerrank || Data Analyst || SQL || Python || Machine Learning|| Data Scientist || Poem Writing skill

发布日期: 2024年4月15日

+ 关注

The term ‘Big Data’ is used for collections of large datasets that include huge volume, high velocity, and a variety of data that is increasing day by day.
Using traditional data management systems, it is difficult to process Big Data.
Therefore, the Apache Software Foundation introduced a framework called Hadoop to solve Big Data management and processing challenges.

Hadoop

Hadoop is an open-source framework to store and process Big Data in a distributed environment.

It combination of two modules, one is MapReduce and another is Hadoop Distributed File System (HDFS).

HDFS: Hadoop Distributed File System is a part of Hadoop framework, used to store and process the datasets. It provides a fault-tolerant file system to run on commodity hardware.(Place where your data is get distributed)
MapReduce: It is a parallel programming model for processing large amounts of structured, semi-structured, and unstructured data on large clusters of commodity hardware.(Place where we write transformation jobs using java and then MapReduce will do distributed processing of your code by reading the data from HDFS)

now imagine you have to write join query so you will write 10-15 lines of java code but like oracle you can simply join it by single line.

so the SQL developers says that we are in to Hadoop because it is really good but the only problem is java is used in MapReduce so it should have SQL to communicate with MapReduce so here the use of Hive comes.

What is Hive?

? Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.

领英推荐

Sqoop

Darshika Srivastava 1 年前

Introduction to Hadoop

Simran Rai 1 个月前

Spark Vs Hadoop Map Reduce

TR Raveendra 2 年前

? It was developed by Facebook.

? Hive is vehicle which runs on the vehicle engine i.e. MapReduce.

? Hive is a query engine because it doesn’t have storage to store the data it use HDFS to store the data.

? Hive is abstraction of MapReduce.

Initially Hive was developed by Facebook, later the Apache Software Foundation took it up and developed it further as an open source under the name Apache Hive.
It is used by different companies. For example, Amazon uses it in Amazon Elastic MapReduce.

Hive is not

A relational database
A design for OnLine Transaction Processing (OLTP)
A language for real-time queries and row-level updates

SOURA SHANKAR SINHA

TIRC || Cost Engineering || Supply Chain || Analytics ||

11 个月

Amazing ??

1 次回应

Harsh Pratap Singh

Final year undergrad at HBTU Kanpur | Competitive Programmer | Frontend Web Developer

11 个月

Informative Article Sir ??

1 次回应

查看更多评论

要查看或添加评论，请登录

Ankit Singh的更多文章

The Evolution of AI: Future Trends in Large Language Models (LLMs)

2024年8月31日

The Evolution of AI: Future Trends in Large Language Models (LLMs)

As the world becomes increasingly digital, the demand for advanced artificial intelligence (AI) solutions is surging…
Introduction to EViews

2024年7月23日

Introduction to EViews

What is EViews? EViews is a modern econometric, statistics, and forecasting package that offers powerful analytical…

2 条评论
Differences and key points: Data Science, Data Analytics & Business Intelligence

2024年2月21日

Differences and key points: Data Science, Data Analytics & Business Intelligence

What do you mean by Data Science? “Data science is the study of extracting useful insights from data using scientific…

23 条评论

Introduction to Hive

Ankit Singh

Research Scholar @Glbimr || 5?in SQL @Hackerrank || Data Analyst || SQL || Python || Machine Learning|| Data Scientist || Poem Writing skill

Hadoop

What is Hive?

领英推荐

Hive is not

Ankit Singh的更多文章

社区洞察

其他会员也浏览了

Hands-on learning how Hadoop can benefit Corporate Security

Apache Hadoop vs Apache Spark

Is Hadoop necessary for data scientists?

Hadoop versus Spark: Who’s winning?

Apache Spark on YARN Architecture

Impala

Getting started with Apache Spark

Hadoop vs Spark Comparison

Impala

Hadoop

What is Hive?

领英推荐

Hive is not

Ankit Singh的更多文章

The Evolution of AI: Future Trends in Large Language Models (LLMs)

Introduction to EViews

Differences and key points: Data Science, Data Analytics & Business Intelligence

社区洞察

其他会员也浏览了

Hands-on learning how Hadoop can benefit Corporate Security

Apache Hadoop vs Apache Spark

Is Hadoop necessary for data scientists?

Hadoop versus Spark: Who’s winning?

Apache Spark on YARN Architecture

Impala

Getting started with Apache Spark

Hadoop vs Spark Comparison

Impala