登录查看更多内容

Hadoop

Elpida Bantra

Data Scientist @ Vinzenz Group | Healthcare Research, Data Analysis

发布日期: 2020年9月14日

The gist of #Hadoop

History: At the end of the 90s internet is booming and a series of open-source projects and startups are thrown to the race of creating automated web crawlers and powerful search engines. Out of the Nutch project, Hadoop popped up with its core being characterized by distributed computing and processing. In 2008, Yahoo released Hadoop as an open-source project. Today, Hadoop’s framework and ecosystem of technologies are managed and maintained by the non-profit Apache Software Foundation (ASF), a global community of software developers and contributors.

What can you do with it?

1. Store and process huge amounts of data quickly #AbilityToStoreAndProcessBigDataQuickly

2. Its distributed computing model leads to significantly reducing processing power #ComputerPower

3. Never have a failing jo as even if a node goes down another one will automatically handle it #FaultTolerance

4. You can store as much data as you want unpreprocessed #Flexibility

5. Keep the cost low as the framework is free and it uses commodity hardware #LowCost

6. You can easily expand your system #Scalability

When should we reconsider using it?

1. #MapReduce #programming is not a good match for iterative and interactive analytic tasks as it is file-intensive

2. #TalentGap It can be difficult to find entry-level programmers who have sufficient Java skills to be productive with MapReduce.

3. #DataSecurity is an issue but Kerberos authentication protocol is a good first step

4. Difficult to do #DataManagement and governance

You can find more information about Hadoop here: https://www.sas.com/en_us/insights/big-data/hadoop.html

#TechCareerMentors #SoftwareToolsAndPlatforms

要查看或添加评论，请登录

Elpida Bantra的更多文章

Η επιστ?μη των δεδομ?νων στην Ελλ?δα

2021年4月22日

Η επιστ?μη των δεδομ?νων στην Ελλ?δα
Fast Skill Share is out!

2020年11月7日

Fast Skill Share is out!

Dear friends, this is the 0.0.
Review of LinkedIn

2020年10月19日

Review of LinkedIn

So this is my review for LinkedIn as a product, what I like and what I need that needs improvement. LinkedIn is a…
What is Cloudera?

2020年9月14日

What is Cloudera?

Cloudera is probably currently the best platform for data analytics, data warehousing, and machine learning. Three…
My first web app

2020年6月30日

My first web app

I found out about #adalo yesterday and I was really impressed by being able to put together in a few minutes a simple…

1 条评论
Saying hello to my professional network

2020年5月16日

Saying hello to my professional network

About me My name is Elpida, an elder millennial from Greece. Over the last few years, I have had the privilege to live…
Build Something

2020年3月27日

Build Something

One year ago I had no idea how to even put two videos together, what work is demanded to create a video, the necessary…
Γιατι φοβομαστε του? ιου? και πω? να του? αποφυγουμε

2020年3月24日

Γιατι φοβομαστε του? ιου? και πω? να του? αποφυγουμε

Και με? την απυθμενη βλακεια, ειδα και ενα σχολιο οαση κατω απο ενα βιντεο για νεο ιο στην Κινα και αναρυθμητα βρισιδια…
Today, a circle is closing for me and a new one opens…

2019年5月24日

Today, a circle is closing for me and a new one opens…

A few years ago I wanted to do a trip to Ireland, have a pint in a pub and enjoy for a few days the green scenery. This…

4 条评论

See all articles

Hadoop

Elpida Bantra

Data Scientist @ Vinzenz Group | Healthcare Research, Data Analysis

Elpida Bantra的更多文章

社区洞察

其他会员也浏览了

Exploring Big Data Architecture Before Spark

9 issues I’ve encountered when setting up a Hadoop/Spark cluster for the first time

How to Connect SQL Server 2019 Dev to Hadoop System 3.1.3

Pig Latin and its Operators

Hadoop 3: Comparison with Hadoop 2 and Spark

Hadoop Cluster Revealed

CONFIGURE HADOOP AND START CLUSTER SERVICES USING ANSIBLE PLAYBOOK:-

Hadoop – Architecture

Spark Or Hadoop : Which Is The Best Big Data Framework?

The First Production Application on Hadoop

Elpida Bantra的更多文章

Η επιστ?μη των δεδομ?νων στην Ελλ?δα

Fast Skill Share is out!

Review of LinkedIn

What is Cloudera?

My first web app

Saying hello to my professional network

Build Something

Γιατι φοβομαστε του? ιου? και πω? να του? αποφυγουμε

Today, a circle is closing for me and a new one opens…

社区洞察

其他会员也浏览了

Exploring Big Data Architecture Before Spark

9 issues I’ve encountered when setting up a Hadoop/Spark cluster for the first time

How to Connect SQL Server 2019 Dev to Hadoop System 3.1.3

Pig Latin and its Operators

Hadoop 3: Comparison with Hadoop 2 and Spark

Hadoop Cluster Revealed

CONFIGURE HADOOP AND START CLUSTER SERVICES USING ANSIBLE PLAYBOOK:-

Hadoop – Architecture

Spark Or Hadoop : Which Is The Best Big Data Framework?

The First Production Application on Hadoop