Hadoop
The gist of #Hadoop
History: At the end of the 90s internet is booming and a series of open-source projects and startups are thrown to the race of creating automated web crawlers and powerful search engines. Out of the Nutch project, Hadoop popped up with its core being characterized by distributed computing and processing. In 2008, Yahoo released Hadoop as an open-source project. Today, Hadoop’s framework and ecosystem of technologies are managed and maintained by the non-profit Apache Software Foundation (ASF), a global community of software developers and contributors.
What can you do with it?
1. Store and process huge amounts of data quickly #AbilityToStoreAndProcessBigDataQuickly
2. Its distributed computing model leads to significantly reducing processing power #ComputerPower
3. Never have a failing jo as even if a node goes down another one will automatically handle it #FaultTolerance
4. You can store as much data as you want unpreprocessed #Flexibility
5. Keep the cost low as the framework is free and it uses commodity hardware #LowCost
6. You can easily expand your system #Scalability
When should we reconsider using it?
1. #MapReduce #programming is not a good match for iterative and interactive analytic tasks as it is file-intensive
2. #TalentGap It can be difficult to find entry-level programmers who have sufficient Java skills to be productive with MapReduce.
3. #DataSecurity is an issue but Kerberos authentication protocol is a good first step
4. Difficult to do #DataManagement and governance
You can find more information about Hadoop here: https://www.sas.com/en_us/insights/big-data/hadoop.html
#TechCareerMentors #SoftwareToolsAndPlatforms