BIG DATA HADOOP DEVELOPER

BIG DATA HADOOP DEVELOPER

About The Course

Collabera Bigdata Hadoop developer course delivers key concepts and expertise necessary to create robust data processing applications using Apache Hadoop. In-depth knowledge of core concepts will be covered in the course along with implementation on varied industry use-cases. The course equips participants to work on the Hadoop environment with ease and learn vital components such as Zookeeper, Oozie, Flume, Sqoop, Spark, Mongo, Cassandra and Neo4J.

Curriculum:

Module 1 Learning Objectives - Introduction to Linux
Most of the Big Data software runs on Linux, so knowledge of Linux is a must for those interested to get into the various aspects of Big Data. Expertise in Linux is not required, but a basic knowledge of Linux is a must. The Linux sessions will cover just enough concepts around Ubuntu for an aspirant to quickly get started with Big Data.

Module 2 Learning Objectives - What is Big Data?
With the prerequisites complete, now is the time to jump into Big Data. Before jumping into the technical aspects the participants are given a holistic view about what Big Data is all about. This will help them to plan their carrier path and also work efficiently in the work environments.

Module 3 Learning Objectives - HDFS (Hadoop Distributed File System)
Data is everywhere and we are constantly generating a lot of data which needs to be stored. HDFS stands for Hadoop Distributed File System and allows for storing huge amounts of data in a cost effective manner. This session will cover what HDFS is all about, the architecture, how to interface with it.

Module 4 Learning Objectives - MapReduce
Once the data has been stored in HDFS, now it the time to process the data. There are many ways to process the data and MapReduce which has been introduced by Google is one of the earliest and the most popular mode. We will look into how to develop, debug, optimize and deploy MapReduce programs in different languages.

Module 5 Learning Objectives - Pig
MapReduce from the previous session is a big verbose and it’s difficult to write programs in MapReduce. That why, Yahoo started a software called Pig for data processing. Programs in Pig are compact and are easy to write. This is the reason for most of the companies to pick Pig when compared to MapReduce programming. This session will look at the Pig programming model.

Module 6 Learning Objectives - Hive
Similar to Pig by Yahoo, Hive was developed by Facebook as an alternate to MapReduce processing model. Similar to Pig, Hive also provides developer productivity when compared to MapReduce. The good thing about Hive is that it provides and SQL like interface and so it makes it easy to write programs against Hive.

Module 7 Learning Objectives - NoSQL (HBase)
NoSQL are the data bases for the Big Data. There are more than 125+ NoSQL databases and they have been categorized in the following types
- KeyValue databases (Accumulo, Dynamo, Riak etc)
- Columnar databases (HBase, Cassandra etc)
- Document databases (Mongo, Couch etc)
- Graph databases (Neo4j, Flock etc)
In this session, we will look into what NoSQL is all about, their characteristics, what NoSQL performs better when compared to RDBMS. We will also look at HBase in detail.

Module 8 Learning Objectives - Big Data Ecosystem
Hadoop started the Big Data revolution, but there are a lot of softwares besides Hadoop which either address the limitations of Hadoop or try to augment Hadoop. In this session we will look at some of them.
Key topics
- Zookeeper, Oozie, Flume, Sqoop, Spark, Mongo, Cassandra, Neo4J

Module 9 Learning Objectives - Big Data Administration
Learning Objectives
The course is mainly geared from a developer perspective, so it mainly deals with how to use particular software than on the installation aspect of it. This section will briefly touch upon the administrative aspects of Big Data.
Key topics
- Theory on how the Big Data Virtual machine has been created.
- Introduction to Cloud
- Demo of the creation of the Cloudera CDH cluster on the Amazon AWS cloud.

Module 10 Learning Objectives – Proof of concepts
In the above sessions it was all about how the individual softwares work. In the POC (Proof Of Concepts) we will see how the individual softwares can be integrated and what can be done as a whole.
The POC will be close to real life use cases as in the case of Amazon, eBay, Google and other big companies. The POCs will give the participants an idea how the Big Data softwares have to be integrated and also how they are used to solve some actual problems.
For the POC section there will be close to 3 hours of discussion and practice. An Internet connection is required for the participants to work on the POC.

WATCH SAMPLE VIDEO: https://www.youtube.com/watch?v=lOTNMMAiOb0&feature=youtu.be 

Interested candidates may inbox me their contact details OR email me their updated resume at unmeesh.sankpal@collabera.com or call me on

India (Cell): +91 759.787.8426 | US: 973.606.3154 | Singapore: +65 31580212

Visit tact.collabera.com 

 

要查看或添加评论,请登录

Unmeesh Sankpal的更多文章

  • “Harnessing AI to Unleash the power of IoT”

    “Harnessing AI to Unleash the power of IoT”

    Internet of Things (IOT) is a natural evolution of embedded computing and is helping create smart devices. IOT has…

  • ANDROID DEVELOPMENT

    ANDROID DEVELOPMENT

    From Cupcake to Marshmallow everyone has a sweet tooth for Android platform. The Android platform is available on…

  • IOS DEVELOPMENT

    IOS DEVELOPMENT

    iOS is the operating system that runs on all hand held Apple devices (eg. iPad, iPhone, iPod).

  • Business Analytics with R

    Business Analytics with R

    What is R? Introduction to R R is a language and environment for statistical computing and graphics. It is aGNU project…

  • BIG DATA HADOOP ADMINISTRATOR

    BIG DATA HADOOP ADMINISTRATOR

    About The Course Become Hadoop Administrator by mastering Hadoop Clusters ! Collabera Big Data and Hadoop Administrator…

社区洞察

其他会员也浏览了