Aggregation framework and Map Reduce in Mongodb
Task Description :- Use Aggression Framework of MongoDB and Create Mapper and Reducer Program.
What is NoSQL ?
NoSQL databases (aka "not only SQL") are non tabular, and store data differently than relational tables. NoSQL databases come in a variety of types based on their data model. The main types are document, key-value, wide-column, and graph. They provide flexible schemas and scale easily with large amounts of data and high user loads.
?What is MongoDB ?
?Mongodb is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.
What is MongoDB Aggregation Framework ?
Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.
What is Aggregation Pipeline ?
MongoDB’s aggregation framework is modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.
What is Map Reduce Function ?
领英推荐
MapReduce is?a software framework and programming model used for processing huge amounts of data. MapReduce program work in two phases, namely, Map and Reduce. Map tasks deal with splitting and mapping of data while Reduce tasks shuffle and reduce the data.
Map-reduce is a data processing paradigm for condensing large volumes of data into useful?aggregated?results.
We will perform this using two MongoDB Aggregation Framework :
Method 1: Aggregation Pipeline
db.countries.aggregate([{$group: {_id: {Language: “$Language”},
totalCountry: {$sum: 1}}}, {$sort: {totalCountry: 1}}])
# {$group: {_id: {Language: "$Language"} --> group by Language
# totalCountry: {$sum: 1} --> count the total countries asscoiated
with that language
# {$sort: {totalCountry: 1} --> sort them in ascending order
Method 2: Map Reduce Function
var mapFunction = function() { … };
var reduceFunction = function(key, values) { … };
db.runCommand(
{
mapReduce: <input-collection>,
map: mapFunction,
reduce: reduceFunction,
out: { merge: <output-collection> },
query: <query>
}
)
Declaring Map variable :
var mapFunc1 = function() {
var cntry = emit(this.Language, this.CountryName);
$split: [ cntry, "," ];
};
# defined country variable which will be grouping the data based on Language and Country Name and then splitting the data by comma
Declaring Reduce variable :
var ReduceFunc1 = function(keyLang, valuesCountryName) {
return valuesCountryName.length;
};
# after grouping, here we are counting the number of countries after the output is been sent by mapper
Using Map Reduce Function :
db.countries.mapReduce(
mapFunc1,
ReduceFunc1,
{out: "map_reduced"}
)
# now using map reduce function and saving it in map_reduced collection