All about NoSQL - MongoDB

1) What is NoSQL?

NoSQL (Not Only SQL) is a type of database that provides a flexible alternative to traditional relational databases. It is designed to handle unstructured, semi-structured, and structured data efficiently. NoSQL databases are widely used in big data applications, real-time analytics, and cloud-based systems.

2) Why MongoDB?

MongoDB is one of the most popular NoSQL databases, known for its:

Document-oriented storage (uses JSON-like BSON format)

Schema flexibility (dynamic schemas)

Scalability (horizontal scaling using sharding)

High performance (fast reads/writes)

Rich query language (support for indexing, aggregation, and geospatial queries)

3) Key Features of MongoDB:

Collections and Documents: Instead of tables and rows, MongoDB stores data in collections and documents.

Flexible Schema: No predefined schema, allowing varied structures.

Indexing: Supports various types of indexes for faster queries.

Replication: Ensures high availability via replica sets.

Sharding: Distributes data across multiple servers for scalability.

Aggregation Framework: Enables complex data processing.

4) When to Use NoSQL?

? High scalability needed

? Unstructured or semi-structured data

? High read/write throughput

? Real-time big data processing

? Flexible schema requirements

5) MongoDB Deployment Options

Self-hosted (on-premises)
MongoDB Atlas (cloud-based managed service)
Docker & Kubernetes (containerized deployments)

6) MongoDB vs. SQL Databases

7) Basic MongoDB Operations

7.1) Create/Select Database

use myUser

db.createCollection("myUser")

7.2) Inserting Document(s)

db.myUser.insertOne({name: "Alice", age: 25, city: "NY"}) // this will let you insert single document

db.myUser.insertMany([ { name: "Alice", age: 25,city: "NY" }, { name: "Bob", age: 28,city: "CA" } ]); // this will let you insert multiple documents a once

7.3) Retrieving Document(s)

db.myUser.find(); // Fetch all documents

db.myUser.find({ age: { $gt: 25 } }); // Fetch data using operators

db.myUser.findOne({ name: "John Doe" }); // Fetch a single document from the collection based on given field attribute or any other attribute

7.4) Updating Document(s)

db.myUser.updateOne({ name: "John Doe" }, { $set: { age: 31 } }); // For updating a specific field attribute

?db.myUser.updateMany({}, { $set: { status: "active" } }); // Updates all documents in the collection

7.5) Deleting Document(s)

db.myUser.deleteOne({ name: "Alice" }); // Delete single document

db.myUser.deleteMany({ age: { $lt: 27 } }); // Deletes all documents based on the condition that you mentioned

7.6) Indexing for Performance

db.myUser.createIndex({ email: 1 }); // Creates an index on the email field

8) Aggregation in MongoDB

The aggregation framework in MongoDB provides powerful data processing capabilities, similar to SQL's GROUP BY, COUNT(*), and aggregate functions. It allows grouping, filtering, sorting, transforming, and computing values across collections.

8.1) Aggregation Pipeline

The pipeline consists of a series of stages where each stage processes input documents and passes the output to the next stage.

8.2) Common Aggregation Stages

$match – Filters documents based on conditions (similar to find).

$group – Groups documents by a specified field and performs aggregation (sum, avg, count, etc.).

$project – Reshapes documents by including/excluding fields or computing new ones.

$sort – Sorts documents based on a specified field.

$limit – Limits the number of documents.

$skip – Skips a specified number of documents.

$unwind – Deconstructs an array field into multiple documents.

$lookup – Performs a left outer join with another collection.

$addFields – Adds new fields to documents.

$facet – Runs multiple aggregation pipelines in a single query.

Example For Aggregation Pipeline:

db.orders.aggregate([

{ $match: { status: "shipped" } },

{ $group: { _id: "$customerId", totalAmount: { $sum: "$amount" } } },

{ $sort: { totalAmount: -1 } },

{ $limit: 5 }

])

Explanation for above aggregate example:

Above we are filtering orders with status as "shipped" by grouping orders by customerId and sums the amount. Then we are sorting by totalAmount in descending order and we are only displaying top 5 customers using Limit.

Performance Considerations:

Use indexes to optimize $match stage.
Minimize $unwind where possible.
Prefer $project early in the pipeline to reduce data size.

8.3) Here’s an advanced example demonstrating multiple aggregation stages in MongoDB.

Scenario: Let's say we have a sales collection with the following structure. Now we want to filter only completed sales and Unwind the items array to separate items. Calculate total revenue per product. Sort products by total revenue and show only the top 5 products.

Query be like:

db.sales.aggregate([

{ $match: { status: "completed" } },

{ $unwind: "$items" },

{

$group: {

_id: "$items.product",

totalRevenue: { $sum: { $multiply: ["$items.quantity", "$items.price"] } },

totalUnitsSold: { $sum: "$items.quantity" }

}

},

{ $sort: { totalRevenue: -1 } },

{ $limit: 5 }

])

Explanation for above query:

$match: Filters documents where status is "completed".
$unwind: Splits each item in the items array into separate documents.
$group: Groups by product name and calculates:

totalRevenue = quantity * price

totalUnitsSold = sum of quantities sold.

$sort: Sorts products by total revenue in descending order.
$limit: Returns only the top 5 selling products.

Output for above query be like:

[

{ "_id": "Laptop", "totalRevenue": 50000, "totalUnitsSold": 50 },

{ "_id": "Mouse", "totalRevenue": 1500, "totalUnitsSold": 60 },

{ "_id": "Keyboard", "totalRevenue": 1200, "totalUnitsSold": 40 }

]

8.4) Let's explore $lookup, which is MongoDB's equivalent of SQL JOIN, using an advanced example.

Scenario: We have two collections, one is Customers Collection and other is Orders Collection. Now we want to retrieve all customers along with their orders, showing total order value per customer.

customersCollection contains:

{

"_id": ObjectId("C001"),

"name": "Stacy Doe",

"email": "[email protected]"

}

ordersCollection contains:

{

"_id": ObjectId("O1001"),

"customerId": ObjectId("C001"),

"items": [

{ "product": "Laptop", "quantity": 1, "price": 1200 },

{ "product": "Mouse", "quantity": 2, "price": 30 }

],

"orderDate": ISODate("2024-03-10T12:00:00Z"),

"status": "shipped"

}

Query using $lookup be like:

db.customers.aggregate([

{

$lookup: {

from: "orders", // The collection to join

localField: "_id", // The field from 'customers'

foreignField: "customerId", // The field from 'orders'

as: "customerOrders" // Output array field

}

},

{

$unwind: "$customerOrders" // Flatten the array to process each order separately

},

{

$group: {

id: "$id",

name: { $first: "$name" },

email: { $first: "$email" },

totalSpent: {

$sum: {

$map: {

input: "$customerOrders.items",

as: "item",

in: { $multiply: ["$$item.quantity", "$$item.price"] }

}

},

{ $sort: { totalSpent: -1 } } // Sort by highest spending customers

])

Explanation of above query:

1?? $lookup: Joins customers with orders on _id = customerId.

2?? $unwind: Expands the customerOrders array to process each order separately.

3?? $group: Groups by customerId, calculates:

totalSpent: Multiplies quantity * price for each item in an order and sums them.

4?? $sort: Orders customers by total spending in descending order.

Output for above query be like:

[

{

"_id": "C001",

"name": "Stacy Doe",

"email": "[email protected]",

"totalSpent": 1260

}

]

8.5) Filtering Orders by Date

Scenario: get customers who placed orders in the last 30 days, add a $match stage:

Query be like:

db.customers.aggregate([

{

$lookup: {

from: "orders", // The collection to join

localField: "_id", // The field from 'customers'

foreignField: "customerId", // The field from 'orders'

as: "customerOrders" // Output array field

}

},

{ $match: { "customerOrders.orderDate": { $gte: ISODate("2024-02-10T00:00:00Z") } }

},

{

$unwind: "$customerOrders" // Flatten the array to process each order separately

},

{

$group: {

id: "$id",

name: { $first: "$name" },

email: { $first: "$email" },

totalSpent: {

$sum: {

$map: {

input: "$customerOrders.items",

as: "item",

in: { $multiply: ["$$item.quantity", "$$item.price"] }

}

},

{ $sort: { totalSpent: -1 } } // Sort by highest spending customers

])

9) Query Operators

$gt Greater than Eg: { age: { $gt: 25 } }

$lt Less than Eg: { age: { $lt: 30 } }

$gte Greater than or equal Eg: { age: { $gte: 18 } }

$lte Less than or equal Eg: { age: { $lte: 60 } }

$eq Equals Eg: { city: { $eq: "NY" } }

$ne Not equals Eg: { city: { $ne: "LA" } }

$in Matches any in array Eg: { city: { $in: ["NY", "LA"] } }

$nin Not in array Eg: { city: { $nin: ["NY", "LA"] } }

$exists Field exists or not Eg: { state: { $exists: true } }

10) Array Operators

db.orders.find({items: {$all: ["laptop", "mouse"]}}) # Matches all items
db.orders.find({items: {$size: 3}}) # Matches exact array size
db.orders.updateOne({}, {$push: {items: "keyboard"}}) # Add item to array
db.orders.updateOne({}, {$pull: {items: "mouse"}}) # Remove item from array

11) Transactions (MongoDB 4.0+)

In MongoDB 4.0+, multi-document transactions were introduced, allowing atomic operations across multiple documents and collections within a single replica set. This was extended to sharded clusters in MongoDB 4.2.

Key Concepts of Transactions in MongoDB

Atomicity – Either all the operations in the transaction succeed, or none are applied.
ACID Compliance – Transactions ensure Atomicity, Consistency, Isolation, and Durability.
Session-Based – Transactions require a session to execute.

Sample Query:

const session = db.getMongo().startSession();

try {

session.startTransaction();

const usersCollection = session.getDatabase("mydb").users;

const ordersCollection = session.getDatabase("mydb").orders;

usersCollection.updateOne({ _id: 1 }, { $set: { balance: 500 } }, { session });

ordersCollection.insertOne({ orderId: 101, amount: 500 }, { session });

session.commitTransaction();

} catch (error) {

print("Transaction failed: " + error);

session.abortTransaction();

} finally {

session.endSession();

}

Key Notes

Transactions can only be used with replica sets (MongoDB 4.0+) and sharded clusters (MongoDB 4.2+).
Transactions have a 16MB size limit for the total write set.
Write conflicts cause transactions to abort and require retrying.
Read concern defaults to local and write concern defaults to majority.

12) User & Role Management

User and Role Management in MongoDB involves defining users, assigning roles, and managing permissions to ensure secure access control. MongoDB uses Role-Based Access Control (RBAC) to manage user privileges.

Sample Query:

db.createUser({

user: "admin",

pwd: "password123",

roles: [{ role: "readWrite", db: "myDatabase" }]

})

db.getUsers() # List users

db.dropUser("admin") # Delete user

13) Backup & Restore

Backing up and restoring MongoDB is essential for data recovery, disaster management, and migrations. MongoDB provides several methods to back up and restore databases, including mongodump/mongorestore, file system snapshots, and oplog backups.

Sample query:

# Backup MongoDB Database

mongodump --db=myDatabase --out=/backup/

# Restore MongoDB Database

mongorestore --db=myDatabase /backup/myDatabase/

Conclusion:

MongoDB is fast, scalable, and flexible.
Use indexing, aggregation, and sharding for performance optimization.
Implement replication & backups for high availability.
Secure MongoDB with authentication & role-based access.
$lookup = SQL JOIN
$unwind = Flatten arrays
$group = Summarize data

When to Use MongoDB?

? Schema-less & flexible data

? High write throughput

? Real-time big data applications

? Scaling horizontally (sharding)

? JSON-like document storage

All about NoSQL - MongoDB

Varshini G

Sr. Quality Assurance Engineer/Sr. Scrum Master, currently seeking for new opportunities QA professional or AI (No Sponsorship Required)

Conclusion:

社区洞察