Slots Plus Casino login No deposit bonus.Claim Your Free 999 Pesos Bonus Today

A guide on how to do the basic set up and connect to a MongoDB Atlas server. I will also try to explain how to use PyMongo (Python) to connect to it and perform CRUD operations in MongoDB using NoSQL.

NoSQL

NoSQL, also referred to as “not only SQL”, “non-SQL”, is an approach to database design that enables the storage and querying of data outside the traditional structures found in relational databases in schema-less JSON format.

Instead of the typical tabular structure of a relational database, NoSQL databases, house data within one data structure, such as JSON document.

NoSQL is also type of distributed database, which means that information is copied and stored on various servers, which can be remote or local. This ensures availability and reliability of data. If some of the data goes offline, the rest of the database can continue to run.

Document Based DB → e.g. MongoDB

Key-Value based DB -> e.g. Amazon DynamoDB

Graph Based DB → e.g Neo4j

Column Based DB → e.g. Cassandra, Hbase

MongoDB

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas supporting CRUD operations.

C — Create DB | Collection | Document

R— Read DB | Collection | Document

U — Update DB | Collection | Document

D— Delete DB | Collection | Document

MongoDB Atlas

Create a Cluster
Create a Python Driver exposing the Server URI

MongoDB with Python

PyMongo is a Python distribution containing tools for working with MongoDB, and is the recommended way to work with MongoDB from Python

Requirements.txt:

pymongo
pymongo[srv]

Create a new client and connect to the server

from pymongo import MongoClient

uri = "mongodb+srv://<username>:<password>@cluster0.k3s4vuf.mongodb.net/?retryWrites=true&w=majority"

# Create a new client and connect to the server
client = MongoClient(uri)
print(client)

## Output ##
MongoClient(host=['ac-edosefl-shard-00-02.k3s4vuf.mongodb.net:27017', 'ac-edosefl-shard-00-01.k3s4vuf.mongodb.net:27017', 'ac-edosefl-shard-00-00.k3s4vuf.mongodb.net:27017'], document_class=dict, tz_aware=False, connect=True, retrywrites=True, w='majority', authsource='admin', replicaset='atlas-t2x2vx-shard-0', tls=True)

Send a ping to confirm a successful connection

status = client.admin.command('ping')
print("Response: {}".format(status))Response: {'ok': 1}

Create a new Database

db = client.kaggle_iris
print("Database name is {0}".format(db.name))Database name is kaggle_iris

Print list of Database names

print("List of databases in MongoDB Cluster: {0}".format(client.list_database_names()))

## Output ##
List of databases in MongoDB Cluster: ['forum', 'kaggle_iris', 'mongotest1', 'mongotest2', 'sample_airbnb', 'sample_analytics', 'sample_geospatial', 'sample_guides', 'sample_mflix', 'sample_restaurants', 'sample_supplies', 'sample_training', 'sample_weatherdata', 'admin', 'local']

Create a new Collection & Print list of Collections

coll = db.iris
print("Collection name is {0}".format(coll.name))
print("List of collections in Database {0}: {1}".format(db.name, db.list_collection_names()))

## Output ##
Collection name is iris
List of collections in Database kaggle_iris: ['iris']

Datasets | Documents in JSON format

doc = {"sepalLength": 5.5, "sepalWidth": 2.4, "petalLength": 3.7, "petalWidth": 1.0, "species": "versicolor"}
    
doc_list = [
{"sepalLength": 5.1, "sepalWidth": 3.5, "petalLength": 1.4, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 4.9, "sepalWidth": 3.0, "petalLength": 1.4, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 4.7, "sepalWidth": 3.2, "petalLength": 1.3, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 4.6, "sepalWidth": 3.1, "petalLength": 1.5, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 5.0, "sepalWidth": 3.6, "petalLength": 1.4, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 5.4, "sepalWidth": 3.9, "petalLength": 1.7, "petalWidth": 0.4, "species": "setosa"},
{"sepalLength": 5.8, "sepalWidth": 2.6, "petalLength": 4.0, "petalWidth": 1.2, "species": "versicolor"},
{"sepalLength": 5.0, "sepalWidth": 3.4, "petalLength": 1.5, "petalWidth": 0.2, "species": "setosa"},
{"sepalLength": 4.4, "sepalWidth": 2.9, "petalLength": 1.4, "petalWidth": 0.2, "species": "setosa"},
    {"sepalLength": 5.9, "sepalWidth": 3.0, "petalLength": 5.1, "petalWidth": 1.8, "species": "virginica"}]

CRUD Operations

i.e create, read, update, and delete

Insert one document

coll.insert_one(doc)

## Output ##
<pymongo.results.InsertOneResult at 0x7f8240a9e520>

Insert more than one document

coll.insert_many(doc_list)

## Output ##
<pymongo.results.InsertManyResult at 0x7f8230bc7fa0>

Find all documents in collection

for x in coll.find():
    print(x)

## Output ##
{'_id': ObjectId('648f33cf03184eea7a245faa'), 'sepalLength': 5.5, 'sepalWidth': 2.4, 'petalLength': 3.7, 'petalWidth': 1.0, 'species': 'versicolor'}
{'_id': ObjectId('648f33cf03184eea7a245fab'), 'sepalLength': 5.1, 'sepalWidth': 3.5, 'petalLength': 1.4, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fac'), 'sepalLength': 4.9, 'sepalWidth': 3.0, 'petalLength': 1.4, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fad'), 'sepalLength': 4.7, 'sepalWidth': 3.2, 'petalLength': 1.3, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fae'), 'sepalLength': 4.6, 'sepalWidth': 3.1, 'petalLength': 1.5, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245faf'), 'sepalLength': 5.0, 'sepalWidth': 3.6, 'petalLength': 1.4, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fb0'), 'sepalLength': 5.4, 'sepalWidth': 3.9, 'petalLength': 1.7, 'petalWidth': 0.4, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fb1'), 'sepalLength': 5.8, 'sepalWidth': 2.6, 'petalLength': 4.0, 'petalWidth': 1.2, 'species': 'versicolor'}
{'_id': ObjectId('648f33cf03184eea7a245fb2'), 'sepalLength': 5.0, 'sepalWidth': 3.4, 'petalLength': 1.5, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fb3'), 'sepalLength': 4.4, 'sepalWidth': 2.9, 'petalLength': 1.4, 'petalWidth': 0.2, 'species': 'setosa'}
{'_id': ObjectId('648f33cf03184eea7a245fb4'), 'sepalLength': 5.9, 'sepalWidth': 3.0, 'petalLength': 5.1, 'petalWidth': 1.8, 'species': 'virginica'}

Find only first documents in collection

print(coll.find_one())

## Output ##
{'_id': ObjectId('648f33cf03184eea7a245faa'), 'sepalLength': 5.5, 'sepalWidth': 2.4, 'petalLength': 3.7, 'petalWidth': 1.0, 'species': 'versicolor'}

Update document

query = {"sepalLength": 5.9}
newvalues = {"$set": {"petalWidth": 11.9}}
print("Document prior udapte: {0}".format(coll.find_one(query)))
coll.update_one(query, newvalues)
print("Document after udapte: {0}".format(coll.find_one(query)))


## Output ##
Document prior udapte: {'_id': ObjectId('648f33cf03184eea7a245fb4'), 'sepalLength': 5.9, 'sepalWidth': 3.0, 'petalLength': 5.1, 'petalWidth': 1.8, 'species': 'virginica'}
Document after udapte: {'_id': ObjectId('648f33cf03184eea7a245fb4'), 'sepalLength': 5.9, 'sepalWidth': 3.0, 'petalLength': 5.1, 'petalWidth': 11.9, 'species': 'virginica'}

Delete document

x = coll.delete_one({"sepalLength": 5.4})
print(x.deleted_count, " documents deleted using delete_one()")

## Output ##
1  documents deleted using delete_one()

Delete all documents

x = coll.delete_many({})  # Empty query means delete all collection
print(x.deleted_count, " documents deleted using delete_many()")

## Output ##
10  documents deleted using delete_many()

Aggregate Pipeline

An aggregation pipeline consists of one or more stages that process documents:

Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values.
The documents that are output from a stage are passed to the next stage.
An aggregation pipeline can return results for groups of documents. For example, return the total, average, maximum, and minimum values.

e.g. $group $sum $min $max $filter $zip

# Create new collection
mycol2 = db['Orders']

multipleRecords = [
    { "_id": 0, "name": "Pepperoni", "size": "small", "price": 19,
     "quantity": 10, "date": 'ISODate( "2021-03-13T08:14:30Z" )' },
   { "_id": 1, "name": "Pepperoni", "size": "medium", "price": 20,
     "quantity": 20, "date" : 'ISODate( "2021-03-13T09:13:24Z" )' },
   { "_id": 2, "name": "Pepperoni", "size": "large", "price": 21,
     "quantity": 30, "date" : 'ISODate( "2021-03-17T09:22:12Z" )' },
   { "_id": 3, "name": "Cheese", "size": "small", "price": 12,
     "quantity": 15, "date" : 'ISODate( "2021-03-13T11:21:39.736Z" )' },
   { "_id": 4, "name": "Cheese", "size": "medium", "price": 13,
     "quantity":50, "date" : 'ISODate( "2022-01-12T21:23:13.331Z" )' },
   { "_id": 5, "name": "Cheese", "size": "large", "price": 14,
     "quantity": 10, "date" : 'ISODate( "2022-01-12T05:08:13Z" )' },
   { "_id": 6, "name": "Vegan", "size": "small", "price": 17,
     "quantity": 10, "date" : 'ISODate( "2021-01-13T05:08:13Z" )' },
   { "_id": 7, "name": "Vegan", "size": "medium", "price": 18,
     "quantity": 10, "date" : 'ISODate( "2021-01-13T05:10:13Z" )' }
]
status = mycol2.insert_many(multipleRecords)

print(status.inserted_ids)

Output:

[0, 1, 2, 3, 4, 5, 6, 7]

# Calculate Total Order Quantity

aggregate = mycol2.aggregate( [

    # Stage 1: Filter pizza order documents by pizza size and pass the output to next in pipeline
    {
        "$match": { "size": "medium" }
    },

    # Stage 2: Group remaining documents by pizza name and calculate total quantity
    {
        "$group": { "_id": "$name", "totalQuantity": { "$sum": "$quantity" } }
    }

] )

for doc in aggregate:
    print(doc)

Output:

{'_id': 'Pepperoni', 'totalQuantity': 20} 
{'_id': 'Vegan', 'totalQuantity': 10} 
{'_id': 'Cheese', 'totalQuantity': 50}

Use Case

Import data from external source kept in .csv format

# Importing CSV

import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/abhijitpaul0212/DataSets/main/AirPassengers.csv")

Convert the data into Lists of JSON format records

import json
mycsvfile = list(json.loads(df.T.to_json()).values())

Push the converted data into MongoDB in a go

# Create new collection
mycol = db['AirPassengers']

# Insert all records
mycol1.insert_many(mycsvfile)

Verify records are inserted

for doc in mycol1.find({}).limit(5:
    print(doc)


{'_id': ObjectId('64eaf250f4c2730326ff5b9c'), 'date': '1949-01-01', 'value': 112}
{'_id': ObjectId('64eaf250f4c2730326ff5b9d'), 'date': '1949-02-01', 'value': 118}
{'_id': ObjectId('64eaf250f4c2730326ff5b9e'), 'date': '1949-03-01', 'value': 132}
{'_id': ObjectId('64eaf250f4c2730326ff5b9f'), 'date': '1949-04-01', 'value': 129}
{'_id': ObjectId('64eaf250f4c2730326ff5ba0'), 'date': '1949-05-01', 'value': 121}

#nosql #mongodb #pymongo #knowledgesharing #opensource #communitybuilding #continuouslearning

Like reading it, you can follow this newsletter, I share a post every Friday here.

Thank you so much for reading!

If you want to read my learnings on any specific topic, let me know in the comment box. I will tag you too if I write on that.

xx

Abhijit Paul

The Pymongo Playbook

Abhijit Paul

SDET-II @ Trading Technologies | Convolutional Neural Networks, Core ML

NoSQL

MongoDB

MongoDB Atlas

MongoDB with Python

Create a new client and connect to the server

Send a ping to confirm a successful connection

Create a new Database

Print list of Database names

Create a new Collection & Print list of Collections

Datasets | Documents in JSON format

CRUD Operations

Insert one document

领英推荐

Insert more than one document

Find all documents in collection

Find only first documents in collection

Update document

Delete document

Delete all documents

Aggregate Pipeline

Use Case

TEK Tapestry

346 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

What is MongoDB? How does SotaTek leverage MongoDB in Software Development projects?

ScyllaDB: Meet the Better Cassandra

Cisco using MongoDB

MongoDB - The Best NoSQL Database for Modern Applications

How Companies using MongoDB?

Analysing the difference: MongoDB Atlas Full-Text Search and Vector Search

How to Work with NoSQL Database in Python using PyMongo

Spring Boot Microservices + MongoDB in Docker Containers | Step by step tutorial for Beginners

Entity framework and NoSql

Transaction Management in MongoDB

NoSQL

MongoDB

MongoDB Atlas

MongoDB with Python

Create a new client and connect to the server

Send a ping to confirm a successful connection

Create a new Database

Print list of Database names

Create a new Collection & Print list of Collections

Datasets | Documents in JSON format

CRUD Operations

Insert one document

领英推荐

Insert more than one document

Find all documents in collection

Find only first documents in collection

Update document

Delete document

Delete all documents

Aggregate Pipeline

Use Case

TEK Tapestry

346 位关注者

All about Python Interpreter!

2024年9月6日

S.O.L.I.D Principles Explained With Code

2024年8月30日

Setting up SSH tunnel with Github

2024年8月29日

GIT Concepts & Commands Explained

2023年5月29日

社区洞察

其他会员也浏览了

What is MongoDB? How does SotaTek leverage MongoDB in Software Development projects?

ScyllaDB: Meet the Better Cassandra

Cisco using MongoDB

MongoDB - The Best NoSQL Database for Modern Applications

How Companies using MongoDB?

Analysing the difference: MongoDB Atlas Full-Text Search and Vector Search

How to Work with NoSQL Database in Python using PyMongo

Spring Boot Microservices + MongoDB in Docker Containers | Step by step tutorial for Beginners

Entity framework and NoSql

Transaction Management in MongoDB