Single Node Mongo Replica on Docker
Sumit Sengupta
Multi-Cloud Architect 12x certified - Azure, AWS, GCP, OCI | Ex- (Microsoft, Apple, MongoDB) | Cybersecurity Instructor | AWS Academy Educator | 2x Top Voice - Database, Data Architecture | Mentor / Tech Volunteer
If you are using MongoDB and if you are like me, you probably use a docker image for your quick tests instead of doing a full blown MongoDB installation.
Luckily, running mongo on docker is very easy. First, let's download the image
docker pull mongo
Next, if you want the data to persist, then either you will use a docker volume or map the directory of mongo container to a directory on your machine. So next time you can "see" your data. The default location of mongodb data directory is /data/db and the default port is 27017. To keep things simple, we will use these default values. We will run it in a single node replica set, passing the --replsSet flag shown below. Finally, for the convenience of operation, we will name our image something so we can refer it by name.
$ docker run -d --name gutlo-mongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0
Now we have a docker container running.
$ docker ps
CONTAINER ID??IMAGE???COMMAND?????????CREATED????STATUS????PORTS???????????NAMES
56468ef53d83??mongo???"docker-entrypoint.s…"??4 Seconds ago??Up 1 second ??0.0.0.0:27017->27017/tcp??gutlo-mongo
Notice a few things. Container ID. It is the same as "hostname" we will find out connecting to the container on bash prompt and finding its hostname ( Linux/unix command to find out the machine name )
docker exec -it gutlo-mongo hostname
56468ef53d83h
Notice the default prompt says root@<hostname> - which also happens to be the container id shown in docker ps command above. Now if you are running this container first time on your machine the database directory /data/mongo is empty and it will be initialized as part of starting the container. You will have to "initialize" the replica set like any mongo replica set
mongo> rs.initiate ()
If the initialization was successful you get a prompt saying you are in rs0 replicaset, connecting to primary
rs0:PRIMARY>
So far, so good and things "just work". Now comes the catch.
If you stop and remove the container, the container goes away but your dataset - including the collection that stores the name of the docker image hostname - is saved. So next time you start the container, it expects the same hostname. But docker "generates" the name every time so you will actually get a different name. Always
Why is the docker hostname important ? Well, because if you are running a single node replica in mongo, the replica set ( being one node set ) needs to know who is the member of the set. i.e. its own machine name. In other words it needs to store the machine name of the node as a member. This is saved as part of the database in one system collection. In the query below, notice the "host" key is the same as the hostname command above or the container ID in docker ps.
mongosh --quiet
rs0:PRIMARY> use local
switched to db local
rs0:PRIMARY> db.system.replset.find({},{"members":1}
{ "_id" : "rs0", "members" : [ { "_id" : 0, "host" : "56468ef53d83:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : {? }, "slaveDelay" : NumberLong(0), "votes" : 1 } ] })
Now let's stop and start the container again. We will have a different container id.
$ docker rm -f gutlo-mongo
$ docker run -d --name gutlo-mongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0
$docker exec -it gutlo-mongo hostname
25599e3f7780
So we have a new docker container ID / hostname 25599e3f7780 instead of 55468ef53d83. Thus when you start mongo, it is not part of the replica set as it has the old hostname saved in replset.
领英推荐
Make sure that the it is indeed the case - query the collection again. If you try to initialize the replica set, you get an error
> rs.initiate().errmsg
already initialized
Now we need to change the hostname to the new docker hostname. In real life, this is similar to moving your replica from one machine to another. There is simple step described in Mongodb manual.
Here is a small shell script that you can run in your MongoDB container every time it is brought up - with a new hostname. Recall that mongo shell allows to use a unix environment variable
# First we create a shell script on your host machine.
It will capture the current hostname to a shell variable and use it to reset replica set hostname for single node replica
$ cat change_replica.sh
export myhost=`hostname`:27017
mongosh <<EOF
myhost=_getEnv("myhost")
conf = rs.config()
conf.members[0].host=myhost
rs.reconfig(conf,{force:true})
EOF
Now the above script needs to be run from the docker container shell so it captures the proper hostname. So it needs to be copied there first.
# First copy the shell script to the docker container
docker cp change_replica.sh gutlo-mongo:/tmp
# Then run the script?
docker exec -it gutlo-mongo bash /tmp/change_replica.sh
Voila !
Now you are back in replica set and be able to see the same saved data !
Why is it important to run in replica set ?
Well, because there are certain features like change stream which is very useful in doing CDC from mongo to some other database, can only be used in replica mode.
Plus - "purely for fun" is also a good reason.
Please let me know if you have any comments or suggestions - all comments - are welcome!
Data Engineer / Teamlead – AutoDoc
2 年Thanks for article. I was looking for reason why my single node replicaset in docker always fails after docker restart. Actually you may do it without script. Just set hostname on docker run. It will give you the same hostname on each docker start. Like this: $ docker run -d --name gutlo-mongo --hostname=mymongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0