Single Node Mongo Replica on Docker

If you are using MongoDB and if you are like me, you probably use a docker image for your quick tests instead of doing a full blown MongoDB installation.

Luckily, running mongo on docker is very easy. First, let's download the image

docker pull mongo
        

Next, if you want the data to persist, then either you will use a docker volume or map the directory of mongo container to a directory on your machine. So next time you can "see" your data. The default location of mongodb data directory is /data/db and the default port is 27017. To keep things simple, we will use these default values. We will run it in a single node replica set, passing the --replsSet flag shown below. Finally, for the convenience of operation, we will name our image something so we can refer it by name.

 
$ docker run -d --name gutlo-mongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0
        

Now we have a docker container running.


$ docker ps

CONTAINER ID??IMAGE???COMMAND?????????CREATED????STATUS????PORTS???????????NAMES

56468ef53d83??mongo???"docker-entrypoint.s…"??4 Seconds ago??Up 1 second ??0.0.0.0:27017->27017/tcp??gutlo-mongo
        

Notice a few things. Container ID. It is the same as "hostname" we will find out connecting to the container on bash prompt and finding its hostname ( Linux/unix command to find out the machine name )


docker exec -it gutlo-mongo hostname
56468ef53d83h
        

Notice the default prompt says root@<hostname> - which also happens to be the container id shown in docker ps command above. Now if you are running this container first time on your machine the database directory /data/mongo is empty and it will be initialized as part of starting the container. You will have to "initialize" the replica set like any mongo replica set


mongo> rs.initiate ()
        

If the initialization was successful you get a prompt saying you are in rs0 replicaset, connecting to primary


rs0:PRIMARY>
        

So far, so good and things "just work". Now comes the catch.

If you stop and remove the container, the container goes away but your dataset - including the collection that stores the name of the docker image hostname - is saved. So next time you start the container, it expects the same hostname. But docker "generates" the name every time so you will actually get a different name. Always

Why is the docker hostname important ? Well, because if you are running a single node replica in mongo, the replica set ( being one node set ) needs to know who is the member of the set. i.e. its own machine name. In other words it needs to store the machine name of the node as a member. This is saved as part of the database in one system collection. In the query below, notice the "host" key is the same as the hostname command above or the container ID in docker ps.


mongosh --quiet
rs0:PRIMARY> use local
switched to db local
rs0:PRIMARY> db.system.replset.find({},{"members":1}
{ "_id" : "rs0", "members" : [ { "_id" : 0, "host" : "56468ef53d83:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : {? }, "slaveDelay" : NumberLong(0), "votes" : 1 } ] })
        

Now let's stop and start the container again. We will have a different container id.


$ docker rm -f gutlo-mongo
$ docker run -d --name gutlo-mongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0
$docker exec -it gutlo-mongo hostname
25599e3f7780
        

So we have a new docker container ID / hostname 25599e3f7780 instead of 55468ef53d83. Thus when you start mongo, it is not part of the replica set as it has the old hostname saved in replset.

Make sure that the it is indeed the case - query the collection again. If you try to initialize the replica set, you get an error


> rs.initiate().errmsg
already initialized
        

Now we need to change the hostname to the new docker hostname. In real life, this is similar to moving your replica from one machine to another. There is simple step described in Mongodb manual.

Here is a small shell script that you can run in your MongoDB container every time it is brought up - with a new hostname. Recall that mongo shell allows to use a unix environment variable


# First we create a shell script on your host machine.

It will capture the current hostname to a shell variable and use it to reset replica set hostname for single node replica        



$ cat change_replica.sh

export myhost=`hostname`:27017

mongosh <<EOF

myhost=_getEnv("myhost")

conf = rs.config()

conf.members[0].host=myhost

rs.reconfig(conf,{force:true})

EOF        

Now the above script needs to be run from the docker container shell so it captures the proper hostname. So it needs to be copied there first.

# First copy the shell script to the docker container



docker cp change_replica.sh gutlo-mongo:/tmp



# Then run the script?



docker exec -it gutlo-mongo bash /tmp/change_replica.sh        


Voila !

Now you are back in replica set and be able to see the same saved data !

Why is it important to run in replica set ?

Well, because there are certain features like change stream which is very useful in doing CDC from mongo to some other database, can only be used in replica mode.

Plus - "purely for fun" is also a good reason.

Please let me know if you have any comments or suggestions - all comments - are welcome!

Oleksandr Ryzhenko

Data Engineer / Teamlead – AutoDoc

2 年

Thanks for article. I was looking for reason why my single node replicaset in docker always fails after docker restart. Actually you may do it without script. Just set hostname on docker run. It will give you the same hostname on each docker start. Like this: $ docker run -d --name gutlo-mongo --hostname=mymongo -v /data/mongo:/data/db -p 27017:27017 mongo --replSet rs0

要查看或添加评论,请登录

Sumit Sengupta的更多文章

社区洞察

其他会员也浏览了