Learning on the go with python container builds
With world getting into the groove of ML and AI , python is no doubt the number 1 programming language right now.
Now developing projects in local can get pretty challenging with python - with all bootstrapping , dependencies and configurations to handle.
I recently got some opportunities to work on python builds and help teams dockerizing them. I will try to cover these experiences in my posts
The foremost question that comes to mind is how to containerize a python service
Let us illustrate this with a simple example - lets call it example.py
from flask import Flask server = Flask(__name__) @server.route("/") def hello(): return "Hello World!" if __name__ == "__main__": server.run(host='0.0.0.0')
To run this we need to install dependencies first - here comes the requirements.txt
requirements.txt with just Flask==1.1.1
Lets put a structure to our code and we separated code with src folder in it.
app
├─── requirements.txt
└─── src
└─── server.py
Now comes the Dockerfile.
# set base image (host OS) FROM python:3.8 # set the working directory in the container WORKDIR /code # copy the dependencies file to the working directory COPY requirements.txt . # install dependencies RUN pip install -r requirements.txt # copy the content of the local src directory to the working directory COPY src/ . # command to run on container start CMD [ "python", "./server.py" ]
For each instruction or command from the Dockerfile, the Docker builder generates an image layer and stacks it upon the previous ones. Therefore, the Docker image resulting from the process is simply a read-only stack of different layers.
Run docker build -t <imagename> . Run docker images will show the image once done.
Few important points to notice here and takeaways:
Base Image:
The first line on dockerfile specific the foundation on which other layers of your app are going to sit. It is very important to choose the right base image and the features it brings to the table. Most of such distributions comes from alpine but for python the official docker python image works well
Instruction order matters in Dockerfile:
It is important to note the number of instructions and more important its order - it decides the speed and efficiency of your docker builds. The dependencies are kept usually first in order so that subsequent builds can use the cache and they change less frequently.
The more the changes to the instruction the less it will use the cache and will take time in docker builds.
Multi Stage builds:
This concept is though not for development but plays an important role in size of the images. If you notice the size is comparatively bigger for the docker build we did above.
You can strip off all unnecessary files and software packages before building the image.
# first stage FROM python:3.8 AS builder COPY requirements.txt . # install dependencies to the local user directory (eg. /root/.local) RUN pip install --user -r requirements.txt # second unnamed stage FROM python:3.8-slim WORKDIR /code # copy only the dependencies installation from the 1st stage image COPY --from=builder /root/.local/bin /root/.local COPY ./src .
Notice that we have a two stage build where we name only the first one as builder. We name a stage by adding an AS <NAME> to the FROM instruction and we use this name in the COPY instruction where we want to copy only the necessary files to the final image.
Making the dockerfile shorter and smarter:
Use incremental builds and make use of cache by putting dependencies first , use sources consistently with stable versions , minimize the layers by coupling the commands and not writing each one as an instruction.
Final thing: Run the docker container using docker run -d -p 5000:5000 myimage
In my next post we will see how external sources are added to python and can be used for dockerization. We will also work on getting the size smaller for our images and make them efficient.
Note: I am writing this information which does not reflect or resemble any views for a firm or organization.
Digital Transformation Leader
4 年Good writeup Ankit Jain !