What is Dockerfile?

A Docker container is a live instance of a Docker image. It is a self-contained environment that includes only the necessary components and the application code.

Docker images are the result of a build process, and Docker containers are the running instances of these images. At the core of Docker are Dockerfiles. These files instruct Docker on how to build images, which are then used to create containers.

Each Docker image corresponds to a file called a Dockerfile. This file name is written exactly as "Dockerfile," without any extension. When you run the docker build command to create a new image, Docker assumes that the Dockerfile is located in the current working directory. If the file is located elsewhere, its path can be specified using the -f flag.

Containers consist of layers. Every layer, except the topmost one, is read-only. The Dockerfile instructs Docker on which layers to add to the image and in what order.

Each layer is essentially a file that describes the change in the image's state compared to its state after adding the previous layer. In Unix systems, almost everything is treated as a file.

A base image is the initial layer upon which the new image is built. The base image is also known as the parent image.

When an image is downloaded from a remote repository to a local machine, only the layers that are not already present on the local machine are physically downloaded. Docker aims to save space and time by reusing existing layers whenever possible.

Dockerfiles contain instructions for building an image. These instructions are written in uppercase letters at the beginning of each line, followed by their arguments. The instructions are processed from top to bottom during the image build process. Here's what a simple Dockerfile might look like:

FROM ubuntu:10.04

COPY . /app

Only the FROM, RUN, COPY, and ADD instructions create layers in the final image. Other instructions configure the image, provide metadata, or tell Docker what to do when the container is run, such as opening a port or executing a command.

Here, we assume that the Docker image is based on a Unix-like operating system. While it is possible to use an image based on Windows, this is less common and more complex to work with. Therefore, if possible, use Unix-based images.

Let’s begin with a list of Dockerfile instructions with brief explanations.

A Dozen Dockerfile Instructions

FROM — Specifies the base (parent) image.

LABEL — Describes metadata, such as information about who created and maintains the image.

ENV — Sets persistent environment variables.

RUN — Executes a command and creates an image layer. Often used to install packages in the container.

COPY — Copies files and directories into the container.

ADD — Similar to COPY but can also extract local .tar files.

CMD — Provides the command and arguments to execute when the container starts. These can be overridden at runtime. Only one CMD instruction is allowed in a Dockerfile.

WORKDIR — Sets the working directory for subsequent instructions.

ARG — Defines build-time variables that can be passed to Docker during the image build process.

ENTRYPOINT — Sets the command with arguments for the container at runtime, which cannot be overridden by CMD.

EXPOSE — Informs Docker that the container will listen on the specified network ports at runtime.

VOLUME — Creates a mount point with a directory linked to the host machine for persistent storage.

Instructions and Examples of Their Use

A Simple Dockerfile

A Dockerfile can be extremely simple and short, like this:

FROM ubuntu:20.04

The Dockerfile must start with a FROM instruction or an ARG instruction followed by a FROM.

The FROM keyword tells Docker to use a base image corresponding to the provided name and tag when building the image. This base image is also called the parent image.

In this example, the base image is stored in the ubuntu repository. Ubuntu is the name of the official Docker repository that provides the base version of the popular Linux operating system.

Notice that this Dockerfile includes the 20.04 tag, specifying the exact base image needed. This is the image that will be downloaded during the build process. If the tag is not included, Docker assumes that the latest image from the repository is required. To be explicit, it is recommended that Dockerfile authors specify the exact image they need.

When this Dockerfile is used on a local machine to build an image for the first time, Docker will download the layers defined by the ubuntu image. You can think of these layers as being stacked on top of each other. Each subsequent layer represents a file that describes the differences in the image’s state compared to its state after adding the previous layer.

When creating a container, a writable layer is added on top of all the other layers. Data in the other layers can only be read.

A More Complex Dockerfile

While the Dockerfile we just discussed is neat and understandable, it’s overly simplistic, using only one instruction. Moreover, it lacks instructions that are executed at runtime. Let’s look at another Dockerfile that builds a small image and includes mechanisms that define commands executed during the container's runtime.

FROM python:3.10.0-alpine

LABEL maintainer="[email protected]"

ENV USER="codikup"

RUN apk update && apk upgrade && apk add bash

COPY . ./app

RUN ["mkdir", "/a_directory"]

CMD ["python", "./my_script.py"]

At first glance, this file might seem complex, so let’s break it down.

The base of this image is the official Python image with the 3.10.0-alpine tag. By analyzing this code, you can see that this base image includes Linux, Python, and not much else. Alpine OS images are popular in the Docker world because they are small, fast, and secure. However, Alpine images lack the extensive features typical of full-fledged operating systems. Therefore, to build something useful on such an image, the creator must install the necessary packages.

The LABEL Instruction

The LABEL instruction allows you to add metadata to an image. In this case, it includes the contact details of the image creator. Labels do not slow down the build process or increase the image size. They simply contain useful information about the Docker image, so it is recommended to include them in the Dockerfile.

The ENV Instruction

The ENV instruction sets persistent environment variables that will be available in the container during its runtime.

The ENV instruction is well-suited for defining constants. If you use a certain value in the Dockerfile multiple times (for example, in commands executed in the container) and think you might need to change it later, it makes sense to store it in an environment variable.

The RUN Instruction

The RUN instruction creates a layer during the image build process. After it executes, a new layer is added to the image, and its state is saved. The RUN instruction is often used to install additional packages in the image. In the previous example, the RUN apk update && apk upgrade command tells Docker to update the packages from the base image. Following these two commands is the command && apk add bash, which instructs Docker to install bash in the image.

The command apk in these commands stands for Alpine Linux package manager. If you’re using a base image from another Linux distribution, such as Ubuntu, you might need to use a command like RUN apt-get to install packages. We’ll discuss other ways to install packages later.

The RUN instruction and similar instructions, such as CMD and ENTRYPOINT, can be used either in exec form or in shell form. Exec form uses a syntax resembling a JSON array description. For example, it might look like this: RUN ["my_executable", "my_first_param1", "my_second_param2"].

In the previous example, we used the shell form of the RUN instruction as follows: RUN apk update && apk upgrade && apk add bash.

Later in our Dockerfile, the exec form of the RUN instruction is used as RUN ["mkdir", "/a_directory"] to create a directory. When using the instruction in this form, remember to enclose strings in double quotes as required by the JSON format.

The COPY Instruction

The COPY instruction in our file is presented as COPY . ./app. It tells Docker to take files and directories from the local build context and add them to the current working directory in the image. If the target directory does not exist, this instruction will create it.

The CMD Instruction

The CMD instruction provides Docker with a command to execute when the container starts. The results of this command are not added to the image during the build process. In our example, this command runs the my_script.py script during the container's runtime.

Here are a few more things to know about the CMD instruction:

Only one CMD instruction is allowed in a Dockerfile. If the file contains multiple CMD instructions, Docker will ignore all but the last one.

The CMD instruction can be used in exec form. If this instruction does not include an executable, the file must contain an ENTRYPOINT instruction. In this case, both instructions should be in JSON format.

Command-line arguments passed to docker run override the arguments provided by the CMD instruction in the Dockerfile.

What is Dockerfile?

Evgeni Malackowski

Software Engineer in Test at CCC Intelligent Solutions

领英推荐

社区洞察

其他会员也浏览了

Pick of the Week: The “mplatform/mquery” Docker Hub Image

Docker Command(CheatSheet)

GUI App in CentOS Docker Container

Sharing data between Docker host and Docker containers

Running Grafana & Prometheus on Docker

Why You Should Start Using Docker: Part 5 - Your Practical Guide to Getting Started

Docker vs. Podman: A Comprehensive Guide for Container System Administrators

Install Docker & setup Kubernetes cluster offline (using kubeadm) in an air-gapped environment.

Introduction To Docker & Containerization

Run GUI Programs on Docker Container