登录查看更多内容

CONTAINER IMAGES - DEEP DIVE

Nived V.

Solutions Architect @ Red Hat | Application Modernization | Infrastructure Modernization | Hybrid & Multi Cloud Migrations

发布日期: 2020年5月25日

What is a Container Image: Container image contains your packaged application along with its dependencies and information on what processes it would run when it's launched. Container images can be created by providing a set of instructions inside a Dockerfile.

[root@homelab ~]$ cat Dockerfile 

FROM httpd:latest

WORKDIR /usr/local/apache2/htdocs

COPY webapp/ .

EXPOSE 8080

Each instruction in this file will add an additional "layer" to the container image. Each layer will only add the difference from the layer that was below it and then, all these layers are stacked together to form a read-only container image.

How does that work?

Don't worry, I got you covered! You need to know a few things about this and in this order. ;)

Union file systems
Copy-on-Write
Overlay File Systems
Snapshotters

Union File Systems (Aufs):

Wikipedia defines it as, "It allows files and directories of separate file systems, known as branches, to be transparently overlaid, forming a single coherent file system. Contents of directories which have the same path within the merged branches will be seen together in a single merged directory, within the new, virtual filesystem."

The idea here is that if you have multiple images with some identical data, instead of having this data copied over again, we would share it by using something called a layer.

Each layer is a file system and can be shared across multiple containers. Eg. The base layer - httpd is the official image of Apache and can be used across any number of containers. Since we are using the same base layer for all our containers, imagine the disk space we just saved.

These image layers are always read-only but when we create a new container from this image, we add a thin writable layer on top of it. This writable layer is where you would then create/modify/delete or make other changes required for each container.

Copy-on-Write:

When we start a container, it appears as if the container has an entire file system of its own, that would mean that every container you run in the system would need its own copy of the file system. Wouldn't this take up a lot of disk space and also take a lot of time for the containers to boot? No - Because we every container does not need its own copy of the filesystem!

We use a copy-on-write mechanism to achieve this. Instead of copying, the copy-on-write strategy is sharing the same instance of data to multiple processes that need access to it, and copy only when a process would need to modify or write data to this process. All other processes would continue to use the original data.

Docker makes use of the copy-on-write mechanism with both images and containers. To do this, changes between the image and the running container are tracked using a graph driver in older versions and now snapshotter.

Before any write operation is performed in the running container, a copy of the file that would be modified is placed on the writeable layer of the container where the write takes place. Now you know why its called "copy-on-write".

This strategy optimizes both image disk space usage and the performance of container start times and works in conjunction with the Union File System.

Overlay File System:

An overlay sits on top of an existing filesystem, and combines an upper and a lower directory tree and presents them as a single directory. These directories are called layers. The lower layer remains unmodified. Each layer will only add the difference from the layer that was below it and this unification process is referred to as a "union mount".

As you can see from the image below, the lower directory or the Image layer is called "lowerdir" and the upper directory or the container layer is called "upperdir". The final overlayed or unified layer is called "merged"

Docker, by default, uses the overlay2 filesystem ( OverlayFS ) for this. Overlay2 driver requires Linux kernel above 4.0 and solves the problem of inode exhaustion (https://github.com/moby/moby/pull/22126) in overlay driver.

[nivedv@homelab ~]$ docker container run -d -p 80:80 httpd


[nivedv@homelab ~]$ sudo mount | grep overlay2 
overlay on /var/lib/docker/overlay2/a3a07027e3db46f05e3622501ace9627de967953e9737e2e12b19a649ed06df0/merged type overlay (rw,relatime,seclabel ......


[nivedv@homelab ~]$ sudo ls -l /var/lib/docker/overlay2/a3a07027e3db46f05e3622501ace9627de967953e9737e2e12b19a649ed06df0
total 28
drwxr-xr-x. 3 root root 4096 May 23 09:40 diff
-rw-r--r--. 1 root root   26 May 23 09:40 link
-rw-r--r--. 1 root root  173 May 23 09:40 lower
drwxr-xr-x. 1 root root 4096 May 23 09:40 merged
drwx------. 3 root root 4096 May 23 09:40 work

So with overlay2 drivers, the layer structure is slightly different. Now, you have

Base Layer: This is the location where the base files of your filesystem are located. In terms of container images, this layer would be your base image.
Overlay Layer: This layer is often called the "container layer" as all the changes that are made to a running container, as adding, deleting, or modifying files are written to this writable layer. All changes that are made to this layer are stored in the next layer and it would be a "union" view of the Base and Diff layer.
Diff Layer: All changes made in the Overlay layer are stored in this layer. If you write something that's already there in the Base Layer, then the overlay file system will copy the file to the Diff Layer and make the modifications you tried to write. This is called a copy-on-write.

SnapShotters:

Containers have the ability to build, manage, and distribute changes as a part of their container filesystem by the use of layers and graph driver. But working with graph drivers is really complicated and is error-prone. SnapShotters are different from graph drivers, as they have no knowledge of images or containers.

SnapShotters work very similar to git. Like the concept of having trees and every commit can be used to track changes that were made to these trees. A Snapshot represents a filesystem state. Snapshots have parent-child relationships using a set of directories. A diff can be taken between a parent and its snapshot to create a layer.

The SnapShotter provides an API for allocating, snapshotting, and mounting abstract, layered file systems.

Want more info?

Understanding Container Terminology -https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
SnapShotter -https://github.com/containerd/containerd/blob/master/design/snapshots.md –
Storage Drivers - https://docs.docker.com/storage/storagedriver/

Image Credits:

CoW - Julia Evans 9 @b0rk ( https://twitter.com/b0rk )

OverlayFS - Docker docs team ( https://docs.docker.com/storage/storagedriver/overlayfs-driver/ )

Anand T N

Infrastructure Consultant

4 年

Thanks for your article Helps a lot to learn

1 次回应

Siddharth Barhate

Cloud Support Engineer I at AWS

4 年

Extremely informative. Thanks for the post

1 次回应

Sruthi Chiramel

Senior Product Manager @ SAP

4 年

Well written . Very lucid!

1 次回应

Vivek Nidhi

Senior DevSecOps ? Platform Engineer ?? Cyber Security Analyst ??

4 年

Awesome mate!! this is well-written.

1 次回应

Velayudhan Chirangarail

Algorithm Strategy Developer for Derivative trading at Saptharishi Algo.In

4 年

Good article. It enlighten me more in to containers.

1 次回应

查看更多评论

要查看或添加评论，请登录

Nived V.的更多文章

Plan your Migration from CentOS to RHEL

2023年5月22日

Plan your Migration from CentOS to RHEL

This article is intended to provide a framework which can be leveraged by your organisations to create your own…

1 条评论
Kubernetes Security - Part I

2022年10月11日

Kubernetes Security - Part I

Complexity is the worst enemy of Security - Bruce Schneier Kubernetes is designed to be highly portable, with multiple…

9 条评论
Fundamentals of Kubernetes Networking

2022年3月9日

Fundamentals of Kubernetes Networking

Understanding the Kubernetes Networking Model The Kubernetes Network Model specifies: Every Pod gets its own IP…

33 条评论
Kubernetes - Chain of events behind a running Pod

2022年1月19日

Kubernetes - Chain of events behind a running Pod

What exactly happens behind the scenes when you create a pod/deployment? I'll try to cover the chain of events on a…

6 条评论
Kubernetes Architecture

2022年1月14日

Kubernetes Architecture

CONTROL PLANE COMPONENTS: ETCD: Etcd is a fast, distributed, and consistent key-value store that is used as a backing…

18 条评论
CONTAINER INTERNALS - Deep Dive

2020年7月6日

CONTAINER INTERNALS - Deep Dive

Linux technologies make up the foundations of building/running a container process in your system. Technologies like:…

2 条评论
CONTAINER RUNTIMES - Deep Dive

2020年6月8日

CONTAINER RUNTIMES - Deep Dive

So what really happens in the backend when we pass the "docker run" command? If the image required by the container is…

4 条评论
CONTAINER FUNDAMENTALS

2020年5月19日

CONTAINER FUNDAMENTALS

A container is a unit of software that wraps an application code, runtime, system tools, system libraries, and…

13 条评论

See all articles

CONTAINER IMAGES - DEEP DIVE

Nived V.

Solutions Architect @ Red Hat | Application Modernization | Infrastructure Modernization | Hybrid & Multi Cloud Migrations

How does that work?

Union File Systems (Aufs):

Copy-on-Write:

Overlay File System:

SnapShotters:

Nived V.的更多文章

社区洞察

其他会员也浏览了

Graphs and LLMs

How to index data into Vector DB from highly unstructured pdfs

Long Context vs RAG: The Final Take.

Exploring Built-in Container Types in Go: Arrays, Slices, and Maps

Is DIY Entity Resolution Right for You? 5 Red Flags to Watch Out For

Deep Dive: ICollection<T>, IList<T>, and List<T> in C# - A Comparison for Advanced Developers

How to turn Fuzzy Requests into Clear Action? Try this 4-Step Framework

Introduction of iServer Image Service(Chapter 1)

Migrate Report-level Measures back to the Semantic Model

Efficient update of aggregations

How does that work?

Union File Systems (Aufs):

Copy-on-Write:

Overlay File System:

SnapShotters:

Nived V.的更多文章

Plan your Migration from CentOS to RHEL

Kubernetes Security - Part I

Fundamentals of Kubernetes Networking

Kubernetes - Chain of events behind a running Pod

Kubernetes Architecture

CONTAINER INTERNALS - Deep Dive

CONTAINER RUNTIMES - Deep Dive

CONTAINER FUNDAMENTALS

社区洞察

其他会员也浏览了

Graphs and LLMs

How to index data into Vector DB from highly unstructured pdfs

Long Context vs RAG: The Final Take.

Exploring Built-in Container Types in Go: Arrays, Slices, and Maps

Is DIY Entity Resolution Right for You? 5 Red Flags to Watch Out For

Deep Dive: ICollection<T>, IList<T>, and List<T> in C# - A Comparison for Advanced Developers

How to turn Fuzzy Requests into Clear Action? Try this 4-Step Framework

Introduction of iServer Image Service(Chapter 1)

Migrate Report-level Measures back to the Semantic Model

Efficient update of aggregations