Running containers without docker
We all use docker now a days, one way or other. Okay okay... it might not docker but any other container orchestrastion tool (eg roket, runc, buildah etc). What i want to ask is have you ever wondered what might be happening internally when you do `docker run ...`.
We know that docker or any container tool internally uses linux namespaces, cgroups and some UFS (Union capable file system). And linux kernel itself provides these features way back from year around 2008. And docker came in about 2013. So certainly, we can run containers without docker just by using some linux commands.
I will dive deep to give some high level idea how engineers would have brought their first containers, before docker was born.
Fundamentals
First of all you should know that containers uses the same kernel as of host. And the kernel (or linux) feature containers used are linux namespaces, cgroups etc.
In nutshell, namespaces provides what things are visible to a container process like processes / file path etc, and cgroups provides how much a container process can consume the resources like cpu / memory etc.
You may explore more after googling these terms and find out what namespaces and cgroupds does kernel provides.
So we need some linux commands around around above to get our container running without docker.
Let's get our hands dirty
I will walk through different commands, and why i need those to run a container. The main commands are unshare and pivot_root.
Get environment ready
I will be using btrfs, why because it is one of UFS (Union capable file system) where i can take snapshots. Btrfs has other good things like (distributed drive support etc) but here i am using just because btrfs is good for snapshots. Let me create its environment first
Pre-requisites: btrfs file system
Say if you using ubuntu
apt install btrfs-progs
Create directories
mkdir /btrfs && cd /btrfs
I will create a private mount point. Unlike a shared or slave mount, it does not receive or forward any propagation events. Also there is (r)private for recursive
mount --make-rprivate /
Create dir for images and containers
mkdir -p images containers
Create image volume
btrfs subvol create images/alpine
Now let get the base image
I will be using docker to fetch the alpine container, save it to locally so that i can start my container based on that.
CID=$(docker run -d alpine true)
Get tarball of container, and unpack into alpine image subvol created above
docker export $CID | tar -C images/alpine/ -xf-
Take a snapshot of the image made just now, and make a container of the same
btrfs subvol snapshot images/alpine/ containers/ramesh
Just for fun i will put a flag that its my root directory of container
touch containers/ramesh/i_am_ramesh
Ensure that we have the content of alpine image root, along with flag above
ls /containers/ramesh/
Now let change the apparent root directory for the current running process and its children
chroot containers/ramesh/ sh
Let's ensure that we have apk available now, and we will come out after that.
apk --help # ensure right output exit
Real isolation starts now
Let isolate container in a few namespaces, using `unshare` command
unshare --mount --uts --ipc --net --pid --fork bash
Also let's set hostname too, (its happening its uts namespace of container)
hostname ramesh
We also want to create separate root to the host. So first i will create oldroot
mkdir containers/ramesh/oldroot
Also i want to restrict access to top level hierarchy. And then move the current root to top
mount --bind containers/ramesh containers/ramesh mount --move containers/ramesh /btrfs
Ensure that /btrfs represents our container. Check for `i_am_ramesh` and oldroot.
cd /btrfs && ls
We have to tell process that there is nothing exists above `/btrfs`. So let's do pivot_root
pivot_root . oldroot/
Ensuring no mount sharing for container process
umount -a unmount -l /oldroot/ # soft removing old mount to clean everything up
Let do magic so that pid from start from 1.
mount -t proc none /proc
Hurray its working
exec bash # ps aux # check list of processes and process ids # ls / # ensure `i_am_ramesh` is present # ifconfig # no other interfaces than lo exit
Further tasks
Well of course there can be tons of things can be done after that to have real usable container. One of them is networking.
Making interfaces
In nutshell you can create pair of interfaces on host and move one into container.
First fetch container process id (unshare command)
CPID=$(pidof unshare)
Create pair of interfaces, peer them
ip link add name h$CPID type veth peer name c$CPID
And move one of them to container (moving to net namespace of unshare process)
ip link set c$CPID netns $CPID
and other one to docker bridge and bring that up
ip link set h$CPID master docker0 up ifconfig # ensure new interface on host
bring all network interfaces up in the container
exec bash # ip link set lo up # ip link set c1234 name eth0 up # assuming CPID=1234
Set ip address for new interface (taking 172.17 for docker range to be used by docker bridge) and also adding default route via the host
ip addr add 172.17.0.2/16 dev eth0 ip route add default via 172.17.0.1 # docker0 on host has 172.17.0.1
Hurray it works
Let's check final things on container
ifconfig # all interfaces including c1234 up ping google.com # works
Lead Platform Engineer
1 年men you just copy this article https://dev.to/nicolasmesa/container-creation-using-namespaces-and-bash-6g ? I believe it always good to name the sources of the materials