Containers

 

Overview

Virtualization provides a way for software to be run in an isolated environment. Virtual machines running in a hypervisor are each independent of each other and run a complete operating system and software.

Containerization provides an alternative to virtualization. A container is an application which is run in an isolated environment, which is bundled with the libraries, code, and data needed for that application. The primary difference between a container and a virtual machine is that containers share the OS kernel of the host system, while VMs each run their own OS.

Containers are essentially just processes which rely on the Linux kernel to isolate them from the rest of the system:

Containers are run under a container manager which is similar to a hypervisor, but for containers. Docker is the most widely used container manager.


 

Containerization vs. Virtualization

Virtualization and containerization are similar, but the big difference is that virtual machines run a full operating system. All of the software in the virtual machine is completely independent from the software in the host.

Running multiple virtual machines entails
running a full operating system for each.  Running multiple containers does not

Virtual machines take much longer to start up since the guest OS needs to boot which can take a few minutes. They also take much more space to store, on the order of gigabytes. In contrast launching a container involves launching an isolated process, which is much faster, and takes much less space to store, on the order of megabytes.

It's not feasible to run more than a handful to a dozen virtual machines on one physical machine, while one can run many more containers simultaneously.

On the other hand, the benefits of virtualization include:


 

Docker

Docker is the most widely used containerization software. It consists of the docker executable that takes commands which allow the user to interact with containers, and the dockerd daemon which implements container operations. The docker command calls into the daemon to make changes.

There are a few important terms with Docker:

The following image shows an example of using Docker:

Docker consists of images, containers, and a registry
Image credit: Docker

Here, we run some docker commands which call into the Docker daemon. That interacts with the containers and images we have stored on our machine, and potentially downloads images from a repository.

Docker containers run as Linux processes. If Docker is running on Linux, it uses the host operating system kernel which is shared with all container processes. On Windows and Mac, Docker installs a Linux VM which is used for all containers (WSL 2 on Windows and LinuxKit on Mac). For that reason, there's a little more overhead on non-Linux systems.


 

Docker Commands

CommandMeaning
docker imagesShows local image files
docker infoDisplays summary information
docker psLists running container processes
docker pullDownloads images from a registry
docker pushUploads changes in an image to a registry
docker rmDeletes containers
docker rmiDeletes images
docker runLaunches an image into a container
docker startResumes a stopped container
docker stopStops a running container
docker topShows process status and resource usage
docker versionShows docker version number

 

Overlay File System

Docker does a very clever thing to allow for containers to share files as well as the OS they run on. It utilizes a Overlay File System which allows for multiple layers to be merged together to present one file system:

Containers take their files from layers, some supplied by
images and some from the container itself.
Image credit: Docker

The base image contains a set of files and then the container itself can either add new files, or replace existing one. To see how this is useful, imagine we have an image file for the Apache web server and want to make containers for serving 2 different web sites. Many of the files between these two containers will be the same: the basic OS libraries, Apache binaries, many of the configuration files etc. These will go in the image layer. Some files in each container will overwrite files in the image, such as the Apache configuration, which may differ between the two containers. This can be overwritten, as in "file2" in the image above. Finally, we can add new files altogether such as the actual data being served by each website. This is like "file4" in the image.

The OverlayFS merges these layers together so the container sees one logical file system even though the files are coming from multiple different sources.


 

Next Steps

Next time we will see: