Page
Image specifications

The goal of this lesson is to get a basic understanding of the three Open Containers Initiative (OCI) specifications that govern finding, running, building, and sharing container images, as well as runtimes and distribution.
In order to get full benefit from taking this lesson, you need:
runc
installed on your machine.- Podman installed on your machine.
By the end of this lesson, you will be able to:
- Explain the image, runtime, and distribution specifications.
- Have a basic understanding of the major metadata files.
- Be able to start a container from scratch.
At rest vs. Running
At the highest level, containers are two things: Files and processes at rest and running.
First, we will take a look at what makes up a container repository on disk, then look at what directives are defined to create a running container. If you are interested in a slightly deeper understanding, take a few minutes to look at the OCI work. It's all publicly available in GitHub repositories:
The OCI image specification
First, let's take a quick look at the contents of a container repository once it's uncompressed. We will use a utility you may have seen before called Podman. The syntax is nearly identical to Docker. If you are using a Linux-based system (e.g., RHEL, macOS, or Windows WSL), simply run the following commands at your command line.
Windows
If you are using Windows, the simplest solution is to use SSH to get inside your Podman machine, which is running Linux.
To get inside your Podman machine, run the following command:
podman machine ssh
Once inside, you will need to install the
jq
command. Inside your Podman machine, run the following command:sudo dnf -y install jq
Create a working directory for our experiment, then make sure the Fedora image is cachéd locally. Run the following three commands:
cd ~/ && mkdir fedora cd fedora podman pull fedora
Next, to export the image to a
tar
file and extract it, run the following two commands:podman save -o fedora.tar fedora tar xvf fedora.tar
You will see results like this:
732c5d746792608a9ed91354118f71f223e9650b6d9898b4bea6eeefa078d374.tar 50e2c7f186f08818f744c58facfe07e763dc62da5e94e0c5824e3ef3542fa08d.json 7ab3b6dd64cbf4532489181f7461de79252de2f68d989a2dc66a968f38fceffa/layer.tar 7ab3b6dd64cbf4532489181f7461de79252de2f68d989a2dc66a968f38fceffa/VERSION 7ab3b6dd64cbf4532489181f7461de79252de2f68d989a2dc66a968f38fceffa/json manifest.json repositories
Finally, let's take a look at three important parts of the container repository. These are the three major pieces that can be found in a container repository when inspected:
Manifest: A metadata file (
manifest.json
) which defines layers and config files to be used.Config: A config file which is consumed by the container engine. This config file is combined with user input specified at start time, as well as defaults provided by the container engine, to create the runtime
config.json
. This file is then handed to the container runtime (runc
) that communicates with the Linux kernel to start the container.Image layers:
tar
files which are typically g-zipped. They are merged when you run the container to create a mounted root file system.
In the Manifest, you should see one or more config and layer entries.
Run the following command:
cat manifest.json | jq
In the config file, notice all of the metadata that looks strikingly similar to command-line options in Docker and Podman.
To view the contents of the configuration file, run the following command:
cat $(cat manifest.json | awk -F 'Config' '{print $2}' | awk -F '["]' '{print $3}') | jq
Each image layer is just a tar
file. When all of the necessary tar files are extracted into a single directory, they can be mounted into a container's mount namespace.
To view this, run the following command:
tar tvf $(cat manifest.json | awk -F 'Layers' '{print $2}' | awk -F '["]' '{print $3}')
The takeaway from inspecting the three major parts of a container repository, is that they are really just the use of tar
balls. Now that we understand what is on disk, let's move on to the runtime.
The OCI runtime specification
The OCI runtime specification governs the format of the file that is passed to the container runtime. Every OCI-compliant runtime will accept this file format, including:
runc
crun
Kata
gVisor
Railcar
Typically, this file is constructed by a container engine such as CRI-O, Podman, containerd, or Docker. These files can be created manually, but it's a tedious process. Instead, we are going to do a couple of experiments so you can get a feel for this file without having to create one manually.
Before we begin our experiments, you need to have a basic understanding of the inputs that go into creating this spec file. The container image comes with a config.json
which provides input. We inspected this file in the last section on image specification. These inputs are a combination of things provided by the image builder (CMD) as well as defaults specified by the build tool (Architecture). The inputs specified at build time can be thought of as a way for the image builder to communicate with the image consumer about how the image should be run.
The container engine itself also provides some default inputs. Some of these can be configured in the configuration for the container engine (SECCOMP profiles), some are dynamically generated by the container engine (sVirt/SELinux contexts or Bind Mounts), and others are hardcoded into the container engine (The default namespaces to utilize).
The command-line options specified by the user of the container engine, or robot in Kubernetes' case, can override many of the defaults provided in the image or by the container engine. Some of these are simple things like bind mounts (-v /data:/data) or more complex like security options (--privileged).
Now, let's start with some experiments. Being the reference implementation for the runtime specification, runc
has the ability to create a very simple spec file. Let's create one and take a quick look at the fairly simple set of directives.
If you are using Windows
If you are using Windows, the simplest solution is to use SSH to get inside your Podman machine, which is running Linux.
To get inside your Podman machine, run the following command:
podman machine ssh
Once inside your Podman machine, run the following command:
sudo dnf install -y runc
Then run the following two commands:
cd ~/ && runc spec
cat config.json | jq
The simple file created by runc
is a good introduction, but to truly understand the breadth of what a container engine does, we need to look at a more complex example. Podman has the ability to create a container and generate a spec file without actually starting the container.
Run the following two commands:
podman create --name fedora -t fedora bash
podman init fedora
The podman init
command generates a config.json
that we can take a look at. First, to find the ID of the container you just created, run the following command:
podman ps -a
You will get output similar to this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
50ad33c8ab15 registry.fedoraproject.org/fedora:latest bash 13 minutes ago Initialized fedora
The CONTAINER ID
is what we are looking for. With that knowledge, we can run a command to locate the config.json
file for this container.
Run the following two commands:
cd ~/
find . -name "config.json"
The result will include an entry for our config.json
file. You can locate it by noticing the container ID at the beginning of the file name:
50ad33c8ab154d7dd8196eea72f792199a0b4d814166f9b6c67c3d532213b7ac/userdata/config.json
Copy the path to the file to your copy/paste buffer (Ctrl-C) and paste it with the following command:
cat path/to/file/pasted/here/config.json | jq
Take a minute to browse through the JSON output. See if you can spot directives that come from the container image, the container engine, and the user.
Now that we have a basic understanding of the runtime spec file, let's move on to starting a container.
The OCI runtime reference implementation
The goal of this lesson is to learn how to use the container runtime to communicate with the Linux kernel to start a container. You will build a simple metadata set and start a container. This will give you insight into what the container engine is actually doing every time you run a command.
Setup
To get runc
to start a new container, you need two things:
- A
filesystem
to mount (often called a RootFS) - A
config.json
file
First, let's create (or steal) a RootFS, which is really nothing more than a Linux distribution extracted into a directory. Podman makes this ridiculously easy to do. The following command will fire up a container, get the ID, mount it, then rsync
the filesystem contents out of it into a directory:
sudo dnf install -y rsync
Run the following two commands:
mkdir ~/fedora
mkdir ~/fedora/rootfs
podman unshare
rsync -av $(podman mount $(podman create fedora bash))/ ~/fedora/rootfs/
We have ourselves a RootFS directory to work with. Check it out:
ls -alh ~/fedora/rootfs
Now that we have a RootFS, let's create a spec file and modify it.
rm -rf ~/fedora/config.json
runc spec -b ~/fedora/
sed -i 's/"terminal": true/"terminal": false/' ~/fedora/config.json
Now we have ourselves a full bundle, which is a colloquial way of referring to the RootFS and config together in one directory.
ls -alh ~/fedora
Experiments
First, let's create an empty container. This essentially creates the user space definition for the container, but no processes are spawned yet:
runc create -b ~/fedora/ fedora
List the created containers.
runc list
Now execute a Bash process in the container, so we can see what's going on. Essentially, any number of processes can be executed in the same namespace and will all have access to the same PID and mount table.
runc exec --tty fedora bash
If you get any error messages related to missing commands, ignore them. It looks just like a normal container you would see with Podman, Docker, or CRI-O inside of Kubernetes. That's because it is.
cat /etc/os-release
Get out of the container.
exit
Delete it and verify that things are cleaned up. You may notice other containers running. That might be because other containers on the system are running in CRI-O, Podman, or Docker.
runc delete fedora
runc list
Summary
In summary, we have learned how to create containers with a terse little program called runc
. This is the exact same program used by every major container engine on the planet. In production, you would never create containers like this, but it's useful to understand what is going on under the hood in CRI-O, Podman, and Docker. When you run into new projects like Kata
, gVisor
, and others, you will now understand exactly how and where they fit into the software stack. While this information is not necessary for app development, it never hurts to know what’s going on under the covers.
It is a journey, and we are always happy to help. If you want more options, consider the following learning paths: