Container image size matters. Let’s look at an experiment that reduced Apache HTTP and NGINX servers to micro container images. This article walks through the process we used to achieve the final result, plus how many megabytes (MB) this approach saved.
For this experiment, we used Fedora RPMs, but a similar approach should work in other operating systems or container images, as you see in the list of available images (there is an Apache HTTP server image that uses CentOS Stream 8 and 9 RPMs).
A shortcut: If you are only interested in the result and want to try out the micro variant of Apache HTTP or NGINX server container images, check out the images in the following registries.
podman pull quay.io/fedora/httpd-24-micro
podman pull quay.io/fedora/nginx-122-micro
podman pull quay.io/sclorg/httpd-24-micro-c8s
podman pull quay.io/sclorg/httpd-24-micro-c9s
The benefits of smaller container images
First, let’s explain a bit more about the story behind those micro containers.
Container images include everything that a specific application needs except a Linux kernel. That's the spirit of container technology. Here, our focus is on web servers, so in addition to the Apache HTTPd and NGINX server daemons, the container needs to include also libraries that those daemons use, necessary userspace components, etc.
Even in the year 2023 when Internet speeds are tremendous, the size of such containers is important. For example, it matters in an environment where the Internet speed is still very limited (have you heard about OpenShift running on satellites somewhere far in space?). It can help make the user experience more delightful (waiting dozens of seconds is not fun) or limit potential attack vectors because of unnecessary pieces of software in the container image that could work without them.
These are just a few reasons why developers want to make the container image as small as practically possible. Now let's review the steps we took to reduce the web server container image size to a minimum. Our experiments used Apache HTTP server 2.4 and NGINX server 1.22.
Choosing binaries
For these tech preview container images, we decided to use a Fedora 36 base image and therefore take RPMs from Fedora repositories. There are different attempts to make the container image small by compiling just the necessary pieces directly from the source, which results in a small image, but that’s not always a good idea.
Using packages from a distribution has a clear benefit—they are well-tested, maintained when there is a security issue, interact well with the rest of the operating system, and are proven to work well outside of the container, so we only need to focus on the container specifics.
You might think about removing files once they are installed as RPMs; this might make the image smaller, but it would be rather risky and a container image could crash in some corner cases when some files would be needed, despite it seemed not like that. If our goal is to create a container image good enough for production, we should follow a simple principle: to not remove files from RPMs, so RPM packages are installed fully or not at all.
However, let's first see what we started with. The container images users currently can use for the latest stable web servers are as follows:
Container image | Compressed size | Uncompressed size |
Apache HTTP Server 2.4 | 120 MB | 376 MB |
Nginx 1.22 | 111 MB | 348 MB |
Minimizing the container image
The main trick is to use a two-phase building of the container image. That means that we use a parent image only for installing RPMs to an empty directory and then use only the content of this directly as a final result. This way we not only get rid of the package installer (DNF) but also the RPM database and RPM tooling itself. We end up with only the web server RPMs and their direct and indirect dependencies.
This change alone already makes a big difference in size, but it also means installing additional software into such an image is not easy. Extending such an image would either mean copying files directly to the image, or the image would need to be rebuilt from scratch. That’s an acceptable disadvantage because, for many use cases, users do not need to extend images with web servers.
Analyzing dependencies
The next step was looking closely at what we actually have in the container image. For example, we see systemd and all of its dependencies. That makes sense when we install the web servers outside of the container image, but in the container? It's likely not needed. So, we worked with Apache HTTPd and NGINX server maintainers, who helped us to get rid of the systemd dependency by installing only httpd-core
and nginx-core
packages. We also avoided installing the Perl module in the case of NGINX, because it pulled in a lot of additional MBs in form of the Perl interpreter and several base libraries.
These changes again helped to squeeze the size significantly. We didn't stop there, though. We analyzed other packages and saw that we installed nss_wrapper
that pulled in the Perl interpreter as well. We also installed gettext
package in order to have envsubst
utility available (for expanding Bash variables in configuration files, as environment variables are common ways to configure container images). In both cases, we worked with the package maintainers, and they allowed us to use only minimal required parts of their tools so we could only install nss_wrapper-libs
and envsubst
packages, which removed additional MBs.
What we kept in the image
What we didn't get rid of were several Bash scripts that help the container when starting (starting the daemon, handling the configuration, etc.). These scripts do not take more than a few kilobytes (kB) anyway, so we didn’t touch those.
There are also a couple of other packages that we installed explicitly to make the container images work reasonably (coreutils-single
, glibc-minimal-langpack
), but those were already made as minimal as possible.
Using the micro web server images
The container images we worked with are designed to be used either directly via the container command-line interface (Podman or Docker) in Kubernetes, but they were specifically designed to work well in Red Hat OpenShift.
Read more about specific usage in the README files available in the GitHub repositories:
The final result
Did we succeed? Except for the Perl module in the case of the NGINX container image, the tests we have for the images passed fine for the micro container images as well. So, the main use cases should work fine and the micro images should still be pretty useful.
Now we can see how big the micro images are after all those changes:
Container image | Compressed size | Uncompressed size |
Apache HTTP Server 2.4 micro | 16 MB (13%) | 46 MB (12%) |
Nginx 1.22 micro | 23 MB (21%) | 63 MB (18%) |
In summary, we were able to decrease to approximately one-fifth of the original size, so the images will be downloaded five times faster and consume less than one-fifth of space.
Conclusion
The price for such a great difference is not large; the most important feature we lose is the ability to install additional software (due to the missing RPM and DNF). If your use case is to serve static content, then micro HTTPd and NGINX images should do the work without trouble. If your use case is beyond this and you want to serve something complicated or install further RPMs, then the original web server images might be a better choice for you. Or you can create your own micro image, based on the principles explained in this article.
Enjoy the micro web servers, and don't forget to let us know what you think by visiting the GitHub projects below. You can also leave a comment here if you simply like this approach and the images work for you.
Looking for more? Explore other container tutorials from Red Hat Developer.
Last updated: August 14, 2023