Flexibility and isolation are key for developers working with various tools and technologies in today's fast-paced development environments. Tools like LocalStack, TestContainers, Ansible, Amazon Web Services Cloud Development Kit (AWS CDK), AWS Serverless Application Model (SAM) CLI, and Knative Functions have become essential for developers to simulate cloud services, run containerized tests, and manage automation tasks. These tools often rely on the ability to run containers within a development workspace. However, when operating in platforms like Red Hat OpenShift Dev Spaces, the lack of support for nested containers can hinder these processes.
This article explores how user namespaces (Tech Preview in Red Hat OpenShift 4.17) can enable nested containers in OpenShift Dev Spaces, allowing developers to achieve the required flexibility while maintaining security.
The problem: Nested containers require elevated privileges in Dev Spaces
OpenShift Dev Spaces provides an isolated, containerized development environment, making it a powerful tool for developers. However, a recurring challenge that development teams face is the need to run a container inside a Dev Spaces workspace container. This nested container capability is especially important for teams using tools such as TestContainers, LocalStack, or Ansible Navigator which need to spin up additional containers to simulate cloud services, create isolated test environments, or run additional tooling.
Currently, the inability to run nested containers arises because container engines (like Podman or Docker) require elevated privileges to run within a container. Allowing containers to run inside other containers—also known as nested containers—requires specific permissions that introduce security concerns, especially around privilege escalation. Users want to run nested containers without needing to grant pods privilege.
The solution: User namespaces for containers
One approach to solving the problem of running nested containers while maintaining a less risky security profile is through user namespaces in Linux. User namespaces enable containers to have separate user and group mappings from the host, allowing containers to run processes with root privileges inside the container but as non-root users on the host machine. This isolation of privilege greatly reduces the security risks associated with running nested containers.
Key benefits of user namespaces
Developers can reap the following benefits from user namespaces:
- Improved security: User namespaces reduce the risk of privilege escalation attacks by mapping users within the container to non-root users on the host.
- Reduced privilege requirements: Developers can run containers inside containers without requiring the parent container to be privileged, making it easier to meet security policies.
- Seamless integration with existing tools: With user namespaces enabled, container engines like Podman or Docker can operate within a Dev Spaces workspace, allowing developers to use their familiar tools without significant changes to the environment.
Implementation: Enabling user namespaces in Red Hat OpenShift
To enable nested containers in OpenShift Dev Spaces using user namespaces, several steps are needed:
- Enable user namespaces in Kubernetes pods: In Red Hat OpenShift 4.17, the user namespace feature is available as a technology preview but must be enabled through the proper API and security policies. This allows pods to run containers with their own user mappings.
- Configure Dev Spaces workspaces for nested containers: Once user namespaces are enabled in the cluster, Dev Spaces workspaces can be configured to support nested containers. This involves adding specific configuration changes to the workspace environment, such as allowing the necessary privileges and ensuring that container engines like Podman are available inside the workspace.
- Run nested containers with Podman or Docker: After user namespaces are enabled, developers can run
podman run
ordocker run
commands from within their Dev Spaces workspaces without needing to elevate privileges, enabling them to use tools like LocalStack, TestContainers, or Knative Functions seamlessly.
Cluster configuration
Because this feature is in Tech Preview in 4.17, a cluster admin needs to perform some cluster configuration to gain access to user namespaces.
Note
Enabling Tech Preview feature gates in Red Hat OpenShift Container Platform will prevent the cluster from applying future updates. You should only do this in a cluster that you plan to destroy after testing. You will not be able to update the affected cluster.
Enable feature gates:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
featureSet: TechPreviewNoUpgrade
This will enable the TechPreviewNoUpgrade
feature set, giving access to the required feature gates.
Enable crun
:
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: enable-crun-worker
spec:
machineConfigPoolSelector:
matchLabels:
pools.operator.machineconfiguration.openshift.io/worker: ""
containerRuntimeConfig:
defaultRuntime: crun
Will set the default OCI runtime to crun
, which is currently the only OCI runtime with support for user namespaces packaged in OpenShift 4.17.
Next, create a custom security context constraint (SCC) for Dev Spaces which will allow nested containers:
Note
In the future, there might be specific SCCs for the nested container use case.
cat << EOF | oc apply -f -
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: nested-podman-scc
priority: null
allowPrivilegeEscalation: true
allowedCapabilities:
- SETUID
- SETGID
fsGroup:
type: MustRunAs
ranges:
- min: 1000
max: 65534
runAsUser:
type: MustRunAs
uid: 1000
seLinuxContext:
type: MustRunAs
seLinuxOptions:
type: container_engine_t
supplementalGroups:
type: MustRunAs
ranges:
- min: 1000
max: 65534
EOF
Edit the CheCluster custom resource to use the above SCC:
apiVersion: org.eclipse.che/v2
kind: CheCluster
metadata:
...
spec:
devEnvironments:
containerBuildConfiguration:
openShiftSecurityContextConstraint: nested-podman-scc
disableContainerBuildCapabilities: false
...
Create a workspace
Now that you have Dev Spaces configured for nested containers, it’s time to create a workspace and try it out. We have provided a Git repository that is already configured for you to try.
Create a workspace from this repository. Learn more about this configuration in the README or in the Deeper Dive section below.
After the workspace is created, open a terminal in your new workspace and run the following:
podman run -d --rm --name webserver -p 8080:80 quay.io/libpod/banner
curl http://localhost:8080
You should see the following output:
___ __
/ _ \___ ___/ /_ _ ___ ____
/ ___/ _ \/ _ / ' \/ _ `/ _ \
/_/ \___/\_,_/_/_/_/\_,_/_//_/
To stop the container, run:
podman kill webserver
That’s it! You are now ready for your own container based adventures in OpenShift Dev Spaces.
Deeper dive: What goes into the workspace configuration?
Here is a simplified example of running nested containers in OpenShift Dev Spaces using Podman and user namespaces to illustrate how this works.
Example devfile for a nested-container enabled workspace:
schemaVersion: 2.2.0
attributes:
controller.devfile.io/storage-type: per-workspace
metadata:
name: nested-containers-demo
components:
- name: dev-tools
attributes:
pod-overrides:
metadata:
annotations:
io.kubernetes.cri-o.Devices: "/dev/fuse,/dev/net/tun" # (1)
spec:
hostUsers: false # (2)
container-overrides:
securityContext:
procMount: Unmasked # (3)
container:
image: quay.io/cgruver0/che/ocp-4-17-userns-tp:latest # (4)
In this devfile, there are three modifications to the pod and container that set it up for running nested containers:
This special annotation allows users to specify simple devices without a device plug-in. There are two devices allowed to be specified with this annotation in OpenShift, and both are specified here. Adding
/dev/fuse
gives access to the fuse driver, allowing Podman to usefuse-overlayfs
inside of the container, instead of the comparatively slower vfs storage driver. Adding/dev/net/tun
allows a container to access the outer network in a safe way controlled by the kernel.hostUsers: false
is the bread and butter of this feature. Typically, the default ofhostUsers
is true meaning the pod is in the host user namespace. Setting it tofalse
requests the pod be put in a pod level user namespace.procMount: Unmasked
utilizes a feature calledProcMountType
which allows the user to request the/proc
inside of the container to be unmasked, or mountedrw
. This is safe to do because the pod is in a user namespace, so any access to the host/proc
from the container is limited by UID/GID permissions. Doing so allows the outer container to edit its own sysctl settings, which is needed for setting up container networking for the nested containers.
This image is provided for convenience, and is built from the below Containerfile.
Example Containerfile to enable nested containers in Dev Spaces:
FROM registry.access.redhat.com/ubi9-minimal
ARG USER_HOME_DIR="/home/user"
ARG WORK_DIR="/projects"
ENV HOME=${USER_HOME_DIR}
ENV BUILDAH_ISOLATION=chroot # (1)
COPY --chown=0:0 entrypoint.sh / # (2)
# Note: compat-openssl11 & libbrotli are needed for che-code (DevSpaces build of VS Code)
RUN microdnf --disableplugin=subscription-manager install -y openssl compat-openssl11 libbrotli git tar shadow-utils bash zsh podman buildah skopeo ; \
microdnf update -y ; \
microdnf clean all ; \
mkdir -p ${USER_HOME_DIR} ; \
mkdir -p ${WORK_DIR} ; \
chgrp -R 0 /home ; \
#
# Setup for root-less podman
#
mkdir -p "${HOME}"/.config/containers ; \
setcap cap_setuid+ep /usr/bin/newuidmap ; \ # (3)
setcap cap_setgid+ep /usr/bin/newgidmap ; \
touch /etc/subgid /etc/subuid ; \
chown 0:0 /etc/subgid ; \
chown 0:0 /etc/subuid ; \
chown 0:0 /etc/passwd ; \
chown 0:0 /etc/group ; \
chmod +x /entrypoint.sh ; \
chmod -R g=u /etc/passwd /etc/group /etc/subuid /etc/subgid /home ${WORK_DIR}
WORKDIR ${WORK_DIR}
ENTRYPOINT [ "/entrypoint.sh" ]
CMD [ "tail", "-f", "/dev/null" ]
Example entrypoint.sh
for enabling nested containers in Dev Spaces:
#!/usr/bin/env bash
if [ ! -d "${HOME}" ]
then
mkdir -p "${HOME}"
fi
if ! whoami &> /dev/null
then
if [ -w /etc/passwd ]
then
echo "${USER_NAME:-user}:x:$(id -u):0:${USER_NAME:-user} user:${HOME}:/bin/bash" >> /etc/passwd
echo "${USER_NAME:-user}:x:$(id -u):" >> /etc/group
fi
fi
USER=$(whoami)
START_ID=$(( $(id -u)+1 )) # (2)
END_ID=$(( 65536-${START_ID} ))
echo "${USER}:${START_ID}:${END_ID}" > /etc/subuid
echo "${USER}:${START_ID}:${END_ID}" > /etc/subgid
/usr/libexec/podman/catatonit -- "$@" # (4)
This Containerfile (and the corresponding entrypoint script) is configured to be the outer Dev Spaces container.
Setting the
BUILDAH_ISOLATION
environment tochroot
prevent issues with running nested user namespaces. It’s possible, but it takes additional setup, and we’re already being isolated by a user namespace (allocated to the outer container).Throughout the Containerfile, there is a lot of changing the user/group to 0. Specifically, this is done so that the UID randomly chosen by OpenShift can be given access to these files. This is finished in the entrypoint, where the dynamically chosen UID is added to the user configurations.
newuidmap and newgidmap are programs that are delegated the SETUID/SETGID capabilities so that podman can run without them. Setting these means the dynamic non-root user will have access to the capabilities when running newuidmap and newgidmap, which are required to run rootless Podman.
catatonit is necessary to provide a process reaper in your workspace. This prevents a lot of zombie processes from popping up… beware the apocalypse!
Conclusion
By leveraging user namespaces in OpenShift Dev Spaces, development teams can now run nested containers in a secure and flexible manner. This feature is particularly beneficial for developers who need to run containerized tools and environments inside their workspaces, enabling them to work more efficiently without compromising security.
With user namespaces, OpenShift Dev Spaces can truly offer developers the ease of configuration and flexibility they need, while ensuring that security is a top priority.
Credits
Lots of the underlying work for the Containerfile and entrypoints were inspired these two articles by Urvashi Mohnani and Dan Walsh.