IoT devices in rootless containers

Applications are often developed, tested, and delivered in containers, and Red Hat OpenShift is a great platform for that purpose. Sometimes, however, the target machine is much smaller than a Kubernetes cluster. It might be an embedded server, industry PC hardware, or a single server.
Image: Systemd + Podman + Ansible.

Let's say the target machine was a Red Hat Enterprise Linux (RHEL) edge server? How would you automate your container to run in that environment? What if you had thousands of such devices? In that case, you would want a fully automated rootless container for security. But how do you automate rootless containers?

In this article, you'll learn how to use systemd, Podman, and Red Hat Ansible Automation to automate and push software as containers to small-scale edge and Internet-of-Things (IoT) gateway devices.

Note: While I have not included Red Hat Ansible Tower in the demonstration, integrating it would be the next logical next step.

Why rootless containers?

Why should you use rootless containers to deliver applications to edge and IoT boxes? Well, it's another layer of security. Even if an evil blackhat manages to break into your container, find a security hole, and punch through your Security-Enhanced Linux (SELinux) module, the rootless container ensures they won't have privileges in the system.

As a developer, you also don’t necessarily need root privileges in the target device. Your team could deliver the application to end devices as containers, while only admins have the privileges to manage the boxes.

Set up the development environment

I’ve previously written about automating Podman containers with Ansible, but I've only recently added the rootless option to my Ansible role. We’ll use the updated Ansible role and an Ansible module for Grafana for this demonstration.

We'll set up Grafana with a dummy dashboard and test users for an RHEL edge server to get started. I have tested the example also on a Fedora IoT distribution and a standard RHEL server.

Our end goal is to create persistent containers using systemd, Podman, and Ansible Tower, where the containers have user-only privileges. In this case, systemd manages the user processes, but any Docker container would work the same. Figure 1 shows how these components would work together.

IoT devices in rootless containers
Figure 1: Creating persistent containers with systemd, Podman, and Ansible Tower.

In the next sections, we'll set up and configure this automation.

Systemd changes in Ansible

First, let's look at what needs to change to make the podman_container_systemd role work in user mode. We'll also look at the changes needed for Fedora CoreOS and similar servers that are intended to run only containers.

User systemd services

The first thing we'll do is move all of the service files into the user's home directory instead of the /etc/systemd/system or /usr/lib/systemd/system system config directories. We could technically use the /usr/lib/systemd/userdirectory, but we want the service files to be private to the user. This way, anyone on the application team can modify them as a regular user if needed.

Here’s the configuration code for this step:

   - name: set systemd dir if user is not root
      set_fact:
        service_files_dir: "{{ user_info.home }}/.config/systemd/user"
        systemd_scope: user
      changed_when: false

    - name: ensure systemd files directory exists if user not root
      file:
        path: "{{ service_files_dir }}"
        state: directory
        owner: "{{ container_run_as_user }}"
        group: "{{ container_run_as_group }}"

Configure a persistent D-BUS session

The biggest hurdle for me was understanding how a service user gets a D-BUS session, which remains available over boots even if the user never logs in. Additionally, systemd must be able to control the user's Podman session. We can handle these requirements by setting up a lingering session for the user, which activates dbus for a given user at boot. Here's how to set up the lingering session:

 - name: Check if user is lingering
    stat:
      path: "/var/lib/systemd/linger/{{ container_run_as_user }}"
    register: user_lingering
    when: container_run_as_user != "root"

  - name: Enable lingering is needed
    command: "loginctl enable-linger {{ container_run_as_user }}"
    when:
      - container_run_as_user != "root"
      - not user_lingering.stat.exists

Next, we need to ensure that the systemd commands are executed in user scope. Because we don’t log in as the target user, but as a privileged Ansible user, we will set an environment variable for xdg_runtime_dir. We can use this variable to find the user’s lingering dbus session later. Here is how to set systemd's scope to user:

- name: set systemd runtime dir
  set_fact:
    xdg_runtime_dir: "/run/user/{{ container_run_as_uid.stdout }}"
  changed_when: false

- name: set systemd scope to system if needed
  set_fact:
    systemd_scope: system
    service_files_dir: '/etc/systemd/system'
    xdg_runtime_dir: "/run/user/{{ container_run_as_uid.stdout }}"
  when: container_run_as_user == "root"
  changed_when: false

Set the default target

We also need to change the systemd session file to the default target when it's run in rootless. Here's the configuration for the default target:

[Install]
{% if container_run_as_user == 'root' %}
WantedBy=multi-user.target
{% endif %}
{% if container_run_as_user != 'root' %}
WantedBy=default.target
{% endif %}

User systemd commands

We've set all the required variables. Next, we need to tell Ansible to use the D-BUS session:

- name: start service
  become: true
  become_user: "{{ container_run_as_user }}"
  environment:
    XDG_RUNTIME_DIR: "{{ xdg_runtime_dir }}"
  systemd:
    name: "{{ service_name }}"
    scope: "{{ systemd_scope }}"
    state: started

Note that we switch to a given user and set the runtime directory to catch the D-BUS. Also, the scope is set to user instead of system.

RPM-OSTREE package handling

For my own purposes, I want to run these containers in minimal, Fedora CoreOS-based machines. Strangely, Ansible doesn’t have a proper package module for this setup. So, I used the following workaround for checking and installing packages:

- name: ensure firewalld is installed (on fedora-iot)
    tags: firewall
    command: >-
      rpm-ostree install --idempotent --unchanged-exit-77
      --allow-inactive firewalld
    register: ostree
    failed_when: not ( ostree.rc == 77 or ostree.rc == 0 )
    changed_when: ostree.rc != 77
    when: ansible_pkg_mgr == "atomic_container"

  - name: reboot if new stuff was installed
    reboot:
      reboot_timeout: 300
    when:
      - ansible_pkg_mgr == "atomic_container"
      - ostree.rc != 77

You might not want to install anything there, but this configuration handles the required reboot if you do. It made sense for using Fedora CoreOS in an IoT system.

At this point, we are pretty much done with the changes.

Try it out

In case you want to try this yourself (and why wouldn’t you?), I will share the commands to run this example in your own environment. I used RHEL 8 on my laptop and an RHEL edge server as the target virtual machine. If you don't have access to RHEL edge, you can also use Feroda-IoT as the target.

I like to keep everything related to tasks in one directory, including the required collections and roles. I've set all of this up, including the requirements files, in the example project repository. All you need to do is get it:

sudo dnf install ansible
git clone https://github.com/ikke-t/ansible-podman-sample.git
cd ansible-podman-sample

Then, install the role and collection dependencies:

ansible-galaxy collection install -r collections/requirements.yml -p collections
ansible-galaxy role install -r roles/requirements.yml -p roles

And run the playbook:

ln -s roles/ikke_t.grafana_podman/tests/test.yml run-container-grafana-podman.yml
ansible-playbook -i edge, -u cloud-user -b \
  -e container_state=running \
  -e ansible_pkg_mgr=atomic_container \
  run-container-grafana-podman.yml

You will need to change the following settings for your system:

  • You only need the ansible_pkg_mgr setting if the target is an RHEL edge server; otherwise, you can remove this line.
  • edge is my VM server's SSH address.
  • cloud-user is the sudo-privileged Ansible user in the target VM.

Let it run for a minute, and ... drumroll ... ta-dah! As shown in Figure 2, we have our Grafana dashboard running in a user session as a container (http://your_vm:3000).

grafana dashboard
grafana-dashboard
grafana-dashboard

Figure 3 shows the test users pushed over the API using the Ansible Grafana collection.

grafana-users
grafana-users
grafana-users

Debugging locally in the target

We are running the containers as a user, and this Ansible role places them under the user context of systemd.  To use systemd as a given user,  you need to set up a D-BUS-related environment variable. However, the user does not have login credentials, so you need to switch manually to the user using su. (Note that sudo won't work because the original user ID would show your ssh user ID.)

Here's the command to switch manually to the user:

su - root
su - grafana
export XDG_RUNTIME_DIR=/run/user/$UID

After setting XDG_RUNTIME_DIR you will be able to use the systemd --user or journalctl --user commands to investigate your systemd services set for the container:

[cloud-user@edge ~]$ su -
Password:
Last login: Thu Dec 31 13:03:43 EET 2020 on pts/2
[root@edge ~]# su - grafana
Last login: Thu Dec 31 13:04:41 EET 2020 on pts/1
(failed reverse-i-search)`export': ^C
[grafana@edge ~]$ export XDG_RUNTIME_DIR=/run/user/$UID
[grafana@edge ~]$
[grafana@edge ~]$ systemctl --user status grafana-container-pod-grafana.service
 grafana-container-pod-grafana.service - grafana Podman Container   Loaded: loaded (/var/home/grafana/.config/systemd/user/grafana-container-pod-grafana.service; enabled; ve>
   Active: active (running) since Thu 2020-12-31 13:06:35 EET; 29min ago
   Process: 1122 ExecStartPre=/usr/bin/rm -f /tmp/grafana-container-pod-grafana.service-pid /tmp/grafana-cont>
 Main PID: 1126 (podman)
   CGroup: /user.slice/user-1002.slice/user@1002.service/grafana-container-pod-grafana.service
           ├─1126 /usr/bin/podman run --name grafana --rm -p 3000:3000/tcp -e GF_INSTALL_PLUGINS=flant-statu>
           ├─1158 /usr/bin/podman run --name grafana --rm -p 3000:3000/tcp -e GF_INSTALL_PLUGINS=flant-statu>
           ├─1167 /usr/bin/podman
           ├─1200 /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox --enable-seccomp>
           ├─1206 /usr/bin/fuse-overlayfs -o lowerdir=/var/home/grafana/.local/share/containers/storage/over>
           ├─1214 containers-rootlessport
           ├─1227 containers-rootlessport-child
           ├─1240 /usr/bin/conmon --api-version 1 -c 2675941bf4743ff26860ff2e84ceaaae78f2fcfbe3fef218e0cfee9>
           └─2675941bf4743ff26860ff2e84ceaaae78f2fcfbe3fef218e0cfee914aa96b37
             └─1251 grafana-server --homepath=/usr/share/grafana --config=/etc/grafana/grafana.ini --packagi>
[grafana@edge ~]$ podman ps
CONTAINER ID  IMAGE                             COMMAND  CREATED         STATUS             PORTS                   NAMES
2675941bf474  docker.io/grafana/grafana:latest           29 minutes ago  Up 29 minutes ago  0.0.0.0:3000->3000/tcp  grafana

Cleanup (nuke it)

You've seen how automating rootless Podman containers using Ansible works; now it’s time to clean it all up. Beware that the “nuke=true” option removes both the Grafana user and the data. Before using this option, make sure you've stored any data you don't want to lose. Note, again, that you need to remove the pkg_mgr if you are not working on an RHEL edge target:

ansible-playbook -i edge, -u cloud-user -b \
  -e container_state=absent \
  -e ansible_pkg_mgr=atomic_container \
  -e nuke=true \
  run-container-grafana-podman.yml

Conclusion

Podman and systemd work well for running containers in small setups where Kubernetes would be overkill. Ansible is a robust way to create such a setup. Sometimes, as I've shown here, you don’t even need backups because the target is super easy for Ansible to create from scratch, including the application installation and configurations. It also doesn't matter if you have one or thousands of machines at the edge; the setup is basically the same. Remember how we configured Grafana over the API and config file? Consider which is better for your application.

By the way, how about updating the application at the edge? In my example, I set up a Podman container tag to periodically poll for new versions of the container. If you enable that service, you only need to push a new version of the container to the registry at the end of a successful CI/CD pipeline in Red Hat OpenShift. Happy containerizing!

Resources

See the following to learn more about Podman and systemd:

Last updated: October 7, 2022