Many developers would like to run their existing applications in a container with restricted capabilities to improve security. However, it may not be clear which capabilities the application uses because the code uses libraries or other code developed elsewhere. The developer could run the application in an unrestricted container that allows all syscalls and capabilities to be used to avoid possible hard to diagnose failures caused by the application's use of forbidden capabilities or syscalls. Of course, this eliminates the enhanced security of restricted containers. At Red Hat, we have developed a SystemTap script (container_check.stp) to provide information about the capabilities that an application uses. Read the SystemTap Beginners Guide for information on how to setup SystemTap.

Below is an example of the container_check.stp script monitoring a sudo command and the child processes it creates due to the strace and ping commands. The SystemTap "-c" option will setup the SystemTap instrumentation, run the specified command following the option, and shut down the SystemTap instrumentation once the command is complete. The expected output of the ping and strace commands are printed out followed by the output of the script. If the script warns about skipped probes, the number of active kretprobes allowed needs to be increased by using a larger number in the "-DKRETACTIVE=100" option on the command line.

The container_check.stp script lists out the capabilities used by each executable. The first section of the script output for this example shows ping uses setuid and net_raw capabilities and the sudo uses setgid, setuid, and audit_write capabilities. The next section of the script output provides more details on the specific system calls using those capabilities for each executable. Thus, for this example to run in a container the setuid, setgid, net_raw, and audit_write capabilities would be required.

$ ./container_check.stp -DKRETACTIVE=100 -c "sudo strace -c -f ping -c 1 people.redhat.com"
starting container_check.stp. monitoring 20146
PING people02.pubmisc.prod.ext.phx2.redhat.com (10.5.19.28) 56(84) bytes of data.
64 bytes from people02.pubmisc.prod.ext.phx2.redhat.com (10.5.19.28): icmp_seq=1 ttl=57 time=46.3 ms

--- people02.pubmisc.prod.ext.phx2.redhat.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 46.370/46.370/46.370/0.000 ms
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 30.90    0.000623          69         9         2 socket
 13.69    0.000276          14        20         1 open
  7.84    0.000158           7        22           mprotect
  7.14    0.000144           5        31           mmap
  5.41    0.000109           5        24           close
  4.37    0.000088           4        20           fstat
  4.07    0.000082           4        20           read
  3.08    0.000062          12         5         2 connect
  3.03    0.000061          31         2           sendto
  2.48    0.000050           8         6           write
  2.18    0.000044          44         1           sendmmsg
  1.93    0.000039           6         7           setsockopt
  1.84    0.000037           7         5           poll
  1.84    0.000037          12         3           munmap
  1.44    0.000029           6         5           ioctl
  1.24    0.000025           4         7           capget
  0.99    0.000020          20         1           recvmsg
  0.94    0.000019           6         3           recvfrom
  0.74    0.000015           5         3           rt_sigaction
  0.74    0.000015           5         3           capset
  0.55    0.000011           6         2         2 access
  0.50    0.000010          10         1           setuid
  0.50    0.000010           5         2           prctl
  0.45    0.000009           3         3           brk
  0.35    0.000007           4         2           getuid
  0.30    0.000006           6         1           setitimer
  0.30    0.000006           6         1           getsockname
  0.30    0.000006           6         1           getsockopt
  0.25    0.000005           5         1           rt_sigprocmask
  0.25    0.000005           5         1           geteuid
  0.20    0.000004           4         1           getpid
  0.20    0.000004           4         1           arch_prctl
  0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ----------------
100.00    0.002016                   215         7 total


capabilities used by executables
      executable:      prob capability

            ping:           cap_setuid
            ping:          cap_net_raw

            sudo:           cap_setgid
            sudo:           cap_setuid
            sudo:      cap_audit_write



capabilities used by syscalls
      executable,              syscall (       capability ) :            count
            ping,               socket (      cap_net_raw ) :                2
            ping,               setuid (       cap_setuid ) :                1
            sudo,            setresuid (       cap_setuid ) :               11
            sudo,            setresgid (       cap_setgid ) :               10
            sudo,            setgroups (       cap_setgid ) :                5
            sudo,               setgid (       cap_setgid ) :                1
            sudo,               setuid (       cap_setuid ) :                1
            sudo,               sendto (  cap_audit_write ) :                5


forbidden syscalls
      executable,              syscall:            count


failed syscalls
      executable,              syscall =            errno:            count
            ping,              connect =           ENOENT:                2
            ping,               socket =           EACCES:                2
            ping,               access =           ENOENT:                2
            ping,                 open =           ENOENT:                1
          stapio,               execve =           ENOENT:                5
          stapio,         rt_sigreturn =            EINTR:                1
          strace,                wait4 =           ECHILD:                1
          strace,               access =           ENOENT:                1
            sudo,                 read =           EAGAIN:                1
            sudo,                ioctl =           ENOTTY:                2
            sudo,              recvmsg =           EAGAIN:                3
            sudo,                 open =           ENOENT:               83
            sudo,                 stat =           ENOENT:                7
            sudo,               access =           ENOENT:                4
            sudo,                fstat =            EBADF:                1
            sudo,              connect =           ENOENT:               13
            sudo,                 poll =                 :                1
            sudo,         rt_sigreturn =            EINTR:                1

You can also monitor already running processes by using the "-x " option and stopping the instrumentation with Ctl-C when the data collection is done. Below is an example monitoring Wireshark, showing the dumpcap executable using the setgid, setuid, and net_raw capabilities:

$ pgrep wireshark
19015
$ ./container_check.stp -DKRETACTIVE=200 -x 19015starting container_check.stp. monitoring 19015
^C

capabilities used by executables
      executable:      prob capability

         dumpcap:           cap_setgid
         dumpcap:           cap_setuid
         dumpcap:          cap_net_raw



capabilities used by syscalls
      executable,              syscall (       capability ) :            count
         dumpcap,            setresgid (       cap_setgid ) :                1
         dumpcap,            setresuid (       cap_setuid ) :                1
         dumpcap,               socket (      cap_net_raw ) :                1


forbidden syscalls
      executable,              syscall:            count


failed syscalls
      executable,              syscall =            errno:            count
         dumpcap,               select =                 :                1
         dumpcap,         rt_sigreturn =            EINTR:                1
         dumpcap,           setsockopt =            EBUSY:                1
         dumpcap,                 stat =           ENOENT:                1
         dumpcap,               access =           ENOENT:                2
         dumpcap,                ioctl =       EOPNOTSUPP:                2
         dumpcap,             recvfrom =           EAGAIN:                1
       wireshark,              recvmsg =           EAGAIN:             2840
       wireshark,                ioctl =           EINVAL:                2
       wireshark,                 open =           ENOENT:               31
       wireshark,                 stat =           ENOENT:               57
Last updated: March 20, 2023