Featured Image: systemtap dyninst runtime

SystemTap (stap) uses a command-line interface (CLI) and a scripting language to write instrumentation for a live running kernel or a user space application. A SystemTap script associates handlers with named events. This means, when a specified event occurs, the default SystemTap kernel runtime runs the handler in the kernel as if it is a quick subroutine, and then it resumes.

SystemTap translates the script to C, uses it to create a kernel module, loads the module, and connects the probed events. It can set probes at arbitrary kernel locations or at user space locations. While SystemTap is a powerful tool, loading the kernel module requires privilege, and this privilege can sometimes be a barrier for use. For example, on managed machines or in containers that are without the necessary privilege. In these cases, SystemTap has another runtime that uses the Dyninst instrumentation framework to provide many features of the kernel module runtime only requiring user privilege.

SystemTap Dyninst runtime use cases

The following examples use the Dyninst runtime, such as when the kernel runtime is not available, or to make use of the Dyninst runtime in its own right. The examples can be cut, pasted, and executed using the given stap command lines.

Many programs such as Python, Perl, TCL, editors, and web servers employ event loops. The next example demonstrates parameter changes in the Python event loop. This Python program, pyexample.py, converts Celsius to Fahrenheit. This example requires the installation of debuginfo for the python3-libs:

stap --dyninst varwatch.stp 'process("/usr/lib64/libpython3.8.so.1.0").statement("PyEval_EvalCodeEx@*:*")' '$$parms' -c '/usr/bin/python3 pyexample.py 35'

where varwatch.stp is:

global var%
probe $1 {
 if (@defined($2)) {
 newvar = $2;
 if (var[tid()] != newvar) {
  printf("%s[%d] %s %s:\n", execname(), tid(), pp(), @2);
  println(newvar);
  var[tid()] = newvar;
 }
}
}

What is PyEval_EvalCodeEx@*:* and how did we determine it? Developers place static probes in the Python executable. For more details, go to the following User space static probes section. One of these probes is function__entry. For this probe, search the sources for that marker, and extrapolate from that point. Once arriving at the PyEval_EvalCodeEx function, the @*:# portion indicates where to set a probe for each statement in the function. Then, using this information, we can set a probe that accumulates time statistics for the Python event loop:

stap --dyninst /work/scox/stap"PyEval_EvalCodeEx")' -c /scripts/func_time_stats.stp 'process("/usr/lib64/libpython3.8.so.1.0").function('/usr/bin/python3 pyexample.py 35'

where func_time_stats.stp is:

global start, intervals
probe $1 { start[tid()] = gettimeofday_us() }
probe $1.return {
t = gettimeofday_us()
old_t = start[tid()]
  if (old_t) intervals <<< t - old_t
  delete start[tid()]
}

where the output is:

35 Celsius is 95.0 Farenheit
intervals min:0us avg:49us max:6598us count:146 variance:297936
value |-------------------------------------------------- count
0 |@@@@@ 10
1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 64
2 |@@@@@@@@@@@@@@@@@@@@@ 43

Then set a probe at the static marker cmd__entry in libtcl8.6.so to display the arguments in the TCL event loop:

stap --dyninst -e 'probe process("/usr/lib64/libtcl8.6.so").mark("cmd__entry") {printf("%s %#lxd
%#lxd\n",$$name,$arg1,$arg2)}' -c /usr/bin/tclsh8.6

Systemtap Dyninst runtime overview

The Dyninst runtime differs in that it creates a C user space source and generates a shared object. The Dyninst runtime does not require special privilege.

The SystemTap+Dyninst operation diagram in Figure 1 compares both approaches.

SystemTap and Dyninst operation flow
Figure 1. Comparison of SystemTap and Dyninst runtime environment and operation.

The SystemTap Dyninst runtime environment supports a subset of the probes that the kernel runtime supports. The following sections provide an overview of some of the probe families and the status of those probes in the Dyninst runtime.

User space probes

User space probes are source-level probes that require debuginfo.  varwatch.stp and func_time_stats.stp exemplify these types of probes. The kernel runtime in the examples can be invoked without the -c COMMAND option. This allows the example to be associated with any process on the system running /usr/bin/ex. The Dyninst runtime cannot be used for this type of system monitoring. It requires association with a specific process that is specified either with the -x PID or -c COMMAND option.

User space variable access

SystemTap can access many types of variables. However, the Dyninst runtime cannot access certain types of variables that the default kernel runtime can access. Typically, these are global variables, which require tracking the virtual memory address, a feature not present in the Dyninst runtime. For example, accessing the global variable Rows in the ex program:

stap --dyninst -e 'probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",@var("Rows@main.c"))}'

gives the error:

semantic error: VMA-tracking is only supported by the kernel runtime (PR15052): operator '@var' at <input>:1:68
source: probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",@var("Rows@main.c"))}

When this error occurs, avoid attempting to access that particular variable. The -L option mentioned in the next section allows you to find and display a possible alternate context variable.

User space static probes

SystemTap can probe symbolic static instrumentation that is compiled into programs and shared libraries. The previous probe mark ("cmd__entry") is an example of this type of probe. Developers place static probes at useful locations. SystemTap can list available static probes in an executable or shared object. For example, to list static probes in the libtcl shared object:

stap -L 'process("/usr/lib64/libtcl8.6.so").mark("*")'
process("/usr/lib64/libtcl8.6.so").mark("cmd__entry") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libtcl8.6.so").mark("cmd__return") $arg1:long $arg2:long
...

A $argN reference in a static probe might receive a VMA tracking error. When this is the case, avoid that particular $argN reference.

Timer probes

There is a timer family of probes. Jiffies timers, called timer.jiffie, is a kernel feature not available in the Dyninst runtime. There is also another type of timer available in the Dyninst runtime called unit-of-time timers or timer.ms(N). For example, to exit SystemTap after two seconds, a SystemTap script might include:

probe timer.ms(2000) {exit()}

Kernel space probes

When the example SystemTap command line is:

stap -e 'probe kernel.function("bio*") { printf ("%s
-> %s\n", thread_indent(1), probefunc())}'

Any process that invokes a kernel function with the wildcard name bio* displays the name of the probe. If the –runtime=Dyninst option is given, then it can't succeed because the Dyninst runtime cannot probe kernel functions. This is also true of the syscall.* and perf.* family of probes that require kernel functionality.

Tapsets

A tapset is a script that is designed for reuse and installed into a special directory. The Dyninst runtime doesn't implement all of the tapsets. For example, if a SystemTap script attempts to use the task_utime tapset, SystemTap warns that the tapset containing task_utime is not available in the Dyninst runtime:

stap --dyninst -e 'probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",task_utime())}'
semantic error: unresolved function task_utime (similar: ctime, qs_time, tz_ctime, tid, uid): identifier 'task_utime' at <input>:1:68

Probe summary

The Dyninst runtime does not support the following probe types:

  • kernel.*
  • perf.*
  • tapset.* (Dyninst runtime implements some, but not all scripts)

The Dyninst runtime supports the following probe types:

  • process.*  (if specified with -x or -c)
  • process.* {...@var("VAR")}  (if no VMA issues)
  • process.mark
  • timer.ms

Micro benchmark

This example compares the runtime of a microbenchmark that has eight threads, with each running a null loop 10,000,000 times, and with a probe firing in each loop. The timing is measured in microseconds, and it begins after SystemTap completes the probe setup, then the benchmark begins executing. The system time for the Dyninst runtime, as compared to the system time for the kernel runtime, reflects the fact that the Dyninst runtime runs the probes in user space.

     Dyninst Runtime  Kernel Module Runtime
User     7,864,521             8,712,623
System   4,808,738            12,049,084

Dyninst execution details

Dyninst can instrument a running, dynamic process. It can also instrument a process that hasn't run yet, called a static process.
Dyninst inserts the instrumentation code using ptrace, although the instrumentation runs in the process. Because of ptrace limitations, Dyninst can probe only one instance of a process. In addition, the program inserting the instrumentation is called the mutator, and the program being instrumented is called the mutatee.

A typical mutator can perform the following:

  • Attach to a running process, create a new process, or load a nonrunning executable.
  • Build the Dyninst image of the process.
  • Find the function to be instrumented:
    • Creates a Dyninst snippet, which is an abstraction that describes the instrumented code.
    • For example, in a function call:
      • Create a Dyninst call snippet.
      • Create Dyninst argument snippets.
      • Translate the snippets into instructions and insert them at the probe point.
    • Dynamic instrumentation occurs when the process is attached or created, and then it continues.
    • Static instrumentation occurs when the process doesn't execute. In this case, it creates a new executable.

A Dyninst snippet creates an abstraction of the code that is inserted at a probe point. The snippet can create or access variables or types, access registers, and logical, conditional, and arithmetic expressions. Dyninst converts the snippets into instructions and inserts them at the probe point.

In the case of SystemTap, the mutator is the SystemTap tool stapdyn. This tool creates snippets that call SystemTap handlers, which are defined in a shared object and correspond to the probe points. The Dyninst snippets do not handle the SystemTap probe. Instead, the probe handlers perform that function and the Dyninst snippets call those handlers.

Summary

The SystemTap Dyninst backend enables the use of SystemTap to probe a user application, without requiring any special privileges. Probes run with minimal overhead because they run in user space. The Dyninst runtime is an alternative the next time a user space application needs probing.

Last updated: February 5, 2024