SystemTap (stap) uses a command-line interface (CLI) and a scripting language to write instrumentation for a live running kernel or a user space application. A SystemTap script associates handlers with named events. This means, when a specified event occurs, the default SystemTap kernel runtime runs the handler in the kernel as if it is a quick subroutine, and then it resumes.
SystemTap translates the script to C, uses it to create a kernel module, loads the module, and connects the probed events. It can set probes at arbitrary kernel locations or at user space locations. While SystemTap is a powerful tool, loading the kernel module requires privilege, and this privilege can sometimes be a barrier for use. For example, on managed machines or in containers that are without the necessary privilege. In these cases, SystemTap has another runtime that uses the Dyninst instrumentation framework to provide many features of the kernel module runtime only requiring user privilege.
SystemTap Dyninst runtime use cases
The following examples use the Dyninst runtime, such as when the kernel runtime is not available, or to make use of the Dyninst runtime in its own right. The examples can be cut, pasted, and executed using the given stap command lines.
Many programs such as Python, Perl, TCL, editors, and web servers employ event loops. The next example demonstrates parameter changes in the Python event loop. This Python program, pyexample.py, converts Celsius to Fahrenheit. This example requires the installation of debuginfo for the python3-libs
:
stap --dyninst varwatch.stp 'process("/usr/lib64/libpython3.8.so.1.0").statement("PyEval_EvalCodeEx@*:*")' '$$parms' -c '/usr/bin/python3 pyexample.py 35'
where varwatch.stp
is:
global var% probe $1 { if (@defined($2)) { newvar = $2; if (var[tid()] != newvar) { printf("%s[%d] %s %s:\n", execname(), tid(), pp(), @2); println(newvar); var[tid()] = newvar; } } }
What is PyEval_EvalCodeEx@*:*
and how did we determine it? Developers place static probes in the Python executable. For more details, go to the following User space static probes section. One of these probes is function__entry
. For this probe, search the sources for that marker, and extrapolate from that point. Once arriving at the PyEval_EvalCodeEx
function, the @*:#
portion indicates where to set a probe for each statement in the function. Then, using this information, we can set a probe that accumulates time statistics for the Python event loop:
stap --dyninst /work/scox/stap"PyEval_EvalCodeEx")' -c /scripts/func_time_stats.stp 'process("/usr/lib64/libpython3.8.so.1.0").function('/usr/bin/python3 pyexample.py 35'
where func_time_stats.stp
is:
global start, intervals probe $1 { start[tid()] = gettimeofday_us() } probe $1.return { t = gettimeofday_us() old_t = start[tid()] if (old_t) intervals <<< t - old_t delete start[tid()] }
where the output is:
35 Celsius is 95.0 Farenheit intervals min:0us avg:49us max:6598us count:146 variance:297936 value |-------------------------------------------------- count 0 |@@@@@ 10 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 64 2 |@@@@@@@@@@@@@@@@@@@@@ 43
Then set a probe at the static marker cmd__entry
in libtcl8.6.so
to display the arguments in the TCL event loop:
stap --dyninst -e 'probe process("/usr/lib64/libtcl8.6.so").mark("cmd__entry") {printf("%s %#lxd %#lxd\n",$$name,$arg1,$arg2)}' -c /usr/bin/tclsh8.6
Systemtap Dyninst runtime overview
The Dyninst runtime differs in that it creates a C user space source and generates a shared object. The Dyninst runtime does not require special privilege.
The SystemTap+Dyninst operation diagram in Figure 1 compares both approaches.
The SystemTap Dyninst runtime environment supports a subset of the probes that the kernel runtime supports. The following sections provide an overview of some of the probe families and the status of those probes in the Dyninst runtime.
User space probes
User space probes are source-level probes that require debuginfo. varwatch.stp
and func_time_stats.stp
exemplify these types of probes. The kernel runtime in the examples can be invoked without the -c
COMMAND option. This allows the example to be associated with any process on the system running /usr/bin/ex
. The Dyninst runtime cannot be used for this type of system monitoring. It requires association with a specific process that is specified either with the -x
PID or -c
COMMAND option.
User space variable access
SystemTap can access many types of variables. However, the Dyninst runtime cannot access certain types of variables that the default kernel runtime can access. Typically, these are global variables, which require tracking the virtual memory address, a feature not present in the Dyninst runtime. For example, accessing the global variable Rows
in the ex
program:
stap --dyninst -e 'probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",@var("Rows@main.c"))}'
gives the error:
semantic error: VMA-tracking is only supported by the kernel runtime (PR15052): operator '@var' at <input>:1:68 source: probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",@var("Rows@main.c"))}
When this error occurs, avoid attempting to access that particular variable. The -L option mentioned in the next section allows you to find and display a possible alternate context variable.
User space static probes
SystemTap can probe symbolic static instrumentation that is compiled into programs and shared libraries. The previous probe mark ("cmd__entry")
is an example of this type of probe. Developers place static probes at useful locations. SystemTap can list available static probes in an executable or shared object. For example, to list static probes in the libtcl
shared object:
stap -L 'process("/usr/lib64/libtcl8.6.so").mark("*")'
process("/usr/lib64/libtcl8.6.so").mark("cmd__entry") $arg1:long $arg2:long $arg3:long process("/usr/lib64/libtcl8.6.so").mark("cmd__return") $arg1:long $arg2:long ...
A $argN
reference in a static probe might receive a VMA tracking error. When this is the case, avoid that particular $argN
reference.
Timer probes
There is a timer family of probes. Jiffies timers, called timer.jiffie
, is a kernel feature not available in the Dyninst runtime. There is also another type of timer available in the Dyninst runtime called unit-of-time timers or timer.ms(N)
. For example, to exit SystemTap after two seconds, a SystemTap script might include:
probe timer.ms(2000) {exit()}
Kernel space probes
When the example SystemTap command line is:
stap -e 'probe kernel.function("bio*") { printf ("%s -> %s\n", thread_indent(1), probefunc())}'
Any process that invokes a kernel function with the wildcard name bio*
displays the name of the probe. If the –runtime=Dyninst
option is given, then it can't succeed because the Dyninst runtime cannot probe kernel functions. This is also true of the syscall.*
and perf.*
family of probes that require kernel functionality.
Tapsets
A tapset is a script that is designed for reuse and installed into a special directory. The Dyninst runtime doesn't implement all of the tapsets. For example, if a SystemTap script attempts to use the task_utime
tapset, SystemTap warns that the tapset containing task_utime
is not available in the Dyninst runtime:
stap --dyninst -e 'probe process("/usr/bin/ex").function("do_cmdline") {printf("%d\n",task_utime())}'
semantic error: unresolved function task_utime (similar: ctime, qs_time, tz_ctime, tid, uid): identifier 'task_utime' at <input>:1:68
Probe summary
The Dyninst runtime does not support the following probe types:
kernel.*
perf.*
tapset.*
(Dyninst runtime implements some, but not all scripts)
The Dyninst runtime supports the following probe types:
process.*
(if specified with -x or -c)process.* {...@var("VAR")}
(if no VMA issues)process.mark
timer.ms
Micro benchmark
This example compares the runtime of a microbenchmark that has eight threads, with each running a null loop 10,000,000 times, and with a probe firing in each loop. The timing is measured in microseconds, and it begins after SystemTap completes the probe setup, then the benchmark begins executing. The system time for the Dyninst runtime, as compared to the system time for the kernel runtime, reflects the fact that the Dyninst runtime runs the probes in user space.
Dyninst Runtime Kernel Module Runtime User 7,864,521 8,712,623 System 4,808,738 12,049,084
Dyninst execution details
Dyninst can instrument a running, dynamic process. It can also instrument a process that hasn't run yet, called a static process.
Dyninst inserts the instrumentation code using ptrace, although the instrumentation runs in the process. Because of ptrace limitations, Dyninst can probe only one instance of a process. In addition, the program inserting the instrumentation is called the mutator, and the program being instrumented is called the mutatee.
A typical mutator can perform the following:
- Attach to a running process, create a new process, or load a nonrunning executable.
- Build the Dyninst image of the process.
- Find the function to be instrumented:
- Creates a Dyninst snippet, which is an abstraction that describes the instrumented code.
- For example, in a function call:
- Create a Dyninst call snippet.
- Create Dyninst argument snippets.
- Translate the snippets into instructions and insert them at the probe point.
- Dynamic instrumentation occurs when the process is attached or created, and then it continues.
- Static instrumentation occurs when the process doesn't execute. In this case, it creates a new executable.
A Dyninst snippet creates an abstraction of the code that is inserted at a probe point. The snippet can create or access variables or types, access registers, and logical, conditional, and arithmetic expressions. Dyninst converts the snippets into instructions and inserts them at the probe point.
In the case of SystemTap, the mutator is the SystemTap tool stapdyn. This tool creates snippets that call SystemTap handlers, which are defined in a shared object and correspond to the probe points. The Dyninst snippets do not handle the SystemTap probe. Instead, the probe handlers perform that function and the Dyninst snippets call those handlers.
Summary
The SystemTap Dyninst backend enables the use of SystemTap to probe a user application, without requiring any special privileges. Probes run with minimal overhead because they run in user space. The Dyninst runtime is an alternative the next time a user space application needs probing.
Last updated: February 5, 2024