Featured image for a Linux topic.

SystemTap allows developers to add instrumentation to Linux systems to better understand the behavior of the kernel as well as userspace applications and libraries. This article, the first in a two-part series, shows how SystemTap can reveal potential performance problems down to individual lines of code. The second part of the series will describe how a SystemTap performance monitoring script was written.

The SystemTap examples repository

Writing a program from scratch in an unfamiliar language can be daunting. To make it easier for people to use SystemTap, the project provides a collection of examples. Some investigate common system issues. Other examples are there to demonstrate particular SystemTap features.

To provide a bit of structure and make it easier to locate desired scripts out of the approximately 200 SystemTap examples, they are divided into subdirectories. Additionally, each of the examples has associated keywords such as "NETWORK", "DISK", "NFS", and "PROCESS." These keywords are used to create indexes that make it easier to find a script that implements the instrumentation you are interested in or provides a good starting point for your particular needs.

For example, suppose you want to better understand a section of code and where it is taking a lot of execution time. Other tools on Linux, such as perf, can identify the hot spots in the code, but you might like to have some additional information about the control flow in the program. You look at the examples associated with the PROFILING keyword and find a linetimes.stp script that looks promising. This article shows how to use this script as an illustration of SystemTap's capabilities as well as preparation for the programming guidelines in the second part of the series.

Profiling with linetime.stp

SystemTap examples are installed as part of the systemtap-client RPM in the /usr/share/systemtap/examples directory in Red Hat Enterprise Linux and Fedora.

The linetimes.stp script takes two arguments. The first argument determines whether the kernel, a kernel module, or a user-space binary is being monitored. The second argument is the name of the function being monitored.

To get a better understanding of the nfsd4_open function in the kernel file fs/nfsd/nfs4proc.c, I ran the following command on a Red Hat Enterprise Linux 8 machine running as an NFS server. The --example option indicates that the script is in the examples directory. Because the NFS server code is compiled as the nfsd module, the command needs to specify the code location with module("nfsd"). Core kernel code would be specified as kernel. The function name, nfsd4_open, follows on the command line. The -c "sleep 10" argument runs the script for 10 seconds and then exits:

$ sudo stap --example linetimes.stp 'module("nfsd")' nfsd4_open -c "sleep 10"

I got the following output from the script. The opening line states that the function was called 12 times. Looking through the times listed for the regions, I see that one line, 418, stands out as being very expensive with an average runtime of 3,937 microseconds:

module("nfsd") nfsd4_open call count: 12
region                                                     avg(us) max(us)
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:352")       2       4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353")       2       4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358")       6       8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382")       5      18
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418")    3937    8962
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419")       8      16
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431")       5       5
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453")       5       7
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454")       2       3
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465")       1       3
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466")       1       2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467")       1       2

The source file containing this line is:

/usr/src/debug/kernel-4.18.0-375.el8/linux-4.18.0-375.el8.x86_64/fs/nfsd/nfs4proc.c

A few relevant lines from the source code follow:

415 switch (open->op_claim_type) {
416         case NFS4_OPEN_CLAIM_DELEGATE_CUR:
417         case NFS4_OPEN_CLAIM_NULL:
418                 status = do_open_lookup(rqstp, cstate, open, &resfh);
419                 if (status)
420                         goto out;
421                 break;
422         case NFS4_OPEN_CLAIM_PREVIOUS:
423                 status = nfs4_check_open_reclaim(cstate->clp);
424                 if (status)
425                         goto out;
426                 open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
427                 reclaim = true;
428                 /* fall through */
429         case NFS4_OPEN_CLAIM_FH:
430         case NFS4_OPEN_CLAIM_DELEG_CUR_FH:
431                 status = do_open_fhandle(rqstp, cstate, open);
432                 if (status)
433                         goto out;
434                 resfh = &cstate->current_fh;
435                 break;

A look at the source file reveals that line 418 contains a do_open_lookup function call. To find out why the call is happening so often, it would make sense to perform the steps shown in this section on that function, but we won't pursue the matter further in this article.

Following the timing information, the script outputs a control flow graph. Each line of output shows the module name followed by information about the executing statement: the function name followed by an ampersand, the filename followed by a colon, and the line number within the filename. A from line is generated for each statement that runs, followed by a list of indented to lines indicating the statements that followed at various times.

The following output shows that much of the control flow in the function is linear, because each from line is followed by a single to destination. A linear flow is also suggested when the number of times a line runs is 12, the same as the function itself. However, some from statements are followed by more than one to destination, indicating a branch:

control flow graph information
from
    to
=======================
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:352")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403") 6
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418") 8
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419") 6
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461") 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466")
    module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467")
    module("nfsd").function("nfsd4_open@fs/nfsd/nfs4proc.c:350").return 12

In particular, note the first from line for nfsd4_open@fs/nfsd/nfs4proc.c, line 415. This line is a switch statement that invokes nfsd4_open@fs/nfsd/nfs4proc.c, line 418, 8 times, and invokes nfsd4_open@fs/nfsd/nfs4proc.c, line 431, 4 times. Lines 418 and 431 are different case statements in the switch-case statement. This information can be used to better understand the paths through the code.

Conclusion

SystemTap offers many example scripts out of the box to help uncover potential system problems. The example index helps you find appropriate scripts. These existing scripts can be easily run, as demonstrated in this article. The scripts can also be used as a starting point for your own instrumentation and a way to better understand how to write your own SystemTap scripts. The second article in the series will walk through the linetimes.stp example to teach how you can create or adapt a SystemTap script to your needs.

Last updated: November 6, 2023