SystemTap allows developers to add instrumentation to Linux systems to better understand the behavior of the kernel as well as userspace applications and libraries. This article, the first in a two-part series, shows how SystemTap can reveal potential performance problems down to individual lines of code. The second part of the series will describe how a SystemTap performance monitoring script was written.
The SystemTap examples repository
Writing a program from scratch in an unfamiliar language can be daunting. To make it easier for people to use SystemTap, the project provides a collection of examples. Some investigate common system issues. Other examples are there to demonstrate particular SystemTap features.
To provide a bit of structure and make it easier to locate desired scripts out of the approximately 200 SystemTap examples, they are divided into subdirectories. Additionally, each of the examples has associated keywords such as "NETWORK", "DISK", "NFS", and "PROCESS." These keywords are used to create indexes that make it easier to find a script that implements the instrumentation you are interested in or provides a good starting point for your particular needs.
For example, suppose you want to better understand a section of code and where it is taking a lot of execution time. Other tools on Linux, such as perf
, can identify the hot spots in the code, but you might like to have some additional information about the control flow in the program. You look at the examples associated with the PROFILING keyword and find a linetimes.stp
script that looks promising. This article shows how to use this script as an illustration of SystemTap's capabilities as well as preparation for the programming guidelines in the second part of the series.
Profiling with linetime.stp
SystemTap examples are installed as part of the systemtap-client RPM in the /usr/share/systemtap/examples
directory in Red Hat Enterprise Linux and Fedora.
The linetimes.stp
script takes two arguments. The first argument determines whether the kernel, a kernel module, or a user-space binary is being monitored. The second argument is the name of the function being monitored.
To get a better understanding of the nfsd4_open
function in the kernel file fs/nfsd/nfs4proc.c
, I ran the following command on a Red Hat Enterprise Linux 8 machine running as an NFS server. The --example
option indicates that the script is in the examples directory. Because the NFS server code is compiled as the nfsd
module, the command needs to specify the code location with module("nfsd")
. Core kernel code would be specified as kernel
. The function name, nfsd4_open
, follows on the command line. The -c "sleep 10"
argument runs the script for 10 seconds and then exits:
$ sudo stap --example linetimes.stp 'module("nfsd")' nfsd4_open -c "sleep 10"
I got the following output from the script. The opening line states that the function was called 12 times. Looking through the times listed for the regions, I see that one line, 418, stands out as being very expensive with an average runtime of 3,937 microseconds:
module("nfsd") nfsd4_open call count: 12
region avg(us) max(us)
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:352") 2 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353") 2 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358") 6 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382") 5 18
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418") 3937 8962
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419") 8 16
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431") 5 5
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453") 5 7
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454") 2 3
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465") 1 3
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466") 1 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467") 1 2
The source file containing this line is:
/usr/src/debug/kernel-4.18.0-375.el8/linux-4.18.0-375.el8.x86_64/fs/nfsd/nfs4proc.c
A few relevant lines from the source code follow:
415 switch (open->op_claim_type) {
416 case NFS4_OPEN_CLAIM_DELEGATE_CUR:
417 case NFS4_OPEN_CLAIM_NULL:
418 status = do_open_lookup(rqstp, cstate, open, &resfh);
419 if (status)
420 goto out;
421 break;
422 case NFS4_OPEN_CLAIM_PREVIOUS:
423 status = nfs4_check_open_reclaim(cstate->clp);
424 if (status)
425 goto out;
426 open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
427 reclaim = true;
428 /* fall through */
429 case NFS4_OPEN_CLAIM_FH:
430 case NFS4_OPEN_CLAIM_DELEG_CUR_FH:
431 status = do_open_fhandle(rqstp, cstate, open);
432 if (status)
433 goto out;
434 resfh = &cstate->current_fh;
435 break;
A look at the source file reveals that line 418 contains a do_open_lookup
function call. To find out why the call is happening so often, it would make sense to perform the steps shown in this section on that function, but we won't pursue the matter further in this article.
Following the timing information, the script outputs a control flow graph. Each line of output shows the module name followed by information about the executing statement: the function name followed by an ampersand, the filename followed by a colon, and the line number within the filename. A from
line is generated for each statement that runs, followed by a list of indented to
lines indicating the statements that followed at various times.
The following output shows that much of the control flow in the function is linear, because each from
line is followed by a single to
destination. A linear flow is also suggested when the number of times a line runs is 12, the same as the function itself. However, some from
statements are followed by more than one to
destination, indicating a branch:
control flow graph information
from
to
=======================
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:352")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:353")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:354")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:357")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:358")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:365")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:368")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:373")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:374")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:378")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:382")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:383")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:395")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:397")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:402")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:403")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:408")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:411")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:415")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:418")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461") 2
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:419")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461") 6
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:431")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:432")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:453")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:454")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:457")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:460")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465") 4
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:461")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:462")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:463")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465") 8
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:465")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:466")
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467") 12
module("nfsd").statement("nfsd4_open@fs/nfsd/nfs4proc.c:467")
module("nfsd").function("nfsd4_open@fs/nfsd/nfs4proc.c:350").return 12
In particular, note the first from
line for nfsd4_open@fs/nfsd/nfs4proc.c
, line 415. This line is a switch statement that invokes nfsd4_open@fs/nfsd/nfs4proc.c
, line 418, 8 times, and invokes nfsd4_open@fs/nfsd/nfs4proc.c
, line 431, 4 times. Lines 418 and 431 are different case statements in the switch-case statement. This information can be used to better understand the paths through the code.
Conclusion
SystemTap offers many example scripts out of the box to help uncover potential system problems. The example index helps you find appropriate scripts. These existing scripts can be easily run, as demonstrated in this article. The scripts can also be used as a starting point for your own instrumentation and a way to better understand how to write your own SystemTap scripts. The second article in the series will walk through the linetimes.stp
example to teach how you can create or adapt a SystemTap script to your needs.