The Linux perf tool was originally written to allow access to the performance monitoring hardware that counts hardware events, such as instructions executed, processor cycles, and cache misses. However, it can also be used to count software events, which can be useful in gauging how frequently some part of the system software is executed.
Recently, someone at Red Hat asked whether there was a way to get a count of system calls being executed on the system. The kernel has a predefined software trace point, raw_syscalls:sys_enter, which collects that exact information. It counts each time a system call is made. To use the trace point events, the perf command needs to be run as root.
The following code will give system-wide count (-a option) of system calls (-e raw_syscalls:sys_enter) every second (-I 1000):
# perf stat -a -e raw_syscalls:sys_enter -I 1000
# time counts unit events
1.000640941 1,250 raw_syscalls:sys_enter
2.001183785 1,901 raw_syscalls:sys_enter
3.001601593 1,922 raw_syscalls:sys_enter
The raw_syscalls:sys_enter trace point is just one predefined trace point event in the kernel. To list the other 1000+ predefined trace points events, run the following as root:
# perf list tracepoint List of pre-defined events (to be used in -e): block:block_bio_backmerge [Tracepoint event] block:block_bio_bounce [Tracepoint event] block:block_bio_complete [Tracepoint event] block:block_bio_frontmerge [Tracepoint event] block:block_bio_queue [Tracepoint event] block:block_bio_remap [Tracepoint event] block:block_dirty_buffer [Tracepoint event] block:block_getrq [Tracepoint event] block:block_plug [Tracepoint event] ...
You may want to have a counter for some arbitrary function in the kernel that does not yet have a trace point. No problem. You can define your own probe points and then use them in the perf stat command to monitor functions that implement expensive operations. For example, clearing a 2MB huge page has latency that is approximately 500 times longer than clearing a traditional 4KB page. These latencies can be noticeable, and you might want to know when a significant number of these delays occur.
The following sets up the probe point in the clear_huge_page function accessible to perf:
# perf probe --add clear_huge_page Added new event: probe:clear_huge_page (on clear_huge_page) You can now use it in all perf tools, such as: perf record -e probe:clear_huge_page -aR sleep 1
The following provides the count for every 10 seconds (10,000 milliseconds):
# perf stat -a -e probe:clear_huge_page -I 10000
# time counts unit events
10.000241215 73 probe:clear_huge_page
20.001129381 4 probe:clear_huge_page
30.001567364 3 probe:clear_huge_page
40.002202895 2 probe:clear_huge_page
50.003554968 1 probe:clear_huge_page
50.316752807 0 probe:clear_huge_page
...
When you no longer need the probe point for the clear_huge_page function, it can be removed as shown below.
# perf probe --del=probe:clear_huge_page Removed event: probe:clear_huge_page
The perf probe points can also be placed user-space executables. You may need to compile the code with debuginfo enabled (GCC's -g option) or to install the debuginfo RPMs to allow perf to find the location of the functions. To place a probe on the malloc function in the glibc library, the executable needs to be specified with the --exec option.
# perf probe --exec=/lib64/libc-2.17.so --add malloc Added new event: probe_libc:malloc (on malloc in /usr/lib64/libc-2.17.so) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1
Using probe_libc:malloc, you can get a count of the number of malloc calls occurring every 10 seconds. Below is the output from a machine that is initially sitting idle for the first 20 seconds. After 20 seconds, a parallel kernel build is started, and the number of times that malloc is called increases dramatically.
# perf stat -a -e probe_libc:malloc -I 10000
# time counts unit events
10.000900150 2 probe_libc:malloc
20.001803180 0 probe_libc:malloc
30.002286255 1,829,385 probe_libc:malloc
40.002442647 12,553,306 probe_libc:malloc
50.002578104 15,579,692 probe_libc:malloc
...
Once you're done with the user-space probe, it can be deleted:
# perf probe --exec=/lib64/libc-2.17.so --del malloc Removed event: probe_libc:malloc
Using perf stat with the software probe points can help you answer the question of how frequently some code is being executed. For more information about setting up software probe points, take a look at the perf-probe man page.