This article describes the new file descriptor related features recently added to Valgrind and explains how to use them to track the file descriptors that your program is using or, more importantly, misusing.
Valgrind was able to track file descriptors for a while. But my colleague Mark Wielaard and I improved this feature and turned all file descriptor operations into events, which allowed Valgrind to create errors on misuse that can show where the operation occurred, made them suppressible, and can be reported in different formats.
What is Valgrind?
Valgrind is an instrumentation framework for building dynamic analysis tools. The tool can detect various memory management and threading bugs. To be able to detect the bugs, Valgrind decompiles, instruments, and recompiles your code to intercept syscalls, signals, threading, and etc.
The most used Valgrind tool, also used by default, is Memcheck, which detects inaccessible or undefined memory usage. The other tools are Cachegrind, a cache profiler; Massif, a heap profiler; and Helgrind, which is a thread debugger and the special none tool (which didn't report any errors before).
Valgrind Error Manager
Valgrind tools use the Error Manager to create and report warnings and errors with backtraces. Valgrind’s output is human readable by default. But by using the --xml=yes
option, Valgrind output is generated in a machine readable form in XML format. Such an output can be then used by some other tools, for example by Eclipse. See below:
<auxwhat>Previously closed</auxwhat>
<stack>
<frame>
<ip>0x........</ip>
<obj>...</obj>
<fn>client</fn>
<file>fdleak_ipv4.c</file>
<line>90</line>
</frame>
</stack>
<auxwhat>Originally opened</auxwhat>
<stack>
<frame>
<ip>0x........</ip>
<obj>...</obj>
<file>fdleak_ipv4.c</file>
<line>90</line>
</frame>
</stack>
Use Valgrind for testing purposes
It is possible to suppress the errors Valgrind outputs. There are cases when the clean output is required, for example, if Valgrind is used for testing. If Valgrind is used for testing and some false positives or known errors are present, we want to suppress them. This is done by using the --gen-supressions=yes
option which generates output that may be placed into a file that can later be used to cause Valgrind to suppress certain errors.
It is also possible to make Valgrind return a specific number if it detects any error. For example, if Valgrind is run with the --error-exitcode=99
option, it will return 99 on exit if any error is detected.
File descriptors
File descriptors are either inherited at program startup (e.g., stdin/stdout/stderr
), created by the system calls (such as creat ()
, open ()
, socket ()
, and etc.), and then destroyed by the system calls (such as close ()
or close_range ()
).
We can think about the file descriptors as about the other resources like memory for example. File descriptors are created and destroyed by calling syscalls similar to how memory is allocated and then freed. We can encounter similar bugs when using the file descriptors as with the addresses pointing to the memory blocks. File descriptors could be closed again after closing them like memory which is freed twice. And you could use file descriptors that you never created or inherited.
Misusing file descriptors
The following are ways in which file descriptors can be misused:
- Using too large or negative file descriptors
- File descriptor double close
close_range ()
system call- File descriptor leak
- File descriptor use after close
- Inspecting file descriptors with GDB
Using too large or negative file descriptors
For tracking file descriptors when using Valgrind, the --track-fd=yes
option is used. The whole command is then:
valgrind --track-fds=yes ./prog_name
When too high file descriptor number is used, for example:
write(12345, string, 3);
Valgrind checks the file descriptor limit (and whether the number is negative). If the file descriptor number is above the limit (rlimit ()
) for the program, Valgrind generates this error:
==3695625== Invalid file descriptor 12345
==3695625== at 0x4991984: write (write.c:26)
==3695625== by 0x4012ED: main (bad.c:46)
The error produced by Valgrind is similar, when the negative file descriptor is used:
==682855== Invalid file descriptor -1
==682855== at 0x4991984: write (write.c:26)
==682855== by 0x401363: main (bad.c:54)
File descriptor double close
When we are done with the file descriptor, it should be closed, similarly to freeing memory. And similarly to freeing memory, file descriptor is supposed to be closed only once, causing bugs otherwise. If the file descriptor is closed for the second time it is not a big deal. The second close ()
call will just return EBADF
error. The problem is, when the close ()
is accidentally called repeatedly in a multi-threaded program, another thread could have called open ()
in between the close ()
calls, so you may bogusly close the file you have not intended to close, causing unexpected problems.
Valgrind is now able to detect that file descriptor was closed already (from the version 3.23.0):
==3521944== File descriptor 3: /dev/pts/0 is already closed
==3521944== at 0x497F804: close (close.c:27)
==3521944== by 0x401322: main (bad.c:51)
==3521944== Previously closed
==3521944== at 0x497F804: close (close.c:27)
==3521944== by 0x4012CF: main (bad.c:44)
==3521944== Originally opened
==3521944== at 0x497FA4B: dup (syscall-template.S:120)
==3521944== by 0x401208: main (bad.c:29)
This backtrace shows where the file descriptor was opened, where it was closed for the first time, and where it is closed again, causing potential problems.
close_range () system call
This system call is relatively new; it was introduced in Linux 5.9, glibc 2.34. close_range ()
syscall closes all file descriptors in a given range. It can be used to close all the file descriptors that are possibly left. For example:
close_range(3, ~0U, flags);
It can also be used for closing some small range of the file descriptors. Valgrind warns about double close only in the case of closing some small range.
File descriptor leak
When the file descriptor is never closed, it leaks. It means the program might have lost track of the file descriptor. File descriptors take up resources, so they should be disposed of as soon as they are not used anymore. Valgrind gives a backtraces telling where the leaked file descriptor was originally opened:
==3696499== FILE DESCRIPTORS: 4 open (3 std) at exit.
==3696499== Open file descriptor 4: /dev/pts/10
==3696499== at 0x498C1BB: dup (syscall-template.S:120)
==3696499== by 0x40121D: main (bad.c:31)
Inherited file descriptors, like stdin/stdout/stderr
, are normally not closed by the program, so we allow them to be ignored if you use --track-fds=yes
, but they can be tracked with --track-fds=all
. See below:
==3696688== FILE DESCRIPTORS: 4 open (3 std) at exit.
==3696688== Open file descriptor 4: /dev/pts/10
==3696688== at 0x498C1BB: dup (syscall-template.S:120)
==3696688== by 0x40121D: main (bad.c:31)
==3696688==
==3696688== Open file descriptor 2: /dev/pts/10
==3696688== <inherited from parent>
==3696688==
==3696688== Open file descriptor 1: /dev/pts/10
==3696688== <inherited from parent>
==3696688==
==3696688== Open file descriptor 0: /dev/pts/10
==3696688== <inherited from parent>
File descriptor use after close
This is work is available from the Valgrind 3.24.0 which was released November 1st, 2024. Valgrind records where the file descriptor was originally destroyed and will output the error with the backtrace showing where an already closed file descriptor is used again:
close(fd);
write(fd, string, 3);
This code will make Valgrind show this error. The backtrace is letting you know where was the file descriptor originally opened, where it was closed, and where it was used again:
==3696196== File descriptor 3: /dev/pts/10 is already closed
==3696196== at 0x498BF74: close (close.c:27)
==3696196== by 0x401356: main (bad.c:54)
==3696196== Previously closed
==3696196== at 0x498BF74: close (close.c:27)
==3696196== by 0x4012D7: main (bad.c:45)
==3696196== Originally opened
==3696196== at 0x498C1BB: dup (syscall-template.S:120)
==3696196== by 0x401210: main (bad.c:30)
Inspecting file descriptors with GDB
Valgrind can co-work with GNU Project Debugger (GDB) pretty well. It’s possible to run them in a separate terminals and ask Valgrind about various things using monitor
commands from GDB. (I wrote an article about that: Debug memory errors with Valgrind and GDB.) It is also possible to run Valgrind from inside GDB. I described how to do this in detail in another one of my previous posts, Valgrind and GDB in close cooperation.
To run Valgrind from inside GDB to track the file descriptors in a program called bad
, use the following commands:
gdb -ex 'set remote exec-file ./bad' -ex 'set sysroot /' ./bad
(gdb) target extended-remote | vgdb --multi --vargs -q --track-fds=yes
After using those commands, Valgrind is launched and it acts as a Gdbserver from GDB’s point of view. To inspect the currently open file descriptors, v.info
command is used:
(gdb) monitor v.info open_fds
==3698979== FILE DESCRIPTORS: 5 open (3 std) .
Open AF_UNIX socket 4: <unknown>
==3698979== at 0x498C1BB: dup (syscall-template.S:120)
==3698979== by 0x40121D: main (bad.c:31)
==3698979==
Open AF_UNIX socket 3: <unknown>
==3698979== at 0x498C1BB: dup (syscall-template.S:120)
==3698979== by 0x401210: main (bad.c:30)
Summary
Using Valgrind is not only useful for debugging memory related problems, but also for debugging other kinds of problems. In a new Valgrind version (3.23.0), we extended file descriptors tracking, giving the users' backtraces with detailed information about when this resource was created, destroyed, and misused. This is another very useful way to debug not only file descriptors related problems is using Valgrind from inside GDB.