In the first article of this series, Getting started with the debugger, I introduced the GNU Debugger (GDB) and walked you through its common startup options and processes. As I promised in Part 1, this article introduces the debugging information that is used to describe compiled code.
What is debugging information?
Put simply, debugging information (debuginfo for short) is output by compilers and other tools to tell other development tools about compiled code. Debuginfo communicates important information about a program or library, including line tables that map executable code back to the original source code; file names and included files; descriptions of functions, variables, and types in the program; and much more.
All of this information is used by other tools, including GDB, to let users step through a program's source code line by line, inspect variables and stack backtraces, and perform other essential development and debugging tasks.
DWARF is the debuginfo standard used on the majority of operating systems today, including GNU Linux, BSD, and other Unix derivatives. It is language- and architecture-agnostic. The same debuginfo format is valid for any architecture (including x86_64, PowerPC, or RISC) or any language (including C and C++, Ada, D, or Rust).
The debugging information entries (DIEs) output by compilers and other tools provide a low-level description of the compiled source code. Every DIE consists of an identifying tag and various attributes that relay important information about the entity being represented. Various source language concepts, such as functions, may be described by several DIEs arranged in an essentially hierarchical order, with child DIEs refining the description of their parent DIE.
Reading debuginfo: A simple use case
To illustrate a (very) simple case, consider the following “Hello, World!” example in C:
1 #include <stdio.h> 2 3 int 4 main (void) 5 { 6 int counter; 7 8 for (counter = 0; counter < 10; ++counter) 9 printf ("Hello, World (%d)!\n", counter); 10 11 return 0; 12 }
If this were saved into a file named hello.c
and compiled with debugging information (see the discussion on compiler flags in the first article in this series), the output to describe this program might look like the pseudo-DIEs here:
DW_TAG_compile_unit DW_AT_producer: “gcc -g” DW_AT_language: “C” DW_AT_name: “hello.c” DW_AT_comp_dir: “/home/keiths” DW_TAG_base_type DW_AT_name: “int” DW_TAG_subprogram DW_AT_name: “main” DW_AT_decl_file: 1 DW_AT_decl_line: 4 DW_AT_type: link to DIE describing “int” DW_TAG_variable DW_AT_name: “counter” DW_AT_decl_file: 1 DW_AT_decl_line: 6 DW_AT_type: link to DIE describing “int” much more
The resulting debugging information describes a compilation unit with the name hello.c
, containing a function named main
that is defined on line 4 of hello.c
. This function contains a variable named counter
that is defined on line 6, and so on. Real DWARF output would, of course, contain much more information, such as where these entities exist in memory, but this simple example demonstrates the hierarchical nature of DWARF DIEs.
Where is debuginfo stored and how can you get it?
Debugging information is stored in several places, including local and remote file systems. Depending on what you are debugging, GDB and other tools search the following locations for the desired debugging information.
1. The program's ELF sections
For programs that you write yourself, such as my “Hello, World!” example, the debugging information is stored in the ELF file itself. When the file is loaded into tools that use debuginfo, the tools will read the various ELF sections related to debugging information (.debug_abbrev
, .debug_aranges
, .debug_info
, etc.) as needed.
2. Separate debug directories by name
For programs and libraries distributed by vendors such as Red Hat, all debugging information is stripped from programs and libraries and saved into separate packages. For example, the program cp
is in the coreutils
package, and its debugging information is in the coreutils-debuginfo
package.
To install the debuginfo for any distribution-supplied package such as coreutils
, use dnf debuginfo-install coreutils
. This will also download the associated debugsrc
package containing the package’s source code.
These separate debuginfo packages are installed under /usr/lib/debug
, a special directory that holds distribution-wide debuginfo. Tools such as GDB know to look in this directory when searching for any missing debugging information.
3. Separate debug directories by build ID
Build IDs are ELF note segments in the object file. A build ID is essentially a hash that uniquely identifies any given version of a program or library. Tools will look for this special note segment to find a build ID and use it to locate the debuginfo in the build ID debug directory, /usr/lib/debug/.build-id
. (Later in the article, you will learn how to query an object file for its build ID.)
4. debuginfod server by build ID
Many tools, including GDB, support the use of a debuginfod server, which allows users to download debugging information on demand from centralized servers. Any debuginfo downloaded by debuginfod will be stored in a cache directory in the user’s home directory, $HOME/.cache/debuginfod_client
.
Using debuginfod
To use debuginfod, set the environment variable DEBUGINFOD_URLS
to point at any servers you want to check for debuginfo. The upstream federated server maintained by the elfutils project can be used to automatically access debugging information for any maintained distribution:
$ export DEBUGINFOD_URLS="https://debuginfod.elfutils.org/ $DEBUGINFOD_URLS"
GDB 12 also contains new commands to control the debuginfod client library (each command also has a corresponding show equivalent):
set debuginfod enabled on/off/ask
: Enables or disables GDB's debuginfod support. If set toask
(the default when debuginfod support is available), GDB will ask the user to confirm the use of debuginfod to download missing debugging information for the current session. To permanently enable this, addset debuginfod enabled on
to your.gdbinit
start-up script. (For more information on start-up scripts, see the previous article in this series.)set debuginfod urls LIST
: Allows you to provide a space-separated list of URLs from which to query for debuginfo. If debuginfod support is enabled in GDB, this defaults to theDEBUGINFOD_URLS
environment variable.set debuginfod verbose on/off
: Controls the verbosity of debuginfod client library messages. To suppress these messages, set this value tooff
.
You can also set up your own debuginfod server. See Frank Eigler’s excellent article about that for more information.
When is debuginfo loaded?
GDB loads debugging information lazily unless the --readnow
option is passed on the command line. Whenever a new object file is encountered (that is, when a program is loaded into GDB or a shared library is loaded at runtime), GDB will quickly scan its debuginfo to collect important information. GDB will typically not read full symbols for any object file until it is requested by the user. For example, attempting to set a breakpoint on a function will cause GDB to expand debugging information for the given function’s compilation unit.
How to inspect debuginfo
There are a number of useful command-line tools for inspecting debugging information. These tools are typically a part of the binutils
and elfutils
packages on Red Hat-based systems.
To inspect build IDs, install the elfutils
package. To inspect debugging information more generally, install both binutils
and elfutils
packages.
$ sudo dnf install binutils elfutils
Inspect the build ID
You can use the eu-unstrip
program to get an object file’s build ID:
$ eu-unstrip -n -e /usr/bin/ls 0+0x23540 c1e1977d6c15f173215ce21f017c50aa577bb50d@0x378 /usr/bin/ls /usr/lib/debug/usr/bin/ls-8.32-30.fc34.x86_64.debug
The build ID is the second element up to the terminating @
: c1e1977d6c15f173215ce21f017c50aa577bb50d
, in this case. If the debuginfo for this program (coreutils-debuginfo
) is installed, it is easy to verify it:
$ ls -l /usr/lib/.build-id/c1/e1977d6c15f173215ce21f017c50aa577bb50d lrwxrwxrwx. 1 root root 22 Jul 7 09:15 /usr/lib/.build-id/c1/e1977d6c15f173215ce21f017c50aa577bb50d -> ../../../../usr/bin/ls
eu-unstrip
also works on core files by using --core=FILENAME
instead of the -e
option.
Inspect the DWARF debugging information
To inspect a program's DWARF debugging information, use readelf -w
(or the equivalent elfutils program, eu-readelf
) with the ELF file. If the file contains a build ID, readelf
will automatically find the separate debuginfo:
$ readelf -w /usr/bin/ls | head -15 /usr/bin/ls: Found separate debug info file: /usr/lib/debug//usr/bin//ls-8.32-30.fc34.x86_64.debug /usr/lib/debug//usr/bin//ls-8.32-30.fc34.x86_64.debug: Found separate debug info file: /usr/lib/debug/usr/bin/../../.dwz/coreutils-8.32-30.fc34.x86_64 Contents of the .eh_frame section (loaded from /usr/bin/ls): 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8
Was my program compiled with debuginfo?
One question that comes up often on the GDB libera.chat IRC channel is whether a user has compiled their program or library with debugging information. There are several ways to check this, including inspecting readelf
and objdump
output, but I often find querying the DWARF compilation unit’s producer
attribute useful. This gives the full list of compile flags passed to the compiler:
$ readelf -w hello | grep producer | head -1 DW_AT_producer : (indirect string, offset: 0x4b): GNU C17 11.2.1 20210728 (Red Hat 11.2.1-1) -mtune=generic -march=x86-64 -g
The output tells us exactly what compiler was used and what flags were used to compile the object file. In this case (the "Hello, World!" example program), the program was compiled with GCC 11.2.1-1, using the -g
option to include debugging information. (The other -m
flags are automatically added by Fedora’s GCC configuration.)
As a bonus, this query can also answer the second most commonly asked question on the GDB IRC channel: Was your program built with optimization? Since the DW_AT_producer
string listed above does not contain any optimization flags (-ON
), we know that this file was not compiled with any optimization.
For complex programs that contain many compilation units, it might be necessary to dump the output of readelf
to a file (or pipe it to less
) and search for the right compilation unit before looking at the producer
attribute.
Next up in this series
In this article, I have presented the very basic what, when, where, and how of debugging information. In the next article in this series, I will return to the GNU Debugger and discuss how to work with files of all kinds, including object and source files and shared libraries.
Do you have a suggestion or tip related to debugging information or a suggestion for a future topic about how to use GDB? Are you interested in a more in-depth article on DWARF? Leave a comment on this article and share your ideas or requests.
Last updated: August 14, 2023