Querying DWARF For Fun And Profit

dwarfDebugging information provides a view into the source code of an application. It's by no means exhaustive, but many features present in the source code are present in the debugging information as well: translation units, functions, scopes, types, variables, etc. It is essential to a range of tools: GDB, SystemTap, OProfile, as well as various developer aids (such as pahole or libabigail).

So clearly this data can be a useful source of information (hence the name, after all). But the only way to access it is through a dumping tool—such as dwarfdump or {eu-,}readelf -w. Deriving any sort of relationship from a dump is a very clumsy undertaking that usually requires nontrivial amounts of scripting to glue the disparate pieces together.

A new tool that Red Hat has been working on lately addresses the need to query debugging information in a structured manner. It is called dwgrep (DWARF grep). As the name indicates, it's aimed particularly at DWARF, which is the name of the format that's generally used in Linux for representing debugging information.

There are two major families of use cases. The first is automated checking of generated DWARF. This comprises looking for instances of a known bug when trying to gauge its impact, discovery of an obscure DWARF construct to check a DWARF consumer that you are writing, or testing DWARF for structural soundness.

The second use case family is that of writing small, ad-hoc tools for deriving information from DWARF. For example, it's not hard to write a small script that dumps class inheritance information, or one that shows all typedefs and what they resolve to.

Both of these families are closely related: in each case, you present a script that describes some sort of relationship between various parts of the debugging information (or ELF in general), and the tool goes ahead to find instances of that relationship. In addition to pattern matching, dwgrep has some typical general-purpose tools, such as integer math or string formatting.

Note that this article mostly skims over some of the key DWARF concepts. I recommend reading through Michael Eager's Introduction to the DWARF Debugging Format, if you are new to DWARF. It makes a good first exposition.

shared-lib-calls-exit

Dwgrep's first version was released recently, and I went ahead to package it for Fedora, so that it's easy for people to get their hands on. There's a whole process around getting a package to Fedora, with meticulous eye towards quality. One of the tests checks that a shared library doesn't call exit. It turned out that dwgrep's own library did:

dwgrep-libzwerg.x86_64: W: shared-lib-calls-exit /usr/lib64/libzwerg.so.0.1 exit@GLIBC_2.2.5

That was confusing to me: that's not the sort of interface that I'd come up with. Luckily, at that point dwgrep was mature enough that it could be used to find the offender. (Don't worry about the details of the query, we'll get to that later.)

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
        entry (name == "exit") parent @AT_decl_file'
/home/petr/proj/dwgrep/64/libzwerg/lexer.cc

Ah ha! lexer.cc is calling exit. Now, lexer.cc is a file generated from a flex source, lexer.ll, and indeed there's no open-coded call to exit in there. But in the autogenerated code, sure enough:

static void yy_fatal_error (yyconst char* msg , yyscan_t yyscanner)
{
        (void) fprintf( stderr, "%sn", msg );
    exit( YY_EXIT_FAILURE );
}

The Language

So what are the rules of the expression language?

Dwgrep operates as a stack machine with a twist. The stack that the query operates on contains values of various types: integers and strings, but also DWARF entries, attributes and other artifacts. The DWARF file that you mention on the command line is put as an initial value to the stack:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e ''
<Dwarf "./libzwerg/libzwerg.so.0.1">

The query language is a concatenative language, which means that if you have two valid expressions, you can combine them simply by placing them next to each other. Unlike e.g. Forth, there's a bit more syntax, but the overall philosophy remains.

The simplest expression, apart from an empty expression, is a mere word. E.g. child, parent, entry, or @AT_decl_file. Words take values from the stack and put values back. When several words are written in the row, they hand the stack over to one another.

The twist mentioned in the previous paragraph is this: expressions can return more than once. In normal concatenative languages, you have one input stack and one output stack. Dwgrep generalizes this: each word has exactly one input stack, but zero or more output stacks.

As it turns out, this simple mechanism is very handy for depth-first exploration of DWARF files. Each word simultaneously filters and extends the search space, and the stacks that come out of the end of the query are the solution.

As an example, take the word unit. That expects a stack whose top value is a DWARF object, and produces a number of stacks, one for each compilation unit in the program:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e 'unit'
CU 0
CU 0x493ef
CU 0x5f304
CU 0xdaae5
CU 0xf71e5
[... etc ...]

In the initial example, entry is a word that takes a DWARF object, and produces its debuginfo entries. That is to say, each input stack of entry is supposed to have a DWARF object on top. That object is discarded, and entry then produces one stack for each entry in that DWARF file, with that entry pushed on top of the stack:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e 'entry' | grep '^['
[b] compile_unit
[29]    namespace
[34]    structure_type
[40]    member
[4b]    typedef
[56]    subprogram
[... etc ...]

name is a word as well: it takes an object, and produces its name, which could be a file name of a DWARF object, or a DW_AT_name of a DIE:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e 'entry name'
/home/petr/proj/dwgrep-older/64/libzwerg/parser.cc
std
integral_constant<bool, false>
value
value_type
operator std::integral_constant<bool, false>::value_type
[... etc ...]

name does not produce anything if there is no name associated with the object.

parent then takes a DIE and produces its parent (if any), and @AT_decl_file takes a DIE and produces value of attribute DW_AT_decl_file (again, if any).

Dwgrep actually has an in-depth tutorial, and a syntax reference that's mostly designed to be read as a tutorial as well. If you want to know more, please read through them. Syntax reference is also where you would look for explanations regarding the whole subexpression (name == "exit"), which is called "infix assertion".

You may also wish to look at the vocabulary of types of DWARF-related stack values and words applicable to them. That is however not designed as a tutorial at all, but rather as an exhaustive reference. So instead this article will concentrate on bringing some of that in a form that's relatively easy to stomach.

Navigating the DWARF Graph

Under typical circumstances, the way that dwgrep is operated is that you put a file name of a binary on the command line, and an object corresponding to that file is put to stack. For separate debuginfo, all you need to do is mention the name of the main file. If the debuginfo file is installed, elfutils' libdw will locate it for you automatically behind the scenes. You then operate on that file.

Apart from this approach, you could use the word dwopen to convert a string to a DWARF object. It is thus possible to open several files and do some sort of cross-querying in those. For example, one could write a query to find source files used from two different modules:

$ ./dwgrep/dwgrep '
    "dwgrep/dwgrep" dwopen unit root name
    (== "libzwerg/libzwerg.so.0.1" dwopen unit root name)'
/home/petr/proj/dwgrep-older/libzwerg/strip.cc

It turns out that the main dwgrep binary shares one source unit with the libzwerg library.

Dwgrep has fairly solid first-class support for .debug_info and .debug_abbrev, meaning that there are words for direct navigation of entities that live in these sections. Both of them contain units, entries, and attributes. In .debug_info, the concrete types of these entities are called T_CU (for compile unit), T_DIE (for debug info entry) and T_ATTR. In .debug_abbrev, the types are T_ABBREV_UNIT, T_ABBREV and T_ABBREV_ATTR.

To display relations among the types, let's use the following notation: "T₁ (W)→ T₂", which reads: given a value of type T₁, when word W is applied to it, value of type T₂ is produced. If the operator is "→*" instead, it means "zero or more values are produced". When it is "→?", it means "at most one value is produced".

The following words are available for the motion from more high-level concepts to more low-level ones and back:

- T_DWARF (unit)→* T_CU -(entry)→* T_DIE (attribute)→* T_ATTR
- T_DWARF (entry)→* T_DIE
- T_DWARF (abbrev)→* T_ABBREV_UNIT (entry)→* T_ABBREV (attribute)→* T_ABBREV_ATTR
- T_CU (root)→ T_DIE
- T_DIE (unit)→ T_CU

The following are for sideways motion, from .debug_info to .debug_abbrev.

- T_CU (abbrev)→ T_ABBREV_UNIT
- T_DIE (abbrev)→ T_ABBREV

Within .debug_info itself, the following words are available for navigating the graph:

- T_DIE (child)→* T_DIE  # For access to children of a node.
- T_DIE (parent)→? T_DIE # For access to parent of a node.
- T_DIE (root)→ T_DIE    # For access from a node to CU DIE.

There are more types than those mentioned—e.g. location expressions and address sets. The DWARF vocabulary reference mentioned above lists them all - we won't go through them in detail here.

Cooking The Dwarfs

One word that we haven't used so far is attribute. Let's go back to the original example, slightly modified:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "exit") ?TAG_GNU_call_site'
[551cc] GNU_call_site
    low_pc (addr)   0x37d19;
    abstract_origin (ref4)  [5f112];

And now let's dump the attributes:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "exit") ?TAG_GNU_call_site attribute'
low_pc (addr)   0x37d19;
abstract_origin (ref4)  [5f112];
external (flag_present) true;
name (strp) exit;
decl_file (data1)   /usr/include/stdlib.h;
decl_line (data2)   543;

Curiously, we end up with quite a few more attributes than the previous dump showed. The reason for that is, that the DWARF file that we work with is cooked. .debug_info values come in two flavors: cooked and raw. Raw values remain faithful to the underlying representation, but cooked ones interpret things. Thus in the previous example, the attribute word pulled in attributes from the DIE referenced by DW_AT_abstract_origin. It would likewise pull attributes from DW_AT_specification, and it would keep pulling if the specification DIE contained in turn more such attributes. For example, consider the following case:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site'
[36f1ba]    GNU_call_site
    low_pc (addr)   0x60ed7;
    abstract_origin (ref4)  [3680b6];
    sibling (ref4)  [36f1e5];

Let's follow the rabbit down the hole:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site @AT_abstract_origin'
[3680b6]    subprogram
    abstract_origin (ref4)  [368084];
    linkage_name (strp) _ZN11overload_opC2ESt10shared_ptrI2opE17overload_instance;
    low_pc (addr)   0x600d0;
    high_pc (data8) 33;
    frame_base (exprloc)    0..0xffffffffffffffff:[0:call_frame_cfa];
    object_pointer (ref4)   [3680da];
    GNU_all_call_sites (flag_present)   true;
    sibling (ref4)  [368122];

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site
    @AT_abstract_origin @AT_abstract_origin'
[368084]    subprogram
    specification (ref4)    [34c2b1];
    inline (data1)  DW_INL_not_inlined;
    object_pointer (ref4)   [368093];
    sibling (ref4)  [3680b6];

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site
    @AT_abstract_origin @AT_abstract_origin @AT_specification'
[34c2b1]    subprogram
    external (flag_present) true;
    name (strp) overload_op;
    decl_file (data1)   /home/petr/proj/dwgrep-older/libzwerg/overload.cc;
    decl_line (data1)   205;
    accessibility (data1)   DW_ACCESS_public;
    declaration (flag_present)  true;
    object_pointer (ref4)   [34c2c1];
    sibling (ref4)  [34c2d1];

... and finally we are at the bottom. attribute will bring in all these attributes to one view, pruning duplicates (except for DW_AT_abstract_origin and DW_AT_specification themselves, which seems like a bug).

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site attribute'
low_pc (addr)   0x60ed7;
abstract_origin (ref4)  [3680b6];
sibling (ref4)  [36f1e5];
abstract_origin (ref4)  [368084];
linkage_name (strp) _ZN11overload_opC2ESt10shared_ptrI2opE17overload_instance;
high_pc (data8) 33;
frame_base (exprloc)    0..0xffffffffffffffff:[0:call_frame_cfa];
object_pointer (ref4)   [3680da];
GNU_all_call_sites (flag_present)   true;
specification (ref4)    [34c2b1];
inline (data1)  DW_INL_not_inlined;
external (flag_present) true;
name (strp) overload_op;
decl_file (data1)   /home/petr/proj/dwgrep-older/libzwerg/overload.cc;
decl_line (data1)   205;
accessibility (data1)   DW_ACCESS_public;

The word raw produces a raw view of the object it's applied to. When we use attribute on such an object, only the attributes that are actually present at the object are produced:

$ ./dwgrep/dwgrep ./libzwerg/libzwerg.so.0.1 -e '
    entry (name == "overload_op") ?DW_TAG_GNU_call_site raw attribute'
low_pc (addr)   0x60ed7;
abstract_origin (ref4)  [3680b6];
sibling (ref4)  [36f1e5];

Other words sensitive to value "doneness" are child and parent. When child encounters a DW_TAG_imported_unit tag, it suppresses yielding of this DIE, and instead recursively produces DIE's referenced by that import point. It also makes a note of which import point the imported DIE's were brought in by.

The word parent then makes use of that note. When it traverses to a root node, it checks whether an import point was remembered, and if yes, it produces that import point's parent instead of the root DIE.

Now, What?

Should you need more information, the following are resources associated with dwgrep project as such:

Other than that, let me know what you think and what you would like to see. If something doesn't work, please file issues in the issue tracker.

Last updated: April 6, 2018