Numerous debugging formats must be supported on Linux systems for C and C++ programs. This article describes the formats supported by libabigail, a tool for creating and unpacking debugging formats, and explains how the redesign of libabigail supports multiple formats.
Debugging formats considered for libabigail
On Linux and most Unix systems, C and C++ programs are compiled into the Executable and Linkable Format (ELF). When the programmer asks the GCC compiler to add debugging information, that information is usually in the DWARF format, but the Compact C Type Format (CTF) is also supported.
Outside of C programs, two other interesting debugging formats are the BPF type format (BTF), which is used for Berkeley Packet Filter (BPF) programs in the Linux kernel, and ABIXML (an XML format emitted by the abidw tool to represent the application binary interface (ABI) of a binary).
DWARF format design
Initially, DWARF was the only format supported by libabigail. The software pipeline architecture had a "front-end" DWARF reader that analyzed the input ELF binary and its associated debug information in the DWARF format and constructed an in-memory intermediate representation called an ABI Corpus. The ABI Corpus could be further analyzed and manipulated by "middle-end" components. The result of the analysis was then emitted in the form of reports by "back-end" components.
This architecture resembled the one used by compilers with their front-end, middle-end, and back-end components. The front end in libabigail analyzed the input binary to construct the intermediate representation.
How the CTF front end evolved
After the 2.0 release of libabigail, members of the community expressed the need to support CTF, which called for a new front end. It's worth noting that other members of the community in the past had explored similar avenues by adding some experimental code to support BTF. So the community had already considered the need for multiple front ends.
As is often the case in our collaborative free software communities, support for CTF was added in an iterative manner.
The first iteration added a front-end CTF reader that analyzed ELF binaries and their accompanying CTF debug information, constructed the appropriate intermediate representation, and handed it over to the existing middle-end components. This project was a success and made its way into the 2.1 release.
However, several aspects could be improved. To accommodate two front ends, we made heavy use of preprocessor-based conditional compilation and code duplication. For instance, both the existing DWARF front end and the new CTF front end needed to process the ELF properties of input binaries. A lot of the ELF processing was thus duplicated in the CTF front end, which we recognized would increase our maintenance burden.
We, therefore, decided to find a better way to support several ELF-based front ends that share the same ELF-processing capabilities.
A related issue concerned how to choose the proper front end. The libabigail framework supports numerous tools that use it to analyze an input binary. These tools need to know which front end to instantiate. For instance, if the input ELF binary contains only CTF debug information, the tool needs to instantiate the CTF front end to perform the analysis. Conversely, if the ELF binary contains only DWARF debug information, the tool instead needs to instantiate the DWARF front end.
Tools, therefore, need basic ELF inspection capabilities. In libabigail parlance, tools need a basic ELF front end to know which format-specific front end to instantiate for further analysis.
The resulting multi-front-end architecture
To accommodate these new requirements, the libabigail community came up with a new multi-front-end architecture, where each front end implements an underlying front-end interface named abigail::fe_iface. For instance, the basic ELF reader that inspects the relevant ELF properties of an ELF binary and exposes them to tools is named abigail::elf::reader and inherits abigail::fe_iface.
The generic behaviors and properties of ELF-based front ends are abstracted out in the abigail::elf_based_reader class. Logically enough, the DWARF and CTF front ends are named abigail::dwarf::reader and abigail::ctf::reader, respectively.
Note that the methods and properties of these two front ends are not available to client code. Rather, once created, these front ends are accessible to client code as instances of abigail::elf_based_reader classes, leaving it possible to evolve the implementation of the actual front ends in the future while limiting the impact of those future changes on client code.
The interesting parts of the CTF and DWARF front ends that are accessible to client code are the factory functions needed to create those front ends, to begin with, called abigail::ctf::create_reader() and abigail::dwarf::create_reader(). These factory functions return a pointer to the abstract abigail::elf_based_reader interface, shielding the client code from the actual implementation of the DWARF or CTF reader.
Besides ELF-based input, libabigail tools deal with ABIXML, which they treat like the binary ELF input files. In that spirit, libabigail has a front-end class for ABIXML files named abigail::abixml::reader. That front-end class directly inherits the abigail::fe_iface class. The abigail::abixml::reader can be instantiated by the factory function abigail::abixml::create_reader(), for instance.
The CTF and DWARF front ends as well as the tools using them have been redesigned to rest on this new multi-front-end architecture as part of the 2.2 release.
Installing libabigail
Download libabigail from the libabigail repository on Fedora Linux. On Enterprise Linux distributions, you also need to activate the Extra Package for Enterprise Linux repositories.
Once you have your distribution setup, install the development libabigail packages using:
$ sudo dnf install libabigail-devel
Make sure you have at least the libabigail 2.2 version installed, which should be the case with any recent Fedora or Enterprise Linux distribution.
Example tool using libabigail
The libabigail tools now have the facilities to instantiate the appropriate front end in a straightforward manner based on the content of the input file they are presented with. Let's look at a hypothetical example of a tool that shall be named emit-abixml
. Its source code is in the file emit-abixml.cc
, written in C++ using libabigail.
Compile the emit-abixml.cc
file with this command:
$ g++ -g -I/usr/include/libabigail -labigail -o emit-abixml emit-abixml.cc
Once the emit-abixml tool is built, we can use it, for instance, to emit the ABIXML representation of itself by doing:
$ ./emit-abixml emit-abixml
This command emits the ABIXML representation of the emit-abixml
tool itself on the standard output. How cool is that?
Now, let's look at the content of emit-abixml.cc
. It's heavily commented for the pleasure of astute readers.
#include <iostream>
#include <string>
#include <abg-tools-utils.h>
#include <abg-writer.h>
int
main(int argc, char **argv)
{
// This is the path to the input binary file path given on the command line
std::string input_path = argv[0];
// The kind of input we expect. ARTIFICIAL_ORIGIN means we’ll let the machinery determine it.
abigail::corpus::origin input_kind = abigail::corpus::ARTIFICIAL_ORIGIN;
// This is where to look for debug information, in case those are put in separate locations.
std::vector<char**> debug_info_paths;
// This is the environment used by the libabigail framework.
abigail::ir::environment env;
// Let’s now instantiate the correct front-end that knows how to handle the input binary file.
abigail::elf_based_reader_sptr front_end =
abigail::tools_utils::create_best_elf_based_reader(input_path, debug_info_paths,
env, input_kind, /*show_all_types=*/false);
// OK, let’s use the front_end to analyze the input binary file and get its abi_representation
abigail::fe_iface::status final_status;
abigail::corpus_sptr abi_repr = front_end->read_corpus(final_status);
bool has_error = true;
// Let’s make sure the analysis went well
if (final_status & abigail::fe_iface::STATUS_OK)
has_error = false;
if (!has_error)
{
// Emit the ABIXML output.
abigail::xml_writer::write_context_sptr writer =
abigail::xml_writer::create_write_context(env, std::cout);
abigail::xml_writer::write_corpus(*writer, abi_repr, /*indentation=*/0);
}
return has_error;
}
The new architecture simplifies the libabigail framework
The new multi-front-end architecture simplifies the process of adding new kinds of front-ends to the libabigail framework so as to support the addition of new formats of debugging information in the future. It also simplifies the code of existing tools by giving them capabilities to instantiate the appropriate front end for the task.
To learn more about the overall API of the libabigail framework, read its apiDoc website. You can also engage with the development community via the mailing list.