Featured image for a Linux topic.

Libabigail is a framework dedicated to analyzing changes to application binary interfaces (ABIs) in ELF binaries. Libabigail 2.0, the latest major release of the framework, was released in October of 2021.

This article is a tour of the main changes delivered in this major release. You'll learn about changes to the core library to start, then move on to updates to specific ABI analysis tools and the library's licensing terms.

New symbol table reader component

The ELF/DWARF front end of the Libabigail core library was refactored to use a new ELF symbol table reader module. This new module was factorized out of the front end and designed as a standalone module in its own namespace, dubbed symtab_reader. That new namespace contains various components that collaborate to provide a consistent view over a symbol table that can be loaded and queried.

One main feature of the new symtab_reader is that it transparently handles the different binary formats of symbol tables of the Linux kernel, as well as details like relocations, symbol namespaces, call frame information indirections, and differences between vmlinux binaries and kernel modules. It also incorporates Libabigail's existing support for the specifics of the PPC64 ELF symbol tables.

symtab_reader is now used by the DWARF/ELF reader front end during the analysis of binaries, and enhances the maintainability of the code base as a whole.

Correctness of type canonicalization

One of the many constraints that Libabigail has to deal with is that ABI comparison has to be fast enough to be useful. In essence, however, comparing aggregate types of arbitrary size and depth is a quadratic problem, so naively attempting to compare types coming from real-life C or C++ binaries can literally take hours on the average workstation.

Libabigail uses a certain number of heuristics to complete those type comparisons in a reasonable time. One of those heuristics is the type canonicalization optimization.

Intuitively, the time taken by the comparison of two aggregate types of size N is proportional to N2—and that same amount of time is spent whenever those two types are compared. The goal of the type canonicalization optimization is to transform that N2 time back into a (relatively short) constant time. In terms of processing, the first comparison between those two types would still take a time proportional to N2; but all subsequent comparisons would then be performed in that shorter amount of time. This type canonicalization optimization is vital because slow comparisons can make the framework effectively unusable.

In Libabigail 2.0, this optimization is more precise and robust. It now works better with anonymous types, and several general bugs have been addressed. The correctness of type change detection has thus been improved.

Support for DWARF 5

Support for version 5 of DWARF has been improved in Libabigail 2.0. As a consequence, binaries emitted by GCC 11 and Clang 13 that use DWARF 5 are now fairly well supported.

Debugging and bug fixes

A number of debugging primitives have been introduced to ease the inspection of the internal representation of types while debugging Libabigail using a debugger like GDB. This makes it significantly easier to debug issues that involve type representation in the framework.

Several bugs have been fixed throughout the framework, leading to better support for ELF aliased symbols, improved support for multi-language binaries, improved filtering of data member offset changes, better sorting of anonymous types in ABIXML files, better handling of member functions changes in union types, and improved support for function parameter addition detection.

Updates to abidw

abidw now emits ABIXML files in a new format, dubbed version 2.0.

Due to a change in the interpretation of DWARF 5's DW_AT_bit_offset attribute in this version of Libabigail, the value of the offset of data members might have changed in an incompatible way, which has driven the change in the ABIXML format. Libabigail can still read ABIXML files in the previous 1.0 format, though.

Updates to abipkgdiff

abipkgdiff should now properly show binary files that were either added to or removed from packages. It also stops erasing the working directory used for binary comparison before it's done using their content. The source code of this tool was fixed to prevent some compilation warnings.

Updates to abidiff

Libabigail 2.0 includes general bug fixes in abidiff. Data members are no longer qualified in diff reports. The --dump-diff-tree option now works in the leaf reporting mode.

abidiff has also been modified to accommodate the ABIXML format change. abidiff will refuse to compare a file written ABIXML 1.0 against one written in ABIXML 2.0; the two files must have the same version number. Note that you can always regenerate an ABIXML 2.0 file from its originating ELF binary using the current abidw tool.

Move to the Apache 2 license

Libabigail's codebase is now licensed under the Apache 2 license, with the LLVM exception. It was previously licensed under version 3 of the LGPL. This license change means that the Libabigail library is now compatible with most copyleft and non-copyleft code, increasing the amount of the code that can use the Libabigail library to perform ABI analysis.

Move to C++11

Libabigail's source code is now written in C++11, at long last. As we need to make sure the framework builds on the Red Hat Enterprise Linux 7 platform, we intend to stick with C++11 for the rest of the expected lifetime of that platform—meaning at least until June of 2024. After that date, we'll consider moving the source base to a more modern version of the C++ language.

Find out more about Libabigail 2.0

We hope you enjoyed this tour of the changes in Libabigail 2.0. If you are interested in learning more about the project, you can get involved with our mailing list. Libabigail users and developers are also available on the irc.oftc.net IRC network, in the #libabigail channel. We'd be happy to see you there!