How to write an ABI compliance

I've previously written about the challenges of ensuring forward compatibility for application binary interfaces (ABIs) exposed by native shared libraries. This article introduces the other side of the equation: How to verify ABI backward compatibility for upstream projects.

If you've read my previous article, you've already been introduced to Libabigail, a static-code analysis and instrumentation library for constructing, manipulating, serializing, and de-serializing ABI-relevant artifacts.

In this article, I'll show you how to build a Python-based checker that uses Libabigail to verify the backward compatibility of ABIs in a shared library. For this case, we'll focus on ABIs for shared libraries in the executable and linkable format (ELF) binary format that runs on Linux-based operating systems.

Note: This tutorial assumes that you have Libabigail and its associated command-line tools, abidw and abidiff installed and set up in your development environment. See the Libabigail documentation for a guide to getting and installing Libabigail.

Ensuring backward compatibility

If we state that the ABI of a newer version of a shared library is backward compatible, we're assuring our users that ABI changes in the newer version of the library won't affect applications linked against older versions. This means application functionality won't change or be disrupted in any way, even for users who update to the newer version of the library without recompiling their application.

To make such a statement with confidence, we need a way to compare the ABI of the newer library version against the older one. Knowing what the ABI changes are, we'll then be able to determine whether any change is likely to break backward compatibility.

The example project: libslicksoft.so

For the sake of this article, let's assume I'm the release manager for a free software project named SlickSoftware. I have convinced you (my fellow hacker) that the ABI of our library, libslicksoft.so, should be backward compatible with older versions, at least for now.  In order to ensure backward compatibility, we'll write an ABI-checking program that we can run at any point in the development cycle. The checker will help us ensure that the ABI for the current version of libslicksoft.so remains compatible with the ABI of a previous version, the baseline ABI. Once we've written the checker, we'll also be able to use it for future projects.

Here's the layout of the slick-software/lib directory, which contains SlickSoftware's source code:

+ slick-software/
|
+ lib/
|    |
|    + file1.c
|    |
|    + Makefile
|
+ include/
|        |
|        + public-header.h
|
+ abi-ref/

Let's start by setting up our example project.

Step 1: Create a shared library

To create a shared library, we visit the slick-software/lib directory and type make. We'll call the new shared library slick-software/lib/libslicksoft.so.

Step 2: Create a representation of the reference ABI

Our next step is to create a representation of the ABI for our shared library, slick-software/lib/libslicksoft.so. Once we've done that, we'll save it in the slick-software/abi-ref/ directory, which is currently empty.

The ABI representation will serve as a reference ABI. We'll compare the ABI of all subsequent versions of libslicksoft.so against it. In theory, we could just save a copy of libslicksoft.so and use the binary itself for ABI comparisons.  We've chosen not to do that because, like many developers, we don't like storing binaries in revision-control software. Luckily Libabigail allows us to save a textual representation of the ABI.

Creating the ABI representation

To generate a textual representation of an ELF binary's ABI, all we have to do is open your favorite command-line interpreter and enter the following:

$ abidw slick-software/lib/libslicksoft.so > slick-software/abi-ref/libslicksoft.so.abi

Automating the creation process

We can automate this process by adding a rule at the end of slick-software/lib/Makefile. In the future, we'll just type make abi-ref whenever we want to generate a textual representation of the ABI libslicksoft.so.abi file.

Here's the content of that Makefile:

$cat slick-software/lib/Makefile SRCS:=file1.c
HEADER_FILE:=../include/public-header.h
SHARED_LIB:=libslicksoft.so
SHARED_LIB_SONAME=libslicksoft
ABI_REF_DIR=../abi-ref
ABI_REF=$(ABI_REF_DIR)/$(SHARED_LIB).abi
CFLAGS:=-Wall -g -I../include
LDFLAGS:=-shared -Wl,-soname=$(SHARED_LIB_SONAME)
ABIDW:= /usr/bin/abidw
ABIDIFF= /usr/bin/abidiff

OBJS:=$(subst .c,.o,$(SRCS))

all: $(SHARED_LIB)

%.o:%.c $(HEADER_FILE)
        $(CC) -c $(CFLAGS) -o $@ $<

$(SHARED_LIB): $(OBJS)
        $(CC) $(LDFLAGS) -o $@ $<

clean:
        rm -f *.o $(SHARED_LIB) $(ABI_REF)

abi-ref: $(SHARED_LIB)
        $(ABIDW) $< > $(ABI_REF)

Step 3: Compare ABI changes

Now that we have a reference ABI, we just need to compare newer versions of libslicksoft.so against it and analyze the changes. We can use Libabigail's abidiff program to compare the two library versions. Here's the command to invoke abidiff:

abidiff baseline.abi path/to/new-binary

This command line compares the ABIs of new-binary against the baseline.abi. It produces a report about the potential ABI changes, then returns a status code that tells us about the different kinds of ABI changes detected. By analyzing the status code, which is represented as a bitmap, we'll be able to tell if any of the ABI changes are likely to break backward compatibility.

The Python-based ABI diff checker

Our next task is to write a program that invokes abidiff to perform the ABI check. We'll call it check-abi and place it in the new slick-software/tools directory.

I've been told Python is cool, so I want to try it out with this new checker. I am far from being a Python expert, but hey, what can go wrong?

Step 1: Spec the ABI checker

To start, let's walk through this Python-based ABI checker we want to write. We'll run it like this:

$ check-abi baseline.abi slicksoft.so

The checker should be simple. If there are no ABI issues it will exit with a zero (0) status code. If it finds a backward-compatibility issue, it will return a non-zero status code and a useful message.

Step 2: Import dependencies

We're writing the check-abi program as a script in Python 3. The first thing we'll do is import the packages we need for this program:

#!/usr/bin/env python3

import argparse
import subprocess
import sys

Step 3: Define a parser

Next, we'll need a function that parses command-line arguments. Let's define it without bothering too much about the content for now:

def parse_command_line():
    """Parse the command line arguments.

       check-abi expects the path to the new binary and a path to the
       baseline ABI to compare against.  It can also optionaly take
       the path to the abidiff program to use.
    """
# ...

Step 4: Write the main function

In this case, I've already written the main function, so let's take a look:

def main():
    # Get the configuration of this program from the command line
    # arguments. The configuration ends up being a variable named
    # config, which has three properties:
    #
    #   config.abidiff: this is the path to the abidiff program
    #
    #   config.baseline_abi: this is the path to the baseline
    #                        ABI. It's the reference ABI that was
    #                        previously stored and that we need to
    #                        compare the ABI of the new binary
    #                        against.
    #
    #   config.new_abi: this is the path to the new binary which ABI
    #                   is to be compared against the baseline
    #                   referred to by config.baseline_abi.
    #
    config = parse_command_line()

    # Execute the abidiff program to compare the new ABI against the
    # baseline.
    completed_process = subprocess.run([config.abidiff,
                                        "--no-added-syms",
                                        config.baseline_abi,
                                        config.new_abi],
                                       universal_newlines = True,
                                       stdout = subprocess.PIPE,
                                       stderr = subprocess.STDOUT)

    if completed_process.returncode != 0:
        # Let's define the values of the bits of the "return code"
        # returned by abidiff.  Depending on which bit is set, we know
        # what happened in terms of ABI verification.  These bits are
        # documented at
        # https://sourceware.org/libabigail/manual/abidiff.html#return-values.
        ABIDIFF_ERROR_BIT = 1
        ABI_CHANGE_BIT = 4
        ABI_INCOMPATIBLE_CHANGE_BIT = 8

        if completed_process.returncode & ABIDIFF_ERROR_BIT:
            print("An unexpected error happened while running abidiff:n")
            return 0
        elif completed_process.returncode & ABI_INCOMPATIBLE_CHANGE_BIT:
            # If this bit is set, it means we detected an ABI change
            # that breaks backwards ABI compatibility, for sure.
            print("An incompatible ABI change was detected:n")
        elif completed_process.returncode & ABI_CHANGE_BIT:
            # If this bit is set, (and ABI_INCOMPATIBLE_CHANGE_BIT is
            # not set) then it means there was an ABI change that
            # COULD potentially break ABI backward compatibility.  To
            # be sure if this change is problematic or not, a human
            # review is necessary
            print("An ABI change that needs human review was detected:n")

        print("%s" % completed_process.stdout)
        return completed_process.returncode

    return 0;

Notes about the code

The code is heavily commented to make it easier for future programmers to understand. Here are two important highlights. First, notice how check-abi invokes abidiff with the --no-added-syms option. That option tells abidiff that added functions, global variables, and publicly defined ELF symbols (aka added ABI artifacts) should not be reported. This lets us focus our attention on ABI artifacts that have been changed or removed.

Second, notice how we've set the checker to analyze the return code generated by abidiff. You can see this detail in the if statement starting here:

if completed_process.returncode != 0:

If the first bit of that return code is set (bit value 1) then it means abidiff encountered a plumbing error while executing. In that case, check-abi will print an error message but it won't report an ABI issue.

If the fourth bit of the return code is set (bit value 8) then it means an ABI change breaks backward compatibility with the older library version. In that case, check-abi will print a meaningful message and a detailed report of the change. Recall that in this case, the checker produces a non-zero return code.

If only the third bit of the return code is set (bit value 4), and the fourth bit mentioned above is not, then it means abidiff detected an ABI change that could potentially break backward compatibility. In this case, a human review of the change is necessary. The checker will print a meaningful message and a detailed report for someone to review.

Note: If you are interested, you can find the complete details of the return code generated by abidiff here.

Source code for the check-abi program

Here's the complete source code for the check-abi program:

#!/usr/bin/env python3

import argparse
import subprocess
import sys

def parse_command_line():
    """Parse the command line arguments.

       check-abi expects the path to the new binary and a path to the
       baseline ABI to compare against.  It can also optionaly take
       the path to the abidiff program to use.
    """

    parser = argparse.ArgumentParser(description="Compare the ABI of a binary "
                                                 "against a baseline")
    parser.add_argument("baseline_abi",
                        help = "the path to a baseline ABI to compare against")
    parser.add_argument("new_abi",
                        help = "the path to the ABI to compare "
                               "against the baseline")
    parser.add_argument("-a",
                        "--abidiff",
                        required = False,
                        default="/home/dodji/git/libabigail/master/build/tools/abidiff")

    return parser.parse_args()


def main():
    # Get the configuration of this program from the command line
    # arguments. The configuration ends up being a variable named
    # config, which has three properties:
    #
    #   config.abidiff: this is the path to the abidiff program
    #
    #   config.baseline_abi: this is the path to the baseline
    #                        ABI. It's the reference ABI that was
    #                        previously stored and that we need to
    #                        compare the ABI of the new binary
    #                        against.
    #
    #   config.new_abi: this is the path to the new binary which ABI
    #                   is to be compared against the baseline
    #                   referred to by config.baseline_abi.
    #
    config = parse_command_line()

    # Execute the abidiff program to compare the new ABI against the
    # baseline.
    completed_process = subprocess.run([config.abidiff,
                                        "--no-added-syms",
                                        config.baseline_abi,
                                        config.new_abi],
                                       universal_newlines = True,
                                       stdout = subprocess.PIPE,
                                       stderr = subprocess.STDOUT)

    if completed_process.returncode != 0:
        # Let's define the values of the bits of the "return code"
        # returned by abidiff.  Depending on which bit is set, we know
        # what happened in terms of ABI verification.  These bits are
        # documented at
        # https://sourceware.org/libabigail/manual/abidiff.html#return-values.
        ABIDIFF_ERROR_BIT = 1
        ABI_CHANGE_BIT = 4
        ABI_INCOMPATIBLE_CHANGE_BIT = 8

        if completed_process.returncode & ABIDIFF_ERROR_BIT:
            print("An unexpected error happened while running abidiff:n")
            return 0
        elif completed_process.returncode & ABI_INCOMPATIBLE_CHANGE_BIT:
            # If this bit is set, it means we detected an ABI change
            # that breaks backwards ABI compatibility, for sure.
            print("An incompatible ABI change was detected:n")
        elif completed_process.returncode & ABI_CHANGE_BIT:
            # If this bit is set, (and ABI_INCOMPATIBLE_CHANGE_BIT is
            # not set) then it means there was an ABI change that
            # COULD potentially break ABI backward compatibility.  To
            # be sure if this change is problematic or not, a human
            # review is necessary
            print("An ABI change that needs human review was detected:n")

        print("%s" % completed_process.stdout)
        return completed_process.returncode

    return 0;

if __name__ == "__main__":
    sys.exit(main())

Using check-abi from the Makefile

We're done with our basic checker, but we could add a feature or two. For instance, wouldn't it be nice if we could invoke our shiny new check-abi program from the slick-software/lib directory? Then we could enter a simple make command anytime we needed to do an ABI verification.

We can set this feature up by adding a rule at the end of the slick-software/lib/Makefile:

abi-check: $(SHARED_LIB)
        $(CHECK_ABI) $(ABI_REF) $(SHARED_LIB) || echo "ABI compatibility issue detected!"

Of course, we also need to define the variable CHECK_ABI at the beginning of the Makefile:

CHECK_ABI=../tools/check-abi

Here's the complete Makefile with these changes:

SRCS:=file1.c
HEADER_FILE:=../include/public-header.h
SHARED_LIB:=libslicksoft.so
SHARED_LIB_SONAME=libslicksoft
ABI_REF_DIR=../abi-ref
ABI_REF=$(ABI_REF_DIR)/$(SHARED_LIB).abi
CFLAGS:=-Wall -g -I../include
LDFLAGS:=-shared -Wl,-soname=$(SHARED_LIB_SONAME)
ABIDW:=/usr/bin/abidw
ABIDIFF=/usr/bin/abidiff
CHECK_ABI=../tools/check-abi

OBJS:=$(subst .c,.o,$(SRCS))

all: $(SHARED_LIB)

%.o:%.c $(HEADER_FILE)
        $(CC) -c $(CFLAGS) -o $@ $<

$(SHARED_LIB): $(OBJS)
        $(CC) $(LDFLAGS) -o $@ $<

clean:
        rm -f *.o $(SHARED_LIB) $(ABI_REF)

abi-ref: $(SHARED_LIB)
        $(ABIDW) $< > $(ABI_REF)

abi-check: $(SHARED_LIB)
        $(CHECK_ABI) $(ABI_REF) $(SHARED_LIB) || echo "ABI compatibility issue detected!"

Run the checker

We're nearly done, but let's test our new checker with a simple ABI check for backward compatibility. First, I will make a few changes to the slick-software library, so I have differences to check.

Next, I visit the slick-software/lib directory and run make abi-check. Here's what's I get back:

$ make abi-check
../tools/check-abi ../abi-ref/libslicksoft.so.abi libslicksoft.so || echo "ABI compatibility issue detected!"
An incompatible ABI change was detected:

Functions changes summary: 1 Removed, 0 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 Removed function:

  'function void function_1()'    {function_1}

ABI compatibility issue detected!
$

The ABI checker is reporting one compatibility issue, with a removed function. I guess I should put function_1() back in to avoid breaking the ABI.

Conclusion

In this article, I showed you how to write a basic ABI verifier for shared libraries in your upstream projects. To keep this project simple, I left out other features that you might want to add to the checker yourself. For instance, Libabigail has mechanisms for handling false positives, which are common in real-world projects. Also, we are constantly improving this tool for the quality of the analysis it can do. If anything about Libabigail doesn't work as you would like, please let us know on the Libabigail mailing list.

Happy hacking, and may all of your ABI incompatibilities be spotted.

Last updated: June 29, 2020