C binary CC0

Most C compilers allow accessing an array declared extern, which has indeterminate bounds, like this:

extern int external_array[];

int
array_get (long int index)
{
  return external_array[index];
}

The definition of external_array could reside in a different translation unit and look like this:

int external_array[3] = { 1, 2, 3 };

The question is what happens if this separate definition is changed to this:

int external_array[4] = { 1, 2, 3, 4 };

Or this:

int external_array[2] = { 1, 2 };

Does either change preserve the binary interface (assuming that there is a mechanism that allows the application to determine the size of the array at run time)?

Curiously, the answer is that on many architectures, increasing the array size breaks binary interface (ABI) compatibility. Decreasing the array size may also cause compatibility problems. We'll look more closely at ABI compatibility in this article and explain how to avoid problems.

How the data section of an executable is linked

To understand how the array size becomes part of the binary interface, we first need to examine how the data section of an executable is linked. The details are of course architecture-specific, and here we focus on the x86-64 architecture.

The x86-64 architecture supports addressing relative to the program counter, which means that access to a global array variable, as in the array_get function shown previously, can be compiled to a single movl instruction:

array_get:
	movl	external_array(,%rdi,4), %eax
	ret

From that, the assembler produces an object file in which the instruction is marked with an R_X86_64_32S relocation.

0000000000000000 :
   0:	mov    0x0(,%rdi,4),%eax
			3: R_X86_64_32S	external_array
   7:	retq   

This relocation tells the link editor (ld) to fill in the appropriate location of the external_array variable at link time, when producing an executable.

This has two important consequences.

  • Because the variable offset is determined at link time, there is no run-time overhead for determining it. The only cost is the memory access itself.
  • To determine the offset, the sizes of all data variables need to be known. Otherwise, it would not be possible to compute the layout of the data section at link time.

For C implementations targeting the Executable and Link Format (ELF), as used on GNU/Linux, references to extern variables do not carry object sizes. For the array_get example, the size of the object is not even known to the compiler. In fact, the entire assembler file looks like this (only omitting unwind information using -fno-asynchronous-unwind-tables, which is technically required for psABI compliance):

	.file	"get.c"
	.text
	.p2align 4,,15
	.globl	array_get
	.type	array_get, @function
array_get:
	movl	external_array(,%rdi,4), %eax
	ret
	.size	array_get, .-array_get
	.ident	"GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)"
	.section	.note.GNU-stack,"",@progbits

There is no size information for the external_array at all in this assembler file: The only reference to the symbol is on the line with the movl instruction, and the only numeric data in the instruction is the array element size (implied by movl and the scaling factor 4).

If ELF required symbol sizes for undefined variables, it would not even be possible to compile the array_get function.

How does the link editor obtain the actual symbol size? It looks at the symbol definition and uses the size information it finds there. This allows the link editor to compute the data section layout and fill out the data relocations with the appropriate offsets.

Introducing ELF shared objects

C implementations for ELF do not require the programmer to add source code markup to indicate if a function or variable is located in the current object (which can be a library or the main executable) or in a different object. The link editor and the dynamic loader are expected to take care of that transparently, without help for the programmer.

At the same time, for executables, there was a desire not to reduce performance by changing the compilation model for executables. This means that when compiling source code for a main program (i.e., without -fPIC, and in this particular case, without -fPIE as well), the array_get function is compiled to the exact same instruction sequence, before the introduction of dynamic shared objects. Furthermore, it does not matter whether the external_array variable is defined in the main executable itself, or whether some shared object is loaded separately at run time. The instructions produced by the compiler are the same in both cases.

How is this possible? After all, ELF shared objects are position-independent. They are loaded at unpredictable, randomized addresses at run time. Yet the compiler generates a machine code sequence that requires that these variables are placed at a fixed offset computed at link time, long before the program even runs.

The answer is related to the fact that only one loaded object (the main executable) uses these fixed offsets. All other objects (the dynamic loader itself, the C run-time library, and any other library the program uses) are compiled and linked as fully position-independent (PIC) objects. For such objects, the compiler introduces an additional indirection, loading the actual address of each variable from the global offset table (GOT). We can see this indirection if we compile the array_get example with -fPIC, leading to this assembler code:

array_get:
	movq	external_array@GOTPCREL(%rip), %rax
	movl	(%rax,%rdi,4), %eax
	ret

As a result, the address of the external_array variable is no longer hard-coded and can be changed at run time by initializing its GOT entry accordingly. This means that at run time, the definition of external_array can be contained in the same shared object, a different shared object, or the main program. The dynamic loader will find the appropriate definition based on the ELF symbol lookup rules and bind the undefined symbol reference to its definition, by updating the GOT entry to its actual address.

Let's go back to the original example, where the array_get function is located in the main program, so there is no indirection for the variable address. The key idea implemented in the link editor is that the main program will provide a definition of the external_array variable even if it is actually defined in a shared object at run time. At run time, instead of pointing all shared objects to the original definition of the variable in the shared object containing it, the dynamic loader will instead pick a copy of the variable in the data section of the executable.

This has two important consequences. First of all, recall that the definition of external_array looks like this:

int external_array[3] = { 1, 2, 3 };

The definition has an initializer, and this initializer has to be applied to the definition in the main executable. To facilitate this, the main executable contains a copy relocation for the symbol. The readelf -rW command displays it as a R_X86_64_COPY relocation:

Relocation section '.rela.dyn' at offset 0x408 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000403ff0  0000000100000006 R_X86_64_GLOB_DAT      0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
0000000000403ff8  0000000200000006 R_X86_64_GLOB_DAT      0000000000000000 __gmon_start__ + 0
0000000000404020  0000000300000005 R_X86_64_COPY          0000000000404020 external_array + 0

Like other relocations, a copy relocation is processed by the dynamic loader. It involves a simple, bit-wise copy operation. The target of the copy is determined by the relocation offset (0000000000404020 in the example). The source is determined at run time, based on the symbol name (external_array) and its resolution (its value). When making the copy, the dynamic loader will also look at the size of the symbol, to obtain the number of bytes that need to be copied. To make all this possible, external_array symbol is automatically exported from the executable, as a defined symbol, so that it is visible at run time to the dynamic loader. The dynamic symbol table (.dynsym) reflects this, as shown by the readelf -sW command:

Symbol table '.dynsym' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
     2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     3: 0000000000404020    12 OBJECT  GLOBAL DEFAULT   22 external_array

Where does the size information come from (12 bytes, in this example)? The link editor opens all shared objects on the link command line, searches for the definition, and uses its size information found there. As before, this allows the link editor to compute the layout of the data section, so that fixed offsets can be used. Also as before, the size of the definition in the main executable is fixed and cannot change at run time.

The dynamic linker also redirects symbol references in shared objects to the target of the copy relocation, in the main executable. This ensures that only a single copy of the variable exists in the entire program, as required by the C language semantics. Otherwise, if the variable is modified after initialization, updates from the main executable would not be visible to the dynamic shared objects, and vice versa.

The impact on binary compatibility

What happens if we change the definition of external_array in the shared object, without relinking (or recompiling) the main program? First, let us consider the addition of an array element.

int external_array[4] = { 1, 2, 3, 4 };

This triggers a warning at run time, from the dynamic loader:

main-program: Symbol `external_array' has different size in shared object, consider re-linking

The main program still contains a definition of external_array, which only provides space for 12 bytes. This means that the copy is incomplete: only the first three array elements are copied. Access to the array element extern_array[3] is undefined as a result. This approach affects all code in the process, not just the main program, because all references to extern_array have been redirected to the definition in the main program. This includes the shared object, which provides the definition of extern_array, which is probably not prepared to deal with the situation that the array element in its own definition is gone.

What about the change in the opposite direction, removing an element, like this?

int external_array[2] = { 1, 2, 3 };

If the program avoids accessing the array element extern_array[2] because it detects that the array length is only two by some mechanism, then this will work. There is a bit of unused memory after the array, but this will not break the program.

This means that we end up with the following rule:

Adding elements to a global array variable breaks binary compatibility.
Removing elements may break compatibility, unless there is a mechanism
that avoids access to the removed elements.

Unfortunately, the dynamic loader warning looks more harmless than it actually is, and for removed elements, there is no warning at all.

How to avoid this situation

Detecting the ABI change is rather easy with tools such as libabigail.

The easiest way to avoid this situation is to provide a function that returns the address of the array, using code like this:

static int local_array[3] = { 1, 2, 3 };

int *
get_external_array (void)
{
  return local_array;
}

If the array definition cannot be made static because of the way it is used in the library, we can give it hidden visibility instead, also preventing its export and therefore avoid the truncation issue:

int local_array[3] __attribute__ ((visibility ("hidden"))) =
  { 1, 2, 3 };

Things are considerably more complicated if the array variable needs to be exported for reasons of backward compatibility. Because the array is truncated underneath the library if an old main program with a shorter array definition is used, the accessor function will not provide access to the full array for newer client code if it is used with the same global array. Instead, the accessor function could use a separate (static or hidden) array, or perhaps a separate array for the newly added elements at the end. The downside is that it is impossible to keep everything in a contiguous array if the array variable is exported for backward compatibility. The design of the additional interface needs to reflect that.

With symbol versioning, it is possible to export multiple versions with different sizes, never changing the size associated with a specific version. Using this model, newly linked programs will always use the latest version, presumably with the largest size. Because symbol version and size are fixed by the link editor at the same time, they are always consistent. The GNU C Library uses this approach for the historic sys_errlist and sys_siglist variables. However, this still does not provide a single, contiguous array.

All things considered, an accessor function (e.g., the get_external_array function above) is the best approach for avoiding this ABI compatibility problem.

Last updated: February 22, 2024