The location of code and data in memory can have a big effect on a program's performance. Placing functions that reference each other frequently in the same area of memory can help reduce the need to load and unload pages of memory. Similarly, pieces of data that are used a lot at the same time benefit from being located in the same region of memory.
Alternatively, sometimes it is better not to group code and data based upon its use, but rather upon its similarity to each other. Placing similar code and data in adjacent memory regions means that the binary image can be compressed to a greater degree, a feature which is often important for mobile applications.
Compilers attempt to address these needs via the use of profile directed optimizations as well as supporting language features like hot and cold function attributes. But sometimes developers want to experiment for themselves and this is where linker section ordering comes into play. By placing each function or piece of data into its own section, it is possible to have the linker organize their layout in memory according to whatever plan is desired.
While in theory it is possible to create a unique linker script that places these sections into a specific order, in practice this can be time-consuming and difficult to get right. So linkers also provide a command-line option that specifies a file that contains the ordering of sections of interest to the user. (Any section not mentioned in this file is placed as normal.)
This article explains how to use this linker feature.
Linker section ordering
Let's start with an example. Given a source file like this:
cat demo.c
int a = 1, b= 2, c = 3;
int function1 (void) { return a; }
int function2 (void) { return b; }
int function3 (void) { return c; }
int main (void) { return function1()
+ function2() + function3(); }
Which is then compiled and linked, the functions end up being ordered in memory like this:
readelf -W --syms a.out | grep function | sort -k 2
20: 0000000000401106 function1
32: 0000000000401112 function2
21: 000000000040111e function3
Note that the ordering of functions is not guaranteed. The compiler and linker are free to make changes to the order for whatever reasons they like. But at the time of writing using GCC 13.3 and ld version 2.40, the above listing shows the order obtained.
Also note that the order in which symbols appear in the symbol table is not guaranteed. In particular they do not have to be in incremental address order. This is why the output of the readelf
command is passed through sort
before being displayed, and why the index numbers of the symbols (the first column in readelf
's output) does not match the up with the order in which they are displayed.
Suppose however that you wanted to ensure a fixed order where function3
appears at the lowest address, relative to function2
and function1
, and function2
is lower than function1
. This can be done, but first each function needs to be placed into its own section.
Both GCC and Clang have a -ffunction-sections
command line option which places each function into its own section, named after the function itself:
gcc -ffunction-sections -c demo.c
readelf -W -S demo.o | grep function | grep PROGBITS | cut -b 1-40
[ 4] .text.function1 PROGBITS
[ 6] .text.function2 PROGBITS
[ 8] .text.function3 PROGBITS
So after compiling the code that way, the linker's section ordering feature can be used to ensure the desired order. The feature needs an input file that describes the order. For the BFD linker (supported from release 2.43 onward) the file looks like this:
cat demo.bfd.t
.text : {
*(.text.function3)
*(.text.function2)
*(.text.function1)
}
It looks like a mini linker script that just describes the mapping of input sections to output sections. Using this file the chosen ordering can be achieved:
ld.bfd demo.o -e 0 --section-ordering-file demo.bfd.t
readelf -W --syms a.out | grep function | sort -k 2
6: 0000000000401000 function3
4: 000000000040100c function2
3: 0000000000401018 function1
The same thing can be done with the GOLD linker, although the syntax of the ordering file is slightly simpler. It just uses the input section names, one per line. GOLD itself will work out the output section name.
cat demo.gold.t
.text.function3
.text.function2
.text.function1
ld.gold demo.o -e 0 --section-ordering-file demo.gold.t
readelf -W --syms a.out | grep function | sort -k 2
6: 0000000000400160 function3
7: 000000000040016c function2
8: 0000000000400178 function1
The LLD linker has a similar option, except that it orders sections based upon the symbols that they contain, rather than the section's own names:
cat demo.lld.t
function3
function2
function1
ld.lld demo.o -e 0 --symbol-ordering-file demo.lld.t
readelf -W --syms a.out | grep function | sort -k 2
7: 00000000002011f4 function3
6: 0000000000201200 function2
5: 000000000020120c function1
Note that the LLD linker does not break up sections however, so if the symbols function3
and function1
happen to be in the same input section they will always appear together in the output file, regardless of how they are placed in the symbol ordering file.
At the time of writing, the MOLD linker does not have a section ordering or symbol ordering option. This may change in the future however.
The linker does not have to be invoked directly for this feature to work. Using a compiler driver and the -Wl
option is also supported.
For example:
gcc -fuse-ld=lld -Wl,--symbol-ordering-file,demo.lld.t -ffunction-sections demo.c
Data can also be ordered as well as code, as long as separate sections are used to contain the data symbols. This is achieved by using the compiler's -fdata-sections
command line option. For example:
cat demo.data.lld.t
c
b
a
Without ordering the symbols are in the, possibly expected, order of a, b, c:
clang -fuse-ld=lld demo.c -fdata-sections
readelf -W --syms a.out | grep -e ' [abc]$' | sort -k 2
29: 000000000020386c a
31: 0000000000203870 b
33: 0000000000203874 c
With ordering they are now in the c, b, a order:
clang -fuse-ld=lld demo.c -fdata-sections -Wl,--symbol-ordering-file,demo.data.lld.t
readelf -W --syms a.out | grep -e ' [abc]$' | sort -k 2
33: 0000000000203868 c
31: 000000000020386c b
29: 0000000000203870 a
This also works for the GOLD and BFD linkers, although of course the syntax is slightly different for each:
cat demo.data.gold.t
.data.c
.data.b
.data.a
cat demo.data.bfd.t
.data : {
*(.data.c)
*(.data.b)
*(.data.a)
}
Note that for all of these linkers the data and text ordering directives can be placed into a single file. They do not have to be kept separate, e.g.:
cat demo.both.gold.t
.text.function3
.text.function2
.text.function1
.data.c
.data.b
.data.a
The BFD and GOLD linker ordering files also support wildcard features. For example:
cat demo.gold.t
.text.function[1,3]
.text.function2
.data.[^b]
.data.b
Which when used will group together function1
and function3
before function2
, (but with the order of function1
and function3
not specified) and all data sections except .data.b
appearing before .data.b
:
readelf -W --syms a.out | grep -e ' [abc]$' -e function | sort -k 2
25: 0000000000400596 function1
23: 00000000004005a2 function3
24: 00000000004005ae function2
28: 0000000000402004 a
26: 0000000000402008 c
27: 000000000040200c b
The BFD linker's syntax is also unique in that it allows you to specify the exact mapping of input files and sections to output sections, so you could for example place a data item in the middle of a code section:
cat demo.mixed.bfd.t
.text : {
*(.text.function1)
*(.data.b)
*(.text.function*)
}
.data : {
*(.data.[^b])
}
gcc -fuse-ld=bfd demo.c -fdata-sections -ffunction-sections -Wl,--section-ordering-file,demo.mixed.bfd.t
readelf -W --syms a.out | grep -e ' [abc]$' -e function | sort -k 2
3: 0000000000401000 function1
2: 000000000040100c b
4: 0000000000401010 function2
6: 000000000040101c function3
11: 0000000000403000 a
5: 0000000000403004 c
The LLD linker's --symbol-ordering-file
option however has the advantage that it will work in conjunction with LLVM's --no-unique-section-names
option. This modifies the -ffunction-sections
and -fdata-section
s options so that instead of creating sections named after the functions and data, it just uses the standard section names—multiple times. (This is allowed by the ELF standard and has the advantage of saving space in the program's string table by not creating lots of unique names.) E.g.:
clang demo.c -c -fdata-sections -ffunction-sections -fno-unique-section-names
readelf -W -S demo.o | grep PROGBITS | cut -b 1-40
[ 2] .text PROGBITS
[ 3] .text PROGBITS
[ 5] .text PROGBITS
[ 7] .text PROGBITS
[ 9] .text PROGBITS
[11] .data PROGBITS
[12] .data PROGBITS
[13] .data PROGBITS
Since the BFD and GOLD linkers need unique section names for their ordering features to work, they cannot handle this particular format.
The BFD linker, on the other hand, has the ability to work with code that has not been compiled with the -ffunction-sections
or -fdata-sections
options, provided that the code, or data, comes from separate files.
For example:
cat demo.multifile.bfd.t
.text : {
source1.o(.text)
source3.o(.text)
source2.o(.text)
}
This ordering file would place the code from source1.o
before the code from source3.o
, which in turn will appear before the code from source2.o
.
Conclusion
Using a linker's section ordering feature allows experimentation with the layout of code and data in memory without the need to write full linker scripts. While each linker has its own syntax for specifying the layout, they all support the same general idea.