Last year I wrote about the new level for _FORTIFY_SOURCE and how it promises to significantly improve application security mitigation in C/C++. In this article, I will show you how an application or library developer can get the best possible fortification results from the compiler to improve the security of applications deployed on Red Hat Enterprise Linux, for instance. There are shades of previous articles about GCC. But that just goes to show how compiler features tie in together to provide security protection at multiple levels, from prevention to mitigation. First, we should take a closer look at the potential impact of
_FORTIFY_SOURCE=3 on performance and code size of applications.
The performance impact of the new fortification level
_FORTIFY_SOURCE=3 builtin improves fortification coverage by evaluating and passing size expressions instead of the constants seen in
_FORTIFY_SOURCE=2, which generates additional code and potentially more register pressure. But the impact of that additional code appears to be trivial in practice. When I compared nearly 10 thousand packages in Fedora rawhide, I found barely any impact on code size. Some binaries grew while others shrunk, indicating a change in generated code, but there was no broad increase in code size.
However, given that the code did change, surely we should see side effects such as register pressure, shouldn't we? Again in practice, that side effect turns out to be trivial. Running SPEC benchmarks with
_FORTIFY_SOURCE=3 again showed no slowdown at all compared to
_FORTIFY_SOURCE=2, indicating that there is no broad-based impact on performance due to this new fortification level. The results are not entirely surprising, though, if you put them in the context of typical programs, modern processors, and how
Does object size overhead affect performance?
At a high level, the major purpose of the
_FORTIFY_SOURCE=3 feature is to estimate the size of an object passed to a library function call and ensure that the call does not perform any unsafe actions on that object and abort if it does. The success of
_FORTIFY_SOURCE as a mitigation strategy is directly linked to its ability to estimate the size of the passed object.
Now there are two main vectors for performance overhead due to this:
- Overhead of a fortified call instead of the regular call (e.g.,
memcpy). This is significant because, in theory,
_FORTIFY_SOURCE=3should generate many more of these than
- Overhead of the size expression that is passed to the fortified call instead of the constant in
The function call overhead isn't a big enough concern for two main reasons. The most important reason is that in many cases where the size of an object is visible, the compiler is determined conclusively at compile time that the access is safe. Thank the wonderful work on value range propagation that went into GCC in recent years for this. Due to this, the compiler can, in those cases, avoid fortifying the call and instead use the regular library function call.
In cases where the fortified call is unavoidable, the overhead will be noticeable only if the call is encountered repeatedly (i.e., it is on the hot path). Here's where modern CPUs come into the picture with their well-oiled branch predictors. The branches for access safety validation are always predicted correctly, and the processor sails through them almost as if they weren't there.
The overhead of size expressions is slightly trickier to explain but still intuitive enough. Whenever
_FORTIFY_SOURCE=3 is successful in determining the size estimate for an object, it basically has access to the definition of that object, which either gives it a readily available constant or expression for use. Additionally, any derivative arithmetic the compiler needs to generate for the object access (e.g.,
&buf->member.data + i) is often the same arithmetic to get the final size of the pointer, which the compiler appears to reliably meld together, thereby nullifying any such overhead.
Final verdict on performance impact
One might be tempted to conclude that there is absolutely no performance overhead to building applications with
_FORTIFY_SOURCE=3 , but it is more nuanced than that. In most cases, the performance overhead appears negligible due to the compiler being smart enough to optimize most of the overhead away. As a result, it should be safe for most application developers to simply bump up fortification to
_FORTIFY_SOURCE=3 and be done with it.
Now let's look at how application developers can get the most out of
How to improve application fortification
The primary way to improve the success of
_FORTIFY_SOURCE is to tell the compiler about the size of an object passed into a function. The compiler can evaluate simple cases where objects are plain types or structures with constant sizes almost all the time. However, objects that are dynamically allocated and whose pointers are passed to a function are tricky. There are several ways to tell the compiler that. These hints are supported by GCC and Clang, so it does not matter which of those two compilers you use. Additionally, these attributes can be applied to C and C++ functions, so this is not limited to just C.
Note that these benefits don't just improve fortification. Since they end up giving better object size information, they improve overall diagnostics, which means better warnings and often even faster code.
Using allocator functions
If your application uses allocators provided by the standard library (e.g.,
realloc, etc.), the compiler can automatically use the size argument passed to those functions as the object size. However, if your application has wrappers that do special things before or after allocation, or if your application has bespoke allocator functions, you could decorate those functions with the
__alloc_size__ attribute to indicate which of the arguments to your allocation function is the size of the returned object.
This is how it would look:
void *my_allocator (size_t sz) __attribute__ ((__alloc_size__ (1)));
For a calloc-like allocator, it would be:
void *my_allocator (size_t nmemb, size_t size) __attribute__ ((__alloc_size__ (1, 2)));
In the first case, the compiler will see that the size of the allocated object is
sz. In the second case, it will see the size as
nmemb * size.
How to use the __access__ function attribute
In C, a typical programming practice is that when pointers are passed to a function to access arrays, the size of the array the pointer points to is typically passed as another argument to that function. If your application or library uses this pattern, then you may be able to tell the compiler about this size using the
__access__ function attribute on the definition of that function. This attribute is a GCC extension, also available in Clang. The following example tells the compiler that
ptr points to memory that is safe to read and write to the extent of
__attribute__ ((__access__ (__read_write__, 1, 2)))
do_something (char *ptr, size_t sz)
// Get a copy size from somewhere else.
size_t setsize = get_size ();
memset (ptr, 0, setsize);
Put the attribute in the function declaration and the definition because the compiler uses it to validate call sites and perform analysis and fortification within the implementation. At the call site, the value passed for
sz is validated against the size of the object pointed to by
ptr to ensure that
do_something can safely access
sz elements in
ptr. Any inconsistency is flagged as a compile time warning. Inside the function implementation,
sz is assumed to be the size of
ptr and any accesses through
ptr within the function are validated against
sz. In the
_FORTIFY_SOURCE will ensure in the call to
setsize is less than or equal to
sz or otherwise, abort.
An important note about a known issue in the compiler attributes (i.e.,
__access__:) is that these are read by the compiler only if the function it is associated with is not inlined. That is, if
my_allocator are inlined, the compiler won't see their attributes anymore. In common cases, this should not matter too much because the inlining ideally should give just as much or more information about the object size. My advice is to correctly annotate all of the functions in the application or library.
The flexible array conundrum
Flexible arrays are a complex topic because of the various ways GCC and Clang support them. A flexible array is an array at the end of a structure that is dynamically allocated in the program. Before it was standardized, GCC had an extension where any array that was declared at the end of the structure with subscripts
 were considered flexible arrays. C99 then formalized this with the
 notation without any numeric subscript and further locked down semantics, ensuring that the flexible array always appeared at the end of a top-level structure.
GCC, however, continues to support the extensions and even supports flexible arrays in nested structures and unions. This makes object size computations tricky because the compiler may sometimes see the flexible arrays as zero or one-sized arrays, causing spurious crashes with
_FORTIFY_SOURCE. These problems can be avoided if the application uses the standard
 notation for its flexible arrays.
Build your applications with _FORTIFY_SOURCE=3
This article has described the implications of building your application or library with
_FORTIFY_SOURCE=3 compared to
_FORTIFY_SOURCE=2. The improved fortification coverage helps to make your programs significantly safer than the current state. I have provided a GCC plugin to help you measure the fortification coverage using
_FORTIFY_SOURCE=2 compared to
_FORTIFY_SOURCE=3 so you can determine how much additional benefit it provides.
We hope to get closer to our goal of having safer applications deployed on RHEL. We can accomplish this goal with more applications and libraries containing good compiler annotations and built with
_FORTIFY_SOURCE=3 and with more developers fixing compiler warnings. If you have questions, please comment below. We welcome your feedback.