Profile-guided optimization in Clang: Dealing with modified sources

GCC combined with glibc can detect instances of buffer overflow by standard C library functions. When a user passes the -D_FORTIFY_SOURCE={1,2} preprocessor flag and an optimization level greater or equal to -O1, an alternate, fortified implementation of the function is used when calling, say, strcpy. Depending on the function and its inputs, this behavior may result in a compile-time error, or a runtime error triggered upon execution. (For more info on this feature, there's an excellent blog post here on the subject).

What about the Clang plus glibc duo? This article digs through -D_FORTIFY_SOURCE usage and discuss the patches applied to Clang to achieve feature parity.

First glance

Let's look at -D_FORTIFY_SOURCE through the compiler portability prism. As this feature relies on a preprocessor definition, it should only involve preprocessor selection. Compatibility should come for free, right?

Not quite. Following the definitions in the glibc headers, one quickly spots compiler-specific function calls that are relied on to implement the security check. A careful study of the headers leads to the following builtins being used at some point in the headers, included solely when -D_FORTIFY_SOURCE is on:

  • __builtin_constant_p
  • __builtin_object_size
  • __builtin___memcpy_chk
  • __builtin___memmove_chk
  • __builtin___mempcpy_chk
  • __builtin___memset_chk
  • __builtin___snprintf_chk
  • __builtin___sprintf_chk
  • __builtin___stpcpy_chk
  • __builtin___strcat_chk
  • __builtin___strcpy_chk
  • __builtin___strncat_chk
  • __builtin___strncpy_chk
  • __builtin___vsnprintf_chk
  • __builtin___vsprintf_chk

Those are the compiler builtins, as hinted by the __builtin__ prefix, which means that either the compiler knows them and provides its own implementation/handling, or the compilation (or linking) process will fail. So in order to support -D_FORTIFY_SOURCE, a compiler must support these builtins. All of them (except __builtin_constant_p and __builtin_object_size) are suffixed by _chk, which suggests they are hardened versions of the corresponding function from libc.

Let's take a deeper look at these functions.

Required compiler builtins

The following compiler builtins are required for -D_FORTIFY_SOURCE.

__builtin_object_size(obj, type)

This builtin function is complex. The interested reader may want to check out its online documentation. As a short summary, let's assume that this function tries to compute the allocated size of obj at compile time, and then returns it. If this process fails, it returns -1.

The type argument controls details of this function's semantics. The following definition is made available in cdefs.h:

#define __bos(ptr) __builtin_object_size (ptr, __USE_FORTIFY_LEVEL > 1)

__builtin_constant_p(obj)

This function returns 1 if the value of obj is known at compile time (after optimizations) and returns 0 otherwise. The following code, extracted from stdio2.h in the glibc version 2.30, showcases an example of usage:

__fortify_function __wur char *
fgets (char *__restrict __s, int __n, FILE *__restrict __stream)
{
  if (__bos (__s) != (size_t) -1)
    {
      if (!__builtin_constant_p (__n) || __n <= 0)
  return __fgets_chk (__s, __bos (__s), __n, __stream);

      if ((size_t) __n > __bos (__s))
  return __fgets_chk_warn (__s, __bos (__s), __n, __stream);
    }
  return __fgets_alias (__s, __n, __stream);
} //*

This code reads as:

If we can compute the basic object size (bos) of the first parameter of fgets, then if the second parameter is not known at compile time, use the __fgets_chk function. If the second parameter is greater than the object size, then use __fgets_chk_warn. Otherwise, we know (at compile time) that the call is secure and the original function is called through __fgets_alias.

__builtin___memcpy_chk(dest, src, n, dest_size)

The extra dest_size argument is used for comparison with n. dest_size can be -1, which means its value is unknown at compile time. It can have a positive value, in which case it means that "the number of allocated bytes remaining after the location pointed by dest is dest_size." When dest_sizeis positive and lower than n, an error is emitted either at compile time or at runtime.

The other __bultin__*_chk builtins do similar checks based on the destination buffer's compiler-computed object size and the actual copy size.

Clang compatibility

After a quick check of the builtins supported by Clang, it turns out that all the builtins required by -D_FORTIFY_SOURCE=2 are supported by Clang. That's a nice property: It means that you can pass that pre-processor flag to Clang when compiling your C (or C++) application and it compiles just fine. As a matter of fact, Firefox already builds with Clang and that flag, so it indeed compiles fine.

But do we get the extra protection? After a deeper look at Clang's source code, the answer is more nuanced. Based on the body of Sema::checkFortifiedBuiltinMemoryFunction, the check is only performed if both the size argument and the object size are known at compile time. Otherwise, no checks are performed. This sequence is different from the GCC behavior, where a call to __memcpy_chk is generated in that case.

A look at this snippet illustrates GCC's behavior. The size argument of the memcpy call is a runtime value, but both the destination and the source argument have a size known at compile time. GCC internal representation shows a call to __builtin___memcpy_chk lowered to __memcpy_chk. On the other hand, Clang just issues a regular call to memcpy.

Patching Clang

Digging into Clang's code reveals that whenever it meets a call to memcpy, the call is replaced by a call to LLVM's builtin llvm.memcpy. Unfortunately, what -D_FORTIFY_SOURCE={1,2} does is unguard an inline definition of memcpy with the fortified implementation. And that's the implementation Clang should use. This patch implements this extra behavior, with the mandatory extra tests.

To validate the whole approach, I wrote a minimal test suite for fortifying compilers. GCC passes it by design, and Clang 9 doesn't. However, using the top-of-tree version of Clang (346de9b6 as of this writing), the test suite now passes just fine:

(sh) make check-gcc
[...]
===== GCC OK =====
(sh) PATH=/path/to/clang:$PATH make check-clang
[...]
===== CLANG OK =====

Conclusion

When aiming at feature parity, the devil is in the details. In the case of -D_FORTIFY_SOURCE, Clang seemingly supported the feature. We are now one step further toward feature parity.

Last updated: February 22, 2024