Adding buffer overflow detection to string functions

This article describes the steps required to add buffer overflow protection to string functions. As a real-world example, we use the strlcpy function, which is implemented in the libbsd library on some GNU/Linux systems.

This kind of buffer overflow protection uses a GNU Compiler Collection (GCC) feature for array size tracking (“source fortification”), accessed through the __builtin_object_size GCC built-in function. In general, these checks are added in a size-checking wrapper function around the original (wrapped) function, which is strlcpy in our example.

The strlcpy function takes three arguments: a pointer to the destination buffer, a pointer to the source string, and the size of the destination buffer. strlcpy copies the source string to the destination buffer, but unlike the strcpy function, it truncates the string to fit into the destination buffer. (For more details, refer to the OpenBSD manual page for strlcpy.) The strlcpy is supposed to be used like this:

char buf[100];
strlcpy(buf, source, sizeof(buf));

There is still a risk for a buffer overflow if the programmer gets the buffer size wrong, for example, due to a typographical error:

char buf1[50];
char buf[100];
strlcpy(buf1, source, sizeof(buf));

We want to implement additional checking to catch such errors.

The following checks are required to implement such buffer-size checking:

  1. Introduce an internal alias for the wrapped strlcpy function, libbsd_strlcpy_local. This function is for library internal use only and is not exported.
  2. Implement the size-checking wrapper function, libbsd_strlcpy_chk. This function is similar to the wrapped strlcpy function but receives the compiler-determined buffer size as an additional argument (in addition to the programmer-supplied size).
  3. Add a public inline wrapper function definition of strlcpy, which obtains the buffer size using __builtin_object_size and calls libbsd_strlcpy_chk.

Our implementation requires that libbsd itself is compiled with GCC or a compatible compiler, which implements the relevant GNU extensions.  The installed header file remains compatible with non-GCC compilers.

The internal alias

The internal alias serves two different purposes: It allows us to call the strlcpy function from the size-checking wrapper function even if the fortification wrapper defined in step 3 is active (avoiding an infinite recursion). The other advantage is that the alias has hidden visibility, which prevents symbol interposition and helps the compiler and linker to optimize the function call. (This second advantage is primarily of interest to library authors.)

To declare the function, we add a new internal-only header file to libbsd, in the file src/local-string.h:

#ifndef LIBBSD_LOCAL_STRING_H
#define LIBBSD_LOCAL_STRING_H

#include <string.h>

size_t libbsd_strlcpy_local(char *dst, const char *src, size_t siz)
  __attribute__ ((visibility ("hidden")));
#endif

The visibility attribute is what enables the aforementioned optimization. In the src/strlcpy.c, we provide a matching definition. We need to change the #include directive:

#include <string.h>"local-string.h"

We need to adjust the definition of strlcpy to use the internal name:

size_t
strlcpylibbsd_strlcpy_local(char *dst, const char *src, size_t siz)
{

Finally, we need to provide a definition for the public name, strlcpy, at the end of the same file:

size_t strlcpy(char *dst, const char *src, size_t siz)
  __attribute__ ((alias("libbsd_strlcpy_local")));

Here we use the GCC alias feature, and GCC will make sure that a call to the strlcpy function is redirected to the libbsd_strlcpy_local function. The alias feature avoids an additional function call which would otherwise be needed to implement the indirection.

We still need the strlcpy function for the benefit of old applications linking against libbsd, and to support compiling new applications with non-GCC compilers (which do not provide the buffer size checking functionality).

The size-checking wrapper function

This function, called libbsd_strlcpy_chk, should be put into the src/libbsd_strlcpy_chk.c file, to match the existing project convention.  Its implementation looks like this:

#include "local-string.h"
#include <stdlib.h>

size_t
libbsd_strlcpy_chk(char *dst, const char *src, size_t siz, size_t dstsiz)
{
  if (siz > dstsiz)
    abort();
  return libbsd_strlcpy_local(dst, src, siz);
}

Note the dstsiz argument for the compiler-determined buffer size, and the caller to the internal function alias introduced above.

We check the compiler-determined buffer size against the programmer-supplied buffer size and fail the check even if the actual length of the string src would not exceed either buffer size (so there is no buffer overflow). This is a trade-off between catching logic bugs and avoiding unnecessary run-time failures, and our choice favors detecting logic bugs (because the size check fails even if the source string is short and fits into a buffer).

If the fortify size check fails, we call the abort function to terminate the process. It could be helpful to print an error message and a backtrace (to match what the GNU C Library does on fortify failures), but care must be taken not leak information across a trust boundary.

The size-checking wrapper is called from the fortification wrapper function below, so it needs to be exported. libbsd uses symbol versioning with an explicit list, so we have to update the src/libbsd.map linker version script:

LIBBSD_0.9 {
 libbsd_strlcpy_chk;
} LIBBSD_0.8;

The public fortification wrapper function

The final step involves translating application calls to the strlcpy function into calls to libbsd_strlcpy_chk, with an added compiler-determined buffer size.  To achieve that, we need to add an inline function to the public header file, include/bsd/string.h:

#ifdef __GNUC__
size_t libbsd_strlcpy_chk(char *dst, const char *src,
                          size_t siz, size_t dstsiz);
extern __inline__
__attribute__((__always_inline__, __artificial__))
size_t
strlcpy(char *dst, const char *src, size_t siz)
{
  return libbsd_strlcpy_chk(dst, src, siz,
    __builtin_object_size(dst, 1));
}
#endif

The magic is in the __builtin_object_size call. This is not a real function, but a compiler built-in comparable to the sizeof operator. But unlike sizeof, __builtin_object_size attempts to determine the size of the buffer to which its pointer argument points, and not the size of the pointer itself (which is generally 4 or 8, so not particularly helpful). The second argument to __builtin_object_size consists of flag bits. (For details, refer to the GCC manual under Object Size Checking Built-in Functions.) The flag value 1 indicates that the immediately enclosing (sub)object determines the size. The other common choice for the flag value is 0, which indicates that the top-level object is used to determine the size. For strlcpy, the value 1 is the right choice because the function operates on null-terminated strings, and not raw byte arrays, and sub-object buffer overflows (from one struct member to the next) are not expected.

If the compiler cannot determine an accurate bound on the object size, it returns a conservative default estimate, SIZE_MAX. That is why this code (in conjunction with the libbsd_strlcpy_chk function implementation) works even if there is no size information available to the compiler (but there will no be security check as a result).

The __artificial__ function attribute hides this inline function from the debugger (which will step over it). The __always_inline__ attribute forces inlining of the function, otherwise __builtin_object_size would not have a chance to determine the buffer size.

The entire block is under #ifdef __GNUC__ because these features are GCC-specific.

Concluding remarks

The wrapper function is activated unconditionally for simplicity. Other fortification wrappers are only active for optimizing builds and when the _FORTIFY_SOURCE preprocessor macro is defined to a positive constant. It would also be possible to check if __builtin_object_size returns the conservative default estimate and call strlcpy instead because no additional checking can be done.

Would it make sense to add this to a future libbsd release? This is a much more difficult question. Most projects, which use strlcpy today provide their own implementation and do not use libbsd at all. All these projects would have to incorporate a similar patch. It would be a simple matter to add strlcpy (and strlcat) with fortification to the GNU C Library (glibc), but existing patches have not been accepted by the community.

The code above has some repetition. String function fortification in glibc uses macros and the __typeof__ keyword to streamline the wrapper implementation and the symbol redirection.  We deliberately did not do this here because it tends to obscure what is actually going on.  Furthermore, it is important not to use internal GNU C Library macros such as __bos0 in the implementation of your own fortify functionality because these macros could change their meaning between glibc releases.


The complete patch against libbsd 0.8.3 is available here.


As a developer, you can get a no-cost Red Hat Enterprise Linux Developer Suite subscription, which includes Red Hat Enterprise Linux 7 server, a collection of development tools, and much more.


Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.

 

Share
What did you think of this article?
-1+1 (No Ratings Yet)
Loading...

Leave a Reply