Introduction to using libFuzzer with llvm-toolset

"Fuzzing" an application is a great way to find bugs that may be missed by other testing methods. Fuzzers test programs by generating random string inputs and feeding them into an application. Any program that accepts arbitrary inputs from its users is a good candidate for fuzzing. This includes compilers, interpreters, web applications, JSON or YAML parsers, and many more types of programs.

libFuzzer is a library to assist with the fuzzing of applications and libraries. It is integrated into the Clang C compiler and can be enabled for your application with the addition of a compile flag and by adding a fuzzing target to your code. libFuzzer has been used successfully to find bugs in many programs, and in this article, I will show how you can integrate libFuzzer into your own applications.

To get started with libFuzzer on Red Hat Enterprise Linux (RHEL) 7, you need to install the llvm-toolset-6.0 package, part of the LLVM Toolset software collection. LLVM Toolset includes Clang. To install LLVM Toolset, you must first enable a few additional repositories:

$ sudo subscription-manager repos --enable rhel-7-server-optional-rpms \
    --enable rhel-server-rhscl-7-rpms \
    --enable rhel-7-server-devtools-rpms

(See How to enable sudo on RHEL if sudo isn't set up on your system.)

Next, install llvm-toolset-6.0:

$ sudo yum install llvm-toolset-6.0

Since LLVM Toolset is delivered as a Red Hat Software Collection (RHSCL), you need to use scl enable to launch a new shell with the llvm-toolset-6.0 collection added to your path.

$ scl enable llvm-toolset-6.0 bash

Alternatively, you could permanently add the llvm-toolset-6.0 collection to your profile. For more information, see the article How to install Clang/LLVM 6 and GCC 8 on Red Hat Enterprise Linux 7.

Let's start fuzzing

We'll begin by fuzzing a simple C function that returns the first capital letter in a word:

#include 

char get_first_cap(const char *in, int size) {
  const char *first_cap = NULL;

  if (size == 0)
    return ' ';
  for ( ; *in != 0; in++) {
    if (*in >= 'A' && *in <= 'Z') {
      first_cap = in;
      break;
    }
  }
  return *first_cap;
}

int LLVMFuzzerTestOneInput(const char *Data, long long Size) {
  get_first_cap(Data, Size);
  return 0;
}

In this C file, we have the function we want to test (get_first_cap) along with a target function (LLVMFuzzerTestOneInput) that the fuzzer will call to pass its input to the function.

Now we can compile this function using clang to create a fuzzable binary:

$ clang -g -fsanitize=fuzzer first-cap.c -o fuzz-first-cap

With the -fsantize=fuzzer flag, clang will automatically link our program against the fuzzer library, which includes its own main function. We now have an executable, fuzz-first-cap, that we can use to fuzz the get_first_cap function.

If we run our fuzz-first-cap program with no arguments, libFuzzer will generate random inputs to test our program. We can also provide a corpus of legal inputs to help libFuzzer to be smarter about the kinds of inputs it generates.

$ mkdir corpus
$ echo "Apple" > corpus/Apple.txt
$ echo "aPple" > corpus/aPple.txt
$ echo "apPle" > corpus/apPle.txt

Now if we run our program with this corpus, we will see that libFuzzer identifies a problem, right away (note we are only using the -seed=1 option to get reproducible output; this is optional):

$ ./fuzz-first-cap -seed=1 corpus

INFO: Seed: 1
INFO: Loaded 1 modules   (8 inline 8-bit counters): 8 [0x670fa0, 0x670fa8), 
INFO: Loaded 1 PC tables (8 PCs): 8 [0x45fd48,0x45fdc8), 
INFO:        3 files found in corpus
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: seed corpus: files: 3 min: 6b max: 6b total: 18b rss: 33Mb
#4      INITED cov: 5 ft: 7 corp: 2/12b exec/s: 0 rss: 34Mb
#6      REDUCE cov: 5 ft: 7 corp: 2/10b exec/s: 0 rss: 34Mb L: 4/6 MS: 2 ChangeBinInt-EraseBytes-
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==15554==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000045473a bp 0x7ffc2eacb4d0 sp 0x7ffc2eacb4a0 T15554)
==15554==The signal is caused by a READ memory access.
==15554==Hint: address points to the zero page.
    #0 0x454739 in get_first_cap /first-cap.c:13:11
    #1 0x4547b6 in LLVMFuzzerTestOneInput /first-cap.c:17:3
    #2 0x415b99 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (fuzz-first-cap+0x415b99)
    #3 0x418954 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) (fuzz-first-cap+0x418954)
    #4 0x41a1f7 in fuzzer::Fuzzer::MutateAndTestOne() (fuzz-first-cap+0x41a1f7)
    #5 0x41a9af in fuzzer::Fuzzer::Loop(std::vector<std::string, fuzzer::fuzzer_allocator > const&) (fuzz-first-cap+0x41a9af)
    #6 0x410193 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (fuzz-first-cap+0x410193)
    #7 0x406562 in main (fuzz-first-cap+0x406562)
    #8 0x7fcb5a8a53d4 in __libc_start_main (/lib64/libc.so.6+0x223d4)
    #9 0x4065aa in _start (fuzz-first-cap+0x4065aa)

UndefinedBehaviorSanitizer can not provide additional info.
==15554==ABORTING
MS: 4 ShuffleBytes-ShuffleBytes-ChangeBit-EraseBytes-; base unit: 7469c22975699536c6c6d00767e773b5429fefc6
0x65,0x0,
e\x00
artifact_prefix='./'; Test unit written to ./crash-36282fac116d9fd6b37cc425310e1a8510f08a53
Base64: ZQA=

The most relevant part of the output, is the stack trace, which shows us there was a segmentation fault and then the information about the input that caused the crash, which comes at the end of the output. In this case, we crashed on a 2-byte input with no capital letters: e, \x00.

There has also been a new file added to our corpus directory:

$ cat corpus/7469c22975699536c6c6d00767e773b5429fefc6
apP

This is a "good" input that libFuzzer generated while fuzzing our program. libFuzzer will add all good inputs it finds to the corpus directory.

So let's fix this bug and then try again:

#include 

char get_first_cap(const char *in, int size) {
  const char *first_cap = NULL;

  if (size == 0)
    return ' ';
  for ( ; *in != 0; in++) {
    if (*in >= 'A' && *in <= 'Z') {
      first_cap = in;
      break;
    }
  }
  if (first_cap)
    return *first_cap;
  else
    return ' ';
}

int LLVMFuzzerTestOneInput(const char *Data, long long Size) {
  get_first_cap(Data, Size);
  return 0;
}

Then recompile:

$ clang -g -fsanitize=fuzzer first-cap.c -o fuzz-first-cap

And run:

$ ./fuzz-first-cap  -seed=1 corpus

This time, libFuzzer did not find any issues after running for around 30 seconds. By default, libFuzzer, will run forever until it finds a bug, but you can configure this using the flag -runs=X.

So far, our fuzzer has been used to detect segmentation faults, but you can also pair it without one of the clang sanitizers to check for other kinds of errors. For example, we can take our program and compile it with the address sanitizer enabled:

clang -g -fsanitize=fuzzer,address first-cap.c -o fuzz-first-cap

Now when we run the program, we see a new error:

==15569==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000f1 at pc 0x000000558507 bp 0x7fff78272c30 sp 0x7fff78272c28
READ of size 1 at 0x6020000000f1 thread T0
    #0 0x558506 in get_first_cap /first-cap.c:8:11
    #1 0x558766 in LLVMFuzzerTestOneInput /first-cap.c:21:3
    #2 0x42cea9 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (fuzz-first-cap+0x42cea9)
    #3 0x42fc64 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) (fuzz-first-cap+0x42fc64)
    #4 0x4317df in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<std::string, fuzzer::fuzzer_allocator > const&) (fuzz-first-cap+0x4317df)
    #5 0x431b72 in fuzzer::Fuzzer::Loop(std::vector<std::string, fuzzer::fuzzer_allocator > const&) (fuzz-first-cap+0x431b72)
    #6 0x4274a3 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (fuzz-first-cap+0x4274a3)
    #7 0x41d852 in main (fuzz-first-cap+0x41d852)
    #8 0x7f88abaca3d4 in __libc_start_main (/lib64/libc.so.6+0x223d4)
    #9 0x41d8bb in _start (fuzz-first-cap+0x41d8bb)
...

Here the fuzzer has triggered a heap buffer overflow, which was caught by the address sanitizer. In this case, the input was a string without a null terminator and it caught a bug in our program where we assumed the input would be null terminated.

Besides the address sanitizer, you can also use libFuzzer with LLVM's undefined behavior sanitizer (UBSAN).

There is a lot more you can do with libFuzzer beyond what is shown here in this simple introduction. For more information see the libFuzzer documentation.

Last updated: November 1, 2023

Introduction to using libFuzzer with llvm-toolset

Let's start fuzzing

Related Articles

Efficiently manage host content with Red Hat Satellite's multi-CV

New features in Python 3.14

Why killing pods is not enough: Testing operator reconciliation with operator-chaos

Troubleshoot Red Hat OpenShift Virtualization localnet with the netobserv command

EvalHub: Capability and safety benchmarking for AI models

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links