Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Improvements to static analysis in the GCC 13 compiler

May 31, 2023
David Malcolm
Related topics:
C, C#, C++Linux
Related products:
Red Hat Enterprise Linux

    I work at Red Hat on GCC, the GNU Compiler Collection. For the last four releases of GCC, I've been working on -fanalyzer, a static analysis pass that tries to identify various problems at compile-time, rather than at runtime. It performs "symbolic execution" of C source code—effectively simulating the behavior of the code along the various possible paths of execution through it (with some caveats that we'll discuss).

    This article summarizes what's new with -fanalyzer in GCC 13, which has just been released.

    [ Learn more: New C features in GCC 13 ] 

    New warnings

    I first added the analyzer to GCC in GCC 10, with 15 new warnings for the compiler, and we've added more in each subsequent release (Table 1).

    Table 1: GCC warnings controlled by -fanalyzer by release

    Release New warnings Cumulative warnings
    GCC 10 15 15
    GCC 11 7 22
    GCC 12 5 27
    GCC 13 20 47

    As you can see in Table 1, GCC 13 is a big release for -fanalyzer, adding 20 new warnings. Let's take a look at some of them.

    Track dynamic buffer size

    Can you spot the bug in the following C code?

    #include <stdlib.h>
    #include <string.h>
    
    struct str {
      size_t len;
      char data[];
    };
    
    struct str *
    make_str_badly (const char *src)
    {
      size_t len = strlen(src);
      struct str *str = malloc(sizeof(str) + len);
      if (!str)
        return NULL;
      str->len = len;
      memcpy(str->data, src, len);
      str->data[len] = '\0';
      return str;
    }
    

    The above example makes the common mistake with C-style strings of forgetting the null terminator when computing how much space to allocate for str.

    GCC 13's -fanalyzer option now keeps track of the sizes of dynamically allocated buffers, and for many cases it checks the simulated memory reads and writes against the sizes of the relevant buffers. With this new work it detects the above problem by emitting this new warning:

    <source>: In function 'make_str_badly':
    <source>:18:18: warning: heap-based buffer overflow [CWE-122] [-Wanalyzer-out-of-bounds]
       18 |   str->data[len] = '\0';
          |   ~~~~~~~~~~~~~~~^~~~~~
      'make_str_badly': events 1-4
        |
        |   13 |   struct str *str = malloc(sizeof(str) + len);
        |      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~
        |      |                     |
        |      |                     (1) capacity: 'len + 8' bytes
        |   14 |   if (!str)
        |      |      ~               
        |      |      |
        |      |      (2) following 'false' branch (when 'str' is non-NULL)...
        |   15 |     return NULL;
        |   16 |   str->len = len;
        |      |   ~~~~~~~~~~~~~~     
        |      |            |
        |      |            (3) ...to here
        |   17 |   memcpy(str->data, src, len);
        |   18 |   str->data[len] = '\0';
        |      |   ~~~~~~~~~~~~~~~~~~~~~
        |      |                  |
        |      |                  (4) write of 1 byte at offset 'len + 8' exceeds the buffer
        |

    I want to thank Tim Lange who implemented this warning as part of Google's Summer of Code program last year (along with two other new warnings: -Wanalyzer-allocation-size and -Wanalyzer-imprecise-fp-arithmetic).

    Check if NULL is dereferenced

    Here's an example of another new warning—what's wrong with the following C code?

    #include <assert.h>
    #include <stdio.h>
    
    extern FILE *logfile;
    
    struct obj
    {
      const char *name;  
      int x;
      int y;
    };
    
    int is_within_boundary (struct obj *p, int radius_squared)
    {
      fprintf (logfile, "%s: (%i, %i)\n", p->name, p->x, p->y);
      if (!p)
        return 0;
      return (p->x * p->x) + (p->y * p->y) < radius_squared;
    }
    

    The issue is that the code is unclear about whether p can be NULL: it's dereferenced unconditionally at the fprintf call, but then checked for NULL later on. A pointer that's unconditionally dereferenced can be assumed by a compiler to be non-NULL, and thus the check against NULL can potentially be optimized away, which is probably not want you want—but the compiler has no way to know what you meant.

    As of GCC 13, the -fanalyzer option now detects the above by emitting this warning:

    <source>: In function 'is_within_boundary':
    <source>:16:6: warning: check of 'p' for NULL after already dereferencing it [-Wanalyzer-deref-before-check]
       16 |   if (!p)
          |      ^
      'is_within_boundary': events 1-2
        |
        |   15 |   fprintf (logfile, "%s: (%i, %i)\n", p->name, p->x, p->y);
        |      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |      |   |
        |      |   (1) pointer 'p' is dereferenced here
        |   16 |   if (!p)
        |      |      ~
        |      |      |
        |      |      (2) pointer 'p' is checked for NULL here but it was already dereferenced at (1)
        |
    

    Other new warnings

    I don't have space in this article to give examples of every new warning added in GCC 13, but here's a round-up of the others.

    I added support to -fanalyzer for tracking the state of <stdarg.h>:

    • -Wanalyzer-va-list-leak for complaining about missing va_end after a va_start or va_copy
    • -Wanalyzer-va-list-use-after-va-end for complaining about va_arg or va_copy used on a va_list that's had va_end called on it
    • -Wanalyzer-va-arg-type-mismatch for type-checking of va_arg usage in interprocedural execution paths against the types of the parameters that were actually passed to the variadic call
    • -Wanalyzer-va-list-exhausted for complaining in interprocedural execution paths if va_arg is used too many times on a va_list

    Immad Mir implemented tracking of file descriptors within the analyzer as part of Google Summer of Code 2022. We added seven new warnings relating to this in GCC 13:

    • -Wanalyzer-fd-access-mode-mismatch
    • -Wanalyzer-fd-double-close
    • -Wanalyzer-fd-leak
    • -Wanalyzer-fd-phase-mismatch (e.g. calling accept on a socket before calling listen on it)
    • -Wanalyzer-fd-type-mismatch (e.g. using a stream socket operation on a datagram socket)
    • -Wanalyzer-fd-use-after-close
    • -Wanalyzer-fd-use-without-check

    along with attributes for marking int function arguments as being file descriptors.

    Finally, I implemented various other warnings:

    • -Wanalyzer-exposure-through-uninit-copy (for detecting "infoleaks" in the Linux kernel)
    • -Wanalyzer-infinite-recursion
    • -Wanalyzer-jump-through-null
    • -Wanalyzer-putenv-of-auto-var
    • -Wanalyzer-tainted-assertion

    SARIF output

    In GCC 9 I added an option -fdiagnostics-format=json to provide machine-readable output for GCC's diagnostics. This is a custom JSON-based format that closely follows GCC's own internal representation.

    In the meantime, another JSON-based format has emerged as the standard in this space: SARIF (the Static Analysis Results Interchange Format). This file format is suited for capturing the results of static analysis tools (like GCC's -fanalyzer), but it can also be used for plain GCC warnings and errors.

    So for GCC 13 I've extended -fdiagnostics-format= to add two new options implementing SARIF support: -fdiagnostics-format=sarif-stderr and -fdiagnostics-format=sarif-file. I've also joined the technical committee overseeing the standard.

    By producing data in an industry standard format we benefit from interoperability with existing consumers of SARIF data. Figure 1 is a simple example, showing VS Code (with a SARIF plugin) viewing a SARIF file generated by GCC. The IDE is able to annotate the source code, adding squiggly lines under code where GCC finds problems. Here I've clicked on a line where -fanalyzer reported a double-free bug, and the IDE is showing the path of execution through the code that GCC predicted will trigger the problem.

    Screenshot of VS Code  showing GCC SARIF output
    Figure 1: GCC SARIF output in VS Code.

    Fixing false positives

    Static analyzers are not perfect—it's impossible to reason perfectly about the most interesting properties of source code. The GCC analyzer performs a crude simulation of the state of the inside of the program, and I've made many tradeoffs to try to make it fast enough to use when working on code. I receive anecdotal reports that people are using it and it's finding bugs for them earlier than they would have found them otherwise, but there will be false positives and false negatives. The analyzer is a bug-finding tool, rather than a tool for proving program correctness (and, alas, sometimes bugs lead to it being too slow). In technical terms, it's neither "sound" nor "complete." 

    I've spent the first few months of this year trying to reduce "spam" from the analyzer for GCC 13. I created an integration testing suite: I picked various real-world C projects, including Doom, the Linux kernel, and qemu. I've been building them with their standard options, but with -fanalyzer added to the build flags, examining the warnings emitted, and trying to fix the false positives.

    I made a lot of fixes to the analyzer; Table 2 shows some before and after numbers for the warnings that were most improved by this work, where FP means a "false positive" (a bogus warning about a non-problem) and TP means a "true positive" (a valid warning about a real problem in the source code).

    Table 2: Improved warnings.

    Warning

    FP

    before

    FP

    after

    TP

    before

    TP

    after

    -Wanalyzer-deref-before-check 63 12 1 1
    -Wanalyzer-malloc-leak 78 50 0 61
    -Wanalyzer-use-of-uninitialized-value 998 125 0 0

    You can see that I eliminated most (but not all) of the false positives from -Wanalyzer-deref-before-check , and that I reduced the number of FPs from -Wanalyzer-malloc-leak whilst fixing it so that it correctly detected a bunch of real memory leaks that it had previously missed (in Doom's initialization logic, as it happens). Unfortunately, -Wanalyzer-use-of-uninitialized-value is still the "spammiest" warning, despite me making a big dent in its number of FPs; it seems to be most prone to exploring paths through the code that can't happen in practice, where the analyzer doesn't have enough high-level information about invariants in the code to figure that out.

    Trying it out

    GCC 13 has been released upstream, and is the system compiler in the recently-released Fedora 38.

    For simple C examples, you can play around with the new GCC online at the Compiler Explorer site. Select GCC 13.1 and add -fanalyzer to the compiler options to run static analysis.

    As noted above, the analyzer isn't perfect, but I hope it's helpful. Given that every compiler and analyzer finds a slightly different subset of bugs it's usually a good idea to run your code through more than one toolchain to see what shakes out.

    Finally, if you're interested in getting involved in compiler development, I've written a guide to getting started as a GCC contributor. It includes lots of ideas for new warnings and features in GCC's Bugzilla.

    Have fun!

    Last updated: December 5, 2023

    Related Posts

    • The state of static analysis in the GCC 12 compiler

    • Static analysis updates in GCC 11

    • Static analysis in GCC 10

    • Securing malloc in glibc: Why malloc hooks had to go

    • Why you should use io_uring for network I/O

    Recent Posts

    • Federated identity across the hybrid cloud using zero trust workload identity manager

    • Confidential virtual machine storage attack scenarios

    • Introducing virtualization platform autopilot

    • Integrate zero trust workload identity manager with Red Hat OpenShift GitOps

    • Best Practice Configuration and Tuning for Linux and Windows VMs

    What’s up next?

    Users and administrators query and control systemd behavior through the systemctl command. The systemd Commands Cheat Sheet presents the most common uses of systemctl, along with journalctl for displaying information about systemd activities from its logs.

    Get the cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.