Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Profile-guided optimization in Clang: Dealing with modified sources

July 6, 2020
Serge Guelton
Related topics:
Linux

    Profile-guided optimization (PGO) is a now-common compiler technique for improving the compilation process. In PGO (sometimes pronounced "pogo"), an administrator uses the first version of the binary to collect a profile, through instrumentation or sampling, then uses that information to guide the compilation process.

    Profile-guided optimization can help developers make better decisions, for instance, concerning inlining or block ordering. In some cases, it can also lead to using obsolete profile information to guide compilation. For reasons that I will explain, this feature can benefit large projects. It also puts the burden on the compiler implementation to detect and handle inconsistencies.

    This article focuses on how the Clang compiler implements PGO, and specifically, how it instruments binaries. We will look at what happens when Clang instruments source code during the compilation step to collect profile information during execution. Then, I'll introduce a real-world bug that demonstrates the pitfalls of the current approach to PGO.

    Note: To learn more about PGO for Clang, see the Clang Compiler User's Manual.

    Instrumenting code in Clang

    In Clang, the -fprofile-instr-generate flag instructs the compiler to instrument code at the source instruction level, and the -fprofile-generate flag instructs the compiler to instrument code at the LLVM intermediate representation (IR) level. Both approaches share a design philosophy, with some differences in granularity. Our topic is -fprofile-instr-generate, and the way that it interacts with source code changes between profiling and recompilation.

    Consider the following scenario:

    1. Compile a code sample (C0) with -fprofile-instr-generate.
    2. Run it to collect profile information (P0).
    3. Edit the C0 sample and turn it into a new version, C1.
    4. Compile C1 using the original P0 profile information.

    How Clang deals with code modification

    The scenario of using somewhat obsolete profile information might seem odd because we usually compile, profile, and recompile. The profiling step can be quite time-consuming, however. In some cases, it is tempting for big projects to provide downloadable profile information based on a source snapshot. Administrators can then use the snapshot to recompile the code without the pain of collecting a new profile every time. (The dotnet runtime takes this approach.)

    Furthermore, for projects with a high commit rate, it could be unfeasible to provide profile information for each commit. As a result, slight changes to the code might not be documented in the profile used for recompilation. So, how would Clang cope with that?

    The trivial answer of "compare checksums for the whole file" is not satisfying because a slight change would invalidate the whole compilation unit. But the actual mechanism relies on the same idea: On a function basis, compute a checksum on the abstract syntax trees (AST), based on the tree structure. That way, changing a function doesn't invalidate the profile information collected for other functions. Of course, this approach has limitations. Removing a call site changes the number of times the function is called, and thus its hotness. But at least it prevents having profile information that points to code that no longer exists, and the other way around.

    Currently, if such outdated profile information is used, the Clang compiler ignores it and prints a warning:

    > echo 'int main() { return 0; }' > a.c && clang -fprofile-instr-generate a.c && LLVM_PROFILE_FILE=a.profraw ./a.out
    > llvm-profdata merge -output=a.profdata a.profraw
    > printf '#include \nint main() { if(1) puts("hello"); return 0; }' > a.c && clang -fprofile-instr-use=a.profdata a.c
    warning: profile data may be out of date: of 1 function, 1 has mismatched data
          that will be ignored [-Wprofile-instr-out-of-date]
    1 warning generated.
    

    When the improbable happens

    Recently, I was tasked with debugging a Clang segmentation fault (segfault), which was raised as an issue in Red Hat Bugzilla Bug 1827282. After debugging, I ended up with two functions having the same checksum:

    extern int bar;
    
    // first version
    void foo() {
        if (bar) { }
        if (bar) { }
        if (bar) { if (bar) { } }
    }
    
    // second version
    void foo() {
        if (bar) { }
        if (bar) { }
        if (bar) { if (bar) { if (bar) { } } }
    }
    

    That's a strange outcome because the checksum algorithm used in Clang relies on MD5, so the chance of having a conflict should be very low. Did the improbable happen?

    It turns out that it didn't. The conflict was due to a slight bug in the way the hashing was finalized, and we fixed it with a patch (D79961). Basically, when computing the hash, a buffer (uint64_t) needs to be filled. Once it's full, it is converted to an array of bytes and sent to the hashing routine. In the final steps, the uint64_t was directly sent to the routine and implicitly converted to a uint8_t, thus potentially ignoring the trailing nodes of the AST. We resolved the issue by adding a new test case that trivially tests that a small function change is reflected in the hash value.

    The patch works, but it changes the hash of most existing functions—namely, of each of those that had more than one element in their last buffer. That is an important side-effect because changing the hash invalidates most of the existing cached profiling information. Fortunately, the patch doesn't impact the typical "compile, profile, recompile" scenario, but it could be an issue for large build systems that pre-compute profile data for the client to download as part of the build process.

    Conclusion

    Clang and GCC both support using obsolete profile information to guide the compilation process. If a function body changes, obsolete information is ignored. This feature can be beneficial for large projects, where gathering profile information is costly. This puts an extra burden on the compiler implementation to detect and handle inconsistencies, which also increases the likelihood of a compiler bug.

    Last updated: June 30, 2020

    Recent Posts

    • Every layer counts: Defense in depth for AI agents with Red Hat AI

    • Fun in the RUN instruction: Why container builds with distroless images can surprise you

    • Trusted software factory: Building trust in the agentic AI era

    • Build a zero trust AI pipeline with OpenShift and RHEL CVMs

    • Red Hat Hardened Images: Top 5 benefits for software developers

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.