Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

The new oracles of GCC

December 20, 2023
Andrew MacLeod
Related topics:
Compilers
Related products:
Red Hat Enterprise Linux

Share:

    A new infrastructure has been added to GCC optimizers over the past few years. This was nominally in support of Ranger, the new range generation engine that I have been working on. The primary goal is to modernize the code base and provide flexible usage options throughout the rest of the compiler.

    One of the major changes was moving to an on-demand model for calculating ranges. Traditionally range information was created by the VRP (Value Range Propagation) pass and some basic global information was then made available outside this pass. The more accurate context sensitive ranges were only available within the pass. There were some attempts to integrate the old VRPs mechanisms into other passes, but these were awkward and ultimately unsatisfactory.

    Ranger however can be enabled in any pass and queried for accurate range information without any additional requirements. It has virtually no startup overhead, and performs a minimal amount of work when a query is made. This makes it quite inexpensive for most uses. In GCC 14 there are at least nine different passes which are now using range information via Ranger, and we continue to find new opportunities to use it.

    Ranger as an oracle

    Ranger has been changed to operate within GCC as an oracle. Oracles are commonly understood to provide expert answers to questions without the client needing to understand the subject matter. A client pass can ask for the range of a variable at any specific statement, edge, or block and get a definitive answer. The client has no need to understand how it happens or enable anything else. The oracle takes care of lookup, processing, and then collating relevant information into a consumable form.

    Much of the other infrastructure Ranger uses has also been converted to a similar oracle model, providing expert knowledge of its subject matter. They each have a specific API and no prerequisites on the other oracles.

    I have made all the new oracles always queryable, but not necessarily active. This means the client code can always ask questions of an omnipresent oracle. The quality of the answer provided can be adjusted by how much work the oracle is allowed to do. If it is disabled, we simply return an answer indicating it doesn’t know much. For a range, it might be [-INF, +INF]. For a relation, it might be NO_RELATION. Clients can get better answers by simply enabling a smarter oracle (at a higher cost) to answer the question. There are different variations of the Ranger oracle for exactly this reason and they all share the identical query API.

    This is a list of numerous advantages to this model:

    • Functionality - You can tailor functionality by only enabling the oracles you care about.
    • Performance - Oracles also have a cost, so you can also tailor performance by not enabling those which are expensive or unlikely to be profitable.
    • Sharable - The information in the oracle can be queried by more than just Ranger. The client pass can also make queries from an oracle if needed.
    • Independence - A desired oracle can be turned on without Ranger and utilized independently.
    • Expandable - Trivial to add new oracles and use them. It doesn’t change any existing APIs or members in existing classes.

    The new oracles

    There are other new oracles created to support Ranger in the past few releases. Many of them started as private internal components, but have been adjusted as oracles available throughout the compiler. The primary motivation for this change has been the sharing and independence aspects so they don’t have to be exported from a Ranger instance for use.

    Static edge range oracle

    This is the simplest oracle and provides the static range implied by an edge from a branch. Primarily it exists because there is no simple way in GCC to map switch edges to the range of the cases it leads to. For a conditional branch (i.e., if (x)), this is the value of TRUE or FALSE.

    Switches are much more complicated. The oracle will analyze all the edges in the switch and the value of the case labels they lead to. It then builds ranges for those labels and edges and caches them such that any further queries will be a simple lookup. This is primarily a performance oriented oracle to avoid multiple expensive recalculations. The various switch optimization passes can also utilize this oracle.

    Inferred range oracle

    An inferred range is a side effect of executing a statement. Take a pointer de-reference such as a = *b, for instance. We can usually infer that after this is executed b will be non-null. Otherwise an exception would have been triggered.

    Similarly, if we see a = b / c we can assume that c is non-zero afterwards or a divide-by-zero trap would have triggered.

    A third case could be z = x << y. The language standards say that shift values that are out of range for the type are undefined behavior. This allows us to assume the range of y after this statement might be something like [0, 64].

    The oracle for inferred ranges manages the results of statements as they are processed by the independent range folding machinery and makes them generally available. We use this for data flow analysis to propagate the range adjustments through the flow graph. How is this different from a cache, you might ask? Well, it is very similar. But the data is collected in the background by using the folding components. It also has a mode where it will follow use-def chains to see if there are any inferred ranges in the block that it does not know about, so it is able to find the answer on its own at a higher cost.

    Relation oracle

    GCC 13 added an oracle that is used to find relations between values. The obvious cases are branches like if (x > y). The x will be greater than y in all blocks dominated by the TRUE edge, and less than or equal to y on the FALSE side. There are many other relations which can be added, however.

    For example, with a = b + c, if c is known to be non-negative, then we know a > b. Likewise if c is known to be negative, then a < b is true. Other arithmetic operations can generate similar relations.

    There are other calculations we can make. The following is a common idiom with unsigned values (which wrap):

      x = y + 1
      If (x < y)
        Overflow (y)

    At the call to overflow(), the oracle indicates x < y, and Ranger can use this information to re-evaluate x = y + 1  and determine that y must be 0xFFFFFFFF because that is the only possible way the relation can be true.

    Before the relation oracle, all relation tracking in GCC was very ad-hoc, with different passes rolling their own solution for their specific needs. This was often achieved by hard-coded recognition of specific code patterns. As situations became more complex, it was hard to get everything right.

    Although we haven’t extended direct use of this oracle much in GCC 14 because there never seems to be enough time to do everything, it is used indirectly by many passes through the use of the Ranger oracle which enables it. We intend to push its individual use into the remaining passes to replace their custom solutions.

    PHI analyzer

    GCC 14 adds a new oracle, the PHI analyzer. I won’t go into details about what the SSA form is, but it includes statements referred to as PHI nodes. These are just fancy copies that allow the value to be chosen depending on which flow graph edge was taken to arrive at the PHI node.

      a_2 = PHI <0(2), a_6(3), b_2(5)>

    The a_2 value is assigned the value of 0, a_6, or b_2 depending on whether the execution arrived via the edge from block 2, block 3, or block 5.

    Loop analysis uses PHI nodes to find loop induction variables, but there are many other classes of PHI usage patterns with which it does not help. The PHI analyzer currently identifies groups of PHI nodes which mostly consume each other but are only updated by one statement. This allows us to make observations about the entire group of values.

      <bb 4> :
      qa_20 = qa_10 + 1;
      <bb 5> :
      # qa_9 = PHI <qa_10(3), qa_20(4)>
      <bb 6> :
      # qa_10 = PHI <0(2), qa_9(5)>

    The qa_9, qa_10, and qa_20 values form a group whose only initialization value is 0, and the only modifier is qa_20 = qa_10 + 1. With this information the oracle can determine that qa_9, qa_10, and qa_20 start with 0 and always increase in value. Therefore, it must be within the range of [0, +INF]. If we happen to have loop information available, we may also be able to determine that the statement executes 50 times at the most. Then we can further conclude the range is [0, 50] based on simulating the execution of the modification statement that many times. This is one simple example of what the PHI analyzer can do, and I anticipate expanding its functionality in future releases.

    Summary

    Although the concept of an oracle is not new, my increasing use of it in GCC is. A significant part of their design ensures their performance is similar to a more tightly integrated traditional solution. Now, I try to add new components using this model when it can be of use elsewhere. This was not my original intention, but the rapid adoption of Ranger into the other passes is, in part, due to its ease of use. Developers are more likely to utilize it when they discover there is virtually no learning curve, and that is a novelty in the world of GCC’s internals.

    Related Posts

    • Value range propagation in GCC with Project Ranger

    • An upside-down approach to GCC optimizations

    • Understanding GCC warnings

    • How we expanded GCC value range propagation to floats

    Recent Posts

    • A deep dive into Apache Kafka's KRaft protocol

    • Staying ahead of artificial intelligence threats

    • Strengthen privacy and security with encrypted DNS in RHEL

    • How to enable Ansible Lightspeed intelligent assistant

    • Why some agentic AI developers are moving code from Python to Rust

    What’s up next?

    Discover how edge computing and automation can help your organization improve scalability, security, agility, and overall efficiency. Automation at the edge illustrates the benefits of edge automation in seven industry use cases.

    Get the e-book
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue