Featured image for: Value range propagation in GCC with Project Ranger.

One of the optimizations GNU Compiler Collection (GCC) performs on C and C++ programs is value range propagation (VRP). VRP determines what subranges a variable can contain and uses that information to eliminate redundant calculations. This in turn makes programs smaller and run faster.

As a simple example, consider the two nested if statements in the following C snippet:

if (a < 100 && b < 100)
  {
    c = a + b;
    if (c < 200)
      foo (c);
  }

We know that, if the outer if statement runs its block, a + b cannot be larger than 99 + 99, or 198. This means we don’t need to check if (c < 200) and can rewrite the snippet to be simply:

if (a < 100 && b < 100)
  foo (a + b);

Branches are expensive in programs, so this sort of thing is a big deal. It's an even bigger deal in the modern world where we are concerned about security issues like buffer overflows (accessing data outside of expected bounds). The compiler may add range checks or do other verification to harden the code, but this comes at a cost. VRP can determine that some of those checks are not needed. This produces a faster program that is still safe. We can also use range information to identify dangerous segments of code and alert the user with a warning. The list of potential uses for VRP regularly grows.

The initial version of VRP in GCC was created in 2006 when static single assignment (SSA) was a new infrastructure in the compiler. With a renewed interest in utilizing range information, GCC’s original version of VRP simply isn't flexible enough. Numerous improvements have been made over the past decade, but there is still a long list of outstanding change requests. The underlying infrastructure simply has too many shortcomings to address current needs. The oldest request originates in 2007!

Enter Project Ranger

We’ve been working on Project Ranger for more than three years now. It is designed from the ground up to resolve all the outstanding issues and replace the current VRP pass. Other passes in the compiler will also be able to access this range information. This upgrade is not without difficulties, as anyone who has tried to revamp a 30-year-old codebase may have experienced. An earlier article covers some of the new technology in this project.

GCC 10 (the current release) contains an object-oriented replacement for the underlying range calculations we call range-ops. Range-ops is responsible for solving various kinds of range equations based on each statement. It is also pre-wired to support a new representation of ranges. Range-ops was the first stage of Ranger to make it into GCC. This stage has been live for almost a year and continues to perform well.

The upcoming GCC 11 release will contain:

  • A new implementation of ranges that allows more precise representation.
  • A new Ranger VRP infrastructure that can make use of these precise ranges.
  • A hybrid VRP pass for this release only, which contains both the original and the new version of VRP.

A couple of other passes are also utilizing Project Ranger to provide some combination of improved speed and better results.

Why is there a hybrid pass?

Development of Project Ranger will continue into GCC 12, where we expect to see the project fully realized. Without the full suite of features, we cannot fully replace the original VRP pass.

The hybrid pass queries both the original and the new VRP engines and combines the results. This hybrid pass helps us optimize everything we used to do, as well as getting some new opportunities. It also allows us to easily track differences between the old and new approaches. Not to worry—the hybrid pass is still quite fast.

The primary missing component is relation processing. The initial example in this article showed how ranges can be used to remove conditions. There are times when a relationship between operands must be known in order to remove the condition. An example follows:

if (a == b)
  if (b == c)
    if (a < c) // This can never be true
      call (b)

The final if condition is always false, but we can't determine that based on ranges alone. There might be no discernible ranges for any of these variables. To detect the value of the expression, we need to introduce a thorough knowledge of relations. If we know that a == b and b == c, we can also conclude that a == c, and therefore a < c can never be true. It seems easy and obvious, but it is not quite so natural in the compiler.

The original version of VRP has numerous ad-hoc additions that allow it to perform this optimization sometimes. This is why GCC 11 has a hybrid pass. These important optimizations can still be performed while benefiting from the other features Ranger provides. We also get a full release cycle to compare the new version to the old one and ensure we miss nothing.

This new relation processing code is ready for the start of the GCC 12 development cycle. We are leveraging the new range-ops infrastructure to also evaluate relationships. This enhancement will allow any kind of statement that may calculate a range to also provide relationships between operands:

a = b + 2    // Establishes a > b
if (a < b)   // We can fold this away since it can never be true

We can conclude that a > b after the addition, and eliminate the if statement. We expect to expose and exploit a lot more of these implied relationships in GCC 12.

Into the future

Project Ranger should be the only VRP implementation required in GCC 12. We can remove the legacy code and the various compatibility layers it required. This will improve compilation speed as well as reduce maintenance effort. Finally, we expect there will be additional passes making use of the more precise ranges that are easily available.

Building on this into the future, we also have plans for:

  • Handling ranges for types other than integral values (floats, complex numbers, etc.).
  • Bit value tracking (known 0 and 1 bits).
  • Additional passes making use of this infrastructure.
  • Continued improvement in memory usage and compilation speed.

You may not be able to tell when you compile a program, but there certainly is a lot of new work going on under the covers. GCC 11 has some range improvements, but we hope to resolve most, if not all, of the 26 outstanding deficiencies and feature requests when GCC 12 comes out.

Last updated: April 27, 2021

Comments