Value range propagation (VRP) is an optimization tool used in compilers. The article Value range propagation in GCC with Project Ranger describes how this optimization works and how the GCC team implements it for C and C++ programs. This article will explain how we expand VRP beyond integers and pointers to other data types, particularly floating-point numbers.
Expanding data types in range tracking
A generic, type-agnostic way to keep track of ranges was part of our original goals for the Ranger project. At first, VRP tracked just integers and pointers, and we also wanted to apply it to floats, strings, and other data types.
We spent a good chunk of the GCC 13 release ridding Ranger of any dependencies on integers baked in the original VRP implementation. We made Ranger work with a generic vrange
class instead of the irange
that was specific to integers and pointers. Then we developed an infrastructure to declare typeless temporaries that could be used in intermediate computations within Ranger. When all was said and done, we had a core that could provide VRP, jump threading, and other range-aware optimizations on anything we could express in terms of vrange
.
For the curious, vrange
is nothing more than an abstract class that describes operations on properties (ranges, in our case). It is basically a class to make it easy for Ranger to do operations on sets (union, intersect, etc.):
class vrange
{
public:
virtual void set (tree, tree, value_range_kind = VR_RANGE);
virtual tree type () const;
virtual bool supports_type_p (const_tree type) const;
virtual void set_varying (tree type);
virtual void set_undefined ();
virtual bool union_ (const vrange &);
virtual bool intersect (const vrange &);
virtual bool singleton_p (tree *result = NULL) const;
virtual bool contains_p (tree cst) const;
virtual bool zero_p () const;
virtual bool nonzero_p () const;
virtual void set_nonzero (tree type);
virtual void set_zero (tree type);
virtual void set_nonnegative (tree type);
virtual bool fits_p (const vrange &r) const;
…
…
};
Current support for floating-point numbers
Once the vrange
class was in place, our next step was to provide a barebones implementation of frange
(floating point range) that would give us enough tools to fold conditionals and perhaps keep track of "not a number" values (NaNs). We accomplished this task in the simplest way we could find, through a class that keeps track of endpoints and whether a positive or negative NaN is possible:
class frange : public vrange
{
…
…
private:
tree m_type;
REAL_VALUE_TYPE m_min;
REAL_VALUE_TYPE m_max;
bool m_pos_nan;
bool m_neg_nan;
};
After much more work than anticipated (because floats are painfully hard, and things never go according to plan), here are a handful of things we now support in GCC 13:
-
Folding of symbolic relational operators:
if (x > y) { if (x == y) link_error (); }
-
Propagation of NaNs and infinities:
if (x > y) { // x is not a NAN // y is not a NAN // x is not -INF // y is not +INF }
-
Intervals for ranges:
if (x >= 5.0) { // x is not a NAN // x is [5.0, +INF] } else { // x is [-INF, 4.99999952316] U [NAN] }
-
Signed zeros:
if (x == 0.0) { if (__builtin_sign (x)) y = x; // y = -0.0 else z = x; // z = +0.0 }
-
A handful of operations, including +, -, *, /, negate, abs, relational operators, unordered relational operators, etc.
Even with this initial implementation, we have been able to close quite a few long-standing PRs (bug reports and enhancement requests) related to floats, including PR24021 (VRP does not work with floating points), which has been with us for 17 years!
Using optimized processor instructions
The goal of this work is to provide an infrastructure to increasingly flesh out floating point operations that can help us do better range propagation, as well as aid other optimizations that generate better code. For example, some architectures provide a cheap sqrt
instruction that works on positive numbers but not on NaNs. With floating point ranges, the instruction selection pass might determine that the argument to sqrt is neither a negative value nor a NaN and can replace an expensive sqrt instruction with a cheaper one.
We also hope to make use of this work with glibc in the next release to provide entry points into libm when the operands like sin()
or cos()
functions are known to be in a certain range and void of NaNs and INFs. This improvement will allow us to generate faster code because the compiler can choose cheaper versions of said functions.