Possible issues with debugging and inspecting compiler-optimized binaries
Developers think of their programs as a serial sequence of operations running as written in the original source code. However, program source code is just a specification for computations. The compiler analyzes the source code and determines if changes to the specified operations will yield the same visible results but be more efficient. It will eliminate operations that are ultimately not visible, and rearrange operations to extract more parallelism and hide latency. These differences between the original program’s source code and the optimized binary that actually runs might be visible when inspecting the execution of the optimized binary via tools like GDB and SystemTap.
To aid with the debugging and instrumentation of binaries the compiler generates debug information to map between the source code and executable binary. The debug information includes which line of source code each machine instruction is associated with, where the variables are located, and how to unwind the stack to get a backtrace of function calls. However, even with the compiler generating this information, a number of non-intuitive effects might be observed when instrumenting a compiler-optimized binary:
- Expected probe points in code are missing.
- Variable values might not be available at some locations.
- A variable might have multiple values at a location.
- Multiple variables have a mixture of old and new values existing at a location.
- Multiple entirely different variables with the same name exist at a location.
These unexpected behaviors are some of the reasons that developers are encouraged to use
-Og rather than enabling compiler optimization with
-O3. However, there are cases where developers might not have the option to recompile the code with more debugging-friendly options. Understanding why these situations happen might save you time and frustration when investigating a misbehaving program. Let’s take a look.
Missing probe points because lines are eliminated
The binary code for a particular line of source code might be removed by the compiler because it has no effect on the later results. This removal might happen when the compiler data and control flow analysis for the function determines that while the code on the line is on a control flow path that could be executed, the values computed are never used. The debugging information that maps the instructions back to source code would have no entries for those eliminated lines. GDB and SystemTap would not be able to inspect the state of the program at those exact source code lines because they no longer exist in the binary.
These unexpected missing lines of source code might become more problematic when a function is inlined in multiple places: Some instances of the inlined function have a line optimized out, but other inlined versions keep that same line. This situation could lead to paths that should be instrumented being missed. SystemTap does have logic to place probe points near the missing line if the particular line isn’t available, but in the case that there are instances of the line being included in the generated code, SystemTap might miss those other instances where the line was removed.
The DWARF debug information specification includes a flag marking the start of a basic block that can help identify the other nearby lines on a per basic block basis. However, in the generated code examined so far, this flag doesn’t seem to ever be set by the GCC or Clang compilers.
Everything you need to grow your career.
With your free Red Hat Developer program membership, unlock our library of cheat sheets and ebooks on next-generation application development.SIGN UP
The value for a variable might not exist at a point in the code
Compilers try to be efficient and store values in places that have the lowest cost to access. On modern processors, the registers can be accessed with the least amount of delay. However, on most processors, there are only a limited number of registers to store values, making them a scarce resource.
The compiler register’s allocation code attempts to maximize register utilization using the same registers to hold different variables at different times. The compiler may determine that a value for a variable is no longer used at one point in the binary and reuse the register to hold another variable, and that old variable value is lost once the new variable is written. Thus, a particular variable might not have any value available at a particular location in the machine code.
Multiple values for a particular variable might exist at a probe point
Developers might reuse a single variable to hold values that are not dependent on each other at different places in a function. The compiler might reorder the operations related to that variable in the binary so that those multiple values are live at the same time, in order to make use of the processor’s ability to execute instructions in parallel or move operations earlier so they do not delay later dependent operations.
This effect might occur with the local variables of a function inlined multiple times in another function. The instructions for the multiple instances of the inline functions are reordered so that the different instances of the local variable from the inlined function are live at the same time. This effect might also happen with loop unrolling, where multiple iterations of a loop are scheduled together. The local variable
a for iteration
i+3 might all have values at the same point in the binary. With aggressive vectorization, this issue might become more common.
Values for multiple variables might not be in a coherent state at a particular location
As mentioned earlier, the compiler might interleave or change the order of operations. Below is a simple example one might want to probe on the line that computes
d, with the expectation that the current value of
a will be available:
a = b + c /* source line 1 */ d = e * f /* source line 2 */ g = d + a /* source line 3 */
However, the compiler might reorder the operations like the code below in order to provide more time between the calculation of
d and its use to compute
g. If the programmer inspects values immediately after running line two, the value of
d is available, but the value for
a has not been computed (unlike the original source code above):
d = e * f /* source line 2 */ a = b + c /* source line 1 */ g = d + a /* source line 3 */
Completely different variables with the same name
There may be multiple variables with the same name at a location in the executable. How many times have you seen the variable
i used as a loop iteration variable and
p used for pointers? For example, the variable used as an argument for a function call might be the same name as the parameter for an inlined function, or the inline function might have local variable names that are the same as the calling function. This setup could lead to confusion as to which function’s variable
p you are examining.
Debugging is hard
Brian Kernighan wrote in The Elements of Programming Style, 2nd edition, Chapter 2:
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Now, with clever compilers, debugging can be even harder! Make the code as simple as possible and consider using these techniques to make the code more transparent and obvious.