C++ support in libcc1: A comprehensive update

GDB relies on libcc1‘s GCC and GDB plugins to implement the “compile code” feature, now extended to support the C++ language.

The Compile and Execute machinery enables GDB users to compile and execute code snippets within the context of an existing process. This allows users to perform inspection and modification of the program state using the target language well beyond the feature set historically exposed by symbolic debuggers. Almost anything that can be expressed in C, and now also in C++, can be compiled, loaded into the running program, and executed on the spot! It is envisioned that this machinery may also be used in the future to speed up conditional breakpoints, and as a foundation for more advanced features such as “Edit and Continue”.

The libcc1 module offers plugins for GDB and GCC that allow GDB to start GCC to compile a user-supplied code snippet. The plugins combine GDB and GCC into a single multi-process program. Through the plugins, GCC can query GDB about the meaning, in the target program, of names encountered in the snippet, and GDB can incrementally inform GCC about variables, functions, types and other constructs present in the program.

For example, when a GDB user stops a program within a C++ function and enters the command:

(gdb) compile code
>   for (auto i = x.begin(); i != x.end(); i++)
>     if (i->counter > 50)
>       cout << i->name << end

… GDB will start an instance of GCC’s C++ compiler to compile the code snippet. GCC will find a reference to symbol x, and ask GDB for its meaning. GDB will, in response, proceed to define the type of x, and then the locations (source-code line and run-time memory address) of the variable.

According to C++ scoping rules, GCC and GDB together might resolve the name to an automatic or static variable in a local scope of the active function, or of an enclosing function; to a member of an enclosing namespace scope; or to a static or non-static data member of an enclosing class.

Once type and name are defined, GCC can compile calls to the begin and end member functions, to the point of deducing the auto type of i, presumably an iterator type. The debugger supplies enough information for the compiler to behave as if x and its type had been defined in the same translation unit as the code snippet.

If the iterator type operates, as expected, as a pointer to another type, GDB will have introduced to GCC this other type before defining the iterator type. It will also have defined the operator functions that make the iterator type behave like a pointer. This enables the data members of the other type, namely counter and name, to be referenced through the iterator object i.

Finally, when GCC encounters cout, and later endl, it will query GDB, that will define their types and locations in the running program, so that the object code generated from the snippet and loaded into the running program will operate on the same cout and endl objects that the program already uses, be they the standard definitions in the ::std namespace, be they anything else in scope at the point in which the program stopped.

Features

Although C++ shares a number of features with the previously-supported C language, even some shared features required a significantly different interface in the libcc1 extensions for C++.

For example, while both languages support tagged types such as struct, union and enum, C’s name spaces for tagged types are flat, but C++ supports nesting and separate namespaces, so care has to be taken to define tagged types in the appropriate scopes, and the ability to separate declaration from definition of class types becomes strictly necessary to support some scenarios involving member classes.

The newly-introduced interface for C++ support enables GDB (a) to navigate in namespace, class, and function scopes; (b) to forward-declare class types and templates; and (c) to define template specializations; classes; class inheritance; static and nonstatic possibly mutable data members; static, virtual and nonvirtual member functions; and regular, explicit, defaulted and deleted constructors, destructors, and overloaded operators.

Although member access controls (public, protected and private) are specified for class members, GDB has historically enabled expressions to bypass access controls. We have continued this tradition in libcc1’s C++ support. After all, what good would a symbolic debugger be that couldn’t inspect private class members? Therefore, access to private and protected members is allowed in the user-supplied code snippet as if the snippet was in a friend function of every class type. This does not affect, however, function template overload resolution and deduction, not even template constructs designed specifically to tell whether a certain member is accessible, a possibility introduced in C++11. (I.e., the code snippet itself may bypass access controls, but that won’t enable template function specializations named by it to do so.)

Reference and closure types, the null pointer type and constant, exception specifications, and types for pointers to data members and to member functions are supported by the libcc1 interface for C++, so GDB can inform GCC about symbols in the user program that involve any of these features.

Default arguments for functions and templates can involve elaborate expressions in the user program. GDB can construct such expressions incrementally, using expression building interfaces also used to construct the arbitrarily complex expressions that may appear in template function signatures. Alas, not all of these features are currently usable by GDB. Default arguments, for example, are not always represented in debug information, but that is the only source of information besides symbol names that GDB could rely on to reconstruct symbolic information from the running program.

Other debug information shortcomings have largely guided our approach to templates. Since generic definitions are not represented in debug information, specializations not present in the user program cannot be used in the code snippet, and the libcc1 interface for C++ enables GDB to forward-declare but not to define templates. Definitions are only allowed for full specializations of template classes and functions: implicit and explicit specializations are introduced in the same way through libcc1. This design spares the interface from supporting template class partial specializations and nested templates.

Unfortunately, absent generic template definitions, we cannot represent members thereof. Thus, if the user program declares a member of a template class as a friend, the library client has to emulate this by declaring the corresponding member of each specialization of the template class as a friend instead. We have arranged for GCC to emit debug information that makes all specializations of a template that are denoted by a friend declaration to appear as if they had been individually declared as friends, so that GDB is spared of this emulation, and of telling apart cases in which the language distinguishes the friendship of e.g. a member of a member class of implicit versus explicit specializations of a template class, without information about specialization implicitness in current debug information.

Another design decision that might surprise clients of the libcc1 interface for C++ is the treatment given to constructors and destructors. For each source-level constructor or destructor, C++ compilers may define up to four separate entry points, each for slightly different use cases. Since defining a member function through the libcc1 interface amounts to little more than telling GCC the address of the member function’s entry point in the running program, the debugger needs some way to inform GCC about the address of each of these so-called clones (although they really aren’t identical), so that GCC can use the appropriate constructors and destructors, depending on the use case and the available definitions. This is why these member functions are defined through a slightly different API that separates declaration of the source-level member function from definition of each of the corresponding entry points.

Status

A work-in-progress implementation of the libcc1 interface for C++ is available in the GCC GIT repository, in the aoliva/libcp1 branch, in the libcc1 directory, mainly in libcp1* files. We aim for inclusion in the upcoming GCC 7.

The corresponding GDB implementation is also underway, in the GDB GIT repository, branch users/pmuldoon/c++compile, in the gdb/compile directory, mainly in compile-cplus-* files.

The interface specification and documentation are in the top-level include directory, in header files named gcc-cp-*, in both projects’ trees.


Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.

Share