debug with delve

A stripped binary is generally devoid of symbol and debug information. There could be many reasons why one would need to generate a stripped binary. Symbol information takes up a lot of storage space, so when there are space constraints, it makes sense to work with stripped binaries.

In the case of Go, we can create a stripped binary either by using go build ldflags="-s -w" or by using the strip command. Debugging is the other side of development, and debugging a stripped binary can be challenging. Go is unique, and unlike other languages, we can still hack our way around to obtain some amount of symbol information from the binaries. This article focuses on how we support stripped binaries in Delve, the Go debugger. 

Finding symbols in stripped binaries

Go retains some information on symbols even within a stripped binary. This information is encoded in a section of the binary called the “PC Line Table” and can be found in the .gopclntab section in ELF binaries. This table contains metadata for all functions in the final binary, including information for symbolic stack traces and data for the garbage collector such as stack maps. This section remains even after removing DWARF debug information by specifically requesting it via the linker using -ldflags=”-s -w” or using the system strip command.

Setting breakpoints in stripped binaries

The first step towards enabling support for stripped binaries is to ensure the basic functionality of a debugger still works. One of those essential pieces is the ability to set a breakpoint using a function name. To do this, the debugger needs to translate the symbol name into an address in memory. One way of achieving this is by using the DWARF debug information which is encoded into the binary in various sections. However, in the case of a stripped binary that DWARF information would not be present. Now, even so, there are potential options for the debugger. If the system supports it, you may download separate debug information manually or by using a service such as debuginfod. If that is not a viable option, then we can instead use the .gopclntab section mentioned in the last paragraph.

When Delve detects the absence of debug information and is unable to obtain it anywhere else, it will fall back to this section. Whereas Delve has custom parsers for DWARF information, the Go standard library provides us access to the information stored in the .gopclntab section. Using the debug/gosym package and knowledge of where the .gopclntab section lives within the binary, we can translate a function name such as main.main to a memory address, and voila! With that in hand, Delve is able to write the breakpoint instruction to that memory address and we have breakpoints. However we shall see later in this blog that in some cases, not all the available information is available in the .gopclntab section and we will still need to fall back on the DWARF section to get us all the answers. 

Now, for functions which have not been inlined, this works great. But what about those which the compiler has chosen to optimize and inline? An inlined function presents different and unique problems for a debugger, which are typically easily overcome. Instead of resolving a function to a single memory address, it instead must be mapped to each location the function was inlined into. With the presence of DWARF, this is rather easy. DWARF can easily encode this information in a standardized way, allowing Delve to parse and retrieve that information and use it during a debug session. However, can we get that information out of the .gopclntab section? Turns out the answer is yes, but it’s not as straightforward as one might imagine.

Inline function support in stripped binaries

Being able to place a breakpoint on a function is a given with any debugger and Delve indeed supports this feature even in a stripped binary. However, if the function is inlined, additional steps need to be taken to extract the inline information so that it is able to place the breakpoint in that function. Inlining is quite a pervasive optimization that is frequently carried out in the Go compiler, so it is imperative that adequate support is provided in the debugger. 

Figure 1 depicts the various steps to be added to providing this support in Delve, which are described in the following sections.

Flowchart showing the steps for adding inline function support in Delve.
Figure 1: The steps to provide Delve inline function support in stripped binaries.

A: Placing breakpoint on a function in the regular case

Usually in Delve, the process to place a breakpoint is to first identify the location of the breakpoint requested and then proceed to put a breakpoint in that location. When we are trying to place a breakpoint on a function, findFuncCandidates is the method that is invoked to get this information. This internally queries BI (binary information) functions that encodes this information. This works even in the case of a stripped binary due to the usage of the .pclntab section data. 

B: If the stripped binary contains inlined functions

If the stripped binary contains inlined functions, the information exposed via the standard library debug/gosym package does not include information on these functions. This information is present in the .pclntab section via a inline tree but isn’t parsed by that package. The solution to this problem is to adapt the code defined in the govulncheck package gosym, which does parse and expose the information stored in the inline tree. This information is used to build the inlined functions list that was earlier missing from the BI functions data structure. Once this missing information has been added back to the original list of functions, it is now easy to identify the location(s) of the inlined functions that the debugger queried upon with the function name, so that the breakpoint(s) can be placed. This is what is depicted as B in the flowchart. 

B.1: Use a modified debug/gosym package

The govulncheck package is used to detect uses of known vulnerabilities in Go programs hence it needs to track which vulnerable functions and methods are called during program execution. For our purposes we need to adapt the code mainly in gosym/pclntab.go, gosym/symtab.go and gosym/additions.go. The code in gosym/symtab.go and gosym/pclntab.go helps to construct the line table which is the central data structure needed to access inlining information. The main method that extracts the inlining information into a data structure is the InlineTree, which is implemented in gosym/additions.go.

If we look at the writeFuncs method in the upstream Go compiler code pcln.go, all the function information (including inlined ones) is emitted with respect to the basic symbol go:func.*. The key to getting the InlineTree to work is to correctly find the offset of the go:func.* symbol in a stripped binary. The offset of this symbol is important because the inline tree is stored relative to it within the binary. We will focus on this in section B.2 of this post.

 B.2: Access InlineTree stored in the pclntab Table 

The code in the govulncheck gosym package would work as is for a binary that is not stripped, since it assumes the presence of the .symtab section and works its way from there to access symbol information pertaining to go:func.* and get the necessary offset. However, in the case of a stripped binary, the .symtab section is not present, requiring us to jump through a few more hoops to get the information we need. 

For this purpose, we access the Go rodata section which is the read only data section consisting of constant values and instructions. All information pertaining to functions are stored in the modules, one for each function, that are connected together in the moduledata structure. The moduledata is organized in the following format wherein the gofunc symbol is stored right next to the rodata address. We can now access the gofunc symbol from the image by doing a simple pointer arithmetic operation to access the next member:

type moduledata struct {

     ...

       rodata uintptr

       gofunc uintptr

       ...

  }

Once we access the go:func.* symbol, it is easy to construct all the other parameters required to invoke InlineTree, such as the go:func.* symbol value (goFuncvalue) from the moduledata structure, baseaddr (which is the address of the memory region containing goFuncValue) and the reader pointer stationed at the start of the region. Invoking InlineTree with these parameters would then give us all the missing functions that were inlined and completes the BI functions structure which can then be queried to give us the location(s) to place the breakpoint. 

You can find more details on GitHub: https://github.com/go-delve/delve/pull/3549