A stripped binary is generally devoid of symbol and debug information. There could be many reasons why one would need to generate a stripped binary. Symbol information takes up a lot of storage space, so when there are space constraints, it makes sense to work with stripped binaries.
In the case of Go, we can create a stripped binary either by using go build ldflags="-s -w"
or by using the strip
command. Debugging is the other side of development, and debugging a stripped binary can be challenging. Go is unique, and unlike other languages, we can still hack our way around to obtain some amount of symbol information from the binaries. This article focuses on how we support stripped binaries in Delve, the Go debugger.
Finding symbols in stripped binaries
Go retains some information on symbols even within a stripped binary. This information is encoded in a section of the binary called the “PC Line Table” and can be found in the .gopclntab
section in ELF binaries. This table contains metadata for all functions in the final binary, including information for symbolic stack traces and data for the garbage collector such as stack maps. This section remains even after removing DWARF debug information by specifically requesting it via the linker using -ldflags=”-s -w”
or using the system strip
command.
Setting breakpoints in stripped binaries
The first step towards enabling support for stripped binaries is to ensure the basic functionality of a debugger still works. One of those essential pieces is the ability to set a breakpoint using a function name. To do this, the debugger needs to translate the symbol name into an address in memory. One way of achieving this is by using the DWARF debug information which is encoded into the binary in various sections. However, in the case of a stripped binary that DWARF information would not be present. Now, even so, there are potential options for the debugger. If the system supports it, you may download separate debug information manually or by using a service such as debuginfod. If that is not a viable option, then we can instead use the .gopclntab
section mentioned in the last paragraph.
When Delve detects the absence of debug information and is unable to obtain it anywhere else, it will fall back to this section. Whereas Delve has custom parsers for DWARF information, the Go standard library provides us access to the information stored in the .gopclntab
section. Using the debug/gosym package and knowledge of where the .gopclntab
section lives within the binary, we can translate a function name such as main.main
to a memory address, and voila! With that in hand, Delve is able to write the breakpoint instruction to that memory address and we have breakpoints. However we shall see later in this blog that in some cases, not all the available information is available in the .gopclntab
section and we will still need to fall back on the DWARF section to get us all the answers.
Now, for functions which have not been inlined, this works great. But what about those which the compiler has chosen to optimize and inline? An inlined function presents different and unique problems for a debugger, which are typically easily overcome. Instead of resolving a function to a single memory address, it instead must be mapped to each location the function was inlined into. With the presence of DWARF, this is rather easy. DWARF can easily encode this information in a standardized way, allowing Delve to parse and retrieve that information and use it during a debug session. However, can we get that information out of the .gopclntab
section? Turns out the answer is yes, but it’s not as straightforward as one might imagine.
Inline function support in stripped binaries
Being able to place a breakpoint on a function is a given with any debugger and Delve indeed supports this feature even in a stripped binary. However, if the function is inlined, additional steps need to be taken to extract the inline information so that it is able to place the breakpoint in that function. Inlining is quite a pervasive optimization that is frequently carried out in the Go compiler, so it is imperative that adequate support is provided in the debugger.
Figure 1 depicts the various steps to be added to providing this support in Delve, which are described in the following sections.
A: Placing breakpoint on a function in the regular case
Usually in Delve, the process to place a breakpoint is to first identify the location of the breakpoint requested and then proceed to put a breakpoint in that location. When we are trying to place a breakpoint on a function, findFuncCandidates is the method that is invoked to get this information. This internally queries BI (binary information) functions that encodes this information. This works even in the case of a stripped binary due to the usage of the .pclntab
section data.
B: If the stripped binary contains inlined functions
If the stripped binary contains inlined functions, the information exposed via the standard library debug/gosym
package does not include information on these functions. This information is present in the .pclntab
section via a inline tree but isn’t parsed by that package. The solution to this problem is to adapt the code defined in the govulncheck package gosym
, which does parse and expose the information stored in the inline tree. This information is used to build the inlined functions list that was earlier missing from the BI functions data structure. Once this missing information has been added back to the original list of functions, it is now easy to identify the location(s) of the inlined functions that the debugger queried upon with the function name, so that the breakpoint(s) can be placed. This is what is depicted as B in the flowchart.
B.1: Use a modified debug/gosym package
The govulncheck
package is used to detect uses of known vulnerabilities in Go programs hence it needs to track which vulnerable functions and methods are called during program execution. For our purposes we need to adapt the code mainly in gosym/pclntab.go, gosym/symtab.go and gosym/additions.go. The code in gosym/symtab.go and gosym/pclntab.go helps to construct the line table which is the central data structure needed to access inlining information. The main method that extracts the inlining information into a data structure is the InlineTree
, which is implemented in gosym/additions.go.
If we look at the writeFuncs
method in the upstream Go compiler code pcln.go, all the function information (including inlined ones) is emitted with respect to the basic symbol go:func.*
. The key to getting the InlineTree
to work is to correctly find the offset of the go:func.*
symbol in a stripped binary. The offset of this symbol is important because the inline tree is stored relative to it within the binary. We will focus on this in section B.2 of this post.
B.2: Access InlineTree stored in the pclntab Table
The code in the govulncheck
gosym
package would work as is for a binary that is not stripped, since it assumes the presence of the .symtab
section and works its way from there to access symbol information pertaining to go:func.*
and get the necessary offset. However, in the case of a stripped binary, the .symtab
section is not present, requiring us to jump through a few more hoops to get the information we need.
For this purpose, we access the Go rodata
section which is the read only data section consisting of constant values and instructions. All information pertaining to functions are stored in the modules, one for each function, that are connected together in the moduledata
structure. The moduledata
is organized in the following format wherein the gofunc
symbol is stored right next to the rodata
address. We can now access the gofunc
symbol from the image by doing a simple pointer arithmetic operation to access the next member:
type moduledata struct {
...
rodata uintptr
gofunc uintptr
...
}
Once we access the go:func.*
symbol, it is easy to construct all the other parameters required to invoke InlineTree
, such as the go:func.*
symbol value (goFuncvalue
) from the moduledata
structure, baseaddr
(which is the address of the memory region containing goFuncValue
) and the reader pointer stationed at the start of the region. Invoking InlineTree
with these parameters would then give us all the missing functions that were inlined and completes the BI functions structure which can then be queried to give us the location(s) to place the breakpoint.
You can find more details on GitHub: https://github.com/go-delve/delve/pull/3549