Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

How debugging Go programs with Delve and eBPF is faster

February 13, 2023
Derek Parker
Related topics:
Developer toolsGo
Related products:
Red Hat Enterprise Linux

    In this article, I will explain how to use Delve to trace your Go programs and how Delve leverages eBPF under the hood to maximize efficiency and speed. The goal of Delve is to provide developers with a pleasant and efficient Go debugging experience. In that vein, this article focuses on how we optimized the function tracing subsystem so you can inspect your programs and get to root-cause analysis quicker. Delve has two different backends for its tracing implementation, one is ptrace based, while the other uses eBPF. If you’re unfamiliar with any of these terms, don’t worry, I will explain along the way.

    What is program tracing?

    Tracing is a technique that allows a developer to see what the program is doing during execution. As opposed to typical debugging techniques, this method does not require direct user interaction. One of the most well-known tracing tools is strace, which allows developers to see which system calls their program during execution.

    While the aforementioned strace tool is useful for gaining insight into system calls, the Delve trace command allows you to gain insight into what is happening in "userspace" within your Go programs. This Delve trace technique allows you to trace arbitrary functions in your program in order to see the inputs and outputs of those functions. Additionally, you can also use this tool to gain insight into the control flow of your program without the overhead of an interactive debugging session as it will also display with Goroutine is executing the function. For highly concurrent programs this can be a quicker way to gain insights into your programs execution without starting a full interactive debugging session.

    How to trace Go programs with Delve

    Delve allows you to trace your Go programs by invoking the dlv trace subcommand. The subcommand accepts a regular expression and will execute your program, setting a tracepoint on each function that matches the regular expression and displaying the results in real time.

    The following program is an example:

    package main
    
    
    import "fmt"
    
    func foo(x, y int) (z int) {
            fmt.Printf("x=%d, y=%d, z=%d\n", x, y, z)
            z = x + y
    
            return
    }
    
    func main() {
            x := 99
            y := x * x
            z := foo(x, y)
    
            fmt.Printf("z=%d\n", z)
    }

    Tracing this program will give you the following output:

    $ dlv trace foo
    
    > goroutine(1): main.foo(99, 9801)
    
    x=99, y=9801, z=0
    
    >> goroutine(1): => (9900)
    
    z=9900
    
    Process 583475 has exited with status 0

    As you can see, we supplied foo as the regexp, which in this case, matched the function of the same name in the main package. The output prefixed with > denotes the function being called and shows the arguments the function was called by, while the output prefixed with >> denotes the return from the function and the return value associated with it. All input and output lines are prefixed with the Goroutine executing at the time.

    By default, the dlv trace command uses the ptrace based backend, however adding the --ebpf flag will enable the experimental eBPF based backend. Using the previous example, if we were to invoke the trace subcommand like the following:

    $ dlv trace –ebpf foo

    We would receive similar output. However, what happens behind the scenes is much different and significantly more efficient.

    The inefficiencies of ptrace

    By default, Delve will use the ptrace syscall in order to implement the tracing feature. The ptrace is a syscall that allows programs to observe and manipulate other programs on the same machine. In fact, on Unix systems, Delve uses this ptrace functionality to implement many low-level functionalities provided by the debugger, such as reading/writing memory, controlling execution, and more.

    While ptrace is a useful and powerful mechanism, it suffers from inherent inefficiencies. First, the fact that ptrace is a syscall means that we must cross the user space/kernel space boundary, which adds overhead every time the function is used. This is compounded by the number of times we have to invoke ptrace in order to achieve the desired results. Considering the previous example, the following is a rough outline of the tracing implementation steps using ptrace:

    1. Start the program and attach the debugger using `ptrace(PT_ATTACH)`.
    2. Set a breakpoint at each function which matches the provided regular expression, using `ptrace` to insert the breakpoint instruction into the traced processes executable memory.
    3. Additionally, set breakpoint at each return instruction for that function.
    4. Continue the program, again using `ptrace(PT_CONT)`.
    5. Hit breakpoint at function entry, and read function arguments. This step can involve many ptrace calls as we read CPU registers, memory on the stack and memory in the heap if we must dereference a pointer.
    6. Continue the program again using `ptrace(PT_CONT)`.
    7. Hit breakpoint at function return, going through the same aforementioned process to read variables potentially involving many more calls to `ptrace` to read registers and memory.
    8. Continue the program again using `ptrace(PT_CONT)`.
    9. Repeat until the program ends.

    Obviously, the more arguments and return values the function has, the more expensive every stop becomes. All the time the debugger spends making these `ptrace` syscalls, the program we are tracing is paused and not executing any instructions. From the users’ perspective, this makes the program run significantly slower than it otherwise would. Now, for development and debugging, maybe this isn’t such a big deal, but time is precious, and we should endeavor to do things as quickly as possible. The quicker your program runs while tracing, the quicker you can get to the root cause of the issue you’re trying to debug.

    Now, the question becomes, how can we make this better? In the next section, we discuss the new eBPF based backend and how it improves upon this approach.

    How eBPF is faster than ptrace

    One of the biggest speed and efficiency improvements we can make is to avoid a lot of the syscall overhead altogether. This is where eBPF comes into play because instead of setting breakpoints on each function, we can instead set uprobes on function entry and exit and attach small eBPF programs to them. Delve uses the Cilium eBPF Go library to load and interact with the eBPF programs.

    Each time the probe is hit, the kernel will invoke our eBPF program and then continue the main program once it has completed. The small eBPF program we write will handle all of the steps listed above at function entry and exit but without all the syscall context switching because the program executes directly within kernel space. Our eBPF program is able to communicate with the debugger in userspace via eBPF ringbuffer and map data structures, allowing Delve to collect all of the information it needs.

    The benefit of this approach is that the time the program we are tracing needs to be paused is significantly decreased. Running our eBPF program when a probe is hit is much quicker than invoking multiple syscalls at function entry and exit.

    The flow of tracing and debugging using eBPF

    Again, using the previous example, the following is a rough outline of the tracing implementation steps using eBPF:

    1. Start the program and attach using `ptrace(PT_ATTACH)`.
    2. Load all uprobes into the kernel for each function to trace.
    3. Continue the program using `ptrace(PT_CONT)`.
    4. Hit uprobes at function entry / exit. In kernel space, each time a probe is hit, the kernel runs our eBPF program, which gathers function arguments or return values and sends them back to userspace. In user space, read from eBPF ringbuffer as function arguments, and return values are sent.
    5. Repeat until the program ends.

    Using this method, Delve is able to trace a program in significantly less time than with the default ptrace implementation. Now, you may ask, if it is so much more efficient to use this method, why not make it the default? Eventually, it likely will be made default. But for the time being, development is still ongoing to improve this eBPF based backend and ensure it has parity with the ptrace based one. However, you can still use it today by supplying the `--ebpf` flag during the `dlv trace` invocation.

    To give a sense of how much more efficient this method is, I measured a different example program running by itself and then under the different tracing methods with the following results.

    Program execution: 23.7µs
    
    With eBPF trace: 683.1µs
    
    With ptrace tracing: 2.3s

    The numbers speak for themselves!

    Why not use uretprobes?

    If you're familiar with eBPF a uprobes / uretprobes you may be asking yourself why we use uprobes for everything as opposed to simply using uretprobes to capture return arguments. The explanation for this gets relatively complex, but the short version is that the Go runtime needs to inspect the call stack at various times during the execution of a Go program. When uretprobes are attached to a function they overwrite the return address of that function on the stack. When the Go runtime then inspects the stack it finds an unexpected return address for the function and will end up fatally exiting the program. To work around this we simply use uprobes and leveraging Delves ability to inspect the machine instructions of the program to set probes at each return instruction for a function.

    Delve debugs Go code faster with eBPF

    The overall goal of Delve is to help developers find bugs in their Go code as quickly as possible. To do this, we leverage the latest methods and tech available and try to push the boundaries of what a debugger can accomplish. Delve leverages eBPF under the hood to maximize efficiency and speed. User space tracing is a great tool for any engineer to have in their toolbox, and we aim to make it efficient and easy to use.

    Building and delivering modern, innovative apps and services is more complicated and fast-moving than ever. Join the Red Hat Developer program for tools, technologies, and community to level up your knowledge and career. Learn more...

    Related Posts

    • Using Delve to debug Go programs on Red Hat Enterprise Linux

    • Secure your Kubernetes deployments with eBPF

    • Network debugging with eBPF (RHEL 8)

    • Build a Go application using OpenShift Pipelines

    Recent Posts

    • Debugging image mode with Red Hat OpenShift 4.20: A practical guide

    • EvalHub: Because "looks good to me" isn't a benchmark

    • SQL Server HA on RHEL: Meet Pacemaker HA Agent v2 (tech preview)

    • Deploy with confidence: Continuous integration and continuous delivery for agentic AI

    • Every layer counts: Defense in depth for AI agents with Red Hat AI

    What’s up next?

    As a developer, you want to develop software without the overhead of do-it-yourself operations. The OpenShift CLI odo lets you develop cloud-native applications without having to learn dozens of commands. Our odo cheat sheet covers the commands you need to get started.

    Get the cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.