Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Debuginfod project update: New clients and metrics

February 25, 2021
Aaron Merey Frank Eigler
Related topics:
Developer ToolsLinux

Share:

    It's been about a year since our last update about debuginfod, an HTTP file server that serves debugging resources to debugger-like tools. Since then, we've been busy integrating clients across a range of developer tools and improving the server's available metrics. This article covers the features and improvements we've added to debuginfod since our last update.

    Note: For an introduction to debuginfod and how to use it, check out our first article introducing debuginfod and the follow-up explaining how to set up your own debuginfod services.

    New debuginfod clients

    Debuginfod is a part of the elfutils project. Tools that already use elfutils to find or analyze debugging resources automatically inherit debuginfod support. Tools like Systemtap, Libabigail, and dwgrep all inherit debuginfod this way. In Systemtap, for example, debuginfod offers new ways to specify which processes to probe. Previously, if you wanted to explore a running user process, you would have to provide either a process identifier (PID) or the executable path. With debuginfod, Systemtap can probe processes according to build-id, as well. So, it is possible to investigate specific versions of a binary independently from the location of the corresponding executable file.

    Debuginfod includes a client library (libdebuginfod) that lets other tools easily query debuginfod servers for source files, executables, and of course, debuginfo—generally, DWARF (debugging with attributed record format) debuginfo. Since last year, a variety of developer tools have integrated debuginfod clients. As of version 2.34, Binutils includes debuginfod support for its components that use separate debuginfo (readelf and objdump). Starting in version 9.03, the Annobin project contains debuginfod support for fetching separate debuginfo files, and support for Dyninst is planned in version 10.3.

    GDB 10.1 was recently released with debuginfod support, making it easy to download any missing debuginfo or source files on-the-fly as you debug your programs, whether the files are for the executable being debugged or any shared libraries used by the executable. GDB also uses improvements to the libdebuginfod API, including programmable progress updates, as shown in the following example (note that this output is abridged for clarity):

    $ gdb /usr/bin/python
    Reading symbols from /usr/bin/python...
    Downloading separate debug info for /usr/bin/python
    (gdb) list
    Downloading source file /usr/src/debug/python3-3.8.6-1.fc32.x86_64/Programs/python.c...
    8 wmain (int argc, wchar_t **argv)
    9 {
    10 return Py_Main(argc, argv)
    11 }
    (gdb) break main
    Breakpoint 1 at 0x1140: file /usr/src/debug/python3-3.8.6-1.fc32.x86_64/Programs/python.c, line 16.
    (gdb) run
    Starting program: /usr/bin/python
    Downloading separate debug info for /lib64/ld-linux-x86-64.so.2...
    Downloading separate debug info for /lib64/libc.so.6...
    Downloading separate debug info for /lib64/libpthread.so.0...
    [...]
    

    Configuring debuginfod to supply all of these tools with debugging resources is as simple as setting an environment variable (DEBUGINFOD_URLS) with the URLs of debuginfod servers. In case you don't want to set up your own server, we also provide servers that include debugging resources for many common Fedora, CentOS, Ubuntu, Debian, and OpenSUSE packages.  For more information, explore the elfutils debuginfod page.

    New debuginfod server metrics

    Operating a debuginfod server for other people is a pleasure and a chore. Once you have users, they will expect the service to stay up. While debuginfod is a simple server, it still needs monitoring and management. With that in mind, debuginfod comes with the usual logging-to-stderr flags, which are tailor-made for container or systemd operation. (Add another -v for more information.) Additionally, debuginfod offers a web API for sharing a variety of metrics about its internal operations. These metrics are exported in Prometheus, which is industry-standard, human-readable, and comes with numerous consumer and processing tools. The metrics are designed to let you see what its various threads are doing, how they're progressing with their workloads, and what types of errors they've encountered. When archived in a time-series database and lightly analyzed, the metrics might help you derive all sorts of neat quantities guiding resource allocation.

    Configuring Prometheus for debuginfod

    To configure a Prometheus server to scrape debuginfod metrics, add a clause for HTTP or HTTPS to the prometheus.yml configuration file, as shown here:

         scrape_configs:
           - job_name: 'debuginfod'
             scheme: http
             static_configs:
             - targets: ['localhost:8002']
           - job_name: 'debuginfod-https'
             scheme: https
             static_configs:
             - targets: ['debuginfod.elfutils.org'] # adjust
    

    Adjust the global scrape_interval if you like. Debuginfod can handle /metrics queries quickly. Let it run a while, then let's take a tour of the metrics.

    Visualizing debuginfod metrics

    When debuginfod is directed to scan a large directory of archives or files for the first time, it uses a pool of threads (-c option) to decompress and parse them. This activity can be I/O and CPU intensive, and ideally both! How can we tell? Look at the scanned_bytes_total metric, which tabulates the total size of input files debuginfod processed. When converted to a rate, it is close to the read throughput of the source filesystem.

    Note: The following screenshots were generated from built-in Prometheus graphs, but you could use another visualizer like Grafana.

    Measuring total bytes scanned

    The graph in Figure 1 represents an intensive scan job where a remote NFS server is feeding debuginfod at a steady 50MBs for some time, then a less impressive 10MBs later on. We believe Monday's arrival was the likely cause for this drop in scanning performance. Developers returned from the weekend and debuginfod had to share NFS capacity.

    The graph shows a sudden drop in scanning performance.
    Figure 1: Results from debuginfod's scanned_bytes_total metric displayed in a Prometheus graph.

    As you can see, the initial scan goes on and on. Developers keep developing, but the NFS server runs slower and slower. To analyze that, we can look at the thread_work_pending metric.

    Measuring thread activity

    The thread_work_pending metric jumps whenever a periodic traversal pass is started (the -t option and SIGUSR1) and winds back down to zero as those scanner threads do their work. The graph in Figure 2 represents the five-day period where a multi-terabyte Red Hat Enterprise Linux 8 RPM dataset was scanned. The gentle slope-periods corresponded to a few packages with a unique combination of enormous RPM sizes and many builds (Kernel, RT-Kernel, Ceph, LibreOffice). Sharp upticks and downticks corresponded to concurrent re-traversals that were immediately dismissed because the indexed data was still fresh. As the line touches zero, the scanning is done. After that, only brief pulses should show.

    This graph shows sharp upticks and downticks.
    Figure 2: Results from debuginfod's thread_work_pending metric displayed in a Prometheus graph.

    Even before all the scanning is finished, the server is ready to answer queries. This is what it's all about, after all—letting developers enjoy that sweet nectar of debuginfo. But how many are using it, and at what cost? Let's check the http_responses_total metric, which counts and classifies web API requests.

    Measuring HTTP responses

    The graph in Figure 3 shows a small peak of errors (unknown build-ids), a large number of successes (extracting content .rpm), and a very small number of other successes (using the fdcache). This was the workload from a bulk, distro-wide debuginfod scan that could not take advantage of any serious caching or prefetching.

    The graph shows a sharp incline and a gradual decline.
    Figure 3: Results from debuginfod's http_responses_total metric displayed in a Prometheus graph.

    Let's take a look at the cost, too. If you measure cost by bytes by network data, pull up the http_responses_transfer_bytes pair of metrics. If measuring cost by CPU time, pull up the http_responses_duration_milliseconds pair of metrics. With a little bit of PromQL, you can compute the average data transfer or processing time.

    Measuring processing time, groom statistics and error counts

    The graph in Figure 4 shows the duration variant for the same time frame in Figure 3. It reveals how the inability to cache or prefetch the results sometimes required tens of seconds of service time, probably from the same large archives that took so long to scan. Configuring aggressive caching could help to create more typical access patterns. See the metrics that mention fdcache.

    need alt text.
    Figure 4: Measuring processing time with debuginfod metrics in Prometheus.

    Now that your server is up, it will also periodically groom its index (-g option and SIGUSR2). As a part of each groom cycle, another set of metrics is updated to provide an overview of the entire index. The last few numbers give an idea of the storage requirements of a fairly large installation: 6.58TB of RPMs, in 76.6GB of index data:

            groom{statistic="archive d/e"} 11837375
            groom{statistic="archive sdef"} 152188513
            groom{statistic="archive sref"} 2636847754
            groom{statistic="buildids"} 11477232
            groom{statistic="file d/e"} 0
            groom{statistic="file s"} 0
            groom{statistic="filenames"} 163330844
            groom{statistic="files scanned (#)"} 579264
            groom{statistic="files scanned (mb)"} 6583193
            groom{statistic="index db size (mb)"} 76662
    

    The error_count metrics track errors from various subsystems of debuginfod.

    Here, you can see how the errors are categorized by subsystem and type. We hope increases to these metrics can be used to signal a gradual degradation or outright failure. We recommend attaching alerts to them.

            error_count{libc="Connection refused"}  3
            error_count{libc="No such file or directory"}   1
            error_count{libc="Permission denied"}   33
            error_count{libarchive="cannot extract file"}   1
    

    Finally, you can use Grafana to scrape the debuginfod Prometheus server to prepare informative and stylish dashboards, such as the one shown in Figure 5.

    The dashboard displays a variety of debuginfod metrics.
    Figure 5: Debuginfod metrics displayed on a Grafana dashboard.

    Conclusion

    This article was an overview of the new client support and metrics available from debuginfod. We didn't cover all of the available metrics, so feel free to check them out for yourself. If you think of more useful metrics for debuginfod please get in touch with our developers at elfutils-devel@sourceware.org.

    Last updated: February 23, 2021

    Recent Posts

    • How to build a Model-as-a-Service platform

    • How Quarkus works with OpenTelemetry on OpenShift

    • Our top 10 articles of 2025 (so far)

    • The benefits of auto-merging GitHub and GitLab repositories

    • Supercharging AI isolation: microVMs with RamaLama & libkrun

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue