Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Limitations of frame pointer unwinding

October 30, 2024
Serhei Makarov
Related topics:
Linux
Related products:
Red Hat Enterprise Linux

Share:

    Recent versions of commonly used Linux distributions including Fedora and Ubuntu have disabled frame pointer optimizations with the goal of allowing profiling tools to produce stack traces without needing to include a call-frame information interpreter. In this article, I will explain some overlooked limitations of unwinding with frame pointers and why enabling frame pointers does not constitute a full solution to enable profiling. I will also list some initiatives that aim to enable system-wide profiling without the need for frame pointers.

    Overview

    Several recent articles have discussed the interaction of frame pointer optimization defaults and profiling, including Guinevere Larsen’s overview of the issue, Will Cohen’s article on call-frame information and unwinding, and my own article on profiling frame pointer-less code with eu-stacktrace.

    In short, modern compilers can produce code either with or without a frame pointer register indicating the beginning of the current stack frame. With frame pointers enabled, the structure of the call stack is trivial to analyze; without frame pointers, an additional register becomes available for general computation. Since around 2011 with GCC version 4.6, the default has been to omit the frame pointer register, which means that debugging and profiling tools must use call-frame information to produce stack traces. In user space, call-frame information in the DWARF-based .eh_frame format is universally available.

    Unfortunately, it has not been feasible to include a full interpreter for DWARF and .eh_frame bytecode in the Linux kernel. Thus, the kernel’s perf_events framework can only use frame pointer unwinding for user-space code, which has affected profiling tools based on this framework and made them non-functional on most Linux distributions. This led to widespread calls to recompile distributions with frame pointers enabled, in the hopes of enabling system-wide stack trace profiling based on perf_events.

    Unfortunately, there are several issues with frame pointer unwinding that have been overlooked in the recent discussions:

    1. Uneven distribution of performance gains and losses
    2. Function prologues and epilogues
    3. Assembly-code functions in libraries

    Uneven distribution of performance gains and losses

    First, the users most impacted by the slowdown due to frame pointers are different from the users who benefit from profiling-driven fixes. This creates a win-lose tradeoff that cannot be discussed in a satisfying fashion.

    In general, one group of users is concerned with performance losses on systems with large numbers of interacting components. Such systems can exhibit issues due to mistuning, which can be fixed for a large performance impact when a profiler is available.

    Another group of users are concerned with raw computational capacity when obtaining the maximum degree of optimization from their compiler. There are no low-hanging fruit for such users to find; all that they get from re-enabling frame pointers is a 1-2% performance loss, which translates to the loss of about 1 or 2 years of compiler improvements.

    It is never good for an upstream project to be forced to decide which group of users is more important. This is especially true for projects whose user base is as large and diverse as that of a compiler or Linux distribution. Thus, there is a lot of motivation to develop a profiling solution that does not require frame pointer optimizations to be disabled.

    Function prologues and epilogues

    Second, the profiles produced by frame pointer unwinding will inevitably exhibit gaps around function prologues and epilogues and in procedure lookup table (PLT) sections. In these portions of an executable, the frame pointer register does not accurately reflect the current stack frame, which causes the frame pointer unwinder to skip the innermost function. In particular, this affects the validity of the profile for evaluating code-locality optimizations.

    A lower-bound estimate of the problem can be obtained by a method suggested by Will Cohen. The minimum size of an x86 function prologue is 8 bytes. We can use the following perf command to check the number of samples that fall into the first 8 bytes of a function and are thus guaranteed to have an inaccurate frame pointer:

    # perf report --sort=sample,symoff
      | grep -E '0x[01234567]$' # first 8 bytes only
      | grep -v "[k]" | grep -v "@plt" # userspace only

    When tested on x86, this analysis yielded about 5.2% of samples falling into the first 8 bytes of a function. Therefore, at least that proportion of samples will have an inaccurate stack trace when frame pointer unwinding is used. The actual proportion is likely to be greater, since compiler optimizations may expand the prologue with additional initialization code. Similarly, on aarch64, where the minimal function prologue size is 12 bytes, a minimum of 6.0% of inaccurate samples were found to occur. The frequent occurrence of samples early in a function may be a result of sampling being more likely to happen immediately after code is loaded into cache after a TLB miss.

    In any case, this means that even with frame pointers enabled, call-frame information is still required to obtain an accurate profile.

    Assembly-code functions in libraries

    Third, the existence of hand-written assembly-code functions in commonly-used libraries, particularly the glibc string and memory manipulation functions, causes another source of inaccuracy. Again, the assembly-code sections do not maintain the frame pointer register the same way that an ordinary function call does.

    In the best case, the frame pointer unwind will skip the caller of the assembly-code function. That is, if function f calls function g which calls strcpy, the resulting stack trace will claim that function f called strcpy directly. In the worst case, if the assembly-code function uses the frame pointer register for general-purpose computation, the unwind will not be able to proceed at all.

    On the other hand, a Call Frame Information (CFI) unwinder will be able to unwind the call correctly. Since around 2003, the glibc assembly code has been hand-annotated with CFI directives, and these document how the canonical frame address can be calculated relative to the stack pointer, or relative to a value spilled to memory, or otherwise. Any other library that includes similar annotations can enable accurate CFI unwinding of assembly-code.

    To support frame pointer unwinding by modifying the glibc assembly-code functions to stop using the frame pointer register for computation and imitate frame pointer-enabled code emitted by a compiler is not a likely prospect. In addition to the volume of the work required, only a subset of the glibc users would find such a change desirable.

    Alternatives to frame pointer unwinding

    Fortunately, there are signs that frame pointer enablement in current Linux distributions is only a stopgap measure. Several initiatives are underway, each of which would make it feasible to obtain profiles via perf_events without relying on a frame pointer unwinder:

    1. My own eu-stacktrace project was described in a prior article. As of the time of writing, an initial version has been merged upstream into elfutils release 0.192 and can be enabled by compiling elfutils with the --enable-stacktrace option, as described in the README.
    2. The SFrame project is a simplified call-frame information format with stronger efficiency guarantees than .eh_frame, albeit slightly less flexibility. As of the time of writing, a patchset to implement SFrame unwinding for perf_events is being reviewed for inclusion in the Linux kernel. After that, SFrame support will need to be added to elfutils and then major distributions will consider compiling their packages with .sframe sections by default.
    3. New generations of hardware will include shadow stack support. Shadow stacks are a security feature which uses hardware assistance to track the structure of the call stack and monitor its integrity. This would also allow stack traces to be obtained without relying on call-frame information or on frame pointers.

    Overall, the accuracy of a stack trace profile depends on a number of subtle details that are easy to overlook. Fortunately, the current projects seem to be on track to improve the quality of profile information over what has been available in the past.

    Last updated: November 6, 2024

    Related Posts

    • Get system-wide profiles of binaries without frame pointers

    • Frame pointers: Untangling the unwinding

    • Debuginfo is not just for debugging programs

    • Improving math performance in glibc

    Recent Posts

    • Ollama or vLLM? How to choose the right LLM serving tool for your use case

    • How to build a Model-as-a-Service platform

    • How Quarkus works with OpenTelemetry on OpenShift

    • Our top 10 articles of 2025 (so far)

    • The benefits of auto-merging GitHub and GitLab repositories

    What’s up next?

    This cheat sheet helps you get familiar with over 30 basic Linux command-line executables frequently used by developers.

    Get the cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue