Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Debugging Python C extensions with GDB

September 8, 2021
Victor Stinner
Related topics:
C, C#, C++LinuxPython
Related products:
Red Hat Enterprise Linux

Share:

    Many popular Python modules are written in the C language, and bugs in C extensions can cause nasty crashes that Python's error-catching mechanism won't catch. Fortunately, numerous powerful debuggers—notably, the GNU Project Debugger (GDB)—were designed for the C language. In Python 3.9, developers can use these to debug Python programs, and particularly the C extensions included in Python programs.

    This article shows how to use the improved Python debug build in Python 3.9. I'll first discuss how we adapted Python to allow developers to use traditional C debuggers, then show you how to use the debug build and GDB to debug C extensions in a Python program.

    Getting started with Python 3.9

    Python 3.9 is now provided in the Red Hat Enterprise Linux 8.4 AppStream. The command to install the new version is:

    $ sudo yum install python3.9
    

    Python 3.9 brings many new features:

    • PEP 584: Union operators added to dict.
    • PEP 585:Type hinting generics in standard collections.
    • PEP 614: Relaxed grammar restrictions on decorators.
    • PEP 616: String methods to remove prefixes and suffixes.
    • PEP 593: Flexible function and variable annotations.
    • A new os.pidfd_open() call that allows process management without races and signals.
    • PEP 615: Relocation of the IANA Time Zone Database to the standard library in the zoneinfo module.
    • An implementation of a topological sort of a graph in the new graphlib module.

    See What’s New In Python 3.9 for the full list of changes.

    Using C debuggers in Python

    When a Python executable is highly optimized, such as the one shipped in RHEL, a typical C debugger doesn't work well. The debugger can't read many helpful pieces of information, such as function arguments, type information, and local variables.

    Python does have a built-in fault-handler module that prints the Python traceback when a crash occurs. But when a Python object is corrupted (by a buffer overflow or for any other reason), the executable can continue for a long time before crashing. In this case, knowing the crash location is useless. Usually, the crash occurs during a garbage collection, when Python visits all Python objects. It's therefore hard to guess how the object was corrupted.

    Unfortunately, for various reasons, some bugs can be reproduced only on production systems, not on developers' workstations. This adds to the importance of a good debugger.

    Python can be built in debug mode, which adds many runtime checks. It helps to detect bugs such as corrupted Python objects. Prior to Python 3.9, a major usability issue was the need to rebuild C extensions in debug mode so they could run with a debug build of Python.

    How we improved the Python debug build

    I have been working for three years on the Python debugging experience to make it easier to use a C-language debugger such as GDB on Python. This section discusses the changes to Python that were required.

    ABI compatibility

    The first practical issue was that C extensions needed to be rebuilt in debug mode to be able to use a Python debug build.

    I made the Python debug build compatible at an application binary interface (ABI) level with the Python release build in Python issue 36465. The main PyObject C structure is now the same in release and debug builds.

    The debug build no longer defines the Py_TRACE_REFS macro, which caused the ABI incompatibility. If you want the macro, you need to explicitly request it through the ./configure --with-trace-refs build option. See the commit for more details.

    C extensions are no longer linked to libpython

    Another issue was that C extensions were linked to libpython. When a C extension was built in release mode and imported into a Python executable that was built in debug mode, the extension pulled in a version of libpython built in release mode, which was incompatible.

    Python functions such as PyLong_FromLong() are already loaded in the running Python process. C extensions inherit these symbols when their dynamic libraries are loaded. Therefore, linking C extensions to libpython explicitly is not strictly required.

    I modified how C extensions are built in Python 3.8 so the extensions are no longer linked to libpython: See Python issue 21536. Some RHEL packages contained C extensions that linked to libpython manually; these had to be modified further.

    Compiler optimizations disabled in the debug build

    Last but not least, the Python package was modified to build Python in debug mode with gcc -O0 rather than gcc -Og. The -Og option is meant to allow some optimizations that don't interfere with debug information. In practice, GDB is fully usable only on an executable built with -O0, which disables all compiler optimizations.

    Debugging with GBD in Python 3.9

    The Python 3.9 debug build shipped with RHEL 8.4 combines all of these enhancements and is now usable with debuggers. A Python 3.9 executable built in debug mode can import C extensions built in release mode. In short, the python3.9d executable can be used as a seamless drop-in replacement for the usual python3.9 to help you run a debug session.

    A special debug build of Python can work with a C debugger pretty much like a C program. This section shows how to use GDB to debug a Python program, plus some special debugger commands Python provides.

    Before: Trying GDB on a Python release build

    Before showing how debugging works better with the new Python 3.9 debug build, let's start with the release build, which is not usable with GDB.

    First, install GDB and the Python 3.9 debug symbols:

    $ sudo yum install gdb
    $ sudo yum debuginfo-install python39
    

    Create a simple Python program named slow.py to play with GDB:

    import time
    def slow_function():
        print("Slow function...")
        x = 3
        time.sleep(60 * 10)
    slow_function()
    

    Debug slow.py in GDB and interrupt it with Ctrl+C:

    $ gdb -args python3.9 slow.py
    (gdb) run
    Slow function...
    ^C
    
    Program received signal SIGINT, Interrupt.
    0x00007ffff7b790e7 in select () from /lib64/libc.so.6
    
    (gdb) where
    #0  select () from /lib64/libc.so.6
    #1  pysleep (secs=<optimized out>) at .../Modules/timemodule.c:2036
    #2  time_sleep (self=<optimized out>, obj=<optimized out>, self=<optimized out>,
        obj=<optimized out>) at .../Modules/timemodule.c:365
    (...)
    #7  _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>,
        throwflag=<optimized out>) at .../Python/ceval.c:3487
    3487     res = call_function(tstate, &sp, oparg, NULL);
    (...)
    

    Note: The previous GDB output was reformatted and truncated to make it easier to read.

    If you try to explore the problem, you find that GDB fails to read the function arguments in pysleep():

    (gdb) frame 1
    #1  0x00007ffff757769a in pysleep (secs=<optimized out>)
        at .../Modules/timemodule.c:2036
    2036     err = select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &timeout);
    (gdb) p secs
    $1 = <optimized out>
    

    GDB also fails to read _PyEval_EvalFrameDefault() local variables:

    (gdb) frame 7
    #7  _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>,
        throwflag=<optimized out>)
        at .../Python/ceval.c:3487
    3487                res = call_function(tstate, &sp, oparg, NULL);
    (gdb) p opcode
    $11 = <optimized out>
    (gdb) p oparg
    $10 = <optimized out>
    

    In the previous output, GDB displays <optimized out>, rather than expected values. Usually, this means that CPU registers are used for these values. Since CPU registers are used for multiple purposes, GDB cannot guess whether the register currently contains the specified function argument or variable or something else.

    In addition, the python3.9 executable is built in release mode with link time optimization (LTO), profile guided optimization (PGO), and gcc -O2 optimizations. Because of these optimizations, when debugged functions get inlined by the compiler, GDB's where command can display invalid call stacks.

    After: Using GDB on the new debug build

    Now install the new Python 3.9 debug build:

    $ sudo yum module enable --enablerepo=rhel-CRB python39-devel
    $ sudo yum install --enablerepo=rhel-CRB python39-debug
    $ sudo yum debuginfo-install python39-debug
    

    These commands enable the python39-devel module, install the python39-debug package from this module, and then install debug symbols. The Red Hat CodeReady Linux Builder repository is enabled in these commands to get the python39-devel module.

    Now, run GDB again to debug the same slow.py program, but using python3.9d. Again, interrupt the program with Ctrl+C:

    $ gdb -args python3.9d slow.py
    (gdb) run
    Slow function...
    ^C
    
    Program received signal SIGINT, Interrupt.
    select () from /lib64/libc.so.6
    
    (gdb) where
    #0  select () from /lib64/libc.so.6
    #1  pysleep (secs=600000000000) at .../Modules/timemodule.c:2036
    #2  time_sleep (self=<module at remote 0x7ffff7eb73b0>, obj=600)
        at .../Modules/timemodule.c:365
    (...)
    #7  _PyEval_EvalFrameDefault (tstate=0x55555575a7e0,
            f=Frame 0x7ffff7ecb850, for file slow.py, line 5, in slow_function (x=3),
            throwflag=0) at .../Python/ceval.c:3487
    (...)
    

    Reading the pysleep() function arguments now gives the expected values:

    (gdb) frame 1
    #1  0x00007ffff754c156 in pysleep (secs=600000000000) at .../Modules/timemodule.c:2036
    2036        err = select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &timeout);
    (gdb) p secs
    $1 = 600000000000
    

    Reading _PyEval_EvalFrameDefault() local variables now also gives the expected values:

    (gdb) frame 7
    #7  _PyEval_EvalFrameDefault (...)
    3487                res = call_function(tstate, &sp, oparg, NULL);
    (gdb) p opcode
    $2 = 161
    (gdb) p oparg
    $3 = 1
    

    As you can see, the <optimized out> messages are gone. GDB works as expected thanks to the new executable built without compiler optimizations.

    Python commands in GDB

    Python comes with a libpython3.9(...)-gdb.py gdb extension (implemented in Python) that adds GDB commands prefixed by py-. Expanding this prefix with the tab key shows the available commands:

    (gdb) py-<tab><tab>
    py-bt  py-bt-full  py-down  py-list  py-locals  py-print  py-up
    

    The py-bt command displays the Python call stack:

    (gdb) py-bt
    Traceback (most recent call first):
      File "slow.py", line 5, in slow_function
        time.sleep(60 * 10)
      File "slow.py", line 6, in <module>
        slow_function()
    

    The py-locals command lists Python local variables:

    (gdb) py-locals
    x = 3
    

    The py-print command gets the value of a Python variable:

    (gdb) py-print x
    local 'x' = 3
    

    Additional debug checks

    Before the program even runs its first statement, a debug build of Python can detect potential problems. When Python is built in debug mode, many debug checks are executed at runtime to detect bugs in C extensions. For example:

    • Debug hooks are installed on memory allocators to detect buffer overflows and other memory errors.
    • Assertions are made on various function arguments.
    • The garbage collector (gc.collect() function) runs some checks on objects' consistency.

    See the Python debug build web page for more details.

    Red Hat contributions to the Python debug build

    Red Hat contributed the following changes to Python upstream to enhance the Python debug build:

    • Adding assertions in the garbage collection module to make debugging easier with corrupted Python objects: See Python issue 9263. These enhancements were written by Dave Malcolm, maintained as downstream patches in Red Hat Enterprise Linux and Fedora, and pushed upstream in Python 3.8 in 2018. The change adds a new _PyObject_ASSERT() function that dumps the Python object that caused the assertion failure.
    • Detecting freed memory to avoid crashes when debugging Python: I added _PyObject_IsFreed() and _PyMem_IsFreed() functions. The visit_decref() function used by the Python garbage collector now detects freed memory and dumps the parent object on an attempt to access that memory: see Python issue 9263.
    • Maintenance of python-gdb.py and associated test_gdb regression tests: See Python issue 34989.

    Conclusion

    Python now works quite well with powerful open source debuggers such as GDB. We suggest you try out a Python debug build and GDB when you encounter a problem, especially a segmentation fault caused by a C extension to Python.

    Last updated: October 8, 2024

    Related Posts

    • The GDB developer's GNU Debugger tutorial, Part 1: Getting started with the debugger

    • Red Hat Enterprise Linux 8.2 brings faster Python 3.8 run speeds

    • Build your first Python application in a Linux container

    • Faster web deployment with Python serverless functions

    Recent Posts

    • Unleashing multimodal magic with RamaLama

    • Integrate Red Hat AI Inference Server & LangChain in agentic workflows

    • Streamline multi-cloud operations with Ansible and ServiceNow

    • Automate dynamic application security testing with RapiDAST

    • Assessing AI for OpenShift operations: Advanced configurations

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue