Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Red Hat Enterprise Linux 8.2 brings faster Python 3.8 run speeds

June 25, 2020
Tomas Orsava Victor Stinner Petr Viktorin
Related topics:
C, C#, C++LinuxPython
Related products:
Red Hat Enterprise Linux

Share:

    The Python interpreter shipped with Red Hat Enterprise Linux (RHEL) 8 is version 3.6, which was released in 2016. While Red Hat is committed to supporting the Python 3.6 interpreter for the lifetime of Red Hat Enterprise Linux 8, it is becoming a bit old for some use cases.

    For developers who need the new Python features—and who can live with the inevitable compatibility-breaking changes—Red Hat Enterprise Linux 8.2 also includes Python 3.8. Besides providing new features, packaging Python 3.8 with RHEL 8.2 allows us to release performance and packaging improvements more quickly than we could in the rock-solid python3 module.

    This article focuses on one specific performance improvement in the python38 package. As we'll explain, Python 3.8 is built with the GNU Compiler Collection (GCC)'s -fno-semantic-interposition flag. Enabling this flag disables semantic interposition, which can increase run speed by as much as 30%.

    Note: The python38 package joins other Python interpreters shipped in RHEL 8.2, including the python2 and python3 packages (which we described in a previous article, Python in RHEL 8). You can install Python 3.8 alongside the other Python interpreters so that it won't interfere with the existing Python stack.

    Where have I seen this before?

    Writing this article feels like taking credit for others' achievements. So, let us set this straight: The performance improvements we're discussing are others' achievements. As RHEL packagers, our role is similar to that of a gallery curator, rather than a painter: It is not our job to create features, but to seek out the best ones from the upstream Python project and combine them into a pleasing experience for developers after they've gone through review, integration, and testing in Fedora.

    Note that we do have "painter" roles on the team. But just as fresh paint does not belong in an exhibition hall, original contributions go to the broader community first and only appear in RHEL when they're well-tested (that is, somewhat boring and obvious).

    The discussions leading to the change we describe in this article include an initial naïve proposal by Red Hat's Python maintainers, a critique, a better idea by C expert Jan Kratochvil, and refining that idea. All of this back-and-forth happened openly on the Fedora development mailing list, with input from both Red Hatters and the wider community.

    Disabling semantic interposition in Python 3.8

    As we've mentioned, the most significant performance improvement in our RHEL 8.2 python38 package comes from building with GCC's -fno-semantic-interposition flag enabled. It increases run speed by as much as 30%, with little change to the semantics.

    How is that possible? There are a few layers to it, so let us explain.

    Python's C API

    All of Python's functionality is exposed in its extensive C API. A large part of Python's success comes from the C API, which makes it possible to extend and embed Python. Extensions are modules written in a language like C, which can provide functionality to Python programs. A classic example is NumPy, a library written in languages like C and Fortran that manipulates Python objects. Embedding means using Python from within a larger application. Applications like Blender or GIMP embed Python to allow scripting.

    Python (or more correctly, CPython, the reference implementation of the Python language) uses the C API internally: Every attribute access goes through a call to the PyObject_GetAttr function, every addition is a call to PyNumber_Add, and so on.

    Python's dynamic library

    Python can be built in two modes: static, where all code lives in the Python executable, or shared, where the Python executable is linked to its dynamic library called libpython. In Red Hat Enterprise Linux, Python is built in shared mode, because applications that embed Python, like Blender, use the Python C API of libpython.

    The python3.8 command is a minimalist example of embedding: It only calls the Py_BytesMain() function:

    int
    main(int argc, char **argv)
    {
        return Py_BytesMain(argc, argv);
    }
    

    All the code lives in libpython. For example, on RHEL 8.2, the size of /usr/bin/python3.8 is just around 8 KiB, whereas the size of the /usr/lib64/libpython3.8.so.1.0 library is around 3.6 MiB.

    Semantic interposition

    When executing a program, the dynamic loader allows you to override any symbol (such as a function) of the dynamic libraries that will be used in the program. You implement the override by setting the LD_PRELOAD environment variable. This technique is called ELF symbol interposition, and it's enabled by default in GCC.

    Note: In Clang, semantic interposition is disabled by default.

    This feature is commonly used, among other things, to trace memory allocation (by overriding the libc malloc and free functions) or to change a single application's clocks (by overriding the libc time function). Semantic interposition is implemented using a procedure linkage table (PLT). Any function that can be overridden with LD_PRELOAD is looked up in a table before it is called.

    Python calls libpython functions from other libpython functions. To respect semantic interposition, all of these calls must be looked up in the PLT. While this activity does introduce some overhead, the slowdown is negligible compared to the time spent in the called functions.

    Note: Python uses the tracemalloc module to trace memory allocations.

    LTO and function inlining

    In recent years, GCC has enhanced link-time optimization (LTO) to produce even more efficient code. One common optimization is to inline function calls, which means replacing a function call with a copy of the function's code. Once a function call is inlined, the compiler can go even further in terms of optimizations.

    However, it is not possible to inline functions that are looked up in the PLT. If the function can be swapped out entirely using LD_PRELOAD, the compiler cannot apply assumptions and optimizations based on what that function does.

    GCC 5.3 introduced the -fno-semantic-interposition flag, which disables semantic interposition. With this flag, functions in libpython that call other libpython functions don't have to go through the PLT indirection anymore. As a result, they can be inlined and optimized with LTO.

    So, that's what we did. We enabled the -fno-semantic-interposition flag in Python 3.8.

    Drawbacks of -fno-semantic-interposition

    The main drawback of building Python with -fno-semantic-interposition enabled is that we can no longer override libpython functions using LD_PRELOAD. However, the impact is limited to libpython. It is still possible, for example, to override malloc/free from libc to trace memory allocations.

    However, this is still an incompatibility: We do not know if developers are using LD_PRELOAD with Python on RHEL 8 in a way that would break with -fno-semantic-interposition. That is why we only enabled the change in the new Python 3.8, while Python 3.6—the default python3—continues to work as before.

    Performance comparison

    To see the -fno-semantic-interposition optimization in practice, let's take a look at the _Py_CheckFunctionResult() function. This function is used by Python to check whether a C function either returned a result (is not NULL) or raised an exception.

    Here is the simplified C code:

    PyObject*
    PyErr_Occurred(void)
    {
        PyThreadState *tstate = _PyRuntime.gilstate.tstate_current;
        return tstate->curexc_type;
    }
    
    PyObject*
    _Py_CheckFunctionResult(PyObject *callable, PyObject *result,
                            const char *where)
    {
        int err_occurred = (PyErr_Occurred() != NULL);
        ...
    }
    

    Assembly code with semantic interposition enabled

    Let's first take a look at Python 3.6 in Red Hat Enterprise Linux 7, which has not been built with -fno-semantic-interposition. Here is an extract of the assembly code (read by's disassemble command):

    Dump of assembler code for function _Py_CheckFunctionResult:
    (...)
    callq  0x7ffff7913d50 <PyErr_Occurred@plt>
    (...)
    

    As you can see, _Py_CheckFunctionResult() calls PyErr_Occurred(), and the call has to go through a PLT indirection.

    Assembly code with semantic interposition disabled

    Now let's look at an extract of the same assembly code after disabling semantic interposition:

    Dump of assembler code for function _Py_CheckFunctionResult:
    (...)
    mov 0x40f7fe(%rip),%rcx # rcx = &_PyRuntime
    mov 0x558(%rcx),%rsi    # rsi = tstate = _PyRuntime.gilstate.tstate_current
    (...)
    mov 0x58(%rsi),%rdi     # rdi = tstate->curexc_type
    (...)
    

    In this case, GCC inlined the PyErr_Occurred() function call. As a result _Py_CheckFunctionResult() gets the tstate directly from _PyRuntime, and then it directly reads its member tstate->curexc_type. There is no function call and no PLT indirection, which results in faster performance.

    Note: In more complex situations, the GCC compiler is free to optimize the inlined function even more, according to the context in which it is called.

    Try it for yourself!

    In this article, we focused on one specific improvement on the performance side, leaving new features to the upstream documents What's new In Python 3.7 and What's new In Python 3.8. If you are intrigued by the new compiler performance possibilities in Python 3.8, grab the python38 package from the Red Hat Enterprise Linux 8 repository and try it out. We hope you will enjoy the run speed-up, as well as a host of other new features that you will discover for yourself.

    Last updated: February 5, 2024

    Recent Posts

    • More Essential AI tutorials for Node.js Developers

    • How to run a fraud detection AI model on RHEL CVMs

    • How we use software provenance at Red Hat

    • Alternatives to creating bootc images from scratch

    • How to update OpenStack Services on OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue