Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Recent improvements to concurrent code in glibc

January 28, 2015
Torvald Riegel

Share:

    gnu logoIn this post, I will give examples of recent improvements to concurrent code in glibc, the GNU C library, in the upstream community project. In other words, this is code that can be executed by multiple threads at the same time and has to coordinate accesses to shared data using synchronization. While some of these improvements are user-visible, many of them are not but can serve as examples of how concurrent code in other code bases can be improved.

    One of the user-visible improvements is a new implementation of Pthreads semaphores that I contributed. It puts less requirements on when a semaphore can be destructed by a program. Previously, programs had to wait for all calls to sem_wait or sem_post to return before they were allowed to call sem_destroy; now, under certain conditions, a thread that returned from sem_wait can call sem_destroy immediately even though the matching sem_post call has woken this thread but not returned yet. This works if, for example, the semaphore is effectively a reference counter for itself; specifically, the program must still ensure that there are no other concurrent, in-flight sem_wait calls or sem_post calls that are yet to increment the semaphore. The new semaphore implementation is portable code due to being based on C11 atomic operations (see below) and replaces several architecture-specific implementations.

    Another improvement contributed by others in the glibc community recently is adding support for transactional lock elision of Pthreads mutexes on PowerPC; this can improve the performance of critical sections by using Hardware Transactional Memory to execute the code speculatively in parallel, only falling back to using locks when necessary. This complements the existing lock elision support on s390 and Intel systems. Lock elision needs to be explicitly enabled when building glibc.

    I have also worked on improving glibc's internal interfaces around futexes. Futexes are an abstraction offered by the Linux kernel that allows a program to block until waken up by another thread or timing out, with the help of the operating system (i.e., unlike when spin-waiting in a busy loop, the OS is aware of the blocking relationship and can, for example, execute another thread on this CPU while the original thread is blocked). This is ongoing work, and also involves collaboration with the kernel community, which is currently improving the documentation of the futex operations offered by the kernel. Doing so is important because futexes are very useful - yet have not been specified in full detail previously. The efforts in the kernel and the glibc communities should help in making futexes be more easy to use by other programs.

    I want to conclude this overview with a very low-level improvement, which I consider very important: We have started to transition glibc to using the C11 memory model. In a nutshell, a memory model defines the behavior of  a multi-threaded program, in particular how the sequential instruction stream in each thread communicates with other threads through reads and writes to shared memory. A previous post explains the C11/C++11 memory model in some more detail (note that while the interfaces that C11 and C++11 provide for synchronization differ, the model itself is intentionally the same).

    Thus, the reason I mention this is that it really is the foundation on top of which other concurrency abstractions such as mutexes or semaphores can be implemented in glibc. The C11 memory model is a programming language's memory model, and can be implemented on top of the memory models of the various hardware architecture supported by glibc.

    Previously - and still in the vast majority of glibc concurrent code that has not been changed to use the C11 model - we were using an insufficiently documented memory model and a mixture of normal memory accesses and architecture-specific assembly implementations of atomic operations to synchronize in shared memory. This allowed experts to write correct concurrent code, but using the C11 memory model and atomic operations has advantages:

    • Over time, more programmers will become familiar with the C11 memory model; using it decreases the learning curve for new glibc developers. Also, we get access to the existing and future tool support for this model. For example, the cppmem tool is a great tool to explore all possible executions of small snippets of C11-like concurrent code; it runs in your browser, and can be really helpful to understand the model by example - and interactively!
    • It puts glibc's interactions with, and expectations on, the compiler on a well-specified foundation, namely the C11 memory model. Prior to C11, C didn't define behavior of multi-threaded programs, so one often essentially relied on knowledge about compilers and their specific implementation to write working concurrent code.
    • Requiring code to be data-race free is necessary to tell the compiler which memory accesses are part of synchronization and must not be optimized like sequential code. A useful side effect of this is that it becomes easy to spot which accesses to memory are actually part of synchronization, because either the atomic operation or the data type are visible in the code; in other words, it makes potentially complex concurrent code stand out more.
    • In glibc's case, we don't loose anything in terms of performance, at least when a decent compiler is used to build glibc. The previous atomic operations are a subset of what is offered by C11, and in a few cases we were able to select the required hardware barriers more carefully.

    Of course, transforming all concurrent code in a code base as big as glibc is not a simple change but rather a multi-phase process with incremental steps. We made a few trade-offs to ease transitioning to the C11 model, which are explained in more detail on a glibc wiki page. Here are a few choices we made that might be useful approaches for your code bases too:

    • We introduce the new, C11-like atomic operations alongside the old atomic operations, allowing us to transition one cluster of atomic code in glibc at a time instead of having to switch everything at once. The C11-like atomics do have the same semantics as their counterparts in C11 but different names, so that we do not conflict with any actual C11 code.
    • We require the memory order of an atomic operation to always be explicitly specified. We want efficient code so programmers should make a conscious choice.
    • We do not use explicitly atomic types for data accessed by atomic operations (e.g., equivalents of C11's atomic types). Note that this is not ideal, and I would not recommend it for new code. Nonetheless, we know that the existing data type declarations in glibc work correctly, so things like requirements on alignment to make atomic operations work are already taken care of. Perhaps more importantly, we require all accesses to a certain variable to use atomic operations if just one atomic access to it exist; thus, this tells the compilers that we support that a variable is in fact used for synchronization.
    • The documentation guidelines request that concurrent code should  be documented using the terms and semantics specified in the C11 memory model (e.g., relations such as happens-before or reads-from). There should not be a disconnect between the model the code is based on and the terminology used to document the code.

    I hope that this look at what has been happening recently in the upstream glibc project was interesting for you. Feel free to leave comments if you have further questions.

    Last updated: April 6, 2018

    Recent Posts

    • Dynamic GPU slicing with Red Hat OpenShift and NVIDIA MIG

    • Protecting virtual machines from storage and secondary network node failures

    • How to use OCI for GitOps in OpenShift

    • Using AI agents with Red Hat Insights

    • Splitting OpenShift machine config pool without node reboots

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue