Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Implementing C++20 semaphores

April 18, 2023
Thomas Rodgers
Related topics:
Developer Tools
Related products:
Developer Tools

Share:

    C++20 introduces counting_semaphore and binary_semaphore, which support blocking acquire() and non-blocking try_acquire() as well as timed try_acquire_for() and try_acquire_until(). On platforms that support __platform_wait()/__platform_notify(), we select an implementation strategy based on atomic; otherwise, we attempt to use POSIX semaphores.

    If you missed the previous article, read it here: Implementing C++20 atomic waiting in libstdc++

    Semaphores in C++

    Here's what the ISO C++ Standard has to say on the matter of semaphores:

    1 Class template counting_semaphore maintains an internal counter that is initialized when the semaphore is created. The counter is decremented when a thread acquires the semaphore, and is incremented when a thread releases the semaphore. If a thread tries to acquire the semaphore when the counter is zero, the thread will block until another thread increments the counter by releasing the semaphore.

    2 least_max_value shall be non-negative; otherwise the program is ill-formed.

    3 Concurrent invocations of the member functions of counting_semaphore, other than its destructor, do not introduce data races.

    void release(ptrdiff_t update = 1);

    8 Preconditions: update >= 0 is true, and update <= max() - counter is true.

    9 Effects: Atomically execute counter += update. Then, unblocks any threads that are waiting for counter to be greater than zero.

    10 Synchronization: Strongly happens before invocations of try_acquire that observe the result of the effects.

    11 Throws: system_error when an exception is required (32.2.2).

    12 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).

    bool try_acquire() noexcept;

    13 Effects: Attempts to atomically decrement counter if it is positive, without blocking. If counter is not decremented, there is no effect and try_acquire immediately returns. An implementation may fail to decrement counter even if it is positive. [Note 1 : This spurious failure is normally uncommon, but allows interesting implementations based on a simple compare and exchange (Clause 31). — end note] An implementation should ensure that try_acquire does not consistently return false in the absence of contending semaphore operations.

    14 Returns: true if counter was decremented, otherwise false.

    void acquire();

    15 Effects: Repeatedly performs the following steps, in order:

    • (15.1) — Evaluates try_acquire. If the result is true, returns.<.li>
    • (15.2) — Blocks on *this until counter is greater than zero.

    16 Throws: system_error when an exception is required (32.2.2).

    17 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).

    template bool try_acquire_for(const chrono::duration& rel_time); template bool try_acquire_until(const chrono::time_point& abs_time);

    18 Effects: Repeatedly performs the following steps, in order:

    • (18.1) — Evaluates try_acquire(). If the result is true, returns true.
    • (18.2) — Blocks on *this until counter is greater than zero or until the timeout expires. If it is unblocked by the timeout expiring, returns false. The timeout expires (32.2.4) when the current time is after abs_time (for try_acquire_until) or when at least rel_time has passed from the start of the function (for try_acquire_for).

    19 Throws: Timeout-related exceptions (32.2.4), or system_error when a non-timeout-related exception is required (32.2.2). 20 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).

    OK, so how to implement it?

    If we have POSIX semaphores available, the following type will be defined:

        struct __platform_semaphore
        {
          using __clock_t = chrono::system_clock;
      #ifdef SEM_VALUE_MAX
          static constexpr ptrdiff_t _S_max = SEM_VALUE_MAX;
      #else
          static constexpr ptrdiff_t _S_max = _POSIX_SEM_VALUE_MAX;
      #endif
    
          explicit __platform_semaphore(ptrdiff_t __count) noexcept
          { sem_init(&_M_semaphore, 0, __count); }
    
          ~__platform_semaphore()
          { sem_destroy(&_M_semaphore); }
    
          // ...
        };
    

    And implements acquire/​​​​​​try_acquire/release:

        struct __platform_semaphore
        {
          // ...
    
          void
          _M_acquire() noexcept
          {
            for (;;)
            {
              auto __err = sem_wait(&_M_semaphore);
              if (__err && (errno == EINTR))
                continue;
              else if (__err)
                std::terminate();
              else
                break;
            }
          }
    
          _GLIBCXX_ALWAYS_INLINE bool
          _M_try_acquire() noexcept
          {
            for (;;)
            {
              auto __err = sem_trywait(&_M_semaphore);
              if (__err && (errno == EINTR))
                continue;
              else if (__err && (errno == EAGAIN))
                return false;
              else if (__err)
                std::terminate();
              else
                break;
            }
            return true;
          }
    
        _GLIBCXX_ALWAYS_INLINE void
        _M_release(std::ptrdiff_t __update) noexcept
        {
          for(; __update != 0; --__update)
          {
            auto __err = sem_post(&_M_semaphore);
            if (__err)
              std::terminate();
          }
        }
          // ...
        };
    

    For the timed wait operations, we have to worry about what the "basis" clock is and how to convert the user-supplied clock. For POSIX semaphores, this is handled as follows:

        struct __platform_semaphore
        {
          using __clock_t = chrono::system_clock;
          // ...
    
          bool
          _M_try_acquire_until_impl(const chrono::time_point<__clock_t>& __atime)
            noexcept
          {
    
            auto __s = chrono::time_point_cast(__atime);
            auto __ns = chrono::duration_cast(__atime - __s);
    
            struct timespec __ts =
            {
              static_cast(__s.time_since_epoch().count()),
              static_cast(__ns.count())
            };
    
            for (;;)
            {
              if (auto __err = sem_timedwait(&_M_semaphore, &__ts))
                {
                  if (errno == EINTR)
              continue;
                  else if (errno == ETIMEDOUT || errno == EINVAL)
              return false;
                  else
              std::terminate();
                }
              else
                break;
            }
            return true;
          }
    
          template
            bool
            _M_try_acquire_until(const chrono::time_point<_Clock,
                                 _Duration>& __atime) noexcept
            {
              if constexpr (std::is_same_v<__clock_t, _Clock>)
                {
                  return _M_try_acquire_until_impl(__atime);
                }
              else
                {
                  const typename _Clock::time_point __c_entry = _Clock::now();
                  const auto __s_entry = __clock_t::now();
                  const auto __delta = __atime - __c_entry;
                  const auto __s_atime = __s_entry + __delta;
                  if (_M_try_acquire_until_impl(__s_atime))
                    return true;
    
                  // We got a timeout when measured against __clock_t but
                  // we need to check against the caller-supplied clock
                  // to tell whether we should return a timeout.
                  return (_Clock::now() < __atime);
                }
            }
        };
    

    If we have support for atomic::wait/notify_one/all then the following type will be defined:

        struct __atomic_semaphore
        {
          static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits::__max;
          explicit __atomic_semaphore(__detail::__platform_wait_t __count) noexcept
            : _M_counter(__count)
          {
            __glibcxx_assert(__count >= 0 && __count <= _S_max);
          }
    
    
          static _GLIBCXX_ALWAYS_INLINE bool
          _S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept
          {
            auto __old = __atomic_impl::load(__counter, memory_order::acquire);
            if (__old == 0)
              return false;
    
            return __atomic_impl::compare_exchange_strong(__counter,
                                                          __old, __old - 1,
                                                          memory_order::acquire,
                                                          memory_order::relaxed);
          }
    
          _GLIBCXX_ALWAYS_INLINE void
          _M_acquire() noexcept
          {
            auto const __pred = [this]
                                {return _S_do_try_acquire(&this->_M_counter);};
            std::__atomic_wait_address(&_M_counter, __pred);
          }
    
          bool
          _M_try_acquire() noexcept
          {
            auto const __pred = [this]
                                { return _S_do_try_acquire(&this->_M_counter); };
            return std::__detail::__atomic_spin(__pred);
          }
    
          // ...
        };
    

    This expects an implementation of __atomic_wait_address that can accept a predicate; however, we have only defined __atomic_wait_address_v to this point (see part 1).

      template<typename _EntersWait>
        struct __waiter
        {
          // ...
    
          template<typename _Pred>
            void
            _M_do_wait(_Pred __pred) noexcept
            {
              do
               {
                 __platform_wait_t __val;
                 if (__base_type::_M_do_spin(__pred, __val))
                   return;
                 __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
               }
              while (!__pred());
            }
        };
    
        template<typename _Tp, typename _Pred>
          void
          __atomic_wait_address(const _Tp* __addr, _Pred __pred) noexcept
          {
            __detail::__enters_wait __w(__addr);
            __w._M_do_wait(__pred);
          }
     

    There is a problem with this formulation, in that __enters_wait will atomically increment and decrement a waiter count, which __atomic_semaphore does not need, as it inherently tracks the presence of waiters. In the previous article, we introduced the notion of __enters_wait and __bare_wait types to be used by waiters and notifiers, respectively. We extend that notion also to support "bare" waiters.

      // This call is to be used by atomic types which track contention externally
      template
        void
        __atomic_wait_address_bare(const __detail::__platform_wait_t* __addr,
    			       _Pred __pred) noexcept
        {
    #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
          do
            {
              __detail::__platform_wait_t __val;
              if (__detail::__bare_wait::_S_do_spin(__addr, __pred, __val))
                return;
              __detail::__platform_wait(__addr, __val);
            }
          while (!__pred());
    #else // !_GLIBCXX_HAVE_PLATFORM_WAIT
          __detail::__bare_wait __w(__addr);
          __w._M_do_wait(__pred);
    #endif
        }
    

    We also need to introduce a corresponding "bare" notify, which skips checks for waiters:

        struct __waiter_pool
        {
          // ...
    
          void
          _M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept
          {
            if (!(__bare || _M_waiting()))
              return;
    
    #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
            __platform_notify(__addr, __all);
    #else
            if (__all)
              _M_cv.notify_all();
            else
              _M_cv.notify_one();
    #endif
          }
        };
    

    Supporting timed waits on __atomic_semaphore

    The C++ Standard has extensive support for durations and time points and specifies waiting operations in terms of those types:

      template<class Rep, class Period>
        bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
    
      template<class Clock, class Duration>
        bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
    

    However, <chrono> is not supported in the freestanding (non-hosted) subset of the standard library, but <atomic> is. This means we can't inject a dependency on the header or in its underlying bits/atomic_base.h header. For this reason, all timed waiting functionality is split to bits/atomic_timed_wait.h. We preferentially use std::chrono::steady_clock as our basis clock on platforms that have futexes or pthread_cond_clockwait. Otherwise, we fall back to the system_clock and convert all user-supplied time points to this clock:

        #ifdef _GLIBCXX_HAVE_LINUX_FUTEX || _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
          using __wait_clock_t = chrono::steady_clock;
        #else
          using __wait_clock_t = chrono::system_clock;
        #endif
    
        template<typename _Clock, typename _Dur>
          __wait_clock_t::time_point
          __to_wait_clock(const chrono::time_point<_Clock, _Dur>& __atime) noexcept
          {
            const typename _Clock::time_point __c_entry = _Clock::now();
            const __wait_clock_t::time_point __w_entry = __wait_clock_t::now();
            const auto __delta = __atime - __c_entry;
            using __w_dur = typename __wait_clock_t::duration;
            return __w_entry + chrono::ceil<__w_dur>(__delta);
          }
    
        template<typename _Dur>
          __wait_clock_t::time_point
          __to_wait_clock(const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept
          {
            using __w_dur = typename __wait_clock_t::duration;
            return chrono::ceil<__w_dur>(__atime);
          }
    

    If a platform supports an efficient __platform_wait(), it is also expected to provide an efficient timed wait version of the same called __platform_wait_until(). For Linux Futexes, we implement this as:

      // returns true if wait ended before timeout
      template<typename _Dur>
        bool
        __platform_wait_until_impl(const __platform_wait_t* __addr, __platform_wait_t __old,
        const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept
        {
          auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
          auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
    
          struct timespec __rt =
          {
            static_cast<std::time_t>(__s.time_since_epoch().count()),
            static_cast<long>(__ns.count())
          };
    
          auto __e = syscall (SYS_futex, __addr,
          static_cast<int>(__futex_wait_flags::__wait_bitset_private),
                                 __old, &__rt, nullptr,
          static_cast<int>(__futex_wait_flags:: __bitset_match_any));
    
          if (__e)
          {
            if ((errno != ETIMEDOUT) && (errno != EINTR) && (errno != EAGAIN))
              __throw_system_error(errno);
            return true;
          }
          return false;
        }
    
        // returns true if wait ended before timeout
        template<typename _Clock, typename _Dur>
          bool
          __platform_wait_until(const __platform_wait_t* __addr, __platform_wait_t __old,
          const chrono::time_point<_Clock, _Dur>& __atime)
          {
            if constexpr (is_same_v<__wait_clock_t, _Clock>)
              {
                return __platform_wait_until_impl(__addr, __old, __atime);
              }
            else
              {
                if (!__platform_wait_until_impl(__addr, __old,
                                                __to_wait_clock(__atime)))
                {
                  // We got a timeout when measured against __clock_t but
                  // we need to check against the caller-supplied clock
                  // to tell whether we should return a timeout.
                  if (_Clock::now() < __atime)
                  return true;
                }
                return false;
              }
          }
    

    As with __platform_wait(), for platforms that do not provide __platform_wait_until(), we fall back to using a mutex and condition variable and defining the following support functions:

        // Returns true if wait ended before timeout.
        // _Clock must be either steady_clock or system_clock.
        template<typename _Clock, typename _Dur>
          bool
          __cond_wait_until_impl(__condvar& __cv, mutex& __mx,
                                 const chrono::time_point<_Clock, _Dur>& __atime)
          {
            static_assert(std::__is_one_of<_Clock,
            chrono::steady_clock,
            chrono::system_clock>::value);
    
            auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
            auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
    
            __gthread_time_t __ts =
            {
              static_cast<std::time_t>(__s.time_since_epoch().count()),
              static_cast<long<(__ns.count())
            };
    
            // if we have a pthreads implementation that supports CLOCK_MONOTONIC
            // and the caller's clock is steady_clock, use that
        #ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
            if constexpr (is_same_v<chrono::steady_clock, _Clock>)
              __cv.wait_until(__mx, CLOCK_MONOTONIC, __ts);
            else
        #endif
            __cv.wait_until(__mx, __ts);
            return _Clock::now() & __atime;
          }
    
        // returns true if wait ended before timeout
        template<typename _Clock, typename _Dur>
          bool
          __cond_wait_until(__condvar& __cv, mutex& __mx,
          const chrono::time_point<_Clock, _Dur>& __atime)
          {
        #ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
            if constexpr (is_same_v<_Clock, chrono::steady_clock>)
              return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
            else
        #endif
            if constexpr (is_same_v<_Clock, chrono::system_clock>)
              return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
            else
              {
                if (__cond_wait_until_impl(__cv, __mx, __to_wait_clock(__atime)))
                  {
                    // We got a timeout when measured against __clock_t but
                    // we need to check against the caller-supplied clock
                    // to tell whether we should return a timeout.
                    if (_Clock::now() < __atime)
                      return true;
                  }
                  return false;
                }
          }
    

    The formulation of __waiter_pool from part 1 assumes that we are in the market for indefinitely blocking waits. However, in this case, we only care about timed waiting. The actual implementation in bits/atomic_wait.h is:

          struct __waiter_pool_base
          {
            // ...
            alignas(_S_align) __platform_wait_t _M_wait = 0;
          #ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
            mutex _M_mtx;
          #endif
            alignas(_S_align) __platform_wait_t _M_ver = 0;
            #ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
            __condvar _M_cv;
          #endif
    
            void
            _M_enter_wait() noexcept;
    
            void
            _M_leave_wait() noexcept;
    
            bool
            _M_waiting() const noexcept;
    
            void
            _M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept;
    
            static __waiter_pool_base&
            _S_for(const void* __addr) noexcept;
          };
    

    For indefinite atomic waits, the derived type __waiter_pool is used:

          struct __waiter_pool : __waiter_pool_base
          {
            void
            _M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old) noexcept;
          };
    

    For timed atomic waits, the derived type __timed_waiter_pool is used:

          struct __timed_waiter_pool : __waiter_pool_base
          {
            // returns true if wait ended before timeout
            template
              bool
              _M_do_wait_until(__platform_wait_t* __addr, __platform_wait_t __old,
              const chrono::time_point<_Clock, _Dur>& __atime)
              {
          #ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
                return __platform_wait_until(__addr, __old, __atime);
          #else
                __platform_wait_t __val;
                __atomic_load(__addr, &__val, __ATOMIC_RELAXED);
                if (__val == __old)
                {
                  lock_guard __l(_M_mtx);
                  return __cond_wait_until(_M_cv, _M_mtx, __atime);
                }
          #endif // _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
              }
          };
    

    Similarly, __waiter is split into a __waiter_base type:

          template<typename _Tp>
            struct __waiter_base
            {
              using __waiter_type = _Tp;
    
              __waiter_type& _M_w;
              __platform_wait_t* _M_addr;
    
              template
              static __platform_wait_t*
              _S_wait_addr(const _Up* __a, __platform_wait_t* __b);
    
              static __waiter_type&
              _S_for(const void* __addr) noexcept
              {
                static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base));
                auto& res = __waiter_pool_base::_S_for(__addr);
                return reinterpret_cast<__waiter_type&>(res);
              }
    
              template<typename _Up>
              explicit __waiter_base(const _Up* __addr) noexcept
                : _M_w(_S_for(__addr))
                , _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
              { }
    
              void
              _M_notify(bool __all, bool __bare = false);
    
              template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy>
                static bool
                _S_do_spin_v(__platform_wait_t* __addr, const _Up& __old, _ValFn __vfn,
                             __platform_wait_t& __val, _Spin __spin = _Spin{ });
    
              template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy>
                bool
                _M_do_spin_v(const _Up& __old, _ValFn __vfn, __platform_wait_t& __val,
                             _Spin __spin = _Spin{ });
    
              template<typename _Pred, typename _Spin = __default_spin_policy>
                static bool
                _S_do_spin(const __platform_wait_t* __addr, _Pred __pred,
                           __platform_wait_t& __val, _Spin __spin = _Spin{ });
    
              template<typename _Pred, typename _Spin = __default_spin_policy>
                bool
                _M_do_spin(_Pred __pred, __platform_wait_t& __val, _Spin __spin = _Spin{ });
            };
    

    From which __waiter is derived:

          template<typename _EntersWait>
            struct __waiter : __waiter_base<__waiter_pool>
            {
              using __base_type = __waiter_base<__waiter_pool>;
    
              template<typename _Tp>
                __waiter(const _Tp* __addr) noexcept;
    
              // ...
    
              template<typename _Tp, typename _ValFn>
                void
                _M_do_wait_v(_Tp __old, _ValFn __vfn);
    
              template<typename _Pred>
                void
                _M_do_wait(_Pred __pred) noexcept;
            };
    
          using __enters_wait = __waiter<std::true_type>;
          using __bare_wait = __waiter<std::false_type>;
    

    __timed_waiter is similarly derived:

          template<typename _EntersWait>
            struct __timed_waiter : __waiter_base<__timed_waiter_pool>
            {
    
              using __base_type = __waiter_base<__timed_waiter_pool>;
    
              template<typename _Tp>
                 __timed_waiter(const _Tp* __addr) noexcept;
    
              // ...
    
              // returns true if wait ended before timeout
              template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur>
                bool
                _M_do_wait_until_v(_Tp __old, _ValFn __vfn,
                const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
    
              // returns true if wait ended before timeout
              template<typename _Pred,
                       typename _Clock, typename _Dur>
                bool
                _M_do_wait_until(_Pred __pred, __platform_wait_t __val,
                const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
    
              // returns true if wait ended before timeout
              template<typename _Pred,
                       typename _Clock, typename _Dur>
                bool
                _M_do_wait_until(_Pred __pred,
                                 const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
    
              template<typename _Tp, typename _ValFn,
                       typename _Rep, typename _Period>
                bool
                _M_do_wait_for_v(_Tp __old, _ValFn __vfn,
                                 const chrono::duration<_Rep, _Period>& __rtime) noexcept;
    
              template<typename _Pred,
                       typename _Rep, typename _Period>
                bool
                _M_do_wait_for(_Pred __pred,
                               const chrono::duration<_Rep, _Period>& __rtime) noexcept;
          };
    
          using __enters_timed_wait = __timed_waiter<std::true_type>;
          using __bare_timed_wait = __timed_waiter<std::false_type>;
    

    As with __waiter, there are top-level __atomic_wait_address_until/for wrappers that __atomic_semaphore calls into, all of which follow the general form of:

          // returns true if wait ended before timeout
          template<typename _Tp, typename _ValFn,
                   typename _Clock, typename _Dur>
            bool
            __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn,
                                          const chrono::time_point<_Clock, _Dur>& __atime) noexcept
            {
              __detail::__enters_timed_wait __w{__addr};
              return __w._M_do_wait_until_v(__old, __vfn, __atime);
            }
    
          template<typename _Tp, typename _Pred,
                   typename _Clock, typename _Dur>
            bool
            __atomic_wait_address_until(const _Tp* __addr, _Pred __pred,
                                        const chrono::time_point<_Clock, _Dur>& __atime) noexcept
            {
              __detail::__enters_timed_wait __w{__addr};
              return __w._M_do_wait_until(__pred, __atime);
            }
    

    With versions for "bare" wait and the _for_v/_for variants to wait for a supplied duration. The actual waiting entry points on __timed_waiter are implemented as follows:

          struct __timed_waiter
          {
            // ...
    
            template<typename _Tp, typename _ValFn,
                     typename _Clock, typename _Dur>
              bool
              _M_do_wait_until_v(_Tp __old, _ValFn __vfn,
                                 const chrono::time_point<_Clock, _Dur>& __atime) noexcept
              {
                __platform_wait_t __val;
                if (_M_do_spin(__old, std::move(__vfn), __val,
                                 __timed_backoff_spin_policy(__atime)))
                  return true;
                return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime);
              }
    
            template<typename _Pred,
                     typename _Clock, typename _Dur>
              bool
              _M_do_wait_until(_Pred __pred, __platform_wait_t __val,
                               const chrono::time_point<_Clock, _Dur>& __atime) noexcept
              {
                for (auto __now = _Clock::now(); __now < __atime; __now = _Clock::now())
                {
                  if (__base_type::_M_do_spin(__pred, __val,
                                              __timed_backoff_spin_policy(__atime, __now)))
                    return true;
    
                  if (__base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime)
                        && __pred())
                    return true;
                }
                return false;
              }
          };
    

    Aside from the usual differences between the _v and predicate forms of wait, the timed waits introduce a custom spin policy:

          struct __timed_backoff_spin_policy
          {
            __wait_clock_t::time_point _M_deadline;
            __wait_clock_t::time_point _M_t0;
    
            template<typename _Clock, typename _Dur>
               __timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur> __deadline = _Clock::time_point::max(),
                                           chrono::time_point<_Clock, _Dur> __t0 = _Clock::now()) noexcept
              : _M_deadline(__to_wait_clock(__deadline))
              , _M_t0(__to_wait_clock(__t0))
            { }
    
            bool
            operator()() const noexcept
            {
              using namespace literals::chrono_literals;
              auto __now = __wait_clock_t::now();
              if (_M_deadline <= __now)
                return false;
    
              auto __elapsed = __now - _M_t0;
              if (__elapsed > 128ms)
              {
                this_thread::sleep_for(64ms);
              }
              else if (__elapsed > 64us)
              {
                this_thread::sleep_for(__elapsed / 2);
              }
              else if (__elapsed > 4us)
              {
                __thread_yield();
              }
              else
                return false;
              return true;
            }
          };
    

    After the usual spin completes unsatisfied, this will begin sleeping the current thread as long as the deadline hasn't been reached. The relative wait_for variants are implemented in terms of the absolute wait_until members:

        struct __timed_waiter
        {
          // ...
    
          template<typename _Tp, typename _ValFn,
                   typename _Rep, typename _Period>
            bool
            _M_do_wait_for_v(_Tp __old, _ValFn __vfn,
                             const chrono::duration<_Rep, _Period>& __rtime) noexcept
            {
              __platform_wait_t __val;
              if (_M_do_spin_v(__old, std::move(__vfn), __val))
                return true;
    
              if (!__rtime.count())
                return false; // no rtime supplied, and spin did not acquire
    
              auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
    
              return __base_type::_M_w._M_do_wait_until(_base_type::_M_addr,
                                                        __val,
                                                        chrono::steady_clock::now() + __reltime);
            }
    
            template<typename _Pred,
                     typename _Rep, typename _Period>
              bool
              _M_do_wait_for(_Pred __pred,
              const chrono::duration<_Rep, _Period>& __rtime) noexcept
              {
                __platform_wait_t __val;
                if (__base_type::_M_do_spin(__pred, __val))
                  return true;
    
                if (!__rtime.count())
                  return false; // no rtime supplied, and spin did not acquire
    
                auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
    
                return _M_do_wait_until(__pred, __val, chrono::steady_clock::now() + __reltime);
              }
        };
    

    The rest of __atomic_semaphore

    With support for atomic timed waits in place, we can define the remaining members of __atomic_semaphore:

      struct __atomic_semaphore
      {
        // ...
    
        template<typename _Clock, typename _Duration>
          _GLIBCXX_ALWAYS_INLINE bool
          _M_try_acquire_until(const chrono::time_point<_Clock,
                               _Duration>& __atime) noexcept
          {
            auto const __pred =
            [this] { return _S_do_try_acquire(&this->_M_counter); };
    
            return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
          }
    
          template<typename _Rep, typename _Period>
            _GLIBCXX_ALWAYS_INLINE bool
            _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime) noexcept
            {
              auto const __pred =
              [this] { return _S_do_try_acquire(&this->_M_counter); };
    
              return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
            }
        };
    

    You might have observed that there are more __atomic_wait_address_for/until variations than are actually used by __atomic_semaphore. C++26 will likely add timed versions of wait() to atomic as well as a version that accepts a predicate. The underlying support for these operations is already present but not yet exposed via the interface of atomic, and the details are subject to change in a future version of GCC.

    The rest of <semaphore>

    The semaphore implementation to use is conditionally chosen based on the presence of the atomic wait feature-test macro:

          #if defined __cpp_lib_atomic_wait
          using __semaphore_impl = __atomic_semaphore;
          #elif _GLIBCXX_HAVE_POSIX_SEMAPHORE
          using __semaphore_impl = __platform_semaphore;
          #endif
    

    We then implement std::counting_semaphore in terms of __semaphore_impl:

        template<ptrdiff_t __least_max_value = __semaphore_impl::_S_max>
          class counting_semaphore
          {
            static_assert(__least_max_value >= 0);
            static_assert(__least_max_value <= __semaphore_impl::_S_max);
    
            __semaphore_impl _M_sem;
    
            public:
            explicit counting_semaphore(ptrdiff_t __desired) noexcept
              : _M_sem(__desired)
              { }
    
            ~counting_semaphore() = default;
    
            counting_semaphore(const counting_semaphore&) = delete;
            counting_semaphore& operator=(const counting_semaphore&) = delete;
    
            static constexpr ptrdiff_t
            max() noexcept
            { return __least_max_value; }
    
            void
            release(ptrdiff_t __update = 1) noexcept(noexcept(_M_sem._M_release(1)))
            { _M_sem._M_release(__update); }
    
            void
            acquire() noexcept(noexcept(_M_sem._M_acquire()))
            { _M_sem._M_acquire(); }
    
            bool
            try_acquire() noexcept(noexcept(_M_sem._M_try_acquire()))
            { return _M_sem._M_try_acquire(); }
    
            template<typename _Rep, typename _Period>
              bool
              try_acquire_for(const std::chrono::duration<_Rep, _Period>& __rtime)
              { return _M_sem._M_try_acquire_for(__rtime); }
    
            template<typename _Clock, typename _Dur>
              bool
              try_acquire_until(const std::chrono::time_point<_Clock, _Dur>& __atime)
              { return _M_sem._M_try_acquire_until(__atime); }
          };
    

    And std::binary_semaphore is just a type alias:

          using binary_semaphore = std::counting_semaphore<1>;
    

    Next time

    With all of the waiting detail and semaphores out of the way, the next installment will look at <latch> and <barrier>.

    Last updated: May 30, 2024

    Related Posts

    • Implementing C++20 atomic waiting in libstdc++

    • How we expanded GCC value range propagation to floats

    • Migrating C and C++ applications from Red Hat Enterprise Linux version 7 to version 8

    • Why we added restartable sequences support to glibc in RHEL 9

    • Some more C# 11

    Recent Posts

    • Our top 10 articles of 2025 (so far)

    • The benefits of auto-merging GitHub and GitLab repositories

    • Supercharging AI isolation: microVMs with RamaLama & libkrun

    • Simplify multi-VPC connectivity with amazon.aws 9.0.0

    • How HaProxy router settings affect middleware applications

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue