C++20 introduces counting_semaphore
and binary_semaphore
, which support blocking acquire()
and non-blocking try_acquire()
as well as timed try_acquire_for()
and try_acquire_until()
. On platforms that support __platform_wait()
/__platform_notify()
, we select an implementation strategy based on atomic
; otherwise, we attempt to use POSIX semaphores.
If you missed the previous article, read it here: Implementing C++20 atomic waiting in libstdc++
Semaphores in C++
Here's what the ISO C++ Standard has to say on the matter of semaphores:
1 Class template counting_semaphore maintains an internal counter that is initialized when the semaphore is created. The counter is decremented when a thread acquires the semaphore, and is incremented when a thread releases the semaphore. If a thread tries to acquire the semaphore when the counter is zero, the thread will block until another thread increments the counter by releasing the semaphore.
2 least_max_value shall be non-negative; otherwise the program is ill-formed.
3 Concurrent invocations of the member functions of counting_semaphore, other than its destructor, do not introduce data races.
void release(ptrdiff_t update = 1);
8 Preconditions: update >= 0 is true, and update <= max() - counter is true.
9 Effects: Atomically execute counter += update. Then, unblocks any threads that are waiting for counter to be greater than zero.
10 Synchronization: Strongly happens before invocations of try_acquire that observe the result of the effects.
11 Throws: system_error when an exception is required (32.2.2).
12 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
bool try_acquire() noexcept;
13 Effects: Attempts to atomically decrement counter if it is positive, without blocking. If counter is not decremented, there is no effect and try_acquire immediately returns. An implementation may fail to decrement counter even if it is positive. [Note 1 : This spurious failure is normally uncommon, but allows interesting implementations based on a simple compare and exchange (Clause 31). — end note] An implementation should ensure that try_acquire does not consistently return false in the absence of contending semaphore operations.
14 Returns: true if counter was decremented, otherwise false.
void acquire();
15 Effects: Repeatedly performs the following steps, in order:
- (15.1) — Evaluates try_acquire. If the result is true, returns.<.li>
- (15.2) — Blocks on *this until counter is greater than zero.
16 Throws: system_error when an exception is required (32.2.2).
17 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
template bool try_acquire_for(const chrono::duration& rel_time); template bool try_acquire_until(const chrono::time_point& abs_time);
18 Effects: Repeatedly performs the following steps, in order:
- (18.1) — Evaluates try_acquire(). If the result is true, returns true.
- (18.2) — Blocks on *this until counter is greater than zero or until the timeout expires. If it is unblocked by the timeout expiring, returns false. The timeout expires (32.2.4) when the current time is after abs_time (for try_acquire_until) or when at least rel_time has passed from the start of the function (for try_acquire_for).
19 Throws: Timeout-related exceptions (32.2.4), or system_error when a non-timeout-related exception is required (32.2.2). 20 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
OK, so how to implement it?
If we have POSIX semaphores available, the following type will be defined:
struct __platform_semaphore { using __clock_t = chrono::system_clock; #ifdef SEM_VALUE_MAX static constexpr ptrdiff_t _S_max = SEM_VALUE_MAX; #else static constexpr ptrdiff_t _S_max = _POSIX_SEM_VALUE_MAX; #endif explicit __platform_semaphore(ptrdiff_t __count) noexcept { sem_init(&_M_semaphore, 0, __count); } ~__platform_semaphore() { sem_destroy(&_M_semaphore); } // ... };
And implements acquire
/try_acquire
/release
:
struct __platform_semaphore { // ... void _M_acquire() noexcept { for (;;) { auto __err = sem_wait(&_M_semaphore); if (__err && (errno == EINTR)) continue; else if (__err) std::terminate(); else break; } } _GLIBCXX_ALWAYS_INLINE bool _M_try_acquire() noexcept { for (;;) { auto __err = sem_trywait(&_M_semaphore); if (__err && (errno == EINTR)) continue; else if (__err && (errno == EAGAIN)) return false; else if (__err) std::terminate(); else break; } return true; } _GLIBCXX_ALWAYS_INLINE void _M_release(std::ptrdiff_t __update) noexcept { for(; __update != 0; --__update) { auto __err = sem_post(&_M_semaphore); if (__err) std::terminate(); } } // ... };
For the timed wait operations, we have to worry about what the "basis" clock is and how to convert the user-supplied clock. For POSIX semaphores, this is handled as follows:
struct __platform_semaphore { using __clock_t = chrono::system_clock; // ... bool _M_try_acquire_until_impl(const chrono::time_point<__clock_t>& __atime) noexcept { auto __s = chrono::time_point_cast(__atime); auto __ns = chrono::duration_cast(__atime - __s); struct timespec __ts = { static_cast(__s.time_since_epoch().count()), static_cast(__ns.count()) }; for (;;) { if (auto __err = sem_timedwait(&_M_semaphore, &__ts)) { if (errno == EINTR) continue; else if (errno == ETIMEDOUT || errno == EINVAL) return false; else std::terminate(); } else break; } return true; } template bool _M_try_acquire_until(const chrono::time_point<_Clock, _Duration>& __atime) noexcept { if constexpr (std::is_same_v<__clock_t, _Clock>) { return _M_try_acquire_until_impl(__atime); } else { const typename _Clock::time_point __c_entry = _Clock::now(); const auto __s_entry = __clock_t::now(); const auto __delta = __atime - __c_entry; const auto __s_atime = __s_entry + __delta; if (_M_try_acquire_until_impl(__s_atime)) return true; // We got a timeout when measured against __clock_t but // we need to check against the caller-supplied clock // to tell whether we should return a timeout. return (_Clock::now() < __atime); } } };
If we have support for atomic::wait
/notify_one
/all
then the following type will be defined:
struct __atomic_semaphore { static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits::__max; explicit __atomic_semaphore(__detail::__platform_wait_t __count) noexcept : _M_counter(__count) { __glibcxx_assert(__count >= 0 && __count <= _S_max); } static _GLIBCXX_ALWAYS_INLINE bool _S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept { auto __old = __atomic_impl::load(__counter, memory_order::acquire); if (__old == 0) return false; return __atomic_impl::compare_exchange_strong(__counter, __old, __old - 1, memory_order::acquire, memory_order::relaxed); } _GLIBCXX_ALWAYS_INLINE void _M_acquire() noexcept { auto const __pred = [this] {return _S_do_try_acquire(&this->_M_counter);}; std::__atomic_wait_address(&_M_counter, __pred); } bool _M_try_acquire() noexcept { auto const __pred = [this] { return _S_do_try_acquire(&this->_M_counter); }; return std::__detail::__atomic_spin(__pred); } // ... };
This expects an implementation of __atomic_wait_address
that can accept a predicate; however, we have only defined __atomic_wait_address_v
to this point (see part 1).
template<typename _EntersWait> struct __waiter { // ... template<typename _Pred> void _M_do_wait(_Pred __pred) noexcept { do { __platform_wait_t __val; if (__base_type::_M_do_spin(__pred, __val)) return; __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val); } while (!__pred()); } }; template<typename _Tp, typename _Pred> void __atomic_wait_address(const _Tp* __addr, _Pred __pred) noexcept { __detail::__enters_wait __w(__addr); __w._M_do_wait(__pred); }
There is a problem with this formulation, in that __enters_wait
will atomically increment and decrement a waiter count, which __atomic_semaphore
does not need, as it inherently tracks the presence of waiters. In the previous article, we introduced the notion of __enters_wait
and __bare_wait
types to be used by waiters and notifiers, respectively. We extend that notion also to support "bare" waiters.
// This call is to be used by atomic types which track contention externally template void __atomic_wait_address_bare(const __detail::__platform_wait_t* __addr, _Pred __pred) noexcept { #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT do { __detail::__platform_wait_t __val; if (__detail::__bare_wait::_S_do_spin(__addr, __pred, __val)) return; __detail::__platform_wait(__addr, __val); } while (!__pred()); #else // !_GLIBCXX_HAVE_PLATFORM_WAIT __detail::__bare_wait __w(__addr); __w._M_do_wait(__pred); #endif }
We also need to introduce a corresponding "bare" notify
, which skips checks for waiters:
struct __waiter_pool { // ... void _M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept { if (!(__bare || _M_waiting())) return; #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT __platform_notify(__addr, __all); #else if (__all) _M_cv.notify_all(); else _M_cv.notify_one(); #endif } };
Supporting timed waits on __atomic_semaphore
The C++ Standard has extensive support for durations and time points and specifies waiting operations in terms of those types:
template<class Rep, class Period> bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time); template<class Clock, class Duration> bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
However, <chrono>
is not supported in the freestanding (non-hosted) subset of the standard library, but <atomic>
is. This means we can't inject a dependency on the header or in its underlying bits/atomic_base.h
header. For this reason, all timed waiting functionality is split to bits/atomic_timed_wait.h
. We preferentially use std::chrono::steady_clock
as our basis clock on platforms that have futexes or pthread_cond_clockwait
. Otherwise, we fall back to the system_clock
and convert all user-supplied time points to this clock:
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX || _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT using __wait_clock_t = chrono::steady_clock; #else using __wait_clock_t = chrono::system_clock; #endif template<typename _Clock, typename _Dur> __wait_clock_t::time_point __to_wait_clock(const chrono::time_point<_Clock, _Dur>& __atime) noexcept { const typename _Clock::time_point __c_entry = _Clock::now(); const __wait_clock_t::time_point __w_entry = __wait_clock_t::now(); const auto __delta = __atime - __c_entry; using __w_dur = typename __wait_clock_t::duration; return __w_entry + chrono::ceil<__w_dur>(__delta); } template<typename _Dur> __wait_clock_t::time_point __to_wait_clock(const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept { using __w_dur = typename __wait_clock_t::duration; return chrono::ceil<__w_dur>(__atime); }
If a platform supports an efficient __platform_wait()
, it is also expected to provide an efficient timed wait version of the same called __platform_wait_until()
. For Linux Futexes, we implement this as:
// returns true if wait ended before timeout template<typename _Dur> bool __platform_wait_until_impl(const __platform_wait_t* __addr, __platform_wait_t __old, const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept { auto __s = chrono::time_point_cast<chrono::seconds>(__atime); auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); struct timespec __rt = { static_cast<std::time_t>(__s.time_since_epoch().count()), static_cast<long>(__ns.count()) }; auto __e = syscall (SYS_futex, __addr, static_cast<int>(__futex_wait_flags::__wait_bitset_private), __old, &__rt, nullptr, static_cast<int>(__futex_wait_flags:: __bitset_match_any)); if (__e) { if ((errno != ETIMEDOUT) && (errno != EINTR) && (errno != EAGAIN)) __throw_system_error(errno); return true; } return false; } // returns true if wait ended before timeout template<typename _Clock, typename _Dur> bool __platform_wait_until(const __platform_wait_t* __addr, __platform_wait_t __old, const chrono::time_point<_Clock, _Dur>& __atime) { if constexpr (is_same_v<__wait_clock_t, _Clock>) { return __platform_wait_until_impl(__addr, __old, __atime); } else { if (!__platform_wait_until_impl(__addr, __old, __to_wait_clock(__atime))) { // We got a timeout when measured against __clock_t but // we need to check against the caller-supplied clock // to tell whether we should return a timeout. if (_Clock::now() < __atime) return true; } return false; } }
As with __platform_wait()
, for platforms that do not provide __platform_wait_until()
, we fall back to using a mutex and condition variable and defining the following support functions:
// Returns true if wait ended before timeout. // _Clock must be either steady_clock or system_clock. template<typename _Clock, typename _Dur> bool __cond_wait_until_impl(__condvar& __cv, mutex& __mx, const chrono::time_point<_Clock, _Dur>& __atime) { static_assert(std::__is_one_of<_Clock, chrono::steady_clock, chrono::system_clock>::value); auto __s = chrono::time_point_cast<chrono::seconds>(__atime); auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); __gthread_time_t __ts = { static_cast<std::time_t>(__s.time_since_epoch().count()), static_cast<long<(__ns.count()) }; // if we have a pthreads implementation that supports CLOCK_MONOTONIC // and the caller's clock is steady_clock, use that #ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT if constexpr (is_same_v<chrono::steady_clock, _Clock>) __cv.wait_until(__mx, CLOCK_MONOTONIC, __ts); else #endif __cv.wait_until(__mx, __ts); return _Clock::now() & __atime; } // returns true if wait ended before timeout template<typename _Clock, typename _Dur> bool __cond_wait_until(__condvar& __cv, mutex& __mx, const chrono::time_point<_Clock, _Dur>& __atime) { #ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT if constexpr (is_same_v<_Clock, chrono::steady_clock>) return __detail::__cond_wait_until_impl(__cv, __mx, __atime); else #endif if constexpr (is_same_v<_Clock, chrono::system_clock>) return __detail::__cond_wait_until_impl(__cv, __mx, __atime); else { if (__cond_wait_until_impl(__cv, __mx, __to_wait_clock(__atime))) { // We got a timeout when measured against __clock_t but // we need to check against the caller-supplied clock // to tell whether we should return a timeout. if (_Clock::now() < __atime) return true; } return false; } }
The formulation of __waiter_pool
from part 1 assumes that we are in the market for indefinitely blocking waits. However, in this case, we only care about timed waiting. The actual implementation in bits/atomic_wait.h
is:
struct __waiter_pool_base { // ... alignas(_S_align) __platform_wait_t _M_wait = 0; #ifndef _GLIBCXX_HAVE_PLATFORM_WAIT mutex _M_mtx; #endif alignas(_S_align) __platform_wait_t _M_ver = 0; #ifndef _GLIBCXX_HAVE_PLATFORM_WAIT __condvar _M_cv; #endif void _M_enter_wait() noexcept; void _M_leave_wait() noexcept; bool _M_waiting() const noexcept; void _M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept; static __waiter_pool_base& _S_for(const void* __addr) noexcept; };
For indefinite atomic waits, the derived type __waiter_pool
is used:
struct __waiter_pool : __waiter_pool_base { void _M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old) noexcept; };
For timed atomic waits, the derived type __timed_waiter_pool
is used:
struct __timed_waiter_pool : __waiter_pool_base { // returns true if wait ended before timeout template bool _M_do_wait_until(__platform_wait_t* __addr, __platform_wait_t __old, const chrono::time_point<_Clock, _Dur>& __atime) { #ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT return __platform_wait_until(__addr, __old, __atime); #else __platform_wait_t __val; __atomic_load(__addr, &__val, __ATOMIC_RELAXED); if (__val == __old) { lock_guard __l(_M_mtx); return __cond_wait_until(_M_cv, _M_mtx, __atime); } #endif // _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT } };
Similarly, __waiter
is split into a __waiter_base
type:
template<typename _Tp> struct __waiter_base { using __waiter_type = _Tp; __waiter_type& _M_w; __platform_wait_t* _M_addr; template static __platform_wait_t* _S_wait_addr(const _Up* __a, __platform_wait_t* __b); static __waiter_type& _S_for(const void* __addr) noexcept { static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base)); auto& res = __waiter_pool_base::_S_for(__addr); return reinterpret_cast<__waiter_type&>(res); } template<typename _Up> explicit __waiter_base(const _Up* __addr) noexcept : _M_w(_S_for(__addr)) , _M_addr(_S_wait_addr(__addr, &_M_w._M_ver)) { } void _M_notify(bool __all, bool __bare = false); template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy> static bool _S_do_spin_v(__platform_wait_t* __addr, const _Up& __old, _ValFn __vfn, __platform_wait_t& __val, _Spin __spin = _Spin{ }); template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy> bool _M_do_spin_v(const _Up& __old, _ValFn __vfn, __platform_wait_t& __val, _Spin __spin = _Spin{ }); template<typename _Pred, typename _Spin = __default_spin_policy> static bool _S_do_spin(const __platform_wait_t* __addr, _Pred __pred, __platform_wait_t& __val, _Spin __spin = _Spin{ }); template<typename _Pred, typename _Spin = __default_spin_policy> bool _M_do_spin(_Pred __pred, __platform_wait_t& __val, _Spin __spin = _Spin{ }); };
From which __waiter
is derived:
template<typename _EntersWait> struct __waiter : __waiter_base<__waiter_pool> { using __base_type = __waiter_base<__waiter_pool>; template<typename _Tp> __waiter(const _Tp* __addr) noexcept; // ... template<typename _Tp, typename _ValFn> void _M_do_wait_v(_Tp __old, _ValFn __vfn); template<typename _Pred> void _M_do_wait(_Pred __pred) noexcept; }; using __enters_wait = __waiter<std::true_type>; using __bare_wait = __waiter<std::false_type>;
__timed_waiter
is similarly derived:
template<typename _EntersWait> struct __timed_waiter : __waiter_base<__timed_waiter_pool> { using __base_type = __waiter_base<__timed_waiter_pool>; template<typename _Tp> __timed_waiter(const _Tp* __addr) noexcept; // ... // returns true if wait ended before timeout template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur> bool _M_do_wait_until_v(_Tp __old, _ValFn __vfn, const chrono::time_point<_Clock, _Dur>& __atime) noexcept; // returns true if wait ended before timeout template<typename _Pred, typename _Clock, typename _Dur> bool _M_do_wait_until(_Pred __pred, __platform_wait_t __val, const chrono::time_point<_Clock, _Dur>& __atime) noexcept; // returns true if wait ended before timeout template<typename _Pred, typename _Clock, typename _Dur> bool _M_do_wait_until(_Pred __pred, const chrono::time_point<_Clock, _Dur>& __atime) noexcept; template<typename _Tp, typename _ValFn, typename _Rep, typename _Period> bool _M_do_wait_for_v(_Tp __old, _ValFn __vfn, const chrono::duration<_Rep, _Period>& __rtime) noexcept; template<typename _Pred, typename _Rep, typename _Period> bool _M_do_wait_for(_Pred __pred, const chrono::duration<_Rep, _Period>& __rtime) noexcept; }; using __enters_timed_wait = __timed_waiter<std::true_type>; using __bare_timed_wait = __timed_waiter<std::false_type>;
As with __waiter
, there are top-level __atomic_wait_address_until
/for
wrappers that __atomic_semaphore
calls into, all of which follow the general form of:
// returns true if wait ended before timeout template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur> bool __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn, const chrono::time_point<_Clock, _Dur>& __atime) noexcept { __detail::__enters_timed_wait __w{__addr}; return __w._M_do_wait_until_v(__old, __vfn, __atime); } template<typename _Tp, typename _Pred, typename _Clock, typename _Dur> bool __atomic_wait_address_until(const _Tp* __addr, _Pred __pred, const chrono::time_point<_Clock, _Dur>& __atime) noexcept { __detail::__enters_timed_wait __w{__addr}; return __w._M_do_wait_until(__pred, __atime); }
With versions for "bare" wait and the _for_v/_
for variants to wait for a supplied duration. The actual waiting entry points on __timed_waiter
are implemented as follows:
struct __timed_waiter { // ... template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur> bool _M_do_wait_until_v(_Tp __old, _ValFn __vfn, const chrono::time_point<_Clock, _Dur>& __atime) noexcept { __platform_wait_t __val; if (_M_do_spin(__old, std::move(__vfn), __val, __timed_backoff_spin_policy(__atime))) return true; return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime); } template<typename _Pred, typename _Clock, typename _Dur> bool _M_do_wait_until(_Pred __pred, __platform_wait_t __val, const chrono::time_point<_Clock, _Dur>& __atime) noexcept { for (auto __now = _Clock::now(); __now < __atime; __now = _Clock::now()) { if (__base_type::_M_do_spin(__pred, __val, __timed_backoff_spin_policy(__atime, __now))) return true; if (__base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime) && __pred()) return true; } return false; } };
Aside from the usual differences between the _v
and predicate forms of wait, the timed waits introduce a custom spin policy:
struct __timed_backoff_spin_policy { __wait_clock_t::time_point _M_deadline; __wait_clock_t::time_point _M_t0; template<typename _Clock, typename _Dur> __timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur> __deadline = _Clock::time_point::max(), chrono::time_point<_Clock, _Dur> __t0 = _Clock::now()) noexcept : _M_deadline(__to_wait_clock(__deadline)) , _M_t0(__to_wait_clock(__t0)) { } bool operator()() const noexcept { using namespace literals::chrono_literals; auto __now = __wait_clock_t::now(); if (_M_deadline <= __now) return false; auto __elapsed = __now - _M_t0; if (__elapsed > 128ms) { this_thread::sleep_for(64ms); } else if (__elapsed > 64us) { this_thread::sleep_for(__elapsed / 2); } else if (__elapsed > 4us) { __thread_yield(); } else return false; return true; } };
After the usual spin completes unsatisfied, this will begin sleeping the current thread as long as the deadline hasn't been reached. The relative wait_for
variants are implemented in terms of the absolute wait_until
members:
struct __timed_waiter { // ... template<typename _Tp, typename _ValFn, typename _Rep, typename _Period> bool _M_do_wait_for_v(_Tp __old, _ValFn __vfn, const chrono::duration<_Rep, _Period>& __rtime) noexcept { __platform_wait_t __val; if (_M_do_spin_v(__old, std::move(__vfn), __val)) return true; if (!__rtime.count()) return false; // no rtime supplied, and spin did not acquire auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime); return __base_type::_M_w._M_do_wait_until(_base_type::_M_addr, __val, chrono::steady_clock::now() + __reltime); } template<typename _Pred, typename _Rep, typename _Period> bool _M_do_wait_for(_Pred __pred, const chrono::duration<_Rep, _Period>& __rtime) noexcept { __platform_wait_t __val; if (__base_type::_M_do_spin(__pred, __val)) return true; if (!__rtime.count()) return false; // no rtime supplied, and spin did not acquire auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime); return _M_do_wait_until(__pred, __val, chrono::steady_clock::now() + __reltime); } };
The rest of __atomic_semaphore
With support for atomic timed waits in place, we can define the remaining members of __atomic_semaphore
:
struct __atomic_semaphore { // ... template<typename _Clock, typename _Duration> _GLIBCXX_ALWAYS_INLINE bool _M_try_acquire_until(const chrono::time_point<_Clock, _Duration>& __atime) noexcept { auto const __pred = [this] { return _S_do_try_acquire(&this->_M_counter); }; return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime); } template<typename _Rep, typename _Period> _GLIBCXX_ALWAYS_INLINE bool _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime) noexcept { auto const __pred = [this] { return _S_do_try_acquire(&this->_M_counter); }; return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime); } };
You might have observed that there are more __atomic_wait_address_for
/until
variations than are actually used by __atomic_semaphore
. C++26 will likely add timed versions of wait()
to atomic
as well as a version that accepts a predicate. The underlying support for these operations is already present but not yet exposed via the interface of atomic,
and the details are subject to change in a future version of GCC.
The rest of <semaphore>
The semaphore implementation to use is conditionally chosen based on the presence of the atomic wait feature-test macro:
#if defined __cpp_lib_atomic_wait using __semaphore_impl = __atomic_semaphore; #elif _GLIBCXX_HAVE_POSIX_SEMAPHORE using __semaphore_impl = __platform_semaphore; #endif
We then implement std::counting_semaphore
in terms of __semaphore_impl
:
template<ptrdiff_t __least_max_value = __semaphore_impl::_S_max> class counting_semaphore { static_assert(__least_max_value >= 0); static_assert(__least_max_value <= __semaphore_impl::_S_max); __semaphore_impl _M_sem; public: explicit counting_semaphore(ptrdiff_t __desired) noexcept : _M_sem(__desired) { } ~counting_semaphore() = default; counting_semaphore(const counting_semaphore&) = delete; counting_semaphore& operator=(const counting_semaphore&) = delete; static constexpr ptrdiff_t max() noexcept { return __least_max_value; } void release(ptrdiff_t __update = 1) noexcept(noexcept(_M_sem._M_release(1))) { _M_sem._M_release(__update); } void acquire() noexcept(noexcept(_M_sem._M_acquire())) { _M_sem._M_acquire(); } bool try_acquire() noexcept(noexcept(_M_sem._M_try_acquire())) { return _M_sem._M_try_acquire(); } template<typename _Rep, typename _Period> bool try_acquire_for(const std::chrono::duration<_Rep, _Period>& __rtime) { return _M_sem._M_try_acquire_for(__rtime); } template<typename _Clock, typename _Dur> bool try_acquire_until(const std::chrono::time_point<_Clock, _Dur>& __atime) { return _M_sem._M_try_acquire_until(__atime); } };
And std::binary_semaphore
is just a type alias:
using binary_semaphore = std::counting_semaphore<1>;
Next time
With all of the waiting detail and semaphores out of the way, the next installment will look at <latch>
and <barrier>
.