C++20 introduces counting_semaphore and binary_semaphore, which support blocking acquire() and non-blocking try_acquire() as well as timed try_acquire_for() and try_acquire_until(). On platforms that support __platform_wait()/__platform_notify(), we select an implementation strategy based on atomic; otherwise, we attempt to use POSIX semaphores.
If you missed the previous article, read it here: Implementing C++20 atomic waiting in libstdc++
Semaphores in C++
Here's what the ISO C++ Standard has to say on the matter of semaphores:
1 Class template counting_semaphore maintains an internal counter that is initialized when the semaphore is created. The counter is decremented when a thread acquires the semaphore, and is incremented when a thread releases the semaphore. If a thread tries to acquire the semaphore when the counter is zero, the thread will block until another thread increments the counter by releasing the semaphore.
2 least_max_value shall be non-negative; otherwise the program is ill-formed.
3 Concurrent invocations of the member functions of counting_semaphore, other than its destructor, do not introduce data races.
void release(ptrdiff_t update = 1);
8 Preconditions: update >= 0 is true, and update <= max() - counter is true.
9 Effects: Atomically execute counter += update. Then, unblocks any threads that are waiting for counter to be greater than zero.
10 Synchronization: Strongly happens before invocations of try_acquire that observe the result of the effects.
11 Throws: system_error when an exception is required (32.2.2).
12 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
bool try_acquire() noexcept;
13 Effects: Attempts to atomically decrement counter if it is positive, without blocking. If counter is not decremented, there is no effect and try_acquire immediately returns. An implementation may fail to decrement counter even if it is positive. [Note 1 : This spurious failure is normally uncommon, but allows interesting implementations based on a simple compare and exchange (Clause 31). — end note] An implementation should ensure that try_acquire does not consistently return false in the absence of contending semaphore operations.
14 Returns: true if counter was decremented, otherwise false.
void acquire();
15 Effects: Repeatedly performs the following steps, in order:
- (15.1) — Evaluates try_acquire. If the result is true, returns.<.li>
- (15.2) — Blocks on *this until counter is greater than zero.
16 Throws: system_error when an exception is required (32.2.2).
17 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
template bool try_acquire_for(const chrono::duration& rel_time); template bool try_acquire_until(const chrono::time_point& abs_time);
18 Effects: Repeatedly performs the following steps, in order:
- (18.1) — Evaluates try_acquire(). If the result is true, returns true.
- (18.2) — Blocks on *this until counter is greater than zero or until the timeout expires. If it is unblocked by the timeout expiring, returns false. The timeout expires (32.2.4) when the current time is after abs_time (for try_acquire_until) or when at least rel_time has passed from the start of the function (for try_acquire_for).
19 Throws: Timeout-related exceptions (32.2.4), or system_error when a non-timeout-related exception is required (32.2.2). 20 Error conditions: Any of the error conditions allowed for mutex types (32.5.4.2).
OK, so how to implement it?
If we have POSIX semaphores available, the following type will be defined:
struct __platform_semaphore
{
using __clock_t = chrono::system_clock;
#ifdef SEM_VALUE_MAX
static constexpr ptrdiff_t _S_max = SEM_VALUE_MAX;
#else
static constexpr ptrdiff_t _S_max = _POSIX_SEM_VALUE_MAX;
#endif
explicit __platform_semaphore(ptrdiff_t __count) noexcept
{ sem_init(&_M_semaphore, 0, __count); }
~__platform_semaphore()
{ sem_destroy(&_M_semaphore); }
// ...
};
And implements acquire/try_acquire/release:
struct __platform_semaphore
{
// ...
void
_M_acquire() noexcept
{
for (;;)
{
auto __err = sem_wait(&_M_semaphore);
if (__err && (errno == EINTR))
continue;
else if (__err)
std::terminate();
else
break;
}
}
_GLIBCXX_ALWAYS_INLINE bool
_M_try_acquire() noexcept
{
for (;;)
{
auto __err = sem_trywait(&_M_semaphore);
if (__err && (errno == EINTR))
continue;
else if (__err && (errno == EAGAIN))
return false;
else if (__err)
std::terminate();
else
break;
}
return true;
}
_GLIBCXX_ALWAYS_INLINE void
_M_release(std::ptrdiff_t __update) noexcept
{
for(; __update != 0; --__update)
{
auto __err = sem_post(&_M_semaphore);
if (__err)
std::terminate();
}
}
// ...
};
For the timed wait operations, we have to worry about what the "basis" clock is and how to convert the user-supplied clock. For POSIX semaphores, this is handled as follows:
struct __platform_semaphore
{
using __clock_t = chrono::system_clock;
// ...
bool
_M_try_acquire_until_impl(const chrono::time_point<__clock_t>& __atime)
noexcept
{
auto __s = chrono::time_point_cast(__atime);
auto __ns = chrono::duration_cast(__atime - __s);
struct timespec __ts =
{
static_cast(__s.time_since_epoch().count()),
static_cast(__ns.count())
};
for (;;)
{
if (auto __err = sem_timedwait(&_M_semaphore, &__ts))
{
if (errno == EINTR)
continue;
else if (errno == ETIMEDOUT || errno == EINVAL)
return false;
else
std::terminate();
}
else
break;
}
return true;
}
template
bool
_M_try_acquire_until(const chrono::time_point<_Clock,
_Duration>& __atime) noexcept
{
if constexpr (std::is_same_v<__clock_t, _Clock>)
{
return _M_try_acquire_until_impl(__atime);
}
else
{
const typename _Clock::time_point __c_entry = _Clock::now();
const auto __s_entry = __clock_t::now();
const auto __delta = __atime - __c_entry;
const auto __s_atime = __s_entry + __delta;
if (_M_try_acquire_until_impl(__s_atime))
return true;
// We got a timeout when measured against __clock_t but
// we need to check against the caller-supplied clock
// to tell whether we should return a timeout.
return (_Clock::now() < __atime);
}
}
};
If we have support for atomic::wait/notify_one/all then the following type will be defined:
struct __atomic_semaphore
{
static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits::__max;
explicit __atomic_semaphore(__detail::__platform_wait_t __count) noexcept
: _M_counter(__count)
{
__glibcxx_assert(__count >= 0 && __count <= _S_max);
}
static _GLIBCXX_ALWAYS_INLINE bool
_S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept
{
auto __old = __atomic_impl::load(__counter, memory_order::acquire);
if (__old == 0)
return false;
return __atomic_impl::compare_exchange_strong(__counter,
__old, __old - 1,
memory_order::acquire,
memory_order::relaxed);
}
_GLIBCXX_ALWAYS_INLINE void
_M_acquire() noexcept
{
auto const __pred = [this]
{return _S_do_try_acquire(&this->_M_counter);};
std::__atomic_wait_address(&_M_counter, __pred);
}
bool
_M_try_acquire() noexcept
{
auto const __pred = [this]
{ return _S_do_try_acquire(&this->_M_counter); };
return std::__detail::__atomic_spin(__pred);
}
// ...
};
This expects an implementation of __atomic_wait_address that can accept a predicate; however, we have only defined __atomic_wait_address_v to this point (see part 1).
template<typename _EntersWait>
struct __waiter
{
// ...
template<typename _Pred>
void
_M_do_wait(_Pred __pred) noexcept
{
do
{
__platform_wait_t __val;
if (__base_type::_M_do_spin(__pred, __val))
return;
__base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
}
while (!__pred());
}
};
template<typename _Tp, typename _Pred>
void
__atomic_wait_address(const _Tp* __addr, _Pred __pred) noexcept
{
__detail::__enters_wait __w(__addr);
__w._M_do_wait(__pred);
}
There is a problem with this formulation, in that __enters_wait will atomically increment and decrement a waiter count, which __atomic_semaphore does not need, as it inherently tracks the presence of waiters. In the previous article, we introduced the notion of __enters_wait and __bare_wait types to be used by waiters and notifiers, respectively. We extend that notion also to support "bare" waiters.
// This call is to be used by atomic types which track contention externally
template
void
__atomic_wait_address_bare(const __detail::__platform_wait_t* __addr,
_Pred __pred) noexcept
{
#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
do
{
__detail::__platform_wait_t __val;
if (__detail::__bare_wait::_S_do_spin(__addr, __pred, __val))
return;
__detail::__platform_wait(__addr, __val);
}
while (!__pred());
#else // !_GLIBCXX_HAVE_PLATFORM_WAIT
__detail::__bare_wait __w(__addr);
__w._M_do_wait(__pred);
#endif
}
We also need to introduce a corresponding "bare" notify, which skips checks for waiters:
struct __waiter_pool
{
// ...
void
_M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept
{
if (!(__bare || _M_waiting()))
return;
#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
__platform_notify(__addr, __all);
#else
if (__all)
_M_cv.notify_all();
else
_M_cv.notify_one();
#endif
}
};
Supporting timed waits on __atomic_semaphore
The C++ Standard has extensive support for durations and time points and specifies waiting operations in terms of those types:
template<class Rep, class Period>
bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template<class Clock, class Duration>
bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
However, <chrono> is not supported in the freestanding (non-hosted) subset of the standard library, but <atomic> is. This means we can't inject a dependency on the header or in its underlying bits/atomic_base.h header. For this reason, all timed waiting functionality is split to bits/atomic_timed_wait.h. We preferentially use std::chrono::steady_clock as our basis clock on platforms that have futexes or pthread_cond_clockwait. Otherwise, we fall back to the system_clock and convert all user-supplied time points to this clock:
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX || _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
using __wait_clock_t = chrono::steady_clock;
#else
using __wait_clock_t = chrono::system_clock;
#endif
template<typename _Clock, typename _Dur>
__wait_clock_t::time_point
__to_wait_clock(const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
const typename _Clock::time_point __c_entry = _Clock::now();
const __wait_clock_t::time_point __w_entry = __wait_clock_t::now();
const auto __delta = __atime - __c_entry;
using __w_dur = typename __wait_clock_t::duration;
return __w_entry + chrono::ceil<__w_dur>(__delta);
}
template<typename _Dur>
__wait_clock_t::time_point
__to_wait_clock(const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept
{
using __w_dur = typename __wait_clock_t::duration;
return chrono::ceil<__w_dur>(__atime);
}
If a platform supports an efficient __platform_wait(), it is also expected to provide an efficient timed wait version of the same called __platform_wait_until(). For Linux Futexes, we implement this as:
// returns true if wait ended before timeout
template<typename _Dur>
bool
__platform_wait_until_impl(const __platform_wait_t* __addr, __platform_wait_t __old,
const chrono::time_point<__wait_clock_t, _Dur>& __atime) noexcept
{
auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
struct timespec __rt =
{
static_cast<std::time_t>(__s.time_since_epoch().count()),
static_cast<long>(__ns.count())
};
auto __e = syscall (SYS_futex, __addr,
static_cast<int>(__futex_wait_flags::__wait_bitset_private),
__old, &__rt, nullptr,
static_cast<int>(__futex_wait_flags:: __bitset_match_any));
if (__e)
{
if ((errno != ETIMEDOUT) && (errno != EINTR) && (errno != EAGAIN))
__throw_system_error(errno);
return true;
}
return false;
}
// returns true if wait ended before timeout
template<typename _Clock, typename _Dur>
bool
__platform_wait_until(const __platform_wait_t* __addr, __platform_wait_t __old,
const chrono::time_point<_Clock, _Dur>& __atime)
{
if constexpr (is_same_v<__wait_clock_t, _Clock>)
{
return __platform_wait_until_impl(__addr, __old, __atime);
}
else
{
if (!__platform_wait_until_impl(__addr, __old,
__to_wait_clock(__atime)))
{
// We got a timeout when measured against __clock_t but
// we need to check against the caller-supplied clock
// to tell whether we should return a timeout.
if (_Clock::now() < __atime)
return true;
}
return false;
}
}
As with __platform_wait(), for platforms that do not provide __platform_wait_until(), we fall back to using a mutex and condition variable and defining the following support functions:
// Returns true if wait ended before timeout.
// _Clock must be either steady_clock or system_clock.
template<typename _Clock, typename _Dur>
bool
__cond_wait_until_impl(__condvar& __cv, mutex& __mx,
const chrono::time_point<_Clock, _Dur>& __atime)
{
static_assert(std::__is_one_of<_Clock,
chrono::steady_clock,
chrono::system_clock>::value);
auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
__gthread_time_t __ts =
{
static_cast<std::time_t>(__s.time_since_epoch().count()),
static_cast<long<(__ns.count())
};
// if we have a pthreads implementation that supports CLOCK_MONOTONIC
// and the caller's clock is steady_clock, use that
#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
if constexpr (is_same_v<chrono::steady_clock, _Clock>)
__cv.wait_until(__mx, CLOCK_MONOTONIC, __ts);
else
#endif
__cv.wait_until(__mx, __ts);
return _Clock::now() & __atime;
}
// returns true if wait ended before timeout
template<typename _Clock, typename _Dur>
bool
__cond_wait_until(__condvar& __cv, mutex& __mx,
const chrono::time_point<_Clock, _Dur>& __atime)
{
#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
if constexpr (is_same_v<_Clock, chrono::steady_clock>)
return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
else
#endif
if constexpr (is_same_v<_Clock, chrono::system_clock>)
return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
else
{
if (__cond_wait_until_impl(__cv, __mx, __to_wait_clock(__atime)))
{
// We got a timeout when measured against __clock_t but
// we need to check against the caller-supplied clock
// to tell whether we should return a timeout.
if (_Clock::now() < __atime)
return true;
}
return false;
}
}
The formulation of __waiter_pool from part 1 assumes that we are in the market for indefinitely blocking waits. However, in this case, we only care about timed waiting. The actual implementation in bits/atomic_wait.h is:
struct __waiter_pool_base
{
// ...
alignas(_S_align) __platform_wait_t _M_wait = 0;
#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
mutex _M_mtx;
#endif
alignas(_S_align) __platform_wait_t _M_ver = 0;
#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
__condvar _M_cv;
#endif
void
_M_enter_wait() noexcept;
void
_M_leave_wait() noexcept;
bool
_M_waiting() const noexcept;
void
_M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) noexcept;
static __waiter_pool_base&
_S_for(const void* __addr) noexcept;
};
For indefinite atomic waits, the derived type __waiter_pool is used:
struct __waiter_pool : __waiter_pool_base
{
void
_M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old) noexcept;
};
For timed atomic waits, the derived type __timed_waiter_pool is used:
struct __timed_waiter_pool : __waiter_pool_base
{
// returns true if wait ended before timeout
template
bool
_M_do_wait_until(__platform_wait_t* __addr, __platform_wait_t __old,
const chrono::time_point<_Clock, _Dur>& __atime)
{
#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
return __platform_wait_until(__addr, __old, __atime);
#else
__platform_wait_t __val;
__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
if (__val == __old)
{
lock_guard __l(_M_mtx);
return __cond_wait_until(_M_cv, _M_mtx, __atime);
}
#endif // _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
}
};
Similarly, __waiter is split into a __waiter_base type:
template<typename _Tp>
struct __waiter_base
{
using __waiter_type = _Tp;
__waiter_type& _M_w;
__platform_wait_t* _M_addr;
template
static __platform_wait_t*
_S_wait_addr(const _Up* __a, __platform_wait_t* __b);
static __waiter_type&
_S_for(const void* __addr) noexcept
{
static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base));
auto& res = __waiter_pool_base::_S_for(__addr);
return reinterpret_cast<__waiter_type&>(res);
}
template<typename _Up>
explicit __waiter_base(const _Up* __addr) noexcept
: _M_w(_S_for(__addr))
, _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
{ }
void
_M_notify(bool __all, bool __bare = false);
template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy>
static bool
_S_do_spin_v(__platform_wait_t* __addr, const _Up& __old, _ValFn __vfn,
__platform_wait_t& __val, _Spin __spin = _Spin{ });
template<typename _Up, typename _ValFn, typename _Spin = __default_spin_policy>
bool
_M_do_spin_v(const _Up& __old, _ValFn __vfn, __platform_wait_t& __val,
_Spin __spin = _Spin{ });
template<typename _Pred, typename _Spin = __default_spin_policy>
static bool
_S_do_spin(const __platform_wait_t* __addr, _Pred __pred,
__platform_wait_t& __val, _Spin __spin = _Spin{ });
template<typename _Pred, typename _Spin = __default_spin_policy>
bool
_M_do_spin(_Pred __pred, __platform_wait_t& __val, _Spin __spin = _Spin{ });
};
From which __waiter is derived:
template<typename _EntersWait>
struct __waiter : __waiter_base<__waiter_pool>
{
using __base_type = __waiter_base<__waiter_pool>;
template<typename _Tp>
__waiter(const _Tp* __addr) noexcept;
// ...
template<typename _Tp, typename _ValFn>
void
_M_do_wait_v(_Tp __old, _ValFn __vfn);
template<typename _Pred>
void
_M_do_wait(_Pred __pred) noexcept;
};
using __enters_wait = __waiter<std::true_type>;
using __bare_wait = __waiter<std::false_type>;
__timed_waiter is similarly derived:
template<typename _EntersWait>
struct __timed_waiter : __waiter_base<__timed_waiter_pool>
{
using __base_type = __waiter_base<__timed_waiter_pool>;
template<typename _Tp>
__timed_waiter(const _Tp* __addr) noexcept;
// ...
// returns true if wait ended before timeout
template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur>
bool
_M_do_wait_until_v(_Tp __old, _ValFn __vfn,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
// returns true if wait ended before timeout
template<typename _Pred,
typename _Clock, typename _Dur>
bool
_M_do_wait_until(_Pred __pred, __platform_wait_t __val,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
// returns true if wait ended before timeout
template<typename _Pred,
typename _Clock, typename _Dur>
bool
_M_do_wait_until(_Pred __pred,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept;
template<typename _Tp, typename _ValFn,
typename _Rep, typename _Period>
bool
_M_do_wait_for_v(_Tp __old, _ValFn __vfn,
const chrono::duration<_Rep, _Period>& __rtime) noexcept;
template<typename _Pred,
typename _Rep, typename _Period>
bool
_M_do_wait_for(_Pred __pred,
const chrono::duration<_Rep, _Period>& __rtime) noexcept;
};
using __enters_timed_wait = __timed_waiter<std::true_type>;
using __bare_timed_wait = __timed_waiter<std::false_type>;
As with __waiter, there are top-level __atomic_wait_address_until/for wrappers that __atomic_semaphore calls into, all of which follow the general form of:
// returns true if wait ended before timeout
template<typename _Tp, typename _ValFn,
typename _Clock, typename _Dur>
bool
__atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
__detail::__enters_timed_wait __w{__addr};
return __w._M_do_wait_until_v(__old, __vfn, __atime);
}
template<typename _Tp, typename _Pred,
typename _Clock, typename _Dur>
bool
__atomic_wait_address_until(const _Tp* __addr, _Pred __pred,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
__detail::__enters_timed_wait __w{__addr};
return __w._M_do_wait_until(__pred, __atime);
}
With versions for "bare" wait and the _for_v/_for variants to wait for a supplied duration. The actual waiting entry points on __timed_waiter are implemented as follows:
struct __timed_waiter
{
// ...
template<typename _Tp, typename _ValFn,
typename _Clock, typename _Dur>
bool
_M_do_wait_until_v(_Tp __old, _ValFn __vfn,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
__platform_wait_t __val;
if (_M_do_spin(__old, std::move(__vfn), __val,
__timed_backoff_spin_policy(__atime)))
return true;
return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime);
}
template<typename _Pred,
typename _Clock, typename _Dur>
bool
_M_do_wait_until(_Pred __pred, __platform_wait_t __val,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
for (auto __now = _Clock::now(); __now < __atime; __now = _Clock::now())
{
if (__base_type::_M_do_spin(__pred, __val,
__timed_backoff_spin_policy(__atime, __now)))
return true;
if (__base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime)
&& __pred())
return true;
}
return false;
}
};
Aside from the usual differences between the _v and predicate forms of wait, the timed waits introduce a custom spin policy:
struct __timed_backoff_spin_policy
{
__wait_clock_t::time_point _M_deadline;
__wait_clock_t::time_point _M_t0;
template<typename _Clock, typename _Dur>
__timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur> __deadline = _Clock::time_point::max(),
chrono::time_point<_Clock, _Dur> __t0 = _Clock::now()) noexcept
: _M_deadline(__to_wait_clock(__deadline))
, _M_t0(__to_wait_clock(__t0))
{ }
bool
operator()() const noexcept
{
using namespace literals::chrono_literals;
auto __now = __wait_clock_t::now();
if (_M_deadline <= __now)
return false;
auto __elapsed = __now - _M_t0;
if (__elapsed > 128ms)
{
this_thread::sleep_for(64ms);
}
else if (__elapsed > 64us)
{
this_thread::sleep_for(__elapsed / 2);
}
else if (__elapsed > 4us)
{
__thread_yield();
}
else
return false;
return true;
}
};
After the usual spin completes unsatisfied, this will begin sleeping the current thread as long as the deadline hasn't been reached. The relative wait_for variants are implemented in terms of the absolute wait_until members:
struct __timed_waiter
{
// ...
template<typename _Tp, typename _ValFn,
typename _Rep, typename _Period>
bool
_M_do_wait_for_v(_Tp __old, _ValFn __vfn,
const chrono::duration<_Rep, _Period>& __rtime) noexcept
{
__platform_wait_t __val;
if (_M_do_spin_v(__old, std::move(__vfn), __val))
return true;
if (!__rtime.count())
return false; // no rtime supplied, and spin did not acquire
auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
return __base_type::_M_w._M_do_wait_until(_base_type::_M_addr,
__val,
chrono::steady_clock::now() + __reltime);
}
template<typename _Pred,
typename _Rep, typename _Period>
bool
_M_do_wait_for(_Pred __pred,
const chrono::duration<_Rep, _Period>& __rtime) noexcept
{
__platform_wait_t __val;
if (__base_type::_M_do_spin(__pred, __val))
return true;
if (!__rtime.count())
return false; // no rtime supplied, and spin did not acquire
auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
return _M_do_wait_until(__pred, __val, chrono::steady_clock::now() + __reltime);
}
};
The rest of __atomic_semaphore
With support for atomic timed waits in place, we can define the remaining members of __atomic_semaphore:
struct __atomic_semaphore
{
// ...
template<typename _Clock, typename _Duration>
_GLIBCXX_ALWAYS_INLINE bool
_M_try_acquire_until(const chrono::time_point<_Clock,
_Duration>& __atime) noexcept
{
auto const __pred =
[this] { return _S_do_try_acquire(&this->_M_counter); };
return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
}
template<typename _Rep, typename _Period>
_GLIBCXX_ALWAYS_INLINE bool
_M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime) noexcept
{
auto const __pred =
[this] { return _S_do_try_acquire(&this->_M_counter); };
return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
}
};
You might have observed that there are more __atomic_wait_address_for/until variations than are actually used by __atomic_semaphore. C++26 will likely add timed versions of wait() to atomic as well as a version that accepts a predicate. The underlying support for these operations is already present but not yet exposed via the interface of atomic, and the details are subject to change in a future version of GCC.
The rest of <semaphore>
The semaphore implementation to use is conditionally chosen based on the presence of the atomic wait feature-test macro:
#if defined __cpp_lib_atomic_wait
using __semaphore_impl = __atomic_semaphore;
#elif _GLIBCXX_HAVE_POSIX_SEMAPHORE
using __semaphore_impl = __platform_semaphore;
#endif
We then implement std::counting_semaphore in terms of __semaphore_impl:
template<ptrdiff_t __least_max_value = __semaphore_impl::_S_max>
class counting_semaphore
{
static_assert(__least_max_value >= 0);
static_assert(__least_max_value <= __semaphore_impl::_S_max);
__semaphore_impl _M_sem;
public:
explicit counting_semaphore(ptrdiff_t __desired) noexcept
: _M_sem(__desired)
{ }
~counting_semaphore() = default;
counting_semaphore(const counting_semaphore&) = delete;
counting_semaphore& operator=(const counting_semaphore&) = delete;
static constexpr ptrdiff_t
max() noexcept
{ return __least_max_value; }
void
release(ptrdiff_t __update = 1) noexcept(noexcept(_M_sem._M_release(1)))
{ _M_sem._M_release(__update); }
void
acquire() noexcept(noexcept(_M_sem._M_acquire()))
{ _M_sem._M_acquire(); }
bool
try_acquire() noexcept(noexcept(_M_sem._M_try_acquire()))
{ return _M_sem._M_try_acquire(); }
template<typename _Rep, typename _Period>
bool
try_acquire_for(const std::chrono::duration<_Rep, _Period>& __rtime)
{ return _M_sem._M_try_acquire_for(__rtime); }
template<typename _Clock, typename _Dur>
bool
try_acquire_until(const std::chrono::time_point<_Clock, _Dur>& __atime)
{ return _M_sem._M_try_acquire_until(__atime); }
};
And std::binary_semaphore is just a type alias:
using binary_semaphore = std::counting_semaphore<1>;
Next time
With all of the waiting detail and semaphores out of the way, the next installment will look at <latch> and <barrier>.