Red Hat at the ISO C++ Standards Meeting (Nov 2014): Parallelism and Concurrency
Several Red Hat engineers attended the JTC1/SC22/WG21 C++ Standards Committee meetings in November 2014 at Urbana-Champaign, IL, USA. This post focuses on the sessions of SG1, the study group on parallelism and concurrency, which met for the whole week to discuss proposals and work on the technical specifications (TS) for both parallelism and concurrency.
SG1 mostly worked on finalizing the first revision of the Parallelism TS, and continued working on accepting proposals into the Concurrency TS. The Transactional Memory proposal is also making progress on becoming a TS.
The Parallelism TS is currently being reviewed and voted on by the National Bodies; they had a few comments that SG1 addressed, including adding a few parallel algorithms. SG1 also discussed SIMD or vector parallelism proposals, exposed through either library interfaces or as language constructs. One noteworthy part of this was the discussion of three SIMD execution models and how they differ: Lockstep execution semantics are special in that they require vectorizing, whereas the other two models allow sequential execution, too (which allows different implementations). Wavefront execution has implicit wavefront barriers (e.g., at every sequence point), which require previous vector “lanes” to have completed the operation before code after the barrier is executed. Explicit barrier execution is equivalent to a task-parallel execution by default (e.g., each iteration of a vectorized loop can be executed using one GPU thread), but can be constrained by barriers (e.g., wavefront barriers) that the programmer explicitly puts into the code; this last model allows compilers to optimize most aggressively, and maps well to SIMD CPUs and GPUs.
For the Concurrency TS, the major discussion topics and proposals were (1) resumable
functions and coroutines, (2) synchronization mechanisms such as latches and a spin-wait-and-blocking abstraction, and (3) improvements to the memory model and atomics. The spin-wait-and-blocking abstraction is interesting from a Linux perspective because it provides a feature similar to futexes. Also, there were two proposals for executors being presented, which are abstractions for executing parallel or concurrent tasks, encapsulating policies for how and when to execute.
I presented two proposals about light-weight execution agents and which terminology
to use around threads. I have already described execution agents in a prior blog post; the main new points in this revision of the proposal are:
- A base definition of what forward progress actually means in the execution of a C++ program,
- A notion of boost-blocking, which basically is a guarantee that abstractions like parallel loops can make to ensure progress, even if an execution agent with strong forward progress guarantees is blocked waiting on execution agents with weaker forward progress guarantees,
- A proposal for how — and to which extent — thread-local state (TLS) could be supported by different kinds of execution agents; having to support TLS as if, for example, every parallel task were a thread can be costly (e.g., due to having to execute constructors and destructors for TLS on each start and end of a parallel task).
Read the related article about C++ Standards: Core.
We’re always interested to hear what matters to Red Hat Enterprise Linux developers, so if you have comments or questions on any aspects of the upcoming C++ standards – in the concurrency area, or otherwise – please feel free to get in touch with us at rheldevelop AT redhat DOT com or Tweet @RHELdevelop.