Red Hat at the ISO C++ Standards Meeting (March 2017): Parallelism and Concurrency

Several Red Hat engineers attended the JTC1/SC22/WG21 C++ Standards Committee meetings in March 2017. This post focuses on the sessions of SG1, the study group on parallelism and concurrency. The major topics of work of the week were (1) further polishing of the parallel algorithms in the C++17 draft, (2) making progress on the executors proposal (which provides mechanisms to control how parallel work is executed, for example on which resources), and (3) continuing work on proposals targeting the Concurrency Technical Specification version 2. We also discussed an important aspect of enabling standard C++ code to execute on GPUs, which is a topic that several people in SG1 have a lot of interest in — I certainly do, for example.

While the parallel algorithms in the C++17 draft did not need major changes, there were still a few minor aspects that needed improvement. For example, we refined what kinds of iterators are required, in which cases copying of input to parallel algorithms is allowed, and tweaked a few of the algorithms to allow for more parallelism to be exploited.

We also spent a lot of time on the executor’s proposal (P0443R1). This is considered an important piece of the C++ support for parallelism because it is intended to allow programs to control in detail how parallel work is executed, and it is supposed to be a mechanism that works not just in combination with the parallel algorithms but also for the features specified in the Networking Technical Specification. Thus, there are a lot of technical questions and use cases to consider. While we by far haven’t resolved all of them yet, we made quite a bit of progress. One outcome that I am particularly glad about is that we were able to determine (and agree on) what set of features should be minimally required for an Executors Technical Specification and for future inclusion in the C++ standard. This means that once these features are ready, we should be able to ship it, with other features being optional and only included if ready.

In the area of concurrency, we primarily continued work on existing proposals about Read-Copy-Update (RCU, which is also used extensively in the Linux Kernel), Hazard Pointers, and concurrent data structures such as concurrent queues and counters.

In a joined session with the Evolution Working Group, we also discussed how to enable programs to request that certain functions should be executable in different ways that each need a different generation of binary code. This is relevant in several contexts: The running example was SIMD code (e.g., enabling a function to be compiled so that one variant uses scalar instructions whereas another variant uses SIMD instructions). But similar flavors of the same problem exist with support for GPGPU code (one wants to write source code once and still be able to execute it on both the CPU and a GPU, which use different ISAs) or Software Transactional Memory (transactional execution implemented in software requires compiling the code differently). One requirement in many of these contexts is that if an execution is currently the special mode (e.g., on the GPU, or in a transaction), then calls to other functions should stay in that mode. This can be enforced at compile time by making it visible in the types of functions whether they support the special execution mode or not (e.g., because they have been annotated accordingly by the programmer, so that all phases of compilation and all parts of the program are aware of this). However, there was a strong pushback by parts of the committee on adding complexity to the type system (e.g., because this can affect existing code using templates). The people attending this session preferred to not extend the type system for the SIMD case, but how to best handle the GPGPU case is still an open question. There may be solutions for the GPGPU case that do not enforce/check the availability of GPU code at compile time but are good enough in practice (e.g., support for reverse offload, so continuing on-GPU execution back on the CPU without affecting correctness). But how these potential solutions affect all relevant use cases and whether they are practical for all C++ implementations still needs further investigation.

The next ISO C++ Standards meeting will take place in a month from now, and we hope to be able to approve C++17 for publication at that meeting. Stay tuned for an update on that.

To build your Java EE Microservice visit WildFly Swarm and download the cheat sheet.

Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.


What did you think of this article?
-1+1 (No Ratings Yet)

Leave a Reply