Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

How Rust makes Rayon's data parallelism magical

April 30, 2021
Josh Stone
Related topics:
Developer Tools
Related products:
Developer Tools

Share:

    Rayon is a data parallelism library for the Rust programming language. Common reactions from programmers who start to use Rayon express how it seems magical: "I changed one line and my code now runs in parallel!" As one of Rayon’s authors, I am of course glad to see happy users, but I want to dispel some of the magic and give credit where it’s due—to Rust itself.

    How Rust supports Rayon's parallelism

    A best-case "magical" scenario often looks something like this with a sequential iterator:

    let total = foo_vector.iter_mut()
        .filter(|foo| foo.is_interesting())
        .map(|foo| foo.heavy_computation())
        .sum();

    To make this a parallel iterator with Rayon, simply change the first line to call par_iter_mut(), as shown in the following example, and watch it light up all of your CPUs! It’s no problem that foo_vector is a local variable on the stack, or that the computation might be mutating the values. The whole collection is automatically split among multiple threads, updates being processed independently without issues:

    let total = foo_vector.par_iter_mut()
        .filter(|foo| foo.is_interesting())
        .map(|foo| foo.heavy_computation())
        .sum();

    As a Rust developer, it’s not enough just to go fast. Everything must still also be checked for memory and thread safety. That principle is maintained with Rayon. Suppose the sequential iterator had been written more like this:

    let mut total = 0;
    foo_vector.iter_mut()
        .filter(|foo| foo.is_interesting())
        .for_each(|foo| total += foo.heavy_computation());

    This code is fine sequentially, but if you try to make it parallel with par_iter_mut(), the compiler will return an error:

    cannot assign to `total`, as it is a captured variable in a `Fn` closure
    

    There would be a data race if multiple threads tried to update total at the same time. You could solve this by using an atomic type for total, or by using Rayon’s built-in parallel sum() or your own custom fold+reduce on the iterator.

    But Rayon is a plain library, with no compiler integration whatsoever, so how can it know there was a data race? That magic is in the Rust language itself, just from Rayon expressing the right generic constraints.

    Ownership and borrowing in Rust

    Rust has strong semantics for the ownership of values of any type T. The compiler’s static borrow-checker keeps track of where a value has been borrowed as a reference, either as &mut T or &T. When a value is owned without any borrows, the owner can do whatever it wants with the value. Borrowing the value as &mut T is exclusive access, where only the borrower can do anything with the value—but even the borrower must leave the underlying T in a valid state. Borrowing the value as &T is immutable shared access, where both the borrower and the owner can only read the value. The borrow checker enforces all of this in statically determined regions of code (lifetimes), where only unborrowed owners of T or exclusive borrowers of &mut T are allowed to make any modifications. The only time &T can be modified is with types built on UnsafeCell, like atomics or Mutex, that add runtime synchronization.

    I recommend reading Rust: A unique perspective for more on this topic.

    Thread safety traits

    There are two auto-traits controlling all of Rust’s thread safety: Send and Sync. Being automatic means that these traits are inferred based on their composition: a struct will implement Send or Sync only if all of its fields do. If some field does not, such as a raw pointer, the traits do not apply to the struct unless a trait is added with an unsafe impl declaration where the author asserts that thread safety will be maintained in some other way (at risk of undefined behavior).

    Send means that T can move control to another thread. For owned values, this simply means the value can be moved entirely, and the original thread has no more access. But we can send references too. For a unique &mut T, the reference can be sent if T: Send is satisfied, passing the unique borrow to the other thread. For a shared &T, the reference can be sent only based on the other trait, T: Sync, which indicates that T can be safely shared with another thread.

    The Rustonomicon has a detailed chapter on Send and Sync.

    Function traits

    There are three traits related to the ways that functions can be called: FnOnce, FnMut, and Fn. Plain functions automatically implement all three traits, but closures implement the traits depending on how the closures use their captured state. If a closure would move or consume any part of its state, it implements only FnOnce called with self by value, because it wouldn’t have state remaining to move or consume a second time. If a closure modifies its state, it implements FnMut called with &mut self, and it can also be called as FnOnce. If a closure just reads its state, or has no state at all like a plain function, it implements Fn called with &self, as well as FnMut and FnOnce.

    The blog post Closures: Magic Functions goes into detail about this implementation.

    Generic constraints in Rayon

    With these powerful tools in the Rust language, Rayon only has to specify its constraints. The parallel iterators and their items have to implement Send, simply because they will be sent between threads. Iterator methods such as filter, map, and for_each have a few more constraints on their callback function/closure F:

    • It must implement Send so you can send it to the thread pool.
    • It must implement Sync so you can share &F references to that callback across multiple threads.
    • It must implement Fn so it can be called from any of those threads with a shared reference.

    Thus Rayon requires F: Send + Sync + Fn.

    Let's look again at the example that the compiler would reject:

    let mut total = 0;
    foo_vector.par_iter_mut()
        .filter(|foo| foo.is_interesting())
        .for_each(|foo| total += foo.heavy_computation());

    We'll take for granted that the type Foo in this vector implements Send, so it's also perfectly fine to send &mut Foo references among threads as the parallel iterator items. The for_each closure would capture a mutable reference to the accumulated total variable. Assuming that the number has a simple type such as i32, it would be acceptable to Send the closure with &mut i32 to another thread. It would even be fine with Sync to share references to that closure between threads. However, the mutation would make it FnMut, requiring &mut self to actually call it. The error from the compiler should now make sense:

    cannot assign to `total`, as it is a captured variable in a `Fn` closure
    

    Because Rayon requires Fn here, that gets locked in by the compiler, and you're not allowed to do anything that would make it FnMut. If you change the total to a type such as AtomicI32, which does allow updates with a shared reference, the code will compile and work just fine:

    let mut total = AtomicI32::new(0);
    foo_vector.par_iter_mut()
        .filter(|foo| foo.is_interesting())
        .for_each(|foo| total.fetch_add(foo.heavy_computation(), Ordering::Relaxed));

    This is the impact of Rust’s fearless concurrency (or parallelism)—not that you will never write bugs in threaded code, but that the compiler will catch the bugs before they can hurt you. Rayon can make it really easy to dip your toes into parallelism, feeling almost magical, but in truth Rayon doesn’t know anything about your code: it just specifies simple constraints and lets the Rust compiler do the hard work of proving it.

    For more articles about Rust, please visit Red Hat's topic page.

    Last updated: February 5, 2024

    Recent Posts

    • How to run AI models in cloud development environments

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    • How to encrypt RHEL images for Azure confidential VMs

    • How to manage RHEL virtual machines with Podman Desktop

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue