Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Improve multicore scaling in Open vSwitch DPDK

November 19, 2021
Kevin Traynor
Related topics:
LinuxKubernetes
Related products:
Red Hat OpenShift

Share:

    A new feature in version 2.16 of Open vSwitch (OVS) helps developers scale the OVS-DPDK userspace datapath to use multiple cores. The Data Plane Development Kit (DPDK) is a popular set of networking libraries and drivers that provide fast packet processing and I/O.

    After reading this article, you will understand the new group assignment type for spreading the datapath workload across multiple cores, how this type differs from the default cycles assignment type, and how to use the new type in conjunction with the auto load balance feature in the poll mode driver (PMD) to improve OVS-DPDK scaling.

    How PMD threads manage packets in the userspace datapath

    In the OVS-DPDK userspace datapath, receive queues (RxQs) store packets from an interface that need to be received, processed, and usually transmitted to another interface. This work is done in OVS-DPDK by PMD threads that run on a set of dedicated cores and that continually poll the RxQs for packets. In OVS-DPDK, these datapath cores are commonly referred to just as PMDs.

    When there's more than one PMD, the workload should ideally be spread equally across them all. This prevents packet loss in cases where some PMDs may be overloaded while others have no work to do. In order to spread the workload across the PMDs, the interface RxQs that provide the packets need to be carefully assigned to the PMDs.

    The user can manually assign individual RxQs to PMDs with the other_config:pmd-rxq-affinity option. By default, OVS-DPDK also automatically assigns them. In this article, we focus on OVS-DPDK's process for automatically assigning RxQs to PMDs.

    OVS-DPDK automatic assignment

    RxQs can be automatically assigned to PMDs when there is a reconfiguration, such as the addition or removal of either RxQs or PMDs. Automatic assignment also occurs if triggered by the PMD auto load balance feature or the ovs-appctl dpif-netdev/pmd-rxq-rebalance command.

    The default cycles assignment type assigns the RxQs requiring the most processing cycles to different PMDs. However, the assignment also places the same or similar number of RxQs on each PMD.

    The cycles assignment type is a trade-off between optimizing for the current workload and having the RxQs spread out across PMDs to mitigate against workload changes. The default type is designed this way because, when it was introduced in OVS 2.9, there was no PMD auto load balance feature to deal with workload changes.

    The role of PMD auto load balance

    PMD auto load balance is an OVS-DPDK feature that dynamically detects an imbalance created by the user in how the workload is spread across PMDs. If PMD auto load balance estimates that the workload can and should be spread more evenly, it triggers an RxQ-to-PMD reassignment. The reassignment, and the ability to rebalance the workload evenly among PMDs, depends on the RxQ-to-PMD assignment type.

    PMD auto load balance is discussed in more detail in another article.

    The group RxQ-to-PMD assignment type

    In OVS 2.16, the cycles assignment type is still the default, but a more optimized group assignment type was added.

    The main differences between these assignment types is that the group assignment type removes the trade-off of having similar numbers of RxQs on each PMD. Instead, this assignment type spreads the workload purely based on finding the best current balance of the workload across PMDs. This improved optimization is feasible now because the PMD auto load balance feature is available to deal with possible workload changes.

    The group assignment type also scales better, because it recomputes the estimated workload on each PMD before every RxQ assignment.

    The increased optimization can mean a more equally distributed workload and hence more equally distributed available capacity across the PMDs. This improvement, along with PMD auto load balance, can mitigate against changes in workload caused by changes in traffic profiles.

    An RxQ-to-PMD assignment comparison

    We can see some of the differing characteristics of cycles and group with an example. If we run OVS 2.16 with a couple RxQs and PMDs, we can check the log messages to confirm that the default cycles assignment type is used for assigning Rxqs to PMDs:

    |dpif_netdev|INFO|Performing pmd to rx queue assignment using cycles algorithm.

    Then we can take a look at the current RxQ PMD assignments and RxQ workload usage:

    $ ovs-appctl dpif-netdev/pmd-rxq-show
    pmd thread numa_id 0 core_id 8:
      isolated : false
      port: vhost1           queue-id:  0 (enabled)   pmd usage: 20 %
      port: dpdk0            queue-id:  0 (enabled)   pmd usage: 70 %
      overhead:  0 %
    pmd thread numa_id 0 core_id 10:
      isolated : false
      port: vhost0           queue-id:  0 (enabled)   pmd usage: 20 %
      port: dpdk1            queue-id:  0 (enabled)   pmd usage: 30 %
      overhead:  0 %

    The workload is visualized in Figure 1.

    cycles RxQ-to-PMD assignment
    Figure 1. cycles RxQ-to-PMD assignment.

    The display shows that the cycles assignment type has done a good job keeping the two RxQs that require the most cycles (dpdk0 70% and dpdk1 30%) on different PMDs. Otherwise, one PMD would be at 100% and Rx packets might be dropped as a result.

    The display also shows that the assignment insists on both PMDs having an equal number of RxQs, two each. This means that PMD 8 is 90% loaded while PMD 10 is 50% loaded.

    That is not a problem with the current traffic profile, because both PMDs have enough processing cycles to handle the load. However, it does mean that PMD 8 has available capacity of only 10% to account for any traffic profile changes that require more processing. If, for example, the dpdk0 traffic profile changed and the required workload increased by more than 10%, PMD 8 would be overloaded and packets would be dropped.

    Now we can look at how the group assignment type optimizes for this kind of scenario. First we enable the group assignment type:

    $ ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group

    The logs confirm that is selected and immediately put to use:

    |dpif_netdev|INFO|Rxq to PMD assignment mode changed to: `group`.

    As mentioned earlier, the group assignment type eliminates the requirement of keeping the same number of RxQs per PMD, and bases its assignments on estimates of the least loaded PMD before every RxQ assignment. We can see how this policy affects the assignments:

    $ ovs-appctl dpif-netdev/pmd-rxq-show
    pmd thread numa_id 0 core_id 8:
      isolated : false
      port: dpdk0            queue-id:  0 (enabled)   pmd usage: 70 %
      overhead:  0 %
    pmd thread numa_id 0 core_id 10:
      isolated : false
      port: vhost0           queue-id:  0 (enabled)   pmd usage: 20 %
      port: dpdk1            queue-id:  0 (enabled)   pmd usage: 30 %
      port: vhost1           queue-id:  0 (enabled)   pmd usage: 20 %
      overhead:  0 %

    The workload is visualized in Figure 2.

    group RxQ-to-PMD assignment type
    Figure 2. group RxQ-to-PMD assignment type.

    Now PMD 8 and PMD 10 both have total loads of 70%, so the workload is better balanced between the PMDs.

    In this case, if the dpdk0 traffic profile changes and the required workload increases by 10%, it could be handled by PMD 8 without any packet drops because there is 30% available capacity.

    An interesting case is where RxQs are new or have no measured workload. If they were all put on the least loaded PMD, that PMD's estimated workload would not change. It would keep being selected as the least loaded PMD and be assigned all the new RxQs. This might not be ideal if those RxQs later became active, so instead the group assignment type spread RxQs with no measured history among PMDs.

    This example shows a change from the cycles to group assignment type during operation. Although that can be done,  the assignment type is typically set when initializing OVS-DPDK. Reassignments can then be triggered by any of the following mechanisms:

    • PMD auto load balance (providing that user-defined thresholds are met)
    • A change in configuration (adding or removing RxQs or PMDs)
    • The ovs-appctl dpif-netdev/pmd-rxq-rebalance command

    Other RxQ considerations

    All of the OVS-DPDK assignment types are constrained by the granularity of the workload on each RxQ. In the example in the previous section, it was possible to spread the workload evenly. In a case where dpdk0 was 95% loaded instead, PMD 8 would have a 95% load, while PMD 10 would have a 70% load.

    If you expect an interface to have a high traffic rate and hence a high required load, it is worth considering the addition of more RxQs in order to help split the traffic for that interface. More RxQs mean a greater granularity to help OVS-DPDK spread the workload more evenly across the PMDs.

    Conclusion

    This article looked at the new group assignment type from OVS 2.16 for RxQ-to-PMD assignments.

    Although the existing cycles assignment type might be good enough in many cases, the new group assignment type allows OVS-DPDK to more evenly distribute the workload across the available PMDs.

    This dynamic assignment has the benefit of allowing more optimal use of PMDs and providing a more equally distributed available capacity across PMDs, which in turn can make them more resilient against workload changes. For larger changes in workload, the PMD auto load balance feature can trigger reassignments.

    OVS 2.16 still has the same defaults as OVS 2.15, so users for whom OVS 2.15 multicore scaling is good enough can continue to use it by default after an upgrade. However, the new option is available if required.

    Further information about OVS-DPDK PMDs can be found in the documentation.

    Last updated: October 6, 2022

    Related Posts

    • Automatic load balancing for PMD threads in Open vSwitch with DPDK

    • Using virtual functions with DPDK on Red Hat OpenShift

    • OVS-DPDK Parameters: Dealing with multi-NUMA

    • Debugging Memory Issues with Open vSwitch DPDK

    • Troubleshooting Open vSwitch DPDK PMD Thread Core Affinity

    Recent Posts

    • Migrating Ansible Automation Platform 2.4 to 2.5

    • Multicluster resiliency with global load balancing and mesh federation

    • Simplify local prototyping with Camel JBang infrastructure

    • Smart deployments at scale: Leveraging ApplicationSets and Helm with cluster labels in Red Hat Advanced Cluster Management for Kubernetes

    • How to verify container signatures in disconnected OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue