Orchestrate offloaded network functions on DPUs with Red Hat OpenShift

The traditional CPU-centric system architecture is being replaced by designs where systems are aggregated from independently intelligent devices. These systems have their own compute capabilities and can natively run network functions with an accelerated data plane. The new model allows us to offload to accelerators not only the individual subroutines but whole software subsystems, such as networking or storage, with cloud-like security isolation and architectural compartmentalization.

One of the most prominent examples of this new architecture is the data processing unit (DPU). DPUs offer a complete compute system with an independent software stack, network identity, and provisioning capabilities. The DPU can host its own applications using either embedded or orchestrated deployment models.

The unique capabilities of the DPU allow for key infrastructure functions and their associated software stacks to be completely removed from the host node’s CPU cores and to be relocated onto the DPU. For instance, DPU could host the management plane of the network functions and part of the control plane, while the data plane could be accelerated by dedicated Arm cores, ASICs, GPUs, or FPGA IPs. Because DPUs can run independent software stacks locally, multiple network functions could run simultaneously on the same devices with service chaining and shared accelerators to provide generic in-line processing.

OVN/OVS offloading on NVIDIA BlueField-2 DPUs

Red Hat has collaborated with NVIDIA to extend the operational simplicity and hybrid cloud architecture of Red Hat OpenShift to NVIDIA BlueField-2 DPUs. Red Hat OpenShift 4.10 provides BlueField-2 OVN/OVS offload as a developer preview.

Installing OpenShift Container Platform on a DPU makes it possible to offload packet processing from the host CPUs to the DPU. Offloading resource-intensive tasks like packet processing from the server’s CPU to the DPU can free up cycles on the OpenShift worker nodes to run more user applications. OpenShift brings portability, scalability, and orchestration to DPU workloads, giving you the ability to use standard Kubernetes APIs along with consistent system management interfaces for both the host systems and DPUs.

In short, utilizing OpenShift with DPUs lets you get the benefits of DPUs without sacrificing the hybrid cloud experience or adding unnecessary complexity to managing IT infrastructure.

To manage DPUs, Red Hat OpenShift replaces the native BlueField operating system (OS) on each DPU and is deployed using a two-cluster design that consists of:

Tenant cluster running on the host servers (x86)
Infrastructure cluster running on DPUs (Arm)

The architecture is illustrated in Figure 1.

Diagram of the architecture. — Figure 1: The two-cluster design.

In this architecture, DPUs are provisioned as worker nodes of the Arm-based OpenShift infrastructure cluster. This is the blue cluster in Figure 1. The tenant OpenShift cluster, composed of the x86 servers, is where user applications typically run. This is the green cluster. In this deployment, each physical server runs both a tenant node on the x86 cores and an infrastructure node on the DPU Arm cores.

This architecture allows you to minimize the attack surface by decoupling the workload from the management cluster.

This architecture also streamlines operations by decoupling the application workload from the underlying infrastructure. That allows IT Ops to deploy and maintain the platform software and accelerated infrastructure while DevOps deploys and maintains application workloads independently from the infrastructure layer.

Red Hat OpenShift 4.10 provides capabilities for offloading Open Virtual Network (OVN) and Open Virtual Switch (OVS) services that typically run on servers, from the host CPU to the DPU. We are offering this functionality as a developer preview and enabling the following components to support OVN/OVS hardware offload to NVIDIA BlueField-2 DPUs:

DPU Network Operator: This component is used with the infrastructure cluster to facilitate OVN deployment.
DPU mode for OVN Kubernetes: This component is assigned by the cluster network operator for the tenant cluster.
SR-IOV network operator: This component discovers compatible network devices, such as the ConnectX-6 Dx NIC embedded inside the BlueField-2 DPU, and provisions them for SR-IOV access by pods on that server
ConnectX NIC Fast Data Path on Arm
Kernel flow offloading (TC Flower)
Experimental use of OpenShift Assisted Installer and BlueField-2 BMC

The combination of these components allows us to move ovn-kube-node services from the x86 host to the BlueField-2 DPU.

The network flows are offloaded in this manner (see Figure 2):

The ovn-k8s components are moved from the x86 host to the DPU (ovn-kube, vswitchd, ovsdb).
The Open vSwitch data path is offloaded from the BlueField Arm CPU to the ConnectX-6 Dx ASIC.

Diagram of the network flows. — Figure 2: Illustrating the network flows.

The following Open vSwitch datapath flows managed by ovn-k8s are offloaded to a BlueField-2 DPU that is running OpenShift 4.10:

Pod to pod (east-west)
Pod to clusterIP service backed by a regular pod in diff node (east-west)
Pod to external (north-south)

Let’s take a more detailed look at the testbed setup.

Install and configure an accelerated infrastructure with OpenShift and NVIDIA BlueField-2 DPUs

In our sample testbed shown in Figure 3, we are using two x86 hosts with BlueField-2 DPU PCIe cards installed. Each DPU has eight Arm cores, two 25GB network ports, and a 1GbE management port. We’ve wired 25GB ports to a switch and also connected the BMC port of the DPU to a separate network to manage the device with IPMI.

A photo of the sample testbed. — Figure 3: The sample testbed.

Next, we’ve installed Red Hat OpenShift Container Platform 4.10 on a DPU to offload packet processing from the host x86 to the DPU. Offloading resource-intensive computational tasks such as packet processing from the server’s CPU to the DPU frees up cycles on the OpenShift Container Platform worker nodes to run more applications or to run the same number of applications more quickly.

OpenShift and DPU deployment architecture

In our setup, OpenShift replaces the native BlueField OS. We used the two-cluster architecture where DPU cards are provisioned as worker nodes in the Arm-based infrastructure cluster. The tenant cluster composed of x86 servers was used to run user applications.

We followed these steps to deploy tenant and infrastructure clusters:

Install OpenShift assisted installer on the installer node using Podman.
Install the infrastructure cluster.
Install the tenant cluster.
Install the DPU Network Operator on the infrastructure cluster.
Configuring SR-IOV Operator for DPUs.

For details about how to install the OpenShift 4.10 on BlueField with OVN/OVS hardware offload, refer to the documentation.

When the cluster is deployed on the BlueField, the OVN configuration is automated with the "DPU Network Operator"; this operator can be installed in the OpenShift console or with the oc command.

As shown in Figure 4, the DPU operator is available for the Arm-based OpenShift clusters in the catalog and not visible in x86 OpenShift clusters.

The DPU Network Operator shown in the Operator Hub catalog. — Figure 4: The DPU Network Operator shown in the OperatorHub catalog.

Validate installation using the OpenShift console

When you have completed the last step, you have two OpenShift clusters running. We will make some console checks to show the configuration done and benchmark the OVS/OVN offloading.

Tenant cluster: We can list the nodes of the Tenant cluster nodes:

[egallen@bastion-tenant ~]$ oc get nodes
NAME                     STATUS   ROLES             AGE   VERSION
tenant-master-0          Ready    master            47d   v1.23.3+759c22b
tenant-master-1          Ready    master            47d   v1.23.3+759c22b
tenant-master-2          Ready    master            47d   v1.23.3+759c22b
tenant-worker-0          Ready    worker            47d   v1.23.3+759c22b
tenant-worker-1          Ready    worker            47d   v1.23.3+759c22b
x86-worker-advnetlab13   Ready    dpu-host,worker   35d   v1.23.3+759c22b
x86-worker-advnetlab14   Ready    dpu-host,worker   41d   v1.23.3+759c22b

Orchestrate offloaded network functions on DPUs with Red Hat OpenShift

Share:

OVN/OVS offloading on NVIDIA BlueField-2 DPUs

Install and configure an accelerated infrastructure with OpenShift and NVIDIA BlueField-2 DPUs

OpenShift and DPU deployment architecture

Validate installation using the OpenShift console

Conclusion

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue