Skip to main content
Redhat Developers  Logo
  • AI

    Get started with AI

    • Red Hat AI
      Accelerate the development and deployment of enterprise AI solutions.
    • AI learning hub
      Explore learning materials and tools, organized by task.
    • AI interactive demos
      Click through scenarios with Red Hat AI, including training LLMs and more.
    • AI/ML learning paths
      Expand your OpenShift AI knowledge using these learning resources.
    • AI quickstarts
      Focused AI use cases designed for fast deployment on Red Hat AI platforms.
    • No-cost AI training
      Foundational Red Hat AI training.

    Featured resources

    • OpenShift AI learning
    • Open source AI for developers
    • AI product application development
    • Open source-powered AI/ML for hybrid cloud
    • AI and Node.js cheat sheet

    Red Hat AI Factory with NVIDIA

    • Red Hat AI Factory with NVIDIA is a co-engineered, enterprise-grade AI solution for building, deploying, and managing AI at scale across hybrid cloud environments.
    • Explore the solution
  • Learn

    Self-guided

    • Documentation
      Find answers, get step-by-step guidance, and learn how to use Red Hat products.
    • Learning paths
      Explore curated walkthroughs for common development tasks.
    • Guided learning
      Receive custom learning paths powered by our AI assistant.
    • See all learning

    Hands-on

    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.
    • Interactive labs
      Learn by doing in these hands-on, browser-based experiences.
    • Interactive demos
      Click through product features in these guided tours.

    Browse by topic

    • AI/ML
    • Automation
    • Java
    • Kubernetes
    • Linux
    • See all topics

    Training & certifications

    • Courses and exams
    • Certifications
    • Skills assessments
    • Red Hat Academy
    • Learning subscription
    • Explore training
  • Build

    Get started

    • Red Hat build of Podman Desktop
      A downloadable, local development hub to experiment with our products and builds.
    • Developer Sandbox
      Spin up Red Hat's products and technologies without setup or configuration.

    Download products

    • Access product downloads to start building and testing right away.
    • Red Hat Enterprise Linux
    • Red Hat AI
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat Developer Toolset

    References

    • E-books
    • Documentation
    • Cheat sheets
    • Architecture center
  • Community

    Get involved

    • Events
    • Live AI events
    • Red Hat Summit
    • Red Hat Accelerators
    • Community discussions

    Follow along

    • Articles & blogs
    • Developer newsletter
    • Videos
    • Github

    Get help

    • Customer service
    • Customer support
    • Regional contacts
    • Find a partner

    Join the Red Hat Developer program

    • Download Red Hat products and project builds, access support documentation, learning content, and more.
    • Explore the benefits

Open vSwitch: QinQ Performance

June 27, 2017
Eric Garver
Related topics:
Linux
Related products:
Red Hat Enterprise Linux

    In a previous post, we introduced QinQ support for Open vSwitch. This post will investigate how QinQ performs relative to alternatives (VXLAN, GENEVE) in both throughput and CPU utilization. This will give us some understanding why we might consider QinQ over VXLAN or GENEVE.

    We're going to look at the following tunnel types and configurations:

    1. VXLAN-SW
      • VXLAN in software only. No hardware offload.
    2. VXLAN-HW
      • VXLAN with hardware offload. This includes UDP tunnel segmentation offload and receives side flow steering.
    3. GENEVE-SW
      • GENEVE in software only. No hardware offload.
    4. QinQ-SW
      • QinQ in software only. No hardware offload.
    5. QinQ-HW
      • QinQ with hardware offloads. This includes S-VLAN parsing. Unfortunately, the NIC used for testing does not support S-VLAN insert.

    Test Description

    All tests were run on the same hardware and software using the OVS kernel data path. The setup consists of two directly connected nodes. VLAN tagged traffic is passed from a network namespace, through a OVS bridge, and finally to the physical network. This simulates containers or virtual machines communicating with each other on separate host nodes.

    Tests were divided into two categories: single stream and multiple streams. A single stream test uses a single instance of netperf and uses only a single tunnel. A multiple stream test uses multiple instances of netperf or iperf3 and utilizes multiple tunnels as well. For VXLAN and GENEVE, this means each host uses multiple outer IP addresses for the encapsulation. For QinQ multiple outer VLAN IDs were used. Traffic from each namespace is mapped to a specific tunnel.

    QinQ Performance
    Diagram of test setup (simplified)">

    Test Configuration

    Specifically, the following hardware and software were used:

    • kernel: 4.12.0-rc2 (a3995460491d
    • NIC: mlx4, ConnectX-3 Pro, MT27520 (firmware 2.40.5000)
    • CPU: Xeon E5-2690

    The OVS configuration was minimal with only half a dozen ports - port 1 being the tunnel interface toward the physical network. Ports 2-6 are Linux veth interfaces associated with the namespaces we use for traffic generation.

    The following are snippets of the flow rules for both VXLAN and GENEVE. Through the magic of OVS, they're identical. For northbound traffic (toward the physical network), it sets the tunnel destination. For southbound traffic (toward the namespace), it matches the tunnel we received the traffic on and directs it toward the appropriate namespace.

    in_port=1,tun_dst=10.222.0.1 action=output:2
    in_port=2 action=set_field:10.222.0.6->tun_dst,output:1
    ...
    ...
    in_port=1,tun_dst=10.222.0.5 action=output:6
    in_port=6 action=set_field:10.222.0.10->tun_dst,output:1

    The following are snippets of the flow rules for QinQ. For northbound traffic, a VLAN tag is added with TPID 0x88a8. For southbound traffic, it matches the outer VLAN ID, removes the VLAN tag, and then directs it toward the appropriate namespace.

    in_port=1,dl_vlan=1 action=pop_vlan,output:2
    in_port=2 action=push_vlan:0x88a8,mod_vlan_vid:1,output:1
    ...
    ...
    in_port=1,dl_vlan=5 action=pop_vlan,output:6
    in_port=6 action=push_vlan:0x88a8,mod_vlan_vid:5,output:1

    Test Results

    Our first results are from the single instance netperf test. This is a single netperf instance over a single tunnel. The benefits of QinQ and VXLAN with hardware offload are very apparent. QinQ achieves almost double the performance of VXLAN-SW and GENEVE-SW.

    The second set of results is again using netperf, but this time with 5 instances of netperf and 5 tunnels. This utilizes more CPU cores and NIC receives queues. We see the same performance gap between tunnel types as with the single instance test, but it's even more pronounced with VXLAN-HW and QinQ essentially saturating the 40g link. One thing to note is the standard deviation is significant for VXLAN-SW and GENEVE-SW whereas it's virtually non-existent for the others.

    This third test uses 5 instances of iperf3 with each instance having 128 streams. In general, the throughput here is better than with netperf because we're using a lot more sockets and therefore have a lot more buffering.

    Our final test measures CPU utilization. This is almost identical to the previous iperf3 test, but to make the comparison fair we target a specific data rate. In this case, it's 32 Mbps * 128 streams * 5 instances, which equals 20 Gbps. The CPU utilization is collected on both nodes using mpstat from the sysstat package. All but 5 CPU cores (one per tunnel) are disabled for this test to give more digestible utilization percentages.

    Test Analysis

    Now, let's pick out some notable things from the graphs above.

    1. QinQ has great performance.
      Given what we learned in the overview post, this should not be all that surprising. QinQ's overhead is minimal which means fewer cycles have to be spent on the encapsulation. Another thing to note is that the IP-based tunnels have to go through the routing code of the kernel after they are encapsulated. QinQ frames can be put directly on the egress interface's output queue.
    2. QinQ-SW CPU utilization is less than half of other software only tunnels.
      In the CPU utilization test, we see that utilization for QinQ-SW is less than half of VXLAN-SW and GENEVE-SW. This savings translates into the host having more cycles to spend on other tasks.
    3. VXLAN offload is a huge help.
      This is very clear in the graphs. Hardware offload really helps VXLAN in both throughput and CPU utilization.
    4. Not much difference between QinQ-SW and QinQ-HW.
      The kernel is efficient at inserting and parsing VLAN tags. A quick analysis with perf shows inserting the VLAN into the packet data only accounts for 0.02% of the transmit overhead. This indicates that hardware S-VLAN insert and parse may help, but only by a small amount.

    Conclusion

    We found that OVS QinQ performs very well in both throughput and CPU utilization. As such, it may be a solid choice for your deployment. This is especially true if hardware offload is not available as is the case for virtual machines and older hardware. It means you can move from one cloud provider or NIC to another and expect similar performance. You're not dependent on hardware features provided by a subset of vendors.

    To get started with OVS QinQ, you can check out the previous post for a quick overview.


    Download this Kubernetes cheat sheet for automating deployment, scaling and operations of application containers across clusters of hosts, providing container-centric infrastructure.

    Last updated: September 18, 2018

    Recent Posts

    • Red Hat Enterprise Linux 10.2 and 9.8: Top features for developers

    • What GPU kernels mean for your distributed inference

    • Debugging image mode with Red Hat OpenShift 4.20: A practical guide

    • EvalHub: Because "looks good to me" isn't a benchmark

    • SQL Server HA on RHEL: Meet Pacemaker HA Agent v2 (tech preview)

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer tools
    • Interactive tutorials
    • API catalog

    Quicklinks

    • Learning resources
    • E-books
    • Cheat sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site status dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2026 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Chat Support

    Please log in with your Red Hat account to access chat support.