general concept - cloud

With the demands for higher bandwidth, came the need for scaling and processing packets on more CPU resources. In Open vSwitch (OVS) using DPDK for faster IO, this translated to using more receive and transmit queues to allow more PMD threads to process the packets. This adds some complexity to a system not easy to understand in the first place. Support or operation people still want to know how much traffic is received and how it is distributed across the CPU resources. To offer help, this article describes new statistics added for the user space datapath in OVS 2.17 and later.

Per queue statistics for DPDK ports

A first evolution in OVS 2.17 consisted of exposing receive and transmit queues statistics per DPDK physical ports in ovsdb.

For example:

# ovs-vsctl get interface dpdk0 statistics | sed -e 's#[{}]##g' -e 's#, #\n#g' | grep packets= | grep -v '=0$'
rx_packets=5553474
rx_q0_packets=3705290
rx_q1_packets=1848184
tx_broadcast_packets=220
tx_multicast_packets=488
tx_packets=39406658924
tx_q1_packets=3700644
tx_q2_packets=97
tx_q3_packets=19696490438
tx_q4_packets=19706467745

Those per queue statistics require support from the DPDK driver backing the port.

A vast majority of physical (and even some virtual) NIC DPDK drivers do support those statistics. But if no statistics appear in ovsdb, you may check for support by looking for the RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS (1 << 6) value in the port dev_flags bitmask through the DPDK telemetry tool (coming with the dpdk-tools rpm).

For example, check port 0:

# if [ $(($(echo /ethdev/info,0 | dpdk-telemetry.py -f /var/run/openvswitch/dpdk/rte | jq -r '.["/ethdev/info"]["dev_flags"]') & 64)) != 0 ]; then
  echo per queue stats are supported;
else
  echo per queue stats may not be implemented for this driver;
fi
per queue stats are supported

Per queue statistics for vhost-user ports

Getting the same level of information for vhost-user ports has required some reworking in the DPDK vhost-user libary because the library was not accounting such information.


This was enhanced by the community in the DPDK v22.07 release with this change, and its support was merged in OVS with this change in the 3.1 version:

# ovs-vsctl get interface vhost0 statistics | sed -e 's#[{}]##g' -e 's#, #\n#g' | grep packets= | grep -v '=0$'
rx_65_to_127_packets=2987595
rx_packets=2987595
rx_q0_good_packets=2987595
rx_q0_size_65_127_packets=2987595
tx_65_to_127_packets=14075727
tx_packets=14075727
tx_q0_good_packets=14075727
tx_q0_size_65_127_packets=14075727

More vhost-user statistics

As a bonus of the work exposing per queue statistics, the vhost-user library started exposing other internal counters.

The virtio driver (e.g., the Linux kernel driver by default) plugged on a vhost-user port may require guest notifications for signaling packets delivery. Triggering those notifications impacts the processing cost of such packets, which is why keeping track of the amount of notifications is of interest.

Previously, OVS was exposing a coverage counter for those notifications, and until OVS 3.0, you could use the following:

# ovs-appctl coverage/show | grep vhost_notification
vhost_notification         0.0/sec     0.000/sec        2.0283/sec   total: 7302

This coverage counter was only hinting at some vhost-user ports used by an unidentified virtual machine.

Starting OVS 3.1, the coverage counter has been removed in favor of per queue and per port statistics (DPDK change / OVS change):

# ovs-vsctl get interface vhost0 statistics | sed -e 's#[{}]##g' -e 's#, #\n#g' | grep guest_notifications
rx_q0_guest_notifications=12
rx_q1_guest_notifications=1
tx_q0_guest_notifications=3
tx_q1_guest_notifications=2

This nice addition makes it possible to directly point at which virtual machine is slowing down packet processing. Other vhost-user statistics have been added, like exposing the vhost-user IOTLB cache internals. More may be added in the future as members of the community express new requirements.

A final note about statistics

As OVS stores per interface statistics in its ovsdb, choices were made to select generic (iow not driver specific) statistics, and that helps in a majority of use cases.

However, if you do not find the driver-specific statistics you're looking for, it is still possible for debugging to use the DPDK telemetry tool and retrieve all unfiltered port statistics as follows:

# echo /ethdev/xstats,0 | dpdk-telemetry.py -f /var/run/openvswitch/dpdk/rte
{
  "/ethdev/xstats": {
    "rx_good_packets": 5553474,
    "tx_good_packets": 39406658860,
    "rx_good_bytes": 710844672,
    "tx_good_bytes": 4886425719104,
    "rx_missed_errors": 78892319,
    "rx_errors": 0,
    "tx_errors": 0,
    "rx_mbuf_allocation_errors": 0,
    "rx_q0_packets": 3705290,
    "rx_q0_bytes": 474277120,
    "rx_q0_errors": 0,
    "rx_q1_packets": 1848184,
    "rx_q1_bytes": 236567552,
    "rx_q1_errors": 0,
    "tx_q0_packets": 0,
    "tx_q0_bytes": 0,
    "tx_q1_packets": 3700615,
    "tx_q1_bytes": 458874621,
    "tx_q2_packets": 71,
    "tx_q2_bytes": 8120,
    "tx_q3_packets": 19696490435,
    "tx_q3_bytes": 2442364825811,
    "tx_q4_packets": 19706467739,
    "tx_q4_bytes": 2443602010552,
    "rx_wqe_errors": 0,
    "rx_unicast_packets": 84445793,
    "rx_unicast_bytes": 10471278332,
    "tx_unicast_packets": 39406658216,
    "tx_unicast_bytes": 4886425618784,
    "rx_multicast_packets": 0,
    "rx_multicast_bytes": 0,
    "tx_multicast_packets": 444,
    "tx_multicast_bytes": 35320,
    "rx_broadcast_packets": 0,
    "rx_broadcast_bytes": 0,
    "tx_broadcast_packets": 200,
    "tx_broadcast_bytes": 65000,
    "tx_phy_packets": 39406658860,
    "rx_phy_packets": 84445793,
    "rx_phy_crc_errors": 0,
    "tx_phy_bytes": 5044052354544,
    "rx_phy_bytes": 10809061504,
    "rx_phy_in_range_len_errors": 0,
    "rx_phy_symbol_errors": 0,
    "rx_phy_discard_packets": 0,
    "tx_phy_discard_packets": 0,
    "tx_phy_errors": 0,
    "rx_out_of_buffer": 78892319,
    "tx_pp_missed_interrupt_errors": 0,
    "tx_pp_rearm_queue_errors": 0,
    "tx_pp_clock_queue_errors": 0,
    "tx_pp_timestamp_past_errors": 0,
    "tx_pp_timestamp_future_errors": 0,
    "tx_pp_jitter": 0,
    "tx_pp_wander": 0,
    "tx_pp_sync_lost": 0
  }
}