Troubleshooting FDB table wrapping in Open vSwitch

Troubleshooting FDB table wrapping in Open vSwitch

When most people deploy an Open vSwitch configuration for virtual networking using the NORMAL rule, that is, using L2 learning, they do not think about configuring the size of the Forwarding DataBase (FDB).

When hardware-based switches are used, the FDB size is generally rather large and the large FDB size is a key selling point. However for Open vSwitch, the default FDB value is rather small, for example, in version 2.9 and earlier it is only 2K entries. Starting with version 2.10 the FDB size was increased to 8K entries. Note that for Open vSwitch, each bridge has its own FDB table for which the size is individually configurable.

This blog explains the effects of configuring too small an FDB table, how to identify which bridge is suffering from too small an FDB table, and how to configure the FDB table size appropriately.

Effects of too small an FDB table

When the FDB table is full and a new entry needs to be added, an older entry is removed to make room for the new one1. This is called FDB wrapping. If a packet is then received from the MAC address whose entry was removed, another entry is removed to make room, and the source MAC address of the packet will be re-added.

When more MAC addresses exist in the network than can be held in the configured FDB table size and all the MAC addresses are seen frequently, a lot of ping/ponging in the table can happen.

The more ping/ponging there is, the more CPU resources are needed to maintain the table. In addition, if traffic is received from evicted MAC addresses, the traffic is flooded out of all ports.

1 The algorithm for removing older entries in Open vSwitch is as follows. On the specific bridge, the port with the most FDB entries is found and the oldest entry is removed.

Everything you need to grow your career.

With your free Red Hat Developer program membership, unlock our library of cheat sheets and ebooks on next-generation application development.

SIGN UP

Open vSwitch–specific manifestations of too small an FDB table

In addition to the FDB table updates, Open vSwitch also has to clean up the flow table when an FDB entry is removed. This is done by the Open vSwitch revalidator thread. Because this flow table cleanup takes quite a bit of CPU cycles, the first indication you might have of an FDB table wrapping issue is a high revalidator thread utilization. The following example shows a high revalidator thread utilization of around 83% (deduced by adding the percentages shown in the CPU% column) in an idle system:

$ pidstat -t -p `pidof ovs-vswitchd` 1 | grep -E "UID|revalidator"
07:37:56 AM   UID      TGID       TID    %usr %system  %guest    %CPU   CPU  Command
07:37:57 AM   995         -    188565    5.00    5.00    0.00   10.00     2  |__revalidator110
07:37:57 AM   995         -    188566    6.00    4.00    0.00   10.00     2  |__revalidator111
07:37:57 AM   995         -    188567    6.00    5.00    0.00   11.00     2  |__revalidator112
07:37:57 AM   995         -    188568    5.00    5.00    0.00   10.00     2  |__revalidator113
07:37:57 AM   995         -    188569    5.00    5.00    0.00   10.00     2  |__revalidator116
07:37:57 AM   995         -    188570    5.00    6.00    0.00   11.00     2  |__revalidator117
07:37:57 AM   995         -    188571    5.00    5.00    0.00   10.00     2  |__revalidator114
07:37:57 AM   995         -    188572    5.00    6.00    0.00   11.00     2  |__revalidator115

Troubleshooting an FDB wrapping issue

Let’s figure out if the high revalidator thread CPU usage is related to the FDB requesting a cleanup. This can be done by inspecting the coverage counters. The following shows all coverage counters (that have a value higher than zero) related to causes for the revalidator running:

$ ovs-appctl coverage/show   | grep -E "rev_|Event coverage"
Event coverage, avg rate over last: 5 seconds, last minute, last hour,  hash=e4a796fd:
rev_reconfigure            0.0/sec     0.067/sec        0.0144/sec   total: 299
rev_flow_table             0.0/sec     0.000/sec        0.0003/sec   total: 2
rev_mac_learning          20.4/sec    18.167/sec       12.4039/sec   total: 44660

In the above output, you can see that rev_mac_learning has triggered the revalidation process about 20 times per second. This is quite high. In theory, it could still happen due to the normal FDB aging process, although in that specific case the last minute/hour values should be lower.

Hower normal aging can be isolated by using the same coverage counters:

$ ovs-appctl coverage/show   | grep -E "mac_learning_|Event"
Event coverage, avg rate over last: 5 seconds, last minute, last hour,  hash=086fdd98:
mac_learning_learned     1836.2/sec  1157.800/sec     1169.0800/sec   total: 7752613
mac_learning_expired       0.0/sec     0.000/sec        1.1378/sec   total: 4353

As you can see, there are mac_learning_learned and mac_learning_expired counters. In the above output, you can see a lot of new MAC addresses have been learned: around 1,836 per second. For an FDB table with the size of 2K, this is extremely high and would indicate we are replacing FDB entries.

If you are running Open vSwitch v2.10 or newer, it has additional coverage counters:

$ ovs-appctl coverage/show   | grep -E "mac_learning_|Event"
Event coverage, avg rate over last: 5 seconds, last minute, last hour,  hash=0ddb1578:
mac_learning_learned       0.0/sec     0.000/sec       10.6514/sec   total: 38345
mac_learning_expired       0.0/sec     0.000/sec        2.2756/sec   total: 8192
mac_learning_evicted       0.0/sec     0.000/sec        8.3758/sec   total: 30153
mac_learning_moved         0.0/sec     0.000/sec        0.0000/sec   total: 1

Explanation of the above:

  • mac_learning_learned: Shows the total number of learned MAC entries
  • mac_learning_expired: Shows the total number of expired MAC entries
  • mac_learning_evicted: Shows the total number of evicted MAC entries, that is, entries moved out due to the table being full
  • mac_learning_moved: Shows the total number of “port moved” MAC entries, that is, entries where the MAC address moved to a different port

Now, how can you determine which bridge has an FDB wrapping issue? For v2.9 and earlier, it’s a manual process of dumping the FDB table a couple of times, using the command ovs-appctl fdb/show, and comparing the entries.

For v2.10 and higher a new command was introduced, ovs-appctl fdb/stats-show, which shows all the above statistics on a per-bridge basis:

$ ovs-appctl fdb/stats-show ovs0
Statistics for bridge "ovs0":
  Current/maximum MAC entries in the table: 8192/8192
  Total number of learned MAC entries     : 52779
  Total number of expired MAC entries     : 8192
  Total number of evicted MAC entries     : 36395
  Total number of port moved MAC entries  : 1

NOTE: The statistics can be cleared with the command ovs-appctl fdb/stats-clear, for example, to get a per-second rate:

$ ovs-appctl fdb/stats-clear ovs0; sleep 1; ovs-appctl fdb/stats-show ovs0
statistics successfully cleared
Statistics for bridge "ovs0":
  Current/maximum MAC entries in the table: 8192/8192
  Total number of learned MAC entries     : 1902
  Total number of expired MAC entries     : 0
  Total number of evicted MAC entries     : 1902
  Total number of port moved MAC entries  : 0

Fixing the FDB table size

With Open vSwitch, you can easily adjust the size of the FDB table, and it’s configurable per bridge. The command to do this is as follows:

ovs-vsctl set bridge <bridge> other-config:mac-table-size=<size>

When you change the configuration, take note of the following:

  • The number of FDB entries can be from 10 to 1,000,000.
  • The configuration is active immediately.
  • The current entries are not flushed from the table.
  • If a smaller number is configured than the number of entries currently in the table, the oldest entries are aged out. You can see this in the expired MAC entries statistics.

Why not change the default to 1 million and stop worrying about this? Resource consumption: each entry in the table allocates memory. Although Open vSwitch allocates memory only when the entry is in use, changing the default to a too-high value could become a problem, for example, when someone does a MAC flooding attack.

So what would be the correct size to configure? This is hard to tell and depends on your use case. As a rule of thumb, you should configure your table a bit larger than the average number of active MAC addresses on your bridge.

Simple script to see FDB wrapping effects

If you would like to experiment with the counters, the following reproducer script from Jiri Benc, which lets you reproduce the effects of FDB wrapping, will let you do this.

Create an Open vSwitch bridge:

$ ovs-vsctl add-br ovs0
$ ip link set ovs0 up

Create the reproducer script:

$ cat > ~/reproducer.py <<EOF
#!/usr/bin/python
from scapy.all import *

data = [(str("00" + str(RandMAC())[2:]), str(RandIP())) for i in range(int(sys.argv[1]))]

s = conf.L2socket(iface="ovs0")
while True:
    for mac, ip in data:
        p = Ether(src=mac, dst=mac)/IP(src=ip, dst=ip)
        s.send(p)
EOF

$ chmod +x ~/reproducer.py

NOTE: The reproducer Python script requires Scapy to be installed.

Start the reproducer:

$ ./reproducer.py 10000

Now you can use the counter commands in the previous troubleshooting section to see the FDB table wrapping information and then set the size of the FDB appropriately.

Additional Open vSwitch and Open Virtual Network resources

Many of Red Hat’s products, such as Red Hat OpenStack Platform and Red Hat Virtualization, are now using Open Virtual Network (OVN) a sub-project of Open vSwitch. Red Hat OpenShift Container Platform will be using OVN soon. Some other virtual networking articles on the Red Hat Developer blog:

Take advantage of your Red Hat Developers membership and download RHEL today at no cost.

Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.

Share