When most people deploy an Open vSwitch configuration for virtual networking using the NORMAL rule, that is, using L2 learning, they do not think about configuring the size of the Forwarding DataBase (FDB).
When hardware-based switches are used, the FDB size is generally rather large and the large FDB size is a key selling point. However for Open vSwitch, the default FDB value is rather small, for example, in version 2.9 and earlier it is only 2K entries. Starting with version 2.10 the FDB size was increased to 8K entries. Note that for Open vSwitch, each bridge has its own FDB table for which the size is individually configurable.
This blog explains the effects of configuring too small an FDB table, how to identify which bridge is suffering from too small an FDB table, and how to configure the FDB table size appropriately.
Effects of too small an FDB table
When the FDB table is full and a new entry needs to be added, an older entry is removed to make room for the new one1. This is called FDB wrapping. If a packet is then received from the MAC address whose entry was removed, another entry is removed to make room, and the source MAC address of the packet will be re-added.
When more MAC addresses exist in the network than can be held in the configured FDB table size and all the MAC addresses are seen frequently, a lot of ping/ponging in the table can happen.
The more ping/ponging there is, the more CPU resources are needed to maintain the table. In addition, if traffic is received from evicted MAC addresses, the traffic is flooded out of all ports.
1 The algorithm for removing older entries in Open vSwitch is as follows. On the specific bridge, the port with the most FDB entries is found and the oldest entry is removed.
Open vSwitch–specific manifestations of too small an FDB table
In addition to the FDB table updates, Open vSwitch also has to clean up the flow table when an FDB entry is removed. This is done by the Open vSwitch revalidator thread. Because this flow table cleanup takes quite a bit of CPU cycles, the first indication you might have of an FDB table wrapping issue is a high revalidator thread utilization. The following example shows a high revalidator thread utilization of around 83% (deduced by adding the percentages shown in the CPU% column) in an idle system:
$ pidstat -t -p `pidof ovs-vswitchd` 1 | grep -E "UID|revalidator" 07:37:56 AM UID TGID TID %usr %system %guest %CPU CPU Command 07:37:57 AM 995 - 188565 5.00 5.00 0.00 10.00 2 |__revalidator110 07:37:57 AM 995 - 188566 6.00 4.00 0.00 10.00 2 |__revalidator111 07:37:57 AM 995 - 188567 6.00 5.00 0.00 11.00 2 |__revalidator112 07:37:57 AM 995 - 188568 5.00 5.00 0.00 10.00 2 |__revalidator113 07:37:57 AM 995 - 188569 5.00 5.00 0.00 10.00 2 |__revalidator116 07:37:57 AM 995 - 188570 5.00 6.00 0.00 11.00 2 |__revalidator117 07:37:57 AM 995 - 188571 5.00 5.00 0.00 10.00 2 |__revalidator114 07:37:57 AM 995 - 188572 5.00 6.00 0.00 11.00 2 |__revalidator115
Troubleshooting an FDB wrapping issue
Let’s figure out if the high revalidator thread CPU usage is related to the FDB requesting a cleanup. This can be done by inspecting the coverage counters. The following shows all coverage counters (that have a value higher than zero) related to causes for the revalidator running:
$ ovs-appctl coverage/show | grep -E "rev_|Event coverage" Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=e4a796fd: rev_reconfigure 0.0/sec 0.067/sec 0.0144/sec total: 299 rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 2 rev_mac_learning 20.4/sec 18.167/sec 12.4039/sec total: 44660
In the above output, you can see that
rev_mac_learning has triggered the revalidation process about 20 times per second. This is quite high. In theory, it could still happen due to the normal FDB aging process, although in that specific case the last minute/hour values should be lower.
Hower normal aging can be isolated by using the same coverage counters:
$ ovs-appctl coverage/show | grep -E "mac_learning_|Event" Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=086fdd98: mac_learning_learned 1836.2/sec 1157.800/sec 1169.0800/sec total: 7752613 mac_learning_expired 0.0/sec 0.000/sec 1.1378/sec total: 4353
As you can see, there are
mac_learning_expired counters. In the above output, you can see a lot of new MAC addresses have been learned: around 1,836 per second. For an FDB table with the size of 2K, this is extremely high and would indicate we are replacing FDB entries.
If you are running Open vSwitch v2.10 or newer, it has additional coverage counters:
$ ovs-appctl coverage/show | grep -E "mac_learning_|Event" Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=0ddb1578: mac_learning_learned 0.0/sec 0.000/sec 10.6514/sec total: 38345 mac_learning_expired 0.0/sec 0.000/sec 2.2756/sec total: 8192 mac_learning_evicted 0.0/sec 0.000/sec 8.3758/sec total: 30153 mac_learning_moved 0.0/sec 0.000/sec 0.0000/sec total: 1
Explanation of the above:
mac_learning_learned: Shows the total number of learned MAC entries
mac_learning_expired: Shows the total number of expired MAC entries
mac_learning_evicted: Shows the total number of evicted MAC entries, that is, entries moved out due to the table being full
mac_learning_moved: Shows the total number of "port moved" MAC entries, that is, entries where the MAC address moved to a different port
Now, how can you determine which bridge has an FDB wrapping issue? For v2.9 and earlier, it’s a manual process of dumping the FDB table a couple of times, using the command
ovs-appctl fdb/show, and comparing the entries.
For v2.10 and higher a new command was introduced,
ovs-appctl fdb/stats-show, which shows all the above statistics on a per-bridge basis:
$ ovs-appctl fdb/stats-show ovs0 Statistics for bridge "ovs0": Current/maximum MAC entries in the table: 8192/8192 Total number of learned MAC entries : 52779 Total number of expired MAC entries : 8192 Total number of evicted MAC entries : 36395 Total number of port moved MAC entries : 1
NOTE: The statistics can be cleared with the command
ovs-appctl fdb/stats-clear, for example, to get a per-second rate:
$ ovs-appctl fdb/stats-clear ovs0; sleep 1; ovs-appctl fdb/stats-show ovs0 statistics successfully cleared Statistics for bridge "ovs0": Current/maximum MAC entries in the table: 8192/8192 Total number of learned MAC entries : 1902 Total number of expired MAC entries : 0 Total number of evicted MAC entries : 1902 Total number of port moved MAC entries : 0
Fixing the FDB table size
With Open vSwitch, you can easily adjust the size of the FDB table, and it’s configurable per bridge. The command to do this is as follows:
ovs-vsctl set bridge <bridge> other-config:mac-table-size=<size>
When you change the configuration, take note of the following:
- The number of FDB entries can be from 10 to 1,000,000.
- The configuration is active immediately.
- The current entries are not flushed from the table.
- If a smaller number is configured than the number of entries currently in the table, the oldest entries are aged out. You can see this in the expired MAC entries statistics.
Why not change the default to 1 million and stop worrying about this? Resource consumption: each entry in the table allocates memory. Although Open vSwitch allocates memory only when the entry is in use, changing the default to a too-high value could become a problem, for example, when someone does a MAC flooding attack.
So what would be the correct size to configure? This is hard to tell and depends on your use case. As a rule of thumb, you should configure your table a bit larger than the average number of active MAC addresses on your bridge.
Simple script to see FDB wrapping effects
If you would like to experiment with the counters, the following reproducer script from Jiri Benc, which lets you reproduce the effects of FDB wrapping, will let you do this.
Create an Open vSwitch bridge:
$ ovs-vsctl add-br ovs0 $ ip link set ovs0 up
Create the reproducer script:
$ cat > ~/reproducer.py <<EOF #!/usr/bin/python from scapy.all import * data = [(str("00" + str(RandMAC())[2:]), str(RandIP())) for i in range(int(sys.argv))] s = conf.L2socket(iface="ovs0") while True: for mac, ip in data: p = Ether(src=mac, dst=mac)/IP(src=ip, dst=ip) s.send(p) EOF $ chmod +x ~/reproducer.py
NOTE: The reproducer Python script requires Scapy to be installed.
Start the reproducer:
$ ./reproducer.py 10000
Now you can use the counter commands in the previous troubleshooting section to see the FDB table wrapping information and then set the size of the FDB appropriately.
Additional Open vSwitch and Open Virtual Network resources
Many of Red Hat’s products, such as Red Hat OpenStack Platform and Red Hat Virtualization, are now using Open Virtual Network (OVN) a sub-project of Open vSwitch. Red Hat OpenShift Container Platform will be using OVN soon. Some other virtual networking articles on the Red Hat Developer blog: