Open vSwitch is growing every day and being used in large-scale deployments. Usually, that means there are few ports configured in the vswitch that will be always available, like physical Ethernet ports and several other ports providing networking connectivity to virtual machines or containers. Those other ports are software devices and very often they cannot be reused after a reboot or a system crash for example.
This blog post will talk about how to make sure the vSwitch comes up clean after a system crash or bad shutdown. The idea is that once vSwitch is up, there is no need for another component (usually a remote controller) to iterate over a large number of stale ports and clean them up.
How it works
There is a new port's property called "transient" accepted in upstream and expected to be available in OVS 2.9. For backward compatibility reasons, the default is "false" if not configured, and there is no behavior change from the previous OVS versions.
However, very often the management system is aware if the port is a software device or not. In case it is, the "transient" property can be set to "true". It doesn't do anything else than inform the Open vSwitch service initialization to remove that port from the database before the service is brought up.
In the case of Fedora and RHEL distributions, there is a new systemd service responsible to do the cleanup. It's called "ovs-delete-transient-ports.service" and it runs once per boot, so it doesn't affect service restarts. This service executes "ovs-ctl" script passing "delete-transient-ports" to do the actual work.
To get into deeper details, systemd does the services orchestration. The first service being brought up is "ovsdb-server". Then the "ovs-delete-transient-ports" service runs, which queries the database for ports with "transient=true" and delete them. Finally, "ovs-vswitchd" service is started completing the vswitch initialization.
I have a script, listed below, simulating 1000 containers connected to Open vSwitch and then I will trigger a kernel panic to simulate a kernel crash, for example. After a reboot without the property set, we can see that all ports remain in the database but not in the system.
Content of create_containers.sh:
#!/bin/bash
# create a bridge
ovs-vsctl add-br br0
for ns in $(seq 1 $1)
do
# add a new veth pair
ip link add vethHOST${ns} type veth \
peer name vethNS${ns}
# add a new network namespace
ip netns add ${ns}
# add host port to the vswitch bridge
ovs-vsctl add-port br0 vethHOST${ns}
# moving veth peer to the namespace
ip link set vethNS${ns} netns ${ns}
done
# ./create_containers.sh 1000
# # Checking the total number of software devices:
# ip link | grep vethHOST | wc -l
1000
# # Checking the total number of ports in br0:
# ovs-vsctl show | grep 'Port "vethHOST' | wc -l
1000
It's time to trigger the system crash to simulate a failure:
# echo b > /proc/sysrq-trigger
This is a good time to grab a cup of your preferred beverage while the system isn't back.
The system is back, let's verify if the software devices survived and if they are in the OVS bridge.
# # Repeating the previous command checking the
# # number of software devices:
# ip link | grep vethHOST | wc -l
0
# # Repeating the previous command checking the
# # the number of ports in br0:
# ovs-vsctl show | grep 'Port "vethHOST' | wc -l
1000
The ports remained in the database, though the devices don't exist. Now, if the management system tries to add a fresh software device that has the same name as one of those 1000 stale ports, it will receive an error:
# ovs-vsctl add-port ovsbr0 vethHOST114
ovs-vsctl: cannot create a port named vethHOST114
because a port named vethHOST114 already exists on
bridge br0
Solution
Let's change the script to set "transient=true" for all those software devices and repeat the test.
#!/bin/bash
# create a bridge
ovs-vsctl add-br br0
for ns in $(seq 1 $1)
do
# add a new veth pair
ip link add vethHOST${ns} type veth \
peer name vethNS${ns}
# add a new network namespace
ip netns add ${ns}
# add host port to the vswitch bridge
ovs-vsctl add-port br0 vethHOST${ns} \
-- set Port vethHOST${ns} other_config:transient=true
# moving veth peer to the namespace
ip link set vethNS${ns} netns ${ns}
done
This is the result, after the system crash:
# # Repeating the previous command checking the
# # number of software devices:
# ip link | grep vethHOST | wc -l
0
# # Repeating the previous command checking the
# # the number of ports in br0:
# ovs-vsctl show | grep 'Port "vethHOST' | wc -l
0
There are no software devices and no ports on the br0 bridge.
Conclusion
Open vSwitch has a database to store configuration in a persistent way, but sometimes that's not desired, especially for software devices like vnets, vhost-users or veth pairs, which will not survive a crash.
Upstream has merged the transient port boolean property to help clean up those type of ports before the service is available.
Thanks!
Take advantage of your Red Hat Developers membership and download RHEL today at no cost.
Last updated: November 30, 2017