In this article, I discuss external connectivity in Open Virtual Network (OVN), a subproject of Open vSwitch (OVS), using a distributed gateway router.
OVN provides external connectivity in two ways:
- A logical router with a distributed gateway port, which is referred to as a distributed gateway router in this article
- A logical gateway router
In this article, you will see how to create a distributed gateway router and an example of how it works.
Creating a distributed gateway router has some advantages over using a logical gateway router for the CMS (cloud management system):
- It is easier to create a distributed gateway router because the CMS doesn't need to create a transit logical switch, which is needed for a logical gateway router.
- A distributed gateway router supports distributed north/south traffic, whereas the logical gateway router is centralized on a single gateway chassis.
- A distributed gateway router supports high availability.
Note: The CMS can be OpenStack, Red Hat OpenShift, Red Hat Virtualization, or any other system that manages a cloud.
Setup details
Let's first talk about the deployment details. I will take an example setup having five nodes, out of which three are controller nodes and the rest are compute nodes. The tenant VMs are created in the compute nodes. Controller nodes run OVN database servers in active/passive mode.
Note: When you run the command ovn-nbctl/ovn-sbctl
, it should be run on the node where the OVN database servers are running. Alternately, you can pass the --db
option with the IP address/port.
Chassis in OVN
In OVN terminology, each node is referred to as chassis. A chassis is nothing but a node where the ovn-controller
service is running. In order for a chassis to act as a gateway chassis, it should be capable of providing external (north/south) connectivity to the tenant traffic. It also requires the following configuration:
- Configure
ovn-bridge-mappings
, which provides a list of key-value pairs that map a physical network name to a local OVS bridge that provides connectivity to that network.
ovs-vsctl set open . external-ids:ovn-bridge-mappings=provider:br-provider
- Create the provider OVS bridge and add to the OVS bridge the interface that provides external connectivity:
ovs-vsctl --may-exist add-br br-provider ovs-vsctl --may-exist add-port br-provider INTERFACE_NAME
In the above setup, all the controller nodes act as gateway chassis.
Below is the output of ovn-sbctl
showing my setup:
Chassis "controller-0" hostname: "controller-0.localdomain" Encap geneve ip: "172.17.2.28" options: {csum="true"} Chassis "controller-1" hostname: "controller-1.localdomain" Encap geneve ip: "172.17.2.26" options: {csum="true"} Chassis "controller-2" hostname: "controller-2.localdomain" Encap geneve ip: "172.17.2.18" options: {csum="true"} Chassis "compute-0" hostname: "compute-0.localdomain" Encap geneve ip: "172.17.2.15" options: {csum="true"} Chassis "compute-1" hostname: "compute-1.localdomain" Encap geneve ip: "172.17.2.17" options: {csum="true"}
Let's first create a couple of logical switches and logical ports and attach them to a logical router:
ovn-nbctl ls-add sw0 ovn-nbctl lsp-add sw0 sw0-port1 ovn-nbctl lsp-set-addresses sw0-port1 "00:00:01:00:00:03 10.0.0.3" ovn-nbctl ls-add sw1 ovn-nbctl lsp-add sw1 sw1-port1 ovn-nbctl lsp-set-addresses sw1-port1 "00:00:02:00:00:03 20.0.0.3" ovn-nbctl lr-add lr0 # Connect sw0 to lr0 ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 ovn-nbctl lsp-add sw0 sw0-lr0 ovn-nbctl lsp-set-type sw0-lr0 router ovn-nbctl lsp-set-addresses sw0-lr0 router ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 # Connect sw1 to lr0 ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 ovn-nbctl lsp-add sw1 sw1-lr0 ovn-nbctl lsp-set-type sw1-lr0 router ovn-nbctl lsp-set-addresses sw1-lr0 router ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1
Below is the output of ovn-nbctl
:
ovn-nbctl show switch 05cf23bc-2c87-4d6d-a76b-f432e562ed71 (sw0) port sw0-port1 addresses: ["00:00:01:00:00:03 10.0.0.3"] port sw0-lr0 type: router router-port: lr0-sw0 switch 0dfee7ef-13b3-4cd0-87a1-7935149f551e (sw1) port sw1-port1 addresses: ["00:00:02:00:00:03 20.0.0.3"] port sw1-lr0 type: router router-port: lr0-sw1 router c189f271-86d6-4f7f-891c-672cb3aa543e (lr0) port lr0-sw0 mac: "00:00:00:00:ff:01" networks: ["10.0.0.1/24"] port lr0-sw1 mac: "00:00:00:00:ff:02" networks: ["20.0.0.1/24"]
The port sw0-port1
can communicate with sw1-port1
since the switches are connected to the logical router lr0
. The east-west traffic is distributed with OVN.
Now let's create a provider logical switch:
ovn-nbctl ls-add public # Create a localnet port ovn-nbctl lsp-add public ln-public ovn-nbctl lsp-set-type ln-public localnet ovn-nbctl lsp-set-addresses ln-public unknown ovn-nbctl lsp-set-options ln-public network_name=provider
Notice thenetwork_name=provider
. The network_name
should match the list defined in the ovn-bridge-mappings
. When a localnet port is defined in a logical switch, the ovn-controller
running on gateway chassis creates an OVS patch port between the integration bridge and the provider bridge so that the logical tenant traffic leaves from and enters into the physical network.
At this point, the tenant traffic from the logical switches sw0
and sw1
still cannot enter the public
logical switch, since there is no association between it and the logical router lr0
.
Creating a distributed router port
Let's first connect lr0
to public
:
ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.200/24 ovn-nbctl lsp-add public public-lr0 ovn-nbctl lsp-set-type public-lr0 router ovn-nbctl lsp-set-addresses public-lr0 router ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
We still need to schedule the distributed gateway port lr0-public
to a gateway chassis. What does scheduling mean here? It means the chassis that is selected to host the gateway router port provides the centralized external connectivity. The north-south tenant traffic will be redirected to this chassis and it acts as a gateway. This chassis applies all the NATting rules before sending out the traffic via the patch port to the provider bridge. It also means that when someone pings 172.168.0.200 or sends ARP request for 172.168.0.200, the gateway chassis hosting this will respond with the ping and ARP replies.
Scheduling the gateway router port
This can be done in two ways:
- Non-high-availability (non-HA) mode: The gateway router port is configured to be scheduled on a single gateway chassis. If the gateway chassis hosting this port goes down for some reason, the external connectivity is completely broken until the CMS (cloud management system) detects this and reschedules it to another gateway chassis.
- HA mode: The gateway router port is configured to be scheduled on a set of gateway chassis. The gateway chassis configured with a high priority claims the gateway router port. If this gateway chassis goes down for some reason, the next higher priority gateway chassis claims the gateway router port.
Scheduling in non-HA mode
Select a gateway chassis where you want to schedule the gateway router port. Let's schedule on controller-0
. There are two ways to do it. Run one of the following commands:
ovn-nbctl set logical_router_port lr0-public options:redirect-chassis=controller-0 ovn-nbctl list logical_router_port lr0-public _uuid : 0ced9cdb-fbc9-47f1-b2e2-97a49988d622 enabled : [] external_ids : {} gateway_chassis : [] ipv6_ra_configs : {} mac : "00:00:20:20:12:13" name : "lr0-public" networks : ["172.168.0.200/24"] options : {redirect-chassis="controller-0"} peer : [] or ovn-nbctl lrp-set-gateway-chassis lr0-public controller-0 20
In the ovn-sbctl show
output below, you can see that controller-0
is hosting the gateway router port lr0-public
.
ovn-sbctl show Chassis "d86bd6f2-1216-4a73-bcaf-3200b8ed8126" hostname: "controller-0.localdomain" Encap geneve ip: "172.17.2.28" options: {csum="true"} Port_Binding "cr-lr0-public" Chassis "20dc7bfb-a329-4cf9-a8ac-3485f7d5be46" hostname: "controller-1.localdomain" ... ...
Scheduling in HA mode
In this case, we select a set of gateway chassis and set a priority for each chassis. The chassis with the highest priority will be hosting the gateway router port.
In our example, let's set all the gateway chassis: controller-0
with priority 20, controller-1
with 15, and controller-2
with 10.
Run the following commands:
ovn-nbctl lrp-set-gateway-chassis lr0-public controller-0 20 ovn-nbctl lrp-set-gateway-chassis lr0-public controller-1 15 ovn-nbctl lrp-set-gateway-chassis lr0-public controller-2 10
You can verify the configuration by running the following commands:
ovn-nbctl list gateway_chassis _uuid : 745d7f84-0516-4a0f-9b3d-772e5cb58a48 chassis_name : "controller-1" external_ids : {} name : "lr0-public-controller-1" options : {} priority : 15 _uuid : 6f2921d4-2555-4f81-9428-640cbf62151e chassis_name : "controller-0" external_ids : {} name : "lr0-public-controller-0" options : {} priority : 20 _uuid : 97595b29-139d-4a43-9973-8995ffe17c64 chassis_name : "controller-2" external_ids : {} name : "lr0-public-controller-2" options : {} priority : 10
ovn-nbctl list logical_router_port lr0-public _uuid : 0ced9cdb-fbc9-47f1-b2e2-97a49988d622 enabled : [] external_ids : {} gateway_chassis : [6f2921d4-2555-4f81-9428-640cbf62151e, 745d7f84-0516-4a0f-9b3d-772e5cb58a48, 97595b29-139d-4a43-9973-8995ffe17c64] ipv6_ra_configs : {} mac : "00:00:20:20:12:13" name : "lr0-public" networks : ["172.168.0.200/24"] options : {} peer : [] ovn-sbctl show Chassis "d86bd6f2-1216-4a73-bcaf-3200b8ed8126" hostname: "controller-0.localdomain" Encap geneve ip: "172.17.2.28" options: {csum="true"} Port_Binding "cr-lr0-public" Chassis "20dc7bfb-a329-4cf9-a8ac-3485f7d5be46" hostname: "controller-1.localdomain" ... ...
You can always delete a gateway chassis' association to the distributed router port by running the following command:
ovn-nbctl lrp-del-gateway-chassis lr0-public controller-1
To support HA, OVN uses the Bidirectional Forwarding Detection (BFD) protocol. It configures BFD on the tunnel ports. When a gateway chassis hosting a distributed gateway port goes down, all the chassis detect that (thanks to BFD) and the next higher priority gateway chassis claims the port. For more details, please refer to this and run the following commands to access the OVN man pages: man ovn-nb
, man ovn-northd
, and man ovn-controller
.
Chassis redirect port
In the output of ovn-sbctl show
, you can see Port_Binding "cr-lr0-public"
. What is cr-lr0-public
? For every gateway router port scheduled, ovn-northd
internally creates a logical port of type chassisredirect. This port represents an instance of the distributed gateway port that is scheduled on the selected chassis.
What happens when a VM sends external traffic?
Now let's briefly see what happens when a VM associated with the logical port (let's say sw0-port0
) sends a packet to destination 172.168.0.110 from the OVN logical datapath pipeline perspective. Let's assume the VM is running on compute-0
and the chassis redirect port is scheduled on controller-0
. 172.168.0.110 could be associated with a physical server or a VM that is reachable via the provider network.
On the compute chassis, the following occurs:
- When the VM sends the traffic, the logical switch pipeline of
sw0
is run. - From the logical switch pipeline, it enters the ingress router pipeline via the
lr0-sw0
port as the packet needs to be routed. - The ingress router pipeline is run and the routing decision is made and the outport is set to
lr0-public
.
Logical flows table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 00:00:00:00:ff:01 && inport == "lr0-sw0"), action=(next;) ... table=7 (lr_in_ip_routing ), priority=49 , match=(ip4.dst == 172.168.0.0/24), action=(ip.ttl--; reg0 = ip4.dst; reg1 = 172.168.0.200; eth.src = 00:00:20:20:12:13; outport = "lr0-public"; flags.loopback = 1; next;) ... table=9 (lr_in_gw_redirect ), priority=50 , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;)
- Since
cr-lr0-public
is scheduled oncontroller-0
, the packet is sent tocontroller-0
via the tunnel port:
table=32, priority=100,reg15=0x4,metadata=0x3 actions=load:0x3->NXM_NX_TUN_ID[0..23],set_field:0x4->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:ovn-cont-0
On controller-0 chassis, the following occurs:
controller-0
receives the traffic on the tunnel port and sends the traffic to the egress pipeline of logical routerlr0
:
table=0, priority=100,in_port="ovn-comp-0" actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,33)
- The NAT rules are applied. That is, the source IP address 10.0.0.3 is NATed to 172.168.0.200:
table=1 (lr_out_snat ), priority=25 , match=(ip && ip4.src == 10.0.0.0/24 && outport == "lr0-public" && is_chassis_resident("cr-lr0-public")), action=(ct_snat(172.168.0.200);)
- The packet is sent to logical switch
public
via thelr0-public
port:
table=3 (lr_out_delivery ), priority=100 , match=(outport == "lr0-public"), action=(output;)
- And the packet is sent out via the localnet port to the provider bridge and reaches the destination.
Now let's see what happens for the reply traffic.
On controller-0
chassis:
- The packet is received by the physical interface present in the provider bridge and it enters the pipeline of ingress logical switch
public
via thelocalnet
port:
table=0,priority=100,in_port="patch-br-int-to",dl_vlan=0 actions=strip_vlan,load:0x1->NXM_NX_REG13[],load:0x7->NXM_NX_REG11[],load:0x8->NXM_NX_REG12[],load:0x4->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
- From
public
, the packet enters the pipeline oflr0
via thepublic-lr0
logical port. - In the ingress router pipeline, UnSNAT rules are applied. That is, the destination IP address is unNATed from 172.168.0.200 to 10.0.0.3:
table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 00:00:20:20:12:13 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")), action=(next;) ... table=3 (lr_in_unsnat ), priority=100 , match=(ip && ip4.dst == 172.168.0.200 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")), action=(ct_snat;)
- Since 10.0.0.3 belongs to the logical switch
sw0
, the packet enters the ingress pipeline ofsw0
vialr0-sw0
:
table=7 (lr_in_ip_routing ), priority=49 , match=(ip4.dst == 10.0.0.0/24), action=(ip.ttl--; reg0 = ip4.dst; reg1 = 10.0.0.1; eth.src = 00:00:00:00:ff:01; outport = "lr0-sw0"; flags.loopback = 1; next;)
- The ingress pipeline of
sw0
is run and the packet is sent tocompute-0
via the tunnel port because OVN knows thatsw0-port1
resides oncompute-0
.
On compute-0
chassis, the following occurs:
compute-0
receives the traffic on the tunnel port and sends the traffic to the egress pipeline of logical switchsw0
.- In the egress pipeline, the packet is delivered to
sw0-port1
.
Conclusion
This article provides an overview of a distributed gateway router in OVN, how it is created and what happens when a VM sends external traffic. Hopefully this will be helpful in understanding external connectivity support in OVN and troubleshooting any issues related to it.
Additional resources
See additional virtual networking articles on Open vSwitch and Open Virtual Network:
Last updated: June 7, 2023