This article demonstrates the deployment of Ethernet Virtual Private Network (EVPN) in Red Hat OpenStack Services on OpenShift version 18 (18.0.10 FR3). This will offer a thorough understanding of EVPN implementation within OpenStack Services on OpenShift 18, highlighting current limitations and potential future enhancements.
Overview of EVPN implementation
In Open Virtual Network (OVN) environments, ovn-bgp-agent
facilitates the exposure of virtual machines (VMs) on provider networks via EVPN. The OVN Border Gateway Protocol (BGP) Agent is a Python-based daemon that runs on each node (e.g., OpenStack controllers and/or compute nodes). It connects to the OVN Northbound DataBase (OVN NB DB) to detect the specific events it needs to react to, and then leverages Free Range Routing (FRR) to expose the routes towards the VMs via EVPN and kernel networking capabilities to redirect the traffic once on the nodes to the OVN overlay.
EVPN, a BGP-based control plane for Virtual Extensible LAN (VXLAN), provides substantial benefits compared to conventional VXLAN deployments. To expose VMs via EVPN, ovn-bgp-agent
must be configured with the expose method set to Virtual Routing and Forwarding (VRF). This configuration triggers FRR to advertise and withdraw directly connected and kernel routes via BGP. Upon startup, the agent identifies and configures all provider networks that meet the following criteria for exposure:
- The provider network matches the bridge mappings defined in the running OpenVSwitch (OVS) instance (e.g., ovn-bridge-mappings="physnet1:br-ex").
- The provider network has been configured by an administrator with at least a Virtual Network Interface (VNI), and the VPN type is set to l3.
For every provider network, a VRF is created through which the VM’s IPs are exposed via EVPN. For every VRF, the following configuration is done:
- A VRF device using the same VNI ID as the name suffix is created.
- A VXLAN device is created, using the VNI number as the VXLAN ID, as well as for the name suffix.
- A bridge device is created where the vxlan device is connected and associated to the created VRF, also using the VNI number as name suffix.
- A local FRR instance is reconfigured to ensure the new VRF is exposed.
- EVPN is connected to OVN overlay so that traffic can be redirected from the node to the OVN virtual networking. This is done by connecting VRF to the OVS provider bridge.
- OVS flows are added into the OVS provider bridge to redirect the traffic back from OVN to the proper VRF, based on the subnet CIDR and the router gateway port MAC address.
- Then, routes are added to expose to the VRF. Since we use full kernel routing in this VRF, we also expose the MAC address that belongs to this route, so we do not rely on ARP proxies in OVN.
External DataPlane nodes deployment
External DataPlane Nodes (EDPM) deployment nodeset templates must have following parameters set in order to enable EVPN functionality support provided by ovn-bgp-agent on compute nodes with DVR enabled:
edpm_ovn_bgp_agent_exposing_method: vrf
edpm_ovn_bgp_agent_evpn_local_ip:<IP>
edpm_frr_bgp_uplinks: <nics>
You can find example deployment templates here.
Services must be listed in the following order (customize the services based on the scenario, maintaining the order):
services:
- bootstrap
- download-cache
- configure-network
- validate-network
- install-os
- configure-os
- frr
- ssh-known-hosts
- run-os
- reboot-os
- install-certs
- ovn
- neutron-ovn-igmp
- neutron-metadata
- ovn-bgp-agent
- libvirt
- nova-custom
Functional gaps
EVPN via kernel routing currently presents few functional gaps. However, there are manual workarounds to get the functionality working.
1. Route exchange is not happening via EVPN due to FRR misconfiguration.
Once the VRF ID is configured on the OVN provider network switch, ovn-bgp-agent creates FRR configuration according to the number of provider networks and corresponding VRFs. Currently the FRR configuration is missing an address family configuration, which is blocking the EVPN type 5 route exchange. It can be manually configured in FRR pod context in compute node, as an example:
$ podman exec -it frr bash
$ vtysh
compute-0# config
compute-0(config)# router bgp 64999
compute-0(config-router)# address-family l2vpn evpn
compute-0(config-router-af)# neighbor <IP> activate
compute-0(config-router-af)# advertise-all-vni
compute-0(config-router-af)# advertise ipv4 unicast
compute-0(config-router-af)# exit-address-family
2. Neutron integration to support VNI ID and VPN type
APIs in Neutron must be implemented in order to support configuration of VPN type and VNI ID. Currently it can be set manually using the following commands, VNI ID can be any number.
$ ovn-nbctl set logical-switch <switch_uuid> external_ids:"neutron_bgpvpn\:type"="l3"
$ ovn-nbctl set logical-switch <switch_uuid> external_ids:"neutron_bgpvpn\:vni"="1001"
3. EDPM Ansible configuration issue
edpm_ovn_bgp_agent_evpn_local_ip
is not treated as an optional parameter when edpm_ovn_bgp_agent_evpn_nic
is defined. The expectation is EVPN VTEP can be configured either using edpm_ovn_bgp_agent_evpn_local_ip
or edpm_ovn_bgp_agent_evpn_nic
, but currently just specifying edpm_ovn_bgp_agent_evpn_nic
throws an error. The workaround is to define edpm_ovn_bgp_agent_evpn_local_ip:<IP>.
Future improvements
The following are potential future improvements that could enhance the experience of EVPN deployment.
- Data path optimization:
Currently, the EVPN implementation relies on kernel-based routing, which introduces encapsulation/decapsulation overhead. To significantly enhance performance, future efforts should focus on leveraging a fast data path. This could involve exploring options such as:
- Data Plane Development Kit (DPDK) integration: Utilizing DPDK can bypass the kernel network stack, allowing for high-speed packet processing directly in user space.
- Hardware offloading: Investigating the possibility of offloading VXLAN encapsulation/decapsulation and routing functions to network interface cards (NICs) that support these capabilities.
2. Provider network configuration
A current limitation of the EVPN implementation is the ability to expose only one flat provider network per VNI. To improve flexibility and scalability, we recommend prioritizing VLAN provider networks and using VLAN provider networks with EVPN. This approach allows for multiple isolated networks to be exposed through the same VNI, improving multi-tenancy and network segmentation capabilities.
Summary
By implementing these optimizations and addressing the current limitations, the performance and utility of EVPN with ovn-bgp-agent
in Red Hat OpenStack Services on OpenShift can be substantially improved.