Skip to content

VXLAN EVPN seems to be no longer supported when using newer FRR #7589

@sk4zuzu

Description

@sk4zuzu

Description

This is typical VXLAN/EVPN config that used to work with one-deploy:

    vn:
      evpn0:
        managed: true
        template:
          VN_MAD: vxlan
          VXLAN_MODE: evpn
          IP_LINK_CONF: nolearning=
          PHYDEV: eth1
          AUTOMATIC_VLAN_ID: "YES"
          GUEST_MTU: 1450
          AR:
            TYPE: IP4
            IP: 172.17.2.200
            SIZE: 48
          NETWORK_ADDRESS: 172.17.2.0
          NETWORK_MASK: 255.255.255.0
          GATEWAY: 172.17.2.1
          DNS: 1.1.1.1
router:
  hosts:
    u1a1: { ansible_host: 10.2.80.10 }
node:
  hosts:
    u1b1: { ansible_host: 10.2.80.20 }
    u1b2: { ansible_host: 10.2.80.21 }

Unfortunately this no longer works with newer versions of FRR like 10.6.0. Older versions like 10.2.1 in SLES 15 work just fine.

u1b1# show bgp l2vpn evpn vni 2
VNI not found

When you force VNI like this:

  address-family l2vpn evpn
    neighbor fabric activate
    advertise-all-vni
    vni 2
    exit-vni
  exit-address-family

You can see:

u1b1# show bgp l2vpn evpn vni 2
VNI: 2
  Type: L2
  Tenant-Vrf: default
  RD: 10.2.80.20:2
  Originator IP: 10.2.80.20
  Mcast group: 0.0.0.0
  MAC-VRF Site-of-Origin:
  Advertise-gw-macip : Disabled
  Advertise-svi-macip : Disabled
  SVI interface : unknown
  Import Route Target:
    65000:2
  Export Route Target:
    65000:2

From the intial investigation it follows that zebra is unable to associate VNI with the bridge SVI interface : unknown and the only way to make EVPN work is:

ip link set eth1.2 type vxlan local 10.2.80.20

where 10.2.80.20 in this particular environment is the main IP of the HV node.

It seems to me that VXLAN/EVPN solution needs to be re-evaluated.

To Reproduce

You can use one-deploy with config suggested above in Ubuntu 24.04 for example, or deploy manually as described in OpenNebula documentation, then deploy 2 VMs and try to ping between them. In SLES 15 similar config will just work without problems.

Expected behavior

VNI should be detected correctly, BGP should propagate routing data.

Details

  • Affected Component: Networking
  • Hypervisor: KVM
  • Version: 7.2 (but it's more related to FRR version)

Additional context

It seems downgrading to 10.5 helps, so can be considered as a temporary workaround.

Progress Status

  • Code committed
  • Testing - QA
  • Documentation (Release notes - resolved issues, compatibility, known issues)

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions