Use BFD to Speed Up BGP Convergence
Out-of-the-box EBGP is notoriously slow to converge – it can take up to three minutes to detect a failed EBGP neighbor. It’s possible to tweak BGP timers to detect a failed neighbor in a few seconds. Still, it’s much better to combine BGP with Bidirectional Forwarding Detection (BFD) – a lightweight protocol that can detect a link- or node failure in milliseconds.
In this lab, you’ll use both mechanisms:
- You’ll tweak the BGP timers to detect a link failure within 10 seconds.
- You’ll enable BFD on an EBGP neighbor to reduce the failure detection time to approximately a second.
The routers in your lab use the following BGP AS numbers. Your routers advertise an IPv4 prefix each; X1 does not advertise BGP prefixes.
Start the Lab
Assuming you already set up your lab infrastructure:
- Change directory to
- Execute netlab up (device requirements, other options)
- Log into your devices (R1,R2) with netlab connect node and verify their configuration.
If you’re using netlab, you’ll get a fully-configured lab, including BGP prefix origination on R1 and R2, and EBGP sessions between R1/R2 and X1. If you’re using another lab platform, you’ll have to do a fair amount of prep work1.
Checking the Convergence Time
- Log into R2 and enable debugging/logging of BGP updates (example: Cisco IOS) or monitoring of IP routing table (example: Arista EOS event monitor). You’ll have to inspect the BGP table on R2 every few seconds if your platform does not have similar functionality.
- Log into R1 and remove the IP address from the R1-X1 link.
It would be easier to shut down the R1-X1 link, but that trick doesn’t work with devices like Arista EOS that tear down the BGP session before shutting down the link.
I used Arista EOS event monitor on R2, added an IP address on the R1 Ethernet1 interface, and removed it as soon as the BGP session was established. It took BGP almost exactly three minutes to detect the failed EBGP session between X1 and R1:
r2#sh event-monitor route match-ip 192.168.42.0/24
Reduce the BGP Timers
You can reduce BGP session timers to improve BGP convergence:
- On the R1-X1 EBGP session, set the keepalive timer to three seconds and hold timer (timeout) to nine seconds.
- Clear the EBGP session if needed2 – the BGP timers are negotiated during the BGP session establishment phase.
- Verify that you reduced the BGP timers with a command similar to show ip bgp neighbor detail.
- Repeat the BGP convergence measurements – X1 should revoke the BGP prefix advertised by R1 within nine seconds.
While some BGP implementations allow you to use very small BGP timers (for example, a one-second keepalive timer), you should use something other than that approach if you care about BGP convergence speed. It’s much better to combine BGP with BFD:
- Configure BFD on the EBGP neighbor session on R1
- Clear the BGP session if needed
- Verify that you have a working BFD session between R1 and X1. Most implementations display the BFD status of a BGP neighbor somewhere within the show ip bgp neighbor details (or similar) command. Some implementations have BFD-specific show commands like show bad neighbors.
- Repeat the BGP convergence measurements – X1 should revoke the BGP prefix advertised by R1 almost immediately.
This lab uses a subset of the 4-router lab topology. The following information might help you if you plan to build custom lab infrastructure:
- Customer router: use any device supported by the netlab BGP configuration module.
- netlab has to configure BFD and BGP timers on the external routers. The device you want to use as an external router has to be supported by the BFD configuration module and the bgp.session plugin.
- You must use Cumulus Linux on the external router if you’re using netlab release 1.6.3 or older.
- Git repository contains external router initial device configurations for Cumulus Linux.
|r1 -> x1
|r2 -> x1
|x1 -> r1
|x1 -> r2