Border Gateway Protocol (BGP) is a fascinating protocol because there are a lot of things that can be done with BGP. However, there has always been an issue with BGP, which is convergence (the time the network or protocol takes to accept change). BGP was designed for scale, not speed, so it’s something that we’ve had to tolerate from its inception. Another truth about BGP is that it is not a big fan of multipath routing. The default maximum path is set to one for eBGP and iBGP. That is configurable for both but BGP will still only select one best path per destination, even though it can add multiple entries into the routing table for a given destination. That had all been true until Prefix-Independent Convergence (PIC) was introduced.
BGP Prefix-Independent Convergence
BGP Prefix-Independent Convergence is described in a draft RFC, which initially came out in September 2012 and was updated a couple of times in 2013 but is currently in an expired state with the IETF. Cisco does support PIC on all their routing platforms (IOS, IOS-XE, IOS-XR and NX-OS). The BGP PIC edge and core for the IP and Multiprotocol Label Switching (MPLS) function improves convergence after a network failure. This convergence is applicable to both core and edge failures on IP and MPLS networks.
Normally, BGP can take several seconds to a few minutes to converge after a network change. At a high level, BGP goes through the following process:
- BGP learns of failures through either Interior Gateway Protocol (IGP) or Bidirectional Forwarding Detection (BFD) events or interface events.
- BGP withdraws the routes from the Routing Information Base (RIB) and the RIB withdraws the routes from the Forwarding Information base (FIB) and distributed FIB (dFIB). This process clears the data path for the affected prefixes.
- BGP sends withdraw messages to its neighbors.
- BGP calculates the next best path to the affected prefixes.
- BGP inserts the next best path for affected prefixes into the RIB and the RIB installs them in the FIB and dFIB.
BGP PIC affects prefixes for IPv4 and VPNv4 address families. For those prefixes, BGP calculates an additional secondary best path along with the primary best path. (The second best path is called the backup/alternate path.) BGP installs the best and backup/alternate paths for the prefixes into the BGP RIB. The backup/alternate path provides a fast reroute mechanism to counter a singular network failure.
With BGP PIC, Cisco Express Forwarding (CEF) stores an alternate path per prefix. When the primary path goes down, CEF searches for the backup/alternate path for a prefix and promotes it immediately. Upon detection of a failure, CEF detects the alternate next hop for all prefixes affected by the failure. The data plane convergence is achieved in subseconds depending on whether the BGP PIC implementation exists in the software or hardware.
Let’s look at an example of BGP PIC. Before BGP is configured, the BGP table looks like this:
R1# sh ip bgp
BGP table version is 16, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.1/32 0.0.0.0 0 32768 i
*> 2.2.2.2/32 192.168.12.2 0 0 2 i
*> 3.3.3.3/32 192.168.13.3 0 0 3 i
* 4.4.4.4/32 192.168.13.3 0 3 4 i
*> 192.168.12.2 0 2 4 i
* 5.5.5.5/32 192.168.13.3 0 3 4 5 i
*> 192.168.12.2 0 2 4 5 i
After BGP PIC is configured:
R1# show run | s router bgp
router bgp 1
bgp log-neighbor-changes
neighbor 192.168.12.2 remote-as 2
neighbor 192.168.13.3 remote-as 3
!
address-family ipv4
bgp additional-paths install
network 1.1.1.1 mask 255.255.255.255
neighbor 192.168.12.2 activate
neighbor 192.168.13.3 activate
exit-address-family
R1#show ip bgp
BGP table version is 21, local router ID is 192.168.13.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.1/32 0.0.0.0 0 32768 i
*> 2.2.2.2/32 192.168.12.2 0 0 2 i
*> 3.3.3.3/32 192.168.13.3 0 0 3 i
*b 4.4.4.4/32 192.168.13.3 0 3 4 i
*> 192.168.12.2 0 2 4 i
*b 5.5.5.5/32 192.168.13.3 0 3 4 5 i
*> 192.168.12.2 0 2 4 5 i
R1#show ip bgp 5.5.5.5
BGP routing table entry for 5.5.5.5/32, version 21
Paths: (2 available, best #2, table default)
Additional-path-install
Advertised to update-groups:
1
Refresh Epoch 1
3 4 5
192.168.13.3 from 192.168.13.3 (3.3.3.3)
Origin IGP, localpref 100, valid, external, backup/repair , recursive-via-connected
rx pathid: 0, tx pathid: 0
Refresh Epoch 1
2 4 5
192.168.12.2 from 192.168.12.2 (2.2.2.2)
Origin IGP, localpref 100, valid, external, best , recursive-via-connected
rx pathid: 0, tx pathid: 0x0
R1#show ip cef 5.5.5.5 detail
5.5.5.5/32, epoch 0, flags rib only nolabel, rib defined all labels
recursive via 192.168.12.2
attached to FastEthernet0/0
recursive via 192.168.13.3, repair
attached to FastEthernet0/1
With this example, you can see that in the BGP table, there is a backup route preselected if an alternate path is available. Also, in the FIB, CEF knows of the backup as well, which is labeled as “repair.”
In this example of a stand-alone configuration, the BGP neighbors know nothing of PIC in this case, but by simply adding an additional neighbor statement either sending or receiving (or both)—additional paths are possible. This is a minimal configuration that adds a minimal amount of additional CPU cycles to BGP provides fast convergence. Sounds like a winner to me!
Related Courses
SPADVROUTE — Deploying Cisco Service Provider Advanced Network Routing v1.0
BGP — Configuring BGP on Cisco Routers v3.2