Peter Milchov 24th February 2021 15:35:34 4

Have you ever wondered what would happen with your tier-0 dynamic routing in a case of a BGP failure? Should you use BFD or a Graceful Restart to protect the control plane? Maybe both together? Let's examine these protocols and see how they fit with tier-0 gateways in NSX-T.


Graceful Restart for BGP

Graceful Restart (GR) is a mechanism that is designed to allow control protocol to restart without interrupting network connectivity state. It is based on the existence of a separate forwarding plane that does not necessarily share fate with the control plane. The GR implementation ensures the forwarding plane can continue to function while the routing protocol restarts.

RFC4724 says:

The Graceful Restart Capability is a new BGP capability that can be used by a BGP speaker to indicate its ability to preserve its forwarding state during BGP restart. 

GR is useful in implementations where the BGP peer has dual supervisors, so the GR capability will allow for preserving the existing routes and the forwarding state during supervisor failover.

A BGP control plane restart could happen due to a supervisor switchover, planned maintenance or active routing engine crash. As soon as a GR-enabled router restarts, or the control plane fails, its peer will mark the routes in the forwarding table as stale. A router does not differentiate between stale and other routing information, it will keep forwarding to both. The reason behind those routes to be marked as 'stale' is so they can be deleted if the graceful restart timer expires or to be replaced by fresh routing updates once the control plane session restarts.

However, in my recent experience I haven't seen a single customer to use dual supervisor switch to peer their tier-0 gateways with. I am not saying nobody uses the big fat boys anymore, just nowadays leaf/spine networks have taken over the datacentres, where they provide scaled up control plane and smaller fault domains unlike the dual supervisor switches that have the potential to bring down half the network, in case of an outage. 



According to RFC5880:

BFD is a simple Hello protocol that, in many respects, is similar to the detection components of well-known routing protocols.  A pair of systems transmit BFD packets periodically over each path between the two systems, and if a system stops receiving BFD packets for long enough, some component in that particular bidirectional path to the neighboring system is assumed to have failed.

BFD is especially helpful in the network scenarios where an alternative path is available in the network. There it will help to detect a failure in the end-to-end forwarding path quicker, so the forwarding can switch to the next best path. BFD Implementation in NSX-T allows for sub-second convergence when you run on a Bare Metal Edge Node (300ms x3) and a second and a half convergence when the tier-0 runs on a VM Edge Node.


BFD for BGP with Graceful Restart

In the NSX-T Reference Design Guide it is written that:

It is recommended to enable GR If the Edge node is connected to a dual supervisor system that supports forwarding traffic when the control plane is restarting. This will ensure that forwarding table data is preserved and forwarding will continue through the restarting supervisor or control plane. Enabling BFD with such a system would depend on the device-specific BFD implementation. If the BFD session goes down during supervisor failover, then BFD should not be enabled with this system. If the BFD implementation is distributed such that the BFD session would not go down in case of supervisor or control plane failure, then enable BFD as well as GR.

That is fine, but there is a catch in that recommendation: you cannot configure the Control Plane Independent bit in tier-0 gateway. It just lacks that functionality, even though the NSX-T routing engine runs FRR under the hood, which supports that capability.

On the left is my pFsense with FRR package installed and on the right, it is my Tier-0 gateway:

BFD cBit

Therefore, it is safe to conclude there is no value in configuring BFD and Graceful Restart together on a tier-0 gateway. Even if you do configure them both, and all the BGP sessions go down that will introduce a failure condition for the tier-0 SR and it will trigger a failover to another Edge Node, so preserving the forwarding state on the failed node makes no sense. 


Graceful Restart Helper

Graceful Restart Helper mode is the ability to assist a neighbouring router attempting graceful restart. VMware recommends enabling GR Helper, however I fail to see the benefits of using that considering the fact we do not want to mix GR and BFD on the ToR switches, due to the reasons explained above. No GR on the ToR would mean no need of GR Helper on the tier-0 gateway.


Any comments are welcome.

Thanks for reading!











  • shusave 13th November 2022 14:25:51 Reply

    <a href=>how much does cialis cost</a> We found that both HIF 1О± and ERО± have binding sites on the same cis regulatory region of the SNAT2 gene, suggesting that SNAT2 transcription can be regulated by either of these two transcription factors, permitting hypoxia dependent growth under anti estrogen treatment

  • Antonionug 20th December 2021 09:21:51 Reply

    It agree, very useful idea
    Certainly. All above told the truth. We can communicate on this theme. Here or in PM.
    I congratulate, what necessary words..., a magnificent idea
    You are not right. Let's discuss it. Write to me in PM, we will talk.
    Interesting theme, I will take part. Together we can come to a right answer.

  • Antonionug 29th December 2021 03:33:02 Reply

    BFD can also be enabled per BGP neighbor for faster failover. BFD timers depend on the Edge node type.

  • Antonionug 30th December 2021 05:49:40 Reply

    The configuration for frr-02 looks like this: router bgp 65001 bgp router-id neighbor remote-as 65000 neighbor bfd neighbor remote-as 65000 neighbor bfd! FRR routing tables We begin by having a look at the FRR routing tables.

Leave a Comment

Name *

Email *

Message *