BGP graceful restart for software update only?

Vincent Bernat bernat at luffy.cx
Fri Oct 18 09:17:35 CEST 2019


 ❦ 17 octobre 2019 16:47 +01, Neil Jerram <neil at tigera.io>:

> I'm not sure I'm persuaded by your argument, though, that LLGR is desirable
> because BFD could generate a false negative.  Wouldn't it be better to
> eliminate those false negatives by allowing BFD to run a more slowly and/or
> with a larger multiplier?

If you are using BFD, a slower BFD means a slower correction if a link
comes down. If you use 1s*3, you may have a 3-second impact when someone
unplug a cable (on non P2P BGP sessions). This may or may not
acceptable. In my context, I was selling the design against P2P LACP
links where the detection was configured around 100ms on the Linux side
(Linux polls the NIC driver to know if a link down at a configurable
interval). So, 3s vs 100ms would be hard to sell. 450ms (150ms * 3) vs
100ms was easier. It's a bit apple and orange, but the context was L2
design vs L3 design over distinct L2 fabrics.

But BFD has nothing to do with your request. You can use LLGR without
BFD.

> IIUC, the benefit of GR, for a planned software update, is that it can
> completely avoid any route flapping in the connected BGP peers - in terms
> both of the data path and the BGP control plane.  With LLGR in that
> scenario, I believe there will be BGP control plane traffic, and other
> BIRDs updating their local kernel with least-preferred routes.  I suppose
> it is still true that there is no data path flapping, but there *has* been
> a lot of control plane churn, which traditional GR would have avoided.  Is
> that understanding all correct?

Yes, I think you are right. Depending on what is more important for you,
I still think LLGR is an improvement over GR. While you may have some
churn at the control plane level with LLGR due to the deprioritization
of some routes, you do not keep non-working routes during an outage (as
long as the routes are present on another peer).

I don't think it is possible to have GR only for upgrades. It would need
to reverse the logic: sending a BGP cease message "upgrade in progress"
to let the other end know they should keep the routes. Maybe something
could be done with a community similar to the graceful session shutdown
but with a timer associated.
-- 
Writing is turning one's worst moments into money.
		-- J.P. Donleavy


More information about the Bird-users mailing list