[PATCH] babel: Send out low-interval hello on shutdown

Toke Høiland-Jørgensen toke at toke.dk
Sat Apr 23 22:31:26 CEST 2022


Ondrej Zajicek <santiago at crfreenet.org> writes:

> On Wed, Apr 20, 2022 at 01:43:21AM +0200, Toke Høiland-Jørgensen wrote:
>> When shutting down a Babel instance we send a wildcard retraction to make
>> sure all peers can quickly switch to other route origins. Add another small
>> optimisation borrowed from babeld: sending a Hello message (along with the
>> retraction) with a very low interval.
>> 
>> This will cause neighbours to modify their expiry timers for the node's
>> state to quickly time it out, thus conserving resources in the network.
>
> Hi
>
> Thanks, merged. Just changed BABEL_TIME_UNITS to BABEL_MIN_INTERVAL.

Awesome! I noticed we had that define as well and did the same rename in
my local tree :)

> BTW, when we added CI tests for Babel authentication, we noticed that it
> has rather slow convergence after reconfiguration. The reason is that
> when authentication changed to become non-matching, it took many missed
> (misauthenticated) hellos to sufficiently clean up hello_map for neighbor
> to go down.

Just to make sure I'm understanding the scenario correctly: A node is
reconfigured to turn on authentication, but not all peers use it; so now
some peers are essentially cut off (basically like if they just dropped
off the network). However, their hello history remain, so their routes
stay active until they time out. Right?

> Perhaps there could be some decision in iface reconfiguration that the
> change is significant to affect reachability of neighbors and in such
> case deprecate some/most items in hello_map.

Hmm, yeah, we could do something like that I suppose. I'm wondering if
it should really be stronger, though? Enabling auth on an
already-running instance is an increase in the "security level" of the
interface, so should we really be keeping unauthenticated data around at
all? I.e., maybe we should simply flush all neighbour entries on an
interface when enabling auth on that interface?

Or another, slightly less disruptive, option is to flush a neighbour
if we receive a packet from it that fails auth, and that neighbour
doesn't have the 'auth_passed' flag set? For existing neighbours that
succeeds auth, that flag should be set immediately on the next packet we
receive from that neighbour, whereas this would quickly clear out
neighbours that fail it. We could maybe speed things up further by
immediately issuing auth challenges to all neighbours when the config
changes?

-Toke



More information about the Bird-users mailing list