Issues in Babel

Ondrej Zajicek santiago at crfreenet.org
Thu Oct 19 13:46:25 CEST 2017


On Thu, Oct 19, 2017 at 01:23:45PM +0200, Toke Høiland-Jørgensen wrote:
> Ondrej Zajicek <santiago at crfreenet.org> writes:
>
> > 1) Hello expiration was not properly implemented, as i understand
> > RFC, it should reset the timer based on last packet hello interval
> > and time-out repeatedly until hello map is zero. Fixed in [2].
> 
> Yeah, I think you are right. However, your fix doesn't apply
> BABEL_HELLO_EXPIRY_FACTOR to the value stored in last_hello_int, which
> is probably a bug, right?

I think it is correct, see section A.1:

  If the Hello history is empty (it contains 0 bits only), the neighbour
  entry is flushed; otherwise, it resets the neighbour's Hello timer to
  the value advertised in the last Hello received from this neighbour (no
  extra margin is necessary in this case).

The BABEL_HELLO_EXPIRY_FACTOR is the extra margin for jitter.


> > 3) Seems like the fix of hello expiration caused that when neighbor
> > disappears, it is no longer removed timely and routes have to wait
> > for route timeout, which is about a minute for 4 s hello interval.
> >
> > It seems that by RFC, we should recompute route metrics whanever rxcost
> > reported by IHU is changed, so IHU expiration should be enough to make
> > all related routes unreachable. Apparently, this is not done in BIRD.
> 
> Hmm, yeah, babel_expire_ihu() should probably do something similar to
> babel_flush_neighbor(), i.e., loop through all selected routes from that
> neighbor and re-run the route selection on them?

I think route selection re-run should be done when IHU changes rxcost
(either explicitly, or by expiration). It is explained in section 3.4.2.:

  After updating a neighbour's txcost, the receiving node recomputes
  the neighbour's cost (Section 3.4.3) and runs the route selection
  procedure (Section 3.6).

And it seems that BIRD should re-run the route selection even if
effective cost is changed by Hello receipt or expiration:

  After receiving a Hello, or determining that it has missed one, the
  node recomputes the association's cost (Section 3.4.3) and runs the
  route selection procedure (Section 3.6).



> > 4) Seems like we do not trigger updates when a route update with just
> > increased seqno due to forwarded unicast request is received. That also
> > slows convergence. IMHO we should do in babel_handle_update() something
> > like:
> >
> > if (route_is_selected() && matching_entry_in_seqno_request_cache())
> >   babel_trigger_update()
> 
> Not sure we should use the request cache for this? We might not have
> seen the request, but a downstream still needs the updated seqno; so
> maybe just do something like:

IMHO the fast propagation is necessary just for the originator of the
seqno request and there exist at least one path of routers that
propagates such request. Others learn updated seqno during regular
updates. It is mentioned in section 3.7.2:

  A node SHOULD NOT send triggered updates for other reasons, such as
  when there is a minor fluctuation in a route's metric, when the
  selected next hop changes, or to propagate a new sequence number
  (except to satisfy a request, as specified in Section 3.8).


> > Do you think that my interpretation of these issues and the fixes are
> > correct?
> 
> Think so, yeah; nice job. And thanks for taking the time to go over this :)

You are welcome.

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20171019/a6432fc8/attachment.asc>


More information about the Bird-users mailing list