[PATCH RFC 1/1] Add the Babel routing protocol (RFC 6129) to Bird.

Ondrej Zajicek santiago at crfreenet.org
Tue Aug 25 13:19:38 CEST 2015


On Tue, Aug 18, 2015 at 11:06:03PM +0100, Toke Høiland-Jørgensen wrote:
> The implementation is currently IPv6-only (since Bird does not support
> mixed IPv4/6 protocols in the same instance), but is otherwise complete
> as far as MUSTs in the RFC is concerned, with one exception: The RFC
> specifies that jitter must be applied to packet transmission times. This
> implementation currently does not buffer TLVs before sending them, but
> it does introduce some randomness to sending of global updates by timer
> and event scheduling randomness in the Bird core.

Hi, i am finally sending the first batch of comments. This time it is
mostly general comments.

> Some cleanup and possible reorganisation is needed in the code. Consider
> this patch an RFC for the general structure of the protocol addition.
> 
> The implementation started out as a copy of the RIP protocol
> implementation, so some patterns in how the communication with the core
> is setup is taken from there (but has diverted quite a bit as the
> different parts of the protocol implementation fell into place).

Using the old RIP protocol implementation as a basis is a problem - that
code has several design problems, some of these seems to be shared by the
Babel code. Not to mention its idiosyncratic coding conventions are far
from ones used in the rest of BIRD. We are currently finishing rewrite of
BIRD RIP code to eliminate these problems from RIP. I will send you the
new RIP code for inspiration.


Route propagation between protocol and core:

Old RIP uses core tables to keep some of its route state and therefore
has nontrivial interaction with core. This should be avoided - ideally
core tables should be handled as a black box with rte_update() in one
direction and rt_notify() (and some other auxiliary hooks) in the other.

Another important question is how to handle and compare external routes
relative to internal routes. For vector-distance (route-propagating)
protocols it seems natural to handle external routes received from core
like if they are from another neighbor. But this approach is problematic
because internal tables and core tables will fight for the right to
choose the best route.

The proper approach is that the protocol keeps information about incoming
routes, choose best of incoming routes, propagate that to core, then core
chooses the the global best route, and propagate that back to protocols.
The protocol learn that route (which may be either its or external), keep
it as 'outgoing' and propagate it to neighbors.

Also note that unreachable routing entries should not be propagated to core.


Interface and neighbor events:

You should create babel_interface structure during babel_if_notify()
hook, not as a result from acquiring lock. The used approach is
potentially racy (you could end with multiple locks if you get multiple
up/down events before lock events). Note that the lock could be allocated
from the iface resource pool. See ospf_iface_new()/ospf_iface_add() or
radv_iface_new()/radv_iface_add().

I am bit worried about liveness of neighbors and ordering of hooks.
Say an iface disappears. First you probably get babel_neigh_notify(),
which does nothing, then you get babel_if_notify(), which will do
kill_iface() and babel_flush_neighbor(), which unregisters data from
the neighbor (bn->neigh->data = NULL), but such neighbor is likely
to be already disposed.



Some minor notes:

Please add comments to each 'list' field in structures to keep which
structures are members of that lists (see e.g. radv.h).

Having timers for each route or entry is not necessary a good idea,
perhaps this could be handled by keeping just timestamps in routes and
having common 'timeout' hook, which would walk the fib table and process
nodes - see rip_timer() or ospf_disp()/ospf_update_lsadb().

Also having resource pools for entries seems excessive. babel_route
structures could be just allocated from a common slab.

Not sure what is a point of struct neighbor_route. If you just want to
keep babel_route in two linked list, just add two 'node' fields in it and
use SKIP_BACK macro. See e.g. struct proto.

Avoid unnecessary floating points, just use fixed point math instead.

It is a good idea to have a ptr from babel_iface to babel_patt instead of
duplicating all the values. Although you would have to change it during
reconfiguration, that is the simplest part of reconf. And if you do not
have reconfiguration implemented, then the whole protocol is restarted
anyways.

Do you have any kind of multihop unicast communication in Babel, or
everything is single-hop to neighbors?

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20150825/b83be8e5/attachment.asc>


More information about the Bird-users mailing list