[PATCH] Add the Babel routing protocol to Bird
Toke Høiland-Jørgensen
toke at toke.dk
Thu Oct 8 17:12:36 CEST 2015
Ondrej Zajicek <santiago at crfreenet.org> writes:
> I am sorry i didn't got around to review it sooner, but doing a proper
> review for a whole new (and unfamiliar) protocol takes nontrivial
> amount of time.
>
> Generally, the code looks good and i am determined to finally merge
> it, but it definitely needs more refinement before that.
Hi Ondrej
Awesome! Many thanks for your detailed comments, and thank you for going
easy on my (lack of) C skills :)
I'll go through and do another revision and resubmit. Some answers /
additional questions inline below.
> First, there is an issue with route selection. Proper route selection in BIRD
> should work this way:
>
> 1) Babel chooses the best route from received routes, say selected_in
> 2) If selected_in changes, it is propagated to the core by rte_update()
> 3) Core compares routes from multiple protocols and one is exported to the Babel
> 4) This route is either Babel own route or external route
> 5) In babel_rt_notify(), Babel receives that route, finds appropriate internal
> route (or create a new one) and stores it in say selected_out.
> 6) Even if the received selected_out is an external route, Babel
> do not stop propagating selected_in to the core
> 7) selected_out is propagated to the network in periodic updates.
> If selected_out changes, it is an event for triggered updates.
Ah, thank you! I struggled a great deal with figuring out how this was
supposed to work. I'll fix that, and also try to write this up in a form
that is suitable for the documentation and include that in the next
version of the patch.
Speaking of documentation: What kind of documentation do you expect in
the patch? Obviously I'll include user's documentation on the
configuration options, but some of the other protocols also has API
documentation of (some of) their function hooks, and it's not
immediately obvious to me what the purpose of those are? Should I add
that for Babel as well?
> Second, can we really expect that TLVs in received packets are
> aligned? Although all TLVs looks like suited to 32-bit alignment, i
> did not found mentioned it in Babel RFC and availability of Pad1, PadN
> TLVs and per-byte compressible addresses suggest that they generally
> are not aligned.
>
> In that case, we have to use get_uXX()/put_uXX() functions instead of
> directly accessing babel_tlv_* structure fields. Or we could copy each
> TLV to a local buffer before calling ntoh+handle callbacka, as each
> TLV internally is aligned, and enforce alignment for outgoing packets.
> I would generally prefer the first approach, but the second approach
> is simpler way to fix it.
No, I don't think they are necessarily 32-bit aligned. I struggled with
alignment issues for the 64-bit values (and got it to work by using the
__attribute__((packed)) attribute in the structure definitions). I'll
try to figure out how to do this properly in general.
> Third, although i am OK with accepting IPv6-only implementation, it
> should be ready for future expansion. TLVs with AE 1 could be skipped,
> but there should not be hidden hardwired IPv6 assumptions in the code.
>
> Mainly, TLV structures should use addr[0] at the end instead of
> addr[2] and addr[4]. Address space requirements should be handled
> explictly based on AE.
Noted.
> Fourth, it seems like triggered updates are not really implemented as
> they should. Route change triggers propagation, but the whole routing
> table is always propagated instead of just the changed entries. This
> could be easily fixed by having timestamp on each babel_entry updated
> everytime when a significant change of selected_out route happened.
> For regular updates all routes are propagated, for triggered updates
> only the entries with timestamps >= time when the trigger update was
> requested. See want_triggered and tx_changed fields in the new RIP
> code.
Hmm, I'm not sure just dumping everything in a triggered update is
strictly a violation of the RFC. But I suppose it does have obvious
impacts on efficiency. Will try to be smarter about that.
> Fifth, ... (snip)
>
> Sixth, ... (snip)
>
> Seventh ... (snip)
>
> Finally, the coding style ... (snip)
Will fix these as well.
> Rest are comments specific to lines in the patch:
Most of these seems obvious how to fix; one or two questions below:
>> diff --git a/nest/route.h b/nest/route.h
>> ...
>> @@ -538,6 +546,7 @@ extern struct protocol *attr_class_to_protocol[EAP_MAX];
>> #define DEF_PREF_RIP 120 /* RIP */
>> #define DEF_PREF_BGP 100 /* BGP */
>> #define DEF_PREF_PIPE 70 /* Routes piped from other tables */
>> +#define DEF_PREF_BABEL 42 /* Babel */
>
> Default could be higher, e.g. 130
OK. What do these actually mean? Is it simply that a higher priority
wins if two protocols have the same route?
>> +enum babel_iface_type_t {
>> + BABEL_IFACE_TYPE_WIRED,
>> + BABEL_IFACE_TYPE_WIRELESS,
>> + BABEL_IFACE_TYPE_MAX
>> +};
>
> I would prefer to have explicit values here. Perhaps also reserve 0
> for undefined?
Well, undefined == wired in this case. I.e. use the simple metric
computation mechanism unless explicitly told to treat the interface as
wireless.
I don't suppose Bird has a way of telling automatically what type of
interface we are dealing with?
>> +static void
>> +babel_expire_routes(struct babel_proto *p)
>> +{
>> + struct babel_entry *e;
>> + struct babel_route *r, *rx;
>> + node *n;
>> + FIB_WALK(&p->rtable, n) {
>> + e = (struct babel_entry *)n;
>> + WALK_LIST_DELSAFE(r, rx, e->routes) {
>> + if(r->refresh_time && r->refresh_time <= now) {
>> + refresh_route(r);
>> + r->refresh_time = 0;
>> + }
>> + if(r->expires && r->expires <= now)
>> + expire_route(r);
>> + }
>> + expire_sources(e);
>> + } FIB_WALK_END;
>> + WALK_LIST_FIRST(n, p->garbage) {
>> + e = SKIP_BACK(struct babel_entry, garbage_node, n);
>> + babel_flush_entry(e);
>> + }
>> +}
>
> Instead of using p->garbage and garbage_node, you could simply use FIB_ITERATE
> instead of FIB_WALK and remove babel_entry inside the cycle:
I tried and failed to get this to work; the whole thing would just
crash, which is why I came up with the garbage list approach. Probably
did it wrong, though (wasn't obvious to me how to use the FIB_* macros),
so will give it another shot; thanks for the pointer :)
>> +static u16
>> +babel_compute_rxcost(struct babel_neighbor *bn)
>> +{
>> + struct babel_iface *bif = bn->bif;
>> + struct babel_proto *p = bif->proto;
>> + u8 n, missed;
>> + u16 map=bn->hello_map;
>> +
>> + if(!map) return BABEL_INFINITY;
>> + n = __builtin_popcount(map); // number of bits set
>
> This should be defined as our inline function in lib/bitopts.h, then
> used here.
So just alias __builtin_popcount to something in bitopts.h? Not entirely
sure what compilers you are targeting; that macro seems to be
GCC-specific, but is also supported in LLVM since 1.5:
http://llvm.org/releases/1.5/docs/ReleaseNotes.html
>> +int
>> +babel_handle_ihu(struct babel_tlv_header *hdr, struct babel_parse_state *state)
>> +{
>> ...
>> + struct babel_neighbor *bn = babel_get_neighbor(bif, state->saddr);
>> + bn->txcost = tlv->rxcost;
>> + bn->ihu_expiry = now + 1.5*(tlv->interval/100);
>
> Please avoid floating point ops.
So, substitute 3/2 for 1.5?
>> +int
>> +babel_handle_update(struct babel_tlv_header *hdr, struct babel_parse_state *state)
>> +{
>> ...
>> + if(tlv->flags & BABEL_FLAG_ROUTER_ID) {
>> + u64 *buf = (u64*)&prefix;
>> + memcpy(&state->router_id, buf+1, sizeof(u64));
>
> I don't think is correct. BIRD stores ip6_addr as four u32 integers in host
> order, also router_id is u64 in host order, therefore such memcpy will swap
> meaning of lower and upper half of router ID on little endian machines.
>
> This is more clean and comprehensible:
>
> state->router_id = (((u64) _I2(prefix)) << 32) | _I3(prefix);
>
> I wonder what this flag is supposed to mean when AE is 1, RFC does not mention
> this.
Noted. You are right, that is not specified in the RFC. However, I think
setting this flag with an AE of 1 should be considered an error, and
will treat it as such.
>> + }
>> + if(!state->router_id)
>> + log(L_WARN "%s: Received update on %s with no preceding router id", p->p.name, bif->ifname);
>
> BTW, Babel RFC does not explictly specify that zero Router ID is invalid. But
> that is likely an ommision in the RFC.
Ah yes, that is true. Will try to get that clarified.
>> +static void
>> +babel_iface_linkdown(struct babel_iface *bif)
>> +{
>> + struct babel_neighbor *bn;
>> + struct babel_route *r;
>> + node *n;
>> + WALK_LIST(bn, bif->neigh_list) {
>> + WALK_LIST(n, bn->routes) {
>> + r = SKIP_BACK(struct babel_route, neigh_route, n);
>> + r->metric = BABEL_INFINITY;
>> + r->expires = now + r->expiry_interval;
>> + babel_select_route(r->e);
>> + }
>> + }
>> +}
>
> Is there any reason why route handing is different in this case and in
> kill_iface() case?
My assumption was that linkdown means that the interface is likely to
come back up, so I wanted to keep the routes and be able to refresh them
when it does (and keep the hello seqno etc so the other hosts on the
link don't refresh their metrics completely if the interface goes down
for a short time). This came about while playing around with a laptop
that was connected by wifi and ethernet to the same network (and running
babel on both), then repeatedly unplugging and re-plugging the ethernet
cable.
>> +static void
>> +babel_copy_config(struct proto_config *dest, struct proto_config *src)
>
> You could just remove this function.
Yeah, well, was planning to actually make it do something useful instead
(i.e. support proper reconfiguring).
>> +void
>> +babel_hton_seqno_request(struct babel_tlv_header *hdr)
>> +{
>> + struct babel_tlv_seqno_request *tlv = (struct babel_tlv_seqno_request *)hdr;
>> + tlv->seqno = htons(tlv->seqno);
>> + tlv->router_id = htobe64(tlv->router_id);
>> +}
>
> There are some issues with portability of names and including
> appropriate header for htobe64() / be64toh(). Perhaps this should be
> handled somewhere in lib/.
Not sure what you mean here? Are you referring to htobe64() being
glibc-specific? An how do you suggest to handle it? :)
>> +struct babel_tlv_header *
>> +babel_add_tlv_size(struct babel_iface *bif, u16 type, int len)
>> +{
>> + struct babel_header *hdr = bif->current_buf;
>> + struct babel_tlv_header *tlv;
>> + int pktlen = sizeof(struct babel_header)+hdr->length;
>> + if(pktlen+len > bif->max_pkt_len) {
>> + babel_send_queue(bif);
>> + pktlen = sizeof(struct babel_header)+hdr->length;
>> + }
>
> This seems broken - TLV is added to current_buf, which may be either
> tlv_buf or just sock->tbuf based on multicast vs unicast. But if there
> si not enough space, then babel_send_queue() is called, which assumes
> that data are in tlv_buf, and copies it to sock->tbuf.
Ah, was trying to be careful with this; obviously didn't work. Will give
it another go.
>> +int
>> +babel_open_socket(struct babel_iface *bif)
>> +{
>> + struct babel_proto *p = bif->proto;
>> + struct babel_config *cf = (struct babel_config *) p->p.cf;
>> + bif->sock = sk_new( bif->pool );
>> + bif->sock->type = SK_UDP;
>> + bif->sock->sport = cf->port;
>> + bif->sock->rx_hook = babel_rx;
>> + bif->sock->data = bif;
>
>> + bif->sock->rbsize = 10240;
>> + bif->sock->iface = bif->iface;
>> + bif->sock->tbuf = mb_alloc( bif->pool, bif->iface->mtu);
>
> Why not just use:
>
> bif->sock->rbsize = MIN(512, bif->iface->mtu);
You mean MAX?
> bif->sock->tbsize = bif->sock->rbsize;
>
> This ensures rbuf, tbuf allocation. Also note that buffer sizes should be
> updated in case of MTU is changed, handle IF_CHANGE_MTU in
> babel_if_notify().
Noted.
I'll get back to you with an updated patch, hopefully before too long :)
-Toke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 472 bytes
Desc: not available
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20151008/a6c03282/attachment.asc>
More information about the Bird-users
mailing list