[PATCH master] IPv6 ECMP support fixes for linux

Ondrej Zajicek santiago at crfreenet.org
Tue Feb 16 18:48:40 CET 2016


On Wed, Feb 10, 2016 at 04:38:03PM +0100, Mikhail Sennikovskii wrote:
> Hello Fellows,
> 
> as a follow-up for the recent discussion about BIRD support of IPv6 ECMP on Linux
> ( http://trubka.network.cz/pipermail/bird-users/2016-January/010163.html ),
> I've created a patch, which addresses this problem.

Hi

Thanks for the patch, i have several comments:

1) In the patch, the incoming direction (collection from kernel) is
handled in unix/krt.c, while outgoing direction (krt_replace_rte()) is
handled in linux/netlink.c . It would make more sense to handle both
directions in the same place. As BIRD uses internally representation
with one route and mpnh structure, i would prefer to move collection
also to linux/netlink.c, so generic code in unix/krt.c would not
care about OS-specific interface (and does not need any changes).
Linux code would just collect next hops and then call krt_got_route()
for the whole multipath route.

2) Function krt_replace_rte() should be smarter than just flushing the
whole old multipath rte. It should compare old and new next hops (we
could suppose they are sorted) and either add missing or remove
surplussing next hops by calling nl_send_route().

3) When collecting received routes, you should avoid creating list
of routes for the same net, and then merging such list by rt_merge_list(),
you could simply collect next hops (struct mpnh *) for the current route.

4) Also rt_merge_list() uses mpnh_merge_rta() which uses rte_update_pool.
That is probably not safe outside of nest update code. The linux netlink
code should have its own linpool (static and shared, because netlink socket
is also shared), so you could allocate mpnh structures from it. The linpool
should be flushed when the route is collected and propagated to krt_got_route().
So you don't need to do rta_lookup() for temporary next hops.

Note that such linpool could be also used for IPv4 multipath next hops in
nl_parse_multipath() instead of nh_buffer.


> Note that the patch still does not support the learn mode for IPv6 ECMP.
> The support for it can be added later.

Well, the learn mode is not really a problem. With the modifications
mentioned above, learning would work automatically (it is handled by
unix/krt.c, regardless of how route is generated).

What is problematic is handling asynchronous notifications (because you
don't know inside netlink code what are other next hops and whether you
will get another one related to the same net).


-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."


More information about the Bird-users mailing list