Exporting multipath link routes from Linux kernel
santiago at crfreenet.org
Mon Jan 21 21:20:28 CET 2019
On Mon, Jan 21, 2019 at 06:38:21PM +0100, Eugene Crosser wrote:
> Hello all,
> we do virtual hosting, and we provide routeable /32 addresses to the
> guests. Kernel routes on the KVM host are link routes that look like this:
> 126.96.36.199 dev pub020304050612 proto static
> They are picked up by bird and exported to the core router. When we
> launch multiple guests with the same address on multiple hosts, this
> results in an ECMP route on the core router, providing load balancing.
> This all works fine until we launch more than one guest with the same
> address on _one_ host. We create kernel multipath route that looks like
> 188.8.131.52 proto static metric 10
> nexthop dev pub020304050612 weight 1
> nexthop dev pub020304050616 weight 1
> Note that there is no "via" address in the hop configulations! This
> actually works, i.e. connections originating from the host are balanced
> between those guests. But bird refuses to pick up such route because of
> the code here:
> 1. What was the justification for disallowing gateway-less multipath
> routes? Would it make sense to allow them (in the mainstream code)?
The code differentiated between gateway and gateway-less routes based on
rta->dest (RTD_ROUTER for gateway, RTD_DEVICE for gateway-less). We
extended that to have RTD_MULTIPATH, but there was no separate dest for
each nexthop, so we restricted it to have all nexthops with gateways.
Also, ECMP routes generated by protocols (e.g. OSPF) are always with
nexthops, so it was generally not a big limitation.
In BIRD 2.0, we unified this, replaced RTD_ROUTER / RTD_DEVICE /
RTD_MULTIPATH with RTD_UNICAST, which can handle ECMP routes with mixed
gateway and gatewa-less nexthops.
> 2. Would it be sufficient to simply drop the check for the presence of
> the gateway address in the message, and return `first` even if gateway
> address was not present?
Not sure what you mean by `first`. You cannot read RTA_GATEWAY field if
there is none and you cannot call neigh_find2() for 0.0.0.0 address. You
could set rv->gw to IPA_NONE, that would perhaps work in most cases, but
it is untested.
Or just switch to BIRD 2.0
Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 195 bytes
Desc: not available
More information about the Bird-users