Removing device-bound routes when interface is going down

Mon Aug 18 19:54:54 CEST 2014

В письме от 18 августа 2014 19:32:10 пользователь Alexander Demenshin написал:
> Hi (again),
> 
> I've encountered a problem which is (most likely) is caused by a bug
> in the Linux kernel, but significantly affects routing handling in bird
> (everything else like quagga is most likely also affected).
> 
> When route is added manually/externally to the kernel, and bound to
> specific
> interface, like this:
> 
> $ ip route add 10.1.1.0/24 dev eth0
> 
> ...it is correctly identified by bird (via async netlink message), but
> when
> the interface goes down, this route is not deleted, as there is no
> corresponding
> netlink message (tested on most recent kernel, 3.16.1, and also on older
> 3.11,
> this happens only on if-down event, manual deletion works fine).

This is long standing "feature" of the Linux kernel. There are at least
two times this problem hits netdev mailing list.

One of which described here:

http://www.spinics.net/lists/netdev/msg254133.html

And the resolution from the DM about this situation and why this should
not be solved within kernel:

http://www.spinics.net/lists/netdev/msg254186.html

Summary is that if you have ~500k routes in the BIRD and interface
where nexthop of such routes changes its operstate kernel should
sent notification about each of them (~500k).

This causes near DoS level of netlink traffic from kernel to userspace.
In most cases  more than half of these messages are being not 
delivered/handled by userspace.

> 
> Since bird does not see the route removal, it is still kept in the
> routing table,
> thus, it is still announced via OSPF (in my particular case), however
> the kernel
> knows nothing about this.
> 
> Obviously, it is possible to set up periodic scanning of kernel table
> with small
> interval, but even then, this removal will not be instant, and
> additionally,
> it introduces additional CPU usage - even if the table in question is
> small by itself,
> but there is another huge table, this scan will consume significant
> time,
> in my case - it takes 0.3s to dump main table (~10 entries), all due to
> huge BGP table (~500k entries), thus scanning every 10 seconds will
> consume 3% CPU
> (mostly for nothing, as this kind of changes are relatively seldom).

You are right, this is really happens and there is nothing to do with this
currently.

> 
> Fixing this problem in the mainstream kernel (if this is indeed
> unintentional behavior)
> will clearly take ages, but it could be easily solved in bird itself -
> if interface
> is going down, all routes pointing to this device must be removed (not
> only
> direct and static routes - where it works perfectly).
> 
> I feel that this workaround is quite easy to implement, but not sure
> where exactly
> to look, thus I'll be really grateful for right directions :)
> 
> Thank you!

-- 
SP5474-RIPE
Sergey Popovich