netlink filtering to avoid clostly FNHE table dumps on Linux
Ondrej Zajicek
santiago at crfreenet.org
Sat Jan 8 05:56:43 CET 2022
On Sat, Jan 08, 2022 at 12:03:52AM +0100, Tomas Hlavacek wrote:
> Hi!
>
> The large table that BIRD pulled from the kernel was a FNHE table
> where Linux collects PMTU records for *all* destination IPs that are
> routed to the tunnel (which does not seem to be right and I will
> discuss it in LKML shortly). These records have (default) 600s
> expiration time and in my scenario I happen to receive some
> backscatter traffic that in most cases gets ICMP or TCP reset
> responses that could ultimately create millions of these records in a
> few minutes.
>
> The reason why this problem occured only in Linux ~5.2+ lies in the
> patch https://patchwork.ozlabs.org/project/netdev/patch/8d3b68cd37fb5fddc470904cdd6793fcf480c6c1.1561131177.git.sbrivio@redhat.com/
> that changed the semantics of netlink dump requests. Now the kernel
> dumps the FIB Next Hop Exceptions table (previously known as route
> cache) alongside the RT unless the requester sets sockopt
> NETLINK_GET_STRICT_CHK and clear the flag RTM_F_CLONED in the dump
> request. BIRD does not apply the filters so the kernel dumps
> everything. But iproute2 and other programs that use netlink utilize
> the filters, so no similar performance issue occurs unless I
> explicitly dump the FNHE table (ip route show cache).
Hi
Thanks, that seems like plausible explanation. Being spammed by PMTU
cache entries where requesting route table dumps is a creative
interpretation of stable API commitment :-(
> I believe that many different types of Linux tunnels create the PMTU
> records for all packets transmitted over the tunnel as well. And it
> works like that for a long time - the code that creates the route
> cache (at that time, now it is FNHE table) records has been introduced
> in Linux 3.10 (https://elixir.bootlin.com/linux/v3.10/source/net/ipv4/ip_tunnel.c#L591).
If i understand it correctly, these PMTU records can also be a result of
regular TCP communication from/to the router even if there are no tunnels?
> Regardless of what may or may not happen on the kernel side I think
> that implementing the netlink filter in BIRD to avoid the described
> situation makes sense. I am almost certain that my experimental fix
> breaks other things (most likely OSPF) but I would be glad to help
> make it right.
How could OSPF be affected by filters on netlink socket?
--
Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
More information about the Bird-users
mailing list