[PATCH] krt: Dump routing tables separetely on linux to avoid congestion

Toke Høiland-Jørgensen toke at toke.dk
Tue Apr 19 16:06:57 CEST 2022


Ondrej Zajicek <santiago at crfreenet.org> writes:

> On Sat, Apr 16, 2022 at 07:28:59PM +0200, Toke Høiland-Jørgensen wrote:
>> Daniel Gröber <dxld at darkboxed.org> writes:
>> 
>> > When dumping the routing table bird currently doesn't set the rtm_table
>> > netlink field to select any particular one but rather wants to get all at
>> > once.
>> >
>> > This can be problematic when multiple routing daemons are running on a
>> > system as the kernel's route modification performance goes down
>> > drasticly (by a factor of 20-200ish) when the table is being modified while
>> > it's being dumped.
>> >
>> > To avoid this situation we make bird do dumps on a per-kernel-table
>> > basis. This then allows the administrator to have multiple routing daemons
>> > use different kernel tables which sidesteps the problem.
>> >
>> > See also this discussion on the babel-users mailing list:
>> >   https://alioth-lists.debian.net/pipermail/babel-users/2022-April/003902.html
>> 
>> Note that this only works with the strict netlink filter checking is
>> enabled (i.e., if the setsockopt for NETLINK_GET_STRICT_CHK succeeds).
>> Bird currently doesn't check this at runtime at all, so just applying
>> this patch as-is will not work correctly on older kernels (<4.20).
>
> Hi
>
> I like the idea of scanning tables independently, but i suspected there
> would be an issue on older kernels without NETLINK_GET_STRICT_CHK.
> I will check how hard it would be to switch CONFIG_ALL_TABLES_AT_ONCE
> in run-time.

The success/failure of the NETLINK_GET_STRICT_CHECK should be a
predictor of whether the filtering will work; this is from the commit
message that added the flag:

   For new userspace on old kernel, the setsockopt will fail and even if
   new userspace sets data in the headers and appended attributes the
   kernel will silently ignore it. Moving forward when the setsockopt
   succeeds, the new userspace on old kernel means the dump request can
   pass an attribute the kernel does not understand. The dump will then
   fail as the older kernel does not understand it.
   
   New userspace on new kernel setting the socket option gets the benefit
   of the improved data dump.

If switching from compile-time to runtime turns out to be too annoying,
just switching unconditionally to the per-table dump would technically
work (as in, would not drop any routes), it would just be inefficient as
each "per-table" dump would return results from all tables...

-Toke



More information about the Bird-users mailing list