[PATCH] RFC: Add netlink KRT dump filters on Linux
Tomas Hlavacek
tmshlvck at gmail.com
Tue Jan 18 10:35:41 CET 2022
Hi Ondrej,
On Fri, Jan 14, 2022 at 11:17 PM Ondrej Zajicek <santiago at crfreenet.org> wrote:
>
> On Mon, Jan 10, 2022 at 11:47:57PM +0100, Tomas Hlavacek wrote:
> > Add netlink KRT dump filter on Linux to avoid PMTU cache records from FNHE
> > table dump along with KRT.
> >
> > Linux Kernel added FNHE table dump to the netlink API in patch
> > https://patchwork.ozlabs.org/project/netdev/patch/8d3b68cd37fb5fddc470904cdd6793fcf480c6c1.1561131177.git.sbrivio@redhat.com/
> >
> > The filter mitigates the risk of receiving unknown and potentially large
> > number of FNHE records that would block BIRD I/O in each sync. There is a
> > known issue caused by the GRE tunnels on Linux that seems to be creating
> > one FNHE record for each destination IP address that is routed through the
> > tunnel, even when the PMTU equals to GRE interface MTU (tested with kernel
> > 5.5 - 5.16-rc7).
>
> Thanks, merged with some modifications:
>
> https://gitlab.nic.cz/labs/bird/-/commit/e818f16448e918ed07633480291283f3449dd9e4
>
> Instead of switching NETLINK_GET_STRICT_CHK on and off, i just used strict
> checking for all dumps (including link and address).
Great! That is definitely a better way! Cool!
>
> Also, removed the SO_SNDBUF/SO_RCVBUF change. That seems unrelated and
> has some issues:
>
> 1) Why these values? 32k for SO_SNDBUF is smaller than the default value
> (208k), so it in fact makes the buffer smaller (which probably does not
> matter). While 1M for SO_RCVBUF is bigger that max value, so it is capped
> at 416k.
I took the values from iproute2 and then I tried to fine-tune speed of
the large FNHE dumps by tweaking these parameters. It is not relevant
anymore. So yes, it's OK to drop it.
>
> 2) It applies just for nl_scan and nl_req, and not for async fd, where it
> makes most sense.
>
> 3) We may want big rx buffer for async fd, in this case we may consider
> using SO_SNDBUFFORCE.
>
> I am not sure which netlink socket operations are really synchronous or
> with flow control, so big buffer is not needed.
I didn't realize that. Sure, you are right.
Best regards,
Tomas
More information about the Bird-users
mailing list