Mitigations/tunables for reducing netlink loss?
Maria Matejka
maria.matejka at nic.cz
Tue Sep 21 09:53:06 CEST 2021
Hello!
> Sep 20 11:50:48 ganges bird: Kernel dropped some netlink messages, will resync on next scan.
[...]
> Sep 20 11:51:19 ganges bird: Kernel dropped some netlink messages, will resync on next scan.
This is somehow inevitable, as the netlink manpage states:
However, reliable transmissions from kernel to user are
impossible in any case. The kernel can't send a netlink message
if the socket buffer is full: the message will be dropped and the
kernel and the user-space process will no longer have the same
view of kernel state. It is up to the application to detect when
this happens (via the ENOBUFS error returned by recvmsg(2)) and
resynchronize.
This unreliability is also a good reason to have periodic table scans,
just to be sure that kernel is in sync with BIRD.
> I'm seeing netlink drops when upstream internet churn is say more than
> 200 updates/sec or so, not huge, but quite freqent and can continue
> for minutes/hours.
Yes, this is quite a known situation. We can't do much about it in
single-threaded BIRD – the ENOBUFS error signals that the kernel has no
more room to store route updates. (See more thorough explanation down
there.)
> Some items I've investigated so far:
>
> Increasing net.core.rmem_max and net.core.wmem_max sysctls doesn't
> seem to help much, strace of bird doesn't indicate any EAGAIN or
> blocking when writing to the netlink sockets.
Here somebody suggests increasing net.core.rmem_default before starting
BIRD.
https://bird.network.cz/pipermail/bird-users/2017-September/011541.html
> strace shows some room for optimization in the prot kernel (these
> would obviously be code changes). For example, when a route changes
> next-hop/interface, 2 netlink messages are sent, delete followed by
> add, instead of a single change/replace (this would complicate bird,
> but reduce netlink message in half for updates).
This would be feasible in a world of one single kernel with no bugs, yet
there have been quite a few bugs needed to be worked-around and we have
no useful detection mechanism to check whether this exact kernel version
suffers from that bug. (There are still people running new BIRD on old
kernels.)
> There is plenty of cpu cycles available, bird is <%1, etc...
>
> Any pointers on tuning or config changes that may help here are
> appreciated.
Well, to be honest, I think this may be fixed by having a separate
netlink thread (which is a work-almost-in-progress), yet without that,
it is almost impossible. The reason is how it works now:
1) BGP receives a packet (quite a big one or several of them)
2) BGP parses the input data and for each single route:
2A) import filter is run
2B) best route in table is recalculated
2C) all exports are run; in case of kernel, the netlink message is sent
2D) kernel generates a netlink message in response, confirming the route
update
(repeat this for all the data)
3) BGP is done and another socket is read. For simplicity, let's assume
it is the netlink receive socket.
4) Netlink parses the incoming messages, getting ENOBUFS and realizing
that there are some more updates that didn't fit into the receive
buffer, issuing that warning.
5) After a while, netlink scan is issued, successfully checking that all
routes are there.
The actual reason for BIRD showing these warning in tables where only
BIRD writes is simply the impossibility of reading the netlink socket
while exporting routes from another protocol.
This will be fixed in future BIRD versions supporting multithreaded
execution where the netlink thread should have enough time to read the
netlink socket and the exports for netlink (and all other protocols)
will properly queue and wait to be processed until the protocol decides
to actually export.
Maria
More information about the Bird-users
mailing list