Upgrade to 1.6.3

Jan Matejka jan.matejka at nic.cz
Fri Dec 30 13:38:06 CET 2016


Hi!

> 2016-12-30 12:02:43 <WARN> Kernel dropped some netlink messages, will
> resync on next scan.
> 2016-12-30 12:02:43 <WARN> Kernel dropped some netlink messages, will
> resync on next scan.
> 2016-12-30 12:02:52 <WARN> Kernel dropped some netlink messages, will
> resync on next scan.
> 2016-12-30 12:03:03 <WARN> Kernel dropped some netlink messages, will
> resync on next scan.
> 2016-12-30 12:03:10 <WARN> Kernel dropped some netlink messages, will
> resync on next scan.
> 2016-12-30 12:03:16 <WARN> Kernel dropped some netlink messages, will
> resync on next scan
> 
> I can see the warning log line was added at Dec 20 2016, which ended up
> in 1.6.3 release. Hence this was happening probably even before the
> upgrade, but it was not visible.

Yes, it was happening and it sometimes caused strange behavior. It
should be logged as it is not detectable in any other reasonable way.
However, you may decide that it is OK for your machine and ignore
(filter out) this log message if you want.

> So the question is simple. Does it means there is something terribly
> wrong with my machine, or the high rate of this warning is expected?

There may be several reasons of this:

1) Something changes in kernel really fast. Routes, interfaces or
something like that. Therefore some routes (the dropped ones) appear in
the kernel protocol not immediately, but during the next regular scan.

2) Something changes in kernel really fast, it fills the kernel netlink
buffer with messages that don't propagate to Bird (e.g. conntrackd).
Sadly, it seems that the kernel netlink buffer is shared for all these
messages. I don't still know whether this should be considered kernel
bug or not or whether it is even true. Haven't looked into it so deep.

3) Something is terribly wrong with your machine, universe or anything.
(Hopefully not anybody's case.)

All of these cases could cause an undebuggable strange behaviour in
several ways. By issuing this WARN message, we want to warn the admin
that there is something strange happening but it probably isn't terribly
wrong.

There was a strange bug (hang) in Bird triggered by ENOBUFS on netlink
socket with no message ready to be read from that socket. I never
believed it could happen but it definitely happens.

MQ


More information about the Bird-users mailing list