Will bird block on syslog() call?

Ondrej Zajicek santiago at crfreenet.org
Fri Feb 24 16:12:27 CET 2017


On Fri, Feb 24, 2017 at 01:13:55PM +0100, Pavlos Parissis wrote:
> Hi,
> 
> We have observed some instability on BFD protocol, where upstream router and/or
> the server (Linux RedHat 7.3) declares the BFD session dead and as consequence
> upstream router stops forwarding traffic to the server (we utilize ECMP).
> 
> Our current hypothesis is that Bird log messages (only BGP KEEPALIVE messages
> when there isn't any route change) via syslog glibc function, which connects to
> UNIX socket (/dev/log) and the sender (Bird daemon) may block when the receiver
> (rsyslogd) doesn't response fast enough or the buffer is full.
> 
> On RedHat 7 servers there is a chain of daemons, which receive log messages via
> UNIX socket.
> 
> systemd-journald.service listens on /dev/log UNIX SOCKET and forwards messages
> to /run/systemd/journal/syslog UNIX SOCKET where rsyslogd listens on.
> 
> As far as I can see in the code and in the output of ps -eLl, Bird daemon is a
> single threaded process (please correct me if I am wrong), so it could be that a
> call to syslog blocks for X seconds when X is higher than the failure detection
> time.

Hi

BIRD is single-threaded with the exception of BFD, which runs in a
separate thread. Generally, interaction of BFD thread with the rest of
BIRD is designed in a way that BFD thread should not wait on the main
thread. So generally, the main thread blocked on syslog() should not
cause problems to the BFD thread. There are some exceptions, like when
the BFD thread wants to log itself (there is shared mutex around logging
subsystem), but that is usually not a problem, as BFD do not log anything
during regular operation (unless packet logging is enabled).

I would suggest to decrease min rx/tx interval to 100 ms (to see if that
helps). And you could try 'watchdog warning' / 'debug latency' options
(with appropriate values, like 500 ms) to track latency in the main
thread to see if BFD problems are related to eventual latency problems in
the main thread.

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20170224/e92d4848/attachment.asc>


More information about the Bird-users mailing list