BIRD drops specific IPv6 session for no reason
Stavros Konstantaras
stavros.konstantaras at ams-ix.net
Tue Mar 3 09:16:16 CET 2020
Hi Alexander,
In general we try to keep the RS as light as possible, which means we do not run unwanted applications or packet captures over there.
Nevertheless, we didn’t observe busy HDD issues but that is also a valid point.
As per Ondrej’s feedback, it seems there is a kernel issue, maybe bug or scalability, so I will schedule a maintenance to update the kernel
on the servers and see if it fixes the problem.
Best regards,
Stavros Konstantaras | Sr. Network Engineer | AMS-IX
M +31 (0) 620 89 51 04 | T +31 20 305 8999
ams-ix.net
> On 29 Feb 2020, at 01:08, Alexander Zubkov <green at qrator.net> wrote:
>
> Hi,
>
> Can it be some IO issue? We had similar problems with bird making an
> IO loop for too much time so that hold timers were expired by that
> time. It was probably caused when it was writing a log file on a busy
> HDD. But we catch those with syslog too, because that write is
> blocking for the bird too.
> But nevertheless the OS should have been replying something in the TCP
> session in your case - accepting the segments or showing that the
> window is full. As far as I know bird does not have its own TCP stack,
> so the OS is to be blamed for that part. It can be stuck for some
> reason/bug or as other people suggested it could be sending packets
> somewhere else or not knowing where to send them.
>
> On Fri, Feb 28, 2020 at 4:46 PM Ondrej Zajicek <santiago at crfreenet.org> wrote:
>>
>> On Fri, Feb 28, 2020 at 03:33:06PM +0100, Stavros Konstantaras wrote:
>>> HI Alarig,
>>>
>>> Thank you for sharing your experiences. I don’t have the MSS currently but if that was the case, wouldn’t have experienced the drops more frequently?
>>> Currently it happens once per month (or 0.8 per month) and contrary to your case which was 100% network related, in our case we don’t even see the
>>> reply packet being generated and leaving the box.
>>>
>>> What puzzles me also and based on the capture, is that I don’t see the TCP-ACK messages being sent to the customer. If BIRD opens a TCP socket
>>> (not a simple RAW socket), I assume that the TCP connection will be handled by the OS and BIRD will push data segments (BGP keep alive messages) when ready.
>>>
>>> But as per output, I don’t see the TCP ack messages at all. Is BIRD handling the TCP communication as well?
>>
>> Hi
>>
>> That is a good point. BIRD uses regular TCP socket, so if you do not see
>> TCP ack, then it is likely an underlying (kernel) issue. There were some
>> reports of IPv6 issues in recent kernels [*]
>>
>> Also, the log message:
>>
>> Feb 20 21:46:11 rs1-mng bird6: 2001:7F8:1::A500:19:7727:1: Received: Hold timer expired
>>
>> shows that the notification message was received by the BIRD. The packet
>> dump shows that keepalives were not sent by BIRD side. You could enable
>> 'debug all' for given peer to see if BIRD tries to send keepalives. You
>> could also monitor state of socket using 'ss' tool.
>>
>> [*] https://bird.network.cz/pipermail/bird-users/2020-February/014270.html
>>
>> --
>> Elen sila lumenn' omentielvo
>>
>> Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
>> OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
>> "To err is human -- to blame it on a computer is even more so."
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20200303/9185b088/attachment.htm>
More information about the Bird-users
mailing list