Long io loop / missing netlink messages
Nico Schottelius
nico.schottelius at ungleich.ch
Sun Dec 22 16:52:20 CET 2019
Hello Maria,
Maria Matějka <maria.matejka at nic.cz> writes:
> Hello!
>
> How much time does it take to list the kernel table?
>
> time ip r > /dev/null
[16:47] router1.place6:~# time ip r >/dev/null
real 0m 6.06s
user 0m 4.90s
sys 0m 1.14s
[16:47] router1.place6:~# time ip r >/dev/null
real 0m 5.92s
user 0m 4.73s
sys 0m 1.17s
[16:47] router1.place6:~#
Question: is a 6 second lookup expected with full IPv4 table?
> How many routes do you have in bird table?
>
> show route count
[16:47] router1.place6:~# birdc show route count
BIRD 2.0.7 ready.
778706 of 778706 routes for 778704 networks in table master4
89166 of 89166 routes for 78105 networks in table master6
Total: 867872 of 867872 routes for 856809 networks in 2 tables
> And what export filter do you have for the kernel protocol in bird?
In short:
every route <= /24 (IPv4) and <= /48 (IPv6) of the global routing table.
In long:
define net_to_router_v4 = [ 0.0.0.0/0+ ];
define net_to_router_v6 = [ ::/0+ ];
protocol kernel kernel_v6 {
metric 64;
learn;
scan time 20;
ipv6 {
import filter ungleich_networks;
export filter to_ungleich_routers;
};
}
function is_for_router_kernel()
{
if ((net.type = NET_IP6 && net ~ net_to_router_v6) ||
(net.type = NET_IP4 && net ~ net_to_router_v4)) then {
return true;
}
if ((net.type = NET_IP6 && net.len <= 29) ||
(net.type = NET_IP4 && net.len <= 10)) then {
return true;
}
return false;
}
/* skipping functions that are very small routes */
filter to_ungleich_routers {
if(is_inside_ungleich()) then accept;
if(is_from_netstream()) then accept;
if(is_from_sunrise()) then accept;
if(is_default_route()) then accept;
if(is_for_router_kernel()) then accept;
reject;
}
Best,
Nico
> On December 21, 2019 1:07:25 PM GMT+01:00, Nico Schottelius <nico.schottelius at ungleich.ch> wrote:
>>
>>Good morning,
>>
>>on a fresh new router running the full routing table,
>>with Alpine, Linux 4.19.80-0-vanilla, bird-2.0.7
>>
>>I see a lot of these messages:
>>
>>Dec 21 12:37:15 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:37:34 router1 daemon.warn bird: I/O loop cycle took 5000 ms
>>for 1 events
>>Dec 21 12:38:14 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:38:35 router1 daemon.warn bird: I/O loop cycle took 5328 ms
>>for 1 events
>>Dec 21 12:38:54 router1 daemon.warn bird: I/O loop cycle took 5013 ms
>>for 1 events
>>Dec 21 12:39:07 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:39:14 router1 daemon.warn bird: I/O loop cycle took 5053 ms
>>for 1 events
>>Dec 21 12:39:34 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:40:14 router1 daemon.warn bird: I/O loop cycle took 5041 ms
>>for 1 events
>>
>>[12:49] router1.place6:~# ip -6 r | wc -l; ip r | wc -l
>>78212
>>779342
>>
>>With "debug latency;" I get the following additional messages:
>>
>>Dec 21 12:54:31 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4449 ms
>>Dec 21 12:54:52 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5608 ms
>>
>>The system is overall idle with bird spiking to 50-100% cpu usage every
>>couple of seconds. I first thougt they are only logged after stating
>>bird (where it might make sense), but the events continue to be logged
>>around every 30s:
>>
>>Dec 21 13:04:11 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4596 ms
>>Dec 21 13:04:32 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5096 ms
>>Dec 21 13:04:52 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5102 ms
>>Dec 21 13:05:11 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4676 ms
>>Dec 21 13:05:31 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4645 ms
>>
>>Which might loosely correlate to the scan time "20" that is setup for
>>device and kernel protocols.
>>
>>How do I best debug this issue?
>>
>>Best,
>>
>>Nico
>>
>>
>>
>>--
>>Modern, affordable, Swiss Virtual Machines. Visit
>>www.datacenterlight.ch
--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
More information about the Bird-users
mailing list