100% CPU load with device scanning enabled

Saso Tavcar fast at ais42.net
Mon May 6 20:40:45 CEST 2019


Hi,

this is an OVS issue, already discussed:

https://mail.openvswitch.org/pipermail/ovs-discuss/2016-November/043007.html <https://mail.openvswitch.org/pipermail/ovs-discuss/2016-November/043007.html>
...
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-November/043063.html <https://mail.openvswitch.org/pipermail/ovs-discuss/2016-November/043063.html>

Official OVS quote:
> We'd accept patches to improve OVS's routing table code.  It's not
> designed to scale to 1,800,000 routes.  We'd also take code to suppress
> the routing table code in cases where it isn't actually needed, since
> it's not always needed.  But we can't take a patch to just delete it;
> I'm sure you understand.
I tried to apply this patch at that time, but was already useless for newer versions:

https://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20161123/5379b333/attachment.bin <https://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20161123/5379b333/attachment.bin>

Our workaround was to scale VM with 3 vCPU-s, since our average system load is 1.5 for BGP.

You can see what is happening:

[root at bgp1 ~]# top
...
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                           
  654 root      10 -10 1284492   1.0g  20276 R  98.0  27.0   2513:01 ovs-vswitchd                                                                                                                      
   16 root      20   0       0      0      0 S   2.0   0.0  24:45.60 ksoftirqd/1  

[root at bgp1 ~]# ip route show
...
1.0.0.0/24 via 89.212.47.185 dev t2-v24-ha proto bird 
1.0.4.0/24 via 89.212.47.185 dev t2-v24-ha proto bird 
1.0.4.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
1.0.5.0/24 via 89.212.47.185 dev t2-v24-ha proto bird


Routes being constantly added and deleted:

[root at bgp1 ~]# ip monitor
...
Deleted 2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
Deleted 2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
Deleted 2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
2620:11d:6000::/42 via 2a01:260:1021::1 dev t2-v26-ha proto bird metric 1024 pref medium
Deleted 68.69.37.0/24 via 89.212.47.185 dev t2-v24-ha proto bird 
68.69.37.0/24 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 103.115.180.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
103.115.180.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 103.115.180.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
103.115.180.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 2.16.70.0/23 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 88.221.28.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 23.50.188.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 92.122.68.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 88.221.100.0/22 via 89.212.47.185 dev t2-v24-ha proto bird 
Deleted 92.123.208.0/22 via 89.212.47.185 dev t2-v24-ha proto bird
..... 



Regards,
saso

> On 6 May 2019, at 19:30, Kees Meijs <kees at nefos.nl <mailto:kees at nefos.nl>> wrote:
> 
> Hi list,
> 
> We're in the process of replacing Quagga with BIRD but stumble upon a
> little problem.
> 
> When device scanning is on (obviously default) our testing machine
> completely fills up a CPU core. The culprit isn't BIRD itself but an
> Open vSwitch daemon.
> 
> After disabling the device protocol and restarting BIRD, everything goes
> back to it's quiet state.
> 
> BIRD (1.6.3-2) and Open vSwitch (2.6.2~pre+git20161223-3) both were
> installed as Debian stable packages.
> 
> The configuration is as simple as:
> 
>> # This is a minimal configuration file, which allows the bird daemon
>> to start
>> # but will not cause anything else to happen.
>> #
>> # Please refer to the documentation in the bird-doc package or BIRD User's
>> # Guide on http://bird.network.cz/ <http://bird.network.cz/> for more information on configuring
>> BIRD and
>> # adding routing protocols.
>> 
>> # Change this into your BIRD router ID. It's a world-wide unique
>> identification
>> # of your router, usually one of router's IPv4 addresses.
>> router id 1.2.3.4;
>> 
>> # The Device protocol is not a real routing protocol. It doesn't
>> generate any
>> # routes and it only serves as a module for getting information about
>> network
>> # interfaces from the kernel.
>> protocol device {
>> }
>> 
>> # The Kernel protocol is not a real routing protocol. Instead of
>> communicating
>> # with other routers in the network, it performs synchronization of BIRD's
>> # routing tables with the OS kernel.
>> protocol kernel {
>>     metric 64;    # Use explicit kernel route metric to avoid collisions
>>             # with non-BIRD routes in the kernel routing table
>>     import none;
>>     export all;    # Actually insert routes into the kernel routing table
>> }
>> 
>> protocol bgp test {
>>     description "BGP test";
>>     local as REDACTED;
>>     neighbor 1.2.3.4 as REDACTED;
>>     direct;
>>     next hop self;
>>     deterministic med on;
>>     export none;
>>     import all;
>> }
> 
> Meanwhile log messages such as below arise:
> 
>> bird: Kernel dropped some netlink messages, will resync on next scan.
> 
> For a test I deleted all existing Open vSwitch bridges and the load
> dropped again. After adding an empty new bridge, the load spikes again
> in an instant.
> 
> This is unexpected behaviour. Maybe it's an implementation problem in
> Open vSwitch or maybe in BIRD. Anyway, it should happen I guess.
> 
> Any clues?
> 
> Thanks in advance!
> 
> Regards,
> Kees
> 
> 




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20190506/8474c339/attachment.htm>


More information about the Bird-users mailing list