bird under heavy cpu load

Mon Mar 12 20:22:10 CET 2012

On Mon, Mar 12, 2012 at 02:27:23PM +0400, Alexander V. Chernikov wrote:
> On 12.03.2012 13:25, Oleg wrote:
> >Hi, all.
> >
> >I have some experience with bird under heavy cpu load. I had a
> >situation when bird do frequent updates of kernel table, because
> >of bgp-session frequent down/up (because of cpu and/or net load).
> >I waited about 5-10 minutes for birdc started, when i wanted to
> >reconfigure bird with configure soft command from birdc(ssh and
> >other interactive programs works well). After i changed bird.conf
> >in such a way that i get from upstream 0.0.0.0/0 instead of full view
> >birdc worked fine and bird cpu load lowered(before, bird load cpu about
> >30-89%).
> >What can i do in such situtation in future for birdc normal work?
> Adding/removing kernel routes is a complex task, results in main
> process waiting for system call is completed.
> 
> It is very costy and one (developer) possibility is to add
> additional thread which does kernel interaction.

  I think that a separate thread can solve this nuisance.

> We workaround this by installing IGP routes only (which does not
> help for border router)

  Unfortunately, this is a border router.

> You can try to reduce MAX_RX_STEPS to 1 in sysdep/unix/io.c, this
> will help a bit for new route installation (2 minutes instead of 5),

  Thanks! I'll try it.

> but at the moment answer seems to be "don't export all routes to
> kernel". You can filter all that is >=/20, for example, to provide
> reasonable load balancing with much less routes in OS kernel.

  In simple configuration with one upstream this works fine, but what in more
complex configurations? Can i lose some networks? Or can i get poor ping,
because of /24 prefix, for example, is better than /20 that contain it?

> >And additional two questions:
> >Does bird immediately push routes info in kernel table, if it receive routes
> >updates from bgp-peer or it wait for the next kernel table scan?
> Kernel protocol (just as any other protocol) receives route updates
> immediately after they are announced to the routing table by BGP
> protocol instance. In the came callback kernel update (for the
> single route) is prepared and called.

  So, in a such situation a simply increasing of a scan time value doesn't
help as i understand.