bird under heavy cpu load

Tue Mar 27 17:13:11 CEST 2012

On 26.03.2012 03:25, Ondrej Zajicek wrote:
> On Mon, Mar 12, 2012 at 11:22:10PM +0400, Oleg wrote:
>> On Mon, Mar 12, 2012 at 02:27:23PM +0400, Alexander V. Chernikov wrote:
>>> On 12.03.2012 13:25, Oleg wrote:
>>>> Hi, all.
>>>>
>>>> I have some experience with bird under heavy cpu load. I had a
>>>> situation when bird do frequent updates of kernel table, because
>>>> of bgp-session frequent down/up (because of cpu and/or net load).
>
> Hello
>
> Answering collectively for the whole thread:
>
> I did some preliminary testing and it on my test machine exporting full
> BGP feed (cca 400k routes) to a kernel table took 1-2 sec on Linux and
> 5-6 sec on BSD. Similar time for flushing the kernel table. Therefore,
> if we devote a half CPU for kernel sync, we have about 200 kr/s (kiloroutes
> per second) for Linux and 40 kr/s for BSD, this still seems more than
> enough for an edge router. Are there any estimates (using protocol statistics)
> for number of updates to kernel proto in this case? How many protocols,
> tables and ppie do you have in your case?
>
> The key to responsiveness (and ability to send keepalives on time)
> during heavy CPU load is in granularity. The main problem in BIRD is
> that whole route propagation is done synchronously - when route is
> received, it is propagated through all pipes and all routing tables to
> all final receivers in one step, which is problematic if you have
> several hundreds of BGP sessions (but probably not too problematic with
I've be been playing with BGP/core code in preparations for peer-groups 
implementation.

Setup: 1 peer with full-view (1), 1 peer as full-view receiver (10.
both are disabled by default. We're starting bird, enables peer 1.
After full-view is received we enables second peer.

Some bgp bucket statistics:
max_feed: 256 iterations: 1551 buckets: 362184 routes: 397056 effect: 8%
max_feed: 512 iterations: 775 buckets: 351902 routes: 396800 effect: 11%
max_feed: 1024 iterations: 387 buckets: 335773 routes: 396288 effect: 15%
max_feed: 2048 iterations: 193 buckets: 300434 routes: 395264 effect: 23%
max_feed: 4096 iterations: 96 buckets: 255752 routes: 393216 effect: 34%
max_feed: 8192 iterations: 48 buckets: 216780 routes: 393216 effect: 44%

'Effect' means (routes - buckets) * 100 / routes e.g. how much prefixes 
are stored in existing buckets.

Maybe we can consider making max_feed value to be auto-tuned ?
e.g. to be 8 or 16k for small total amount of protocols.
If we assume max_feed * proto_count to be const (which keeps granularity 
at the same level), and say that we use default feed (256) for 256 
protocols, we can automatically recalculate max_feed on every { 
configure, protocol enabled/disabled state change / whatever }

( and something should be done with kernel protocol before doing this)

( I've also changed bucket allocation style to slabs but it seems there 
is no noticeable by getrusage() change (at least on max_feed=256). 
however, maybe we should do this at least for consistency. I'll come 
with the results later after testing in more complex setup )

> regard to kernel sync). If this could be splitted and done
> asynchronously in some smart way, it could solve several problems. One
> idea is that routes (or nets) in a table would be connected in a list in
> the order as they arrived to the table, each protocol would have a
> position in that list for routes not yet exported to that protocol and
> event in event queue handling that export. Another advantage of that
> approach is that we could temporarily stop or rate limit propagation of
> routes to that protocol.
>
> But i noticed recently that there are some steps that have even bigger
> latency than just plain route propagation. For example disabling kernel
> protocol would cause flushing all routes in kernel table done in one step,
> which (as mentioned above) will block BIRD for cca 5 s on BSD with 400k
> routes. Fortunately, disabling kernel protocol is not a common operation.
>
> Another problem is protocol flushing - if protocol fell down, all its
> routes are de-propagated in one step (function rt_prune()). This is
> probably the most important cause of latencies and could be easily
> fixed.
>
> Another possible problem is a kernel scan, which is also done
> in one step, but at least in Linux it could probably be splitted to
> smaller steps and does not took too much time if the kernel table is
> in expected state.
...
CLI interface can easily be another abuser:
bird> show route count
2723321 of 2723321 routes for 407158 networks

If I do 'show route' for such table this can block bird for 10? seconds.

>
> I would probably implement some latency measurement and do some more
> testing to get a better idea and probably fix the protocol flushing
> problem.
>
> BTW, one possible hack how to spare CPU time under heavy load with
> regard to kernel syncing is to disable synchronous kernel updates in
> krt_notify() and rely completely on syncing during periodic scan.
>
> BTW, regardless of all of this, for BFD we would definitely need a
> separate process/thread, but with almost none shared data.
>