bird under heavy cpu load

Tue Mar 27 19:29:10 CEST 2012

On 27.03.2012 20:44, Ondrej Zajicek wrote:
> On Tue, Mar 27, 2012 at 07:13:11PM +0400, Alexander V. Chernikov wrote:
>> On 26.03.2012 03:25, Ondrej Zajicek wrote:
>>> On Mon, Mar 12, 2012 at 11:22:10PM +0400, Oleg wrote:
>>>> On Mon, Mar 12, 2012 at 02:27:23PM +0400, Alexander V. Chernikov wrote:
>>>>> On 12.03.2012 13:25, Oleg wrote:
>>>>>> Hi, all.
>>>>>>
>>>>>> I have some experience with bird under heavy cpu load. I had a
>>>>>> situation when bird do frequent updates of kernel table, because
>>>>>> of bgp-session frequent down/up (because of cpu and/or net load).
>>>
>>> Hello
>>>
>>> Answering collectively for the whole thread:
>>>
>>> I did some preliminary testing and it on my test machine exporting full
>>> BGP feed (cca 400k routes) to a kernel table took 1-2 sec on Linux and
>>> 5-6 sec on BSD. Similar time for flushing the kernel table. Therefore,
>>> if we devote a half CPU for kernel sync, we have about 200 kr/s (kiloroutes
>>> per second) for Linux and 40 kr/s for BSD, this still seems more than
>>> enough for an edge router. Are there any estimates (using protocol statistics)
>>> for number of updates to kernel proto in this case? How many protocols,
>>> tables and ppie do you have in your case?
>>>
>>> The key to responsiveness (and ability to send keepalives on time)
>>> during heavy CPU load is in granularity. The main problem in BIRD is
>>> that whole route propagation is done synchronously - when route is
>>> received, it is propagated through all pipes and all routing tables to
>>> all final receivers in one step, which is problematic if you have
>>> several hundreds of BGP sessions (but probably not too problematic with
>> I've be been playing with BGP/core code in preparations for peer-groups
>> implementation.
>>
>> Setup: 1 peer with full-view (1), 1 peer as full-view receiver (10.
>> both are disabled by default. We're starting bird, enables peer 1.
>> After full-view is received we enables second peer.
>>
>> Some bgp bucket statistics:
>> max_feed: 256 iterations: 1551 buckets: 362184 routes: 397056 effect: 8%
>> max_feed: 512 iterations: 775 buckets: 351902 routes: 396800 effect: 11%
>> max_feed: 1024 iterations: 387 buckets: 335773 routes: 396288 effect: 15%
>> max_feed: 2048 iterations: 193 buckets: 300434 routes: 395264 effect: 23%
>> max_feed: 4096 iterations: 96 buckets: 255752 routes: 393216 effect: 34%
>> max_feed: 8192 iterations: 48 buckets: 216780 routes: 393216 effect: 44%
>>
>> 'Effect' means (routes - buckets) * 100 / routes e.g. how much prefixes
>> are stored in existing buckets.
>>
>> Maybe we can consider making max_feed value to be auto-tuned ?
>> e.g. to be 8 or 16k for small total amount of protocols.
>> If we assume max_feed * proto_count to be const (which keeps granularity
>> at the same level), and say that we use default feed (256) for 256
>> protocols, we can automatically recalculate max_feed on every {
>> configure, protocol enabled/disabled state change / whatever }
>
> Is there any point to trying to achieve efficient route packing to buckets?
> Most rx processing is done per route, so buckets just save some TCP data
> transfers.
Decrease number of packets being sent to large amount of readers 
(actually, peer-group members). Yes, it is not the piece where the main 
CPU-intensive parts are done, but this can still save, say, 1-5% without 
any cost from our side. Why not be polite if we can?

Our typical firewall, see number of messages received:
Neighbor  V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
XXX-BIRD  4 XXXXX 11394287   93753        0    0    0 03:53:52   403577
X-QUAGGA  4 XXXXX 11185307   92512        0    0    0 01w5d18h   403583
X-QUAGGA  4 XXXXX 26569910   93805        0    0    0 05w6d06h   403411


>
> BTW, these results depends on many things like how big buffers kernel
> has for TCP and how fast the other side is able to acknowledge receiving
> data. I guess that first, BIRD probably flushes buckets from BGP to TCP
> as fast as they are generated with minimal packing (depending mostly on
> granularity or max_feed), later TCP buffers became full, sending updates
> is postponed and BGP bucket cache started to fill (you could see that in
> 'show memory').
>
> If you want to get efficient packing, probably most elegant solution
> would be to add some delay (like 2 s) before activating/scheduling
> sending BGP update packets. Or some smart approach, like if BGP bucket
> cache contains at least x buckets, schedule updates immediately,
> otherwise schedule them after 2 s.
Thanks for the idea. I should have thought about that approach.
>
>>> Another possible problem is a kernel scan, which is also done
>>> in one step, but at least in Linux it could probably be splitted to
>>> smaller steps and does not took too much time if the kernel table is
>>> in expected state.
>> ...
>> CLI interface can easily be another abuser:
>> bird>  show route count
>> 2723321 of 2723321 routes for 407158 networks
>>
>> If I do 'show route' for such table this can block bird for 10? seconds.
>
> Really? show route processing is splitted per 64 routes, so i suppose
> that only the CLI session is blocked.
Not exactly.

'show route' result shows up immediately, but when I press 'q' to quit 
from the results I see 'bird' process eating CPU for ~5 seconds.
>