multithread support
Maria Matejka
maria.matejka at nic.cz
Tue Mar 2 18:54:54 CET 2021
Hello!
On 3/2/21 4:34 PM, Douglas Fischer wrote:
> This is very good news!
>
> I know you said "This is a ball park guess", but I confess that I was a
> little scared by the proportion of extra CPU usage (30/48 -> +60%).
This depends much on what kind of load we are speaking about. Generally,
if you are a big route server, then 98% of CPU time is probably eaten by
complex filters. I would estimate that this may finish anywhere between
+10% and -10% due to other structural changes. The parallelization
overhead would be minimal.
However, if you are a big route reflector, then you're constantly just
recomputing the best route, accessing the same table. Then we may get to
the +60% estimate. Long story short, the more work you do with one
route, the less overhead you get.
Remember that BIRD is currently extremely well optimized for
single-threaded execution and some parts still heavily depend on being
executed that way. We chose first to allow parallel execution of those
parts that can be parallelized well, with adding some overhead to other
parts.
The most critical part of this is route export (from tables to
protocols) which is now done synchronously after route import. We
decided to decouple it in the multithreaded code, which involves having
a route export queue. Hence more memory stores and loads, more cache
misses etc.
Well … maybe the +60% is too much, reconsidering that guess. Let's hope
it's overestimated. I'd be more concerned about the memory usage. There
are some estimations of peak memory usage in worst cases which can be
even +100% (for a short time). In case we get to these problems in real
world, we'd definitely have to implement algorithms to limit these peaks
as swapping to disk is not desirable here at all. Anyway, this is not
the problem of today; we still need first to get to a code which at
least builds and runs without spitting one core file after another.
> I also know that you said that the code is still "currently not
> releasable", but I'm curious to know a little more about how this
> multi-threading was handled.
Basically, one thread per receiving socket, one thread per exporting
channel, with some exceptions. One lock per protocol instance, one lock
per table. You can lock only one table and one protocol instance at
time; protocol goes first.
We'll publish more documentation; it's still WIP. For now, I'm just
answering a question to say "yes, we're going multithreaded and we're
actively working on it".
> Just to illustrate:
>
> Single-Core CPU on BGP is known to be a problem for many engines and
> vendors.
>
> One of the vendors developed a "creative" way to do this load
> distribution in multiple colors.
> As I understood it, they made a kind of Affinity CPU by BGP-Peer.
> In a way that each peer has a BGP process, and that process is
> "semi-tied" to a core.
> And they created a mechanism to redistribute these affinities from
> time-to-time based on the amount of BGP messages per second exchanged on
> each peer.
If this arises to be a problem, we'll consider this. For now, it just
seems that the most critical part is the route itself which is being
propagated through BIRD -- which should stay in one thread as long as
possible and the threads should keep its CPU (on a well-behaved system)
unless moved for a good reason.
Maria
>
> Em ter., 2 de mar. de 2021 às 10:13, Maria Matejka <maria.matejka at nic.cz
> <mailto:maria.matejka at nic.cz>> escreveu:
>
> Hi!
>
> On 3/1/21 1:26 PM, Marcelo Balbinot wrote:
> >
> > Hi, I already asked this question at some point,
> > but I am curious about the evolution ..
> > About multi thread support (multi-core cpu use).
> > Is this still a possibility?
>
> Yes, it is. Be prepared that this will also raise memory usage (current
> estimates are about >+10% memory) and overall CPU usage (compared to
> single-thread execution) due to needed synchronization and buffers.
>
> This means that if you now consume 20G of memory and 30 minutes of
> single core time to converge the main table on a rather big node,
> you're
> going to consume, let's say, >22G of memory and 3 minutes of 16-core
> CPU
> (summing to 48 minutes of CPU time). This is a ball park guess, do not
> take me much seriously. It may be better, it may be worse.
>
> Anyway, there is some code (currently not releasable) that will get
> to a
> preview release soon. We'll highly appreciate testing from any user
> around. Stay tunad!
>
> Maria
>
>
>
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação
More information about the Bird-users
mailing list