[RFC] filter profiling

Tue Aug 29 11:05:26 CEST 2017

On Tue, Aug 29, 2017 at 10:07:03AM +0200, Vincent Bernat wrote:

> >> > Feedback appreciated!  (Better ideas also appreciated. :D)
> >> 
> >> Using USDT probes? You can attach arbitrary strings to them. I know perf
> >> supports them (with a recent kernel) but I don't know how
> >> exactly. However, with systemtap, it's dead easy to see them:
> >> 
> >> stap -e 'probe bird.* { print($$vars) }'
> >> 
> >> For implementation, see:
> >>  https://github.com/vincentbernat/lldpd/commit/bdfe419389075af3cdfeadc78008a157afc2d5d7
> >>  https://github.com/vincentbernat/lldpd/commit/140e34f057462d5ff7818ab4af368f055eaad4e3
> >
> > As far as I can see, these are tracepoints, but they wouldn't let me do
> > profiling?  What I need is profiling, as I want to know what's consuming
> > the most CPU, so I want to be able to fire an event N times per second
> > to tell me what bird is doing right at that specific moment, and listing
> > or counting tracepoint invocations won't necessarily tell me what's using
> > up the most CPU.
> 
> I didn't look at your patch, but from my understanding, when you see a
> CPU pike, you sent USR2 and it started "tracing". Use of tracepoints
> would do the same thing without having to send a signal.

The patch outputs a single location (config file + line number) sample
every time you send a SIGUSR2, so you have to send SIGUSR2 signals in
a loop to collect enough samples (and then aggregate those samples
somehow).

Sample output while running 'while true; killall -USR2 bird; done' while
having bird receive 1M /32 routes from a remote test peer is as follows
(where each line corresponds to one SIGUSR2 signal):

	# ./bird -c test.conf -s test.sock -f 
	test.conf:55 200.6.170.145/32
	test.conf:30 200.7.98.189/32
	test.conf:37 200.14.187.67/32
	test.conf:47 200.7.183.137/32
	test.conf:28 200.15.56.225/32
	test.conf:44 200.5.53.104/32
	test.conf:33 200.1.38.168/32
	test.conf:25 200.12.74.254/32
	test.conf:26 200.1.111.250/32
	test.conf:42 200.9.63.39/32
	test.conf:37 200.15.180.68/32
	test.conf:44 200.11.188.128/32
	test.conf:36 200.9.193.157/32
	test.conf:32 200.13.155.79/32
	test.conf:32 200.6.171.136/32
	test.conf:38 200.3.169.76/32
	test.conf:31 200.6.189.140/32
	test.conf:36 200.14.178.7/32
	test.conf:49 200.13.55.165/32
	test.conf:33 200.6.83.248/32
	test.conf:51 200.4.150.202/32
	test.conf:39 200.3.215.221/32
	test.conf:26 200.11.220.20/32
	test.conf:31 200.10.166.139/32
	test.conf:32 200.4.250.210/32
	test.conf:54 200.6.3.46/32
	test.conf:30 200.9.48.108/32
	test.conf:38 200.11.118.191/32
	test.conf:39 200.10.199.82/32
	test.conf:47 200.0.168.132/32
	[...]

> You could also have a tracepoint for "begin", a tracepoint for "end",
> you record the current filter by intercepting "begin", remove it on
> "end" and you can profile on CPU to know which is the current
> filter if BIRD is currently running on a CPU.

Right, my patch does something like that, but it unconditionally does
the "begin" and "end" actions, because each of them is only two global
variable assignments, the cost of which is likely negligible, so you
don't really need to conditionalize those with tracepoints -- and you'd
do the heavy lifting from a timer anyway.

Ideally, I would like to be able to use a timer-based profiling mode
(e.g. perf record -F) with stack trace collection, but then instead of
collecting bird stack traces, which would just show many stacks of
nested interpret() calls, I'd like it to collect traces of f_inst call
stacks.

(Right now, my patch only prints the top 'stack frame' when receiving
SIGUSR2, but there's no reason you wouldn't be able to instrument
interpret() to thread the stack 'frames', so that you'd be able to get
a full config language stack trace when a profiling event is triggered.)

That and JITting filter statements...

Thanks!
Lennert