BIRD crashing when --enable-debug is set

andreas at rammhold.de andreas at rammhold.de
Sun Nov 22 18:41:18 CET 2020


On 18:21 22.11.20, Toke Høiland-Jørgensen wrote:
> andreas at rammhold.de writes:
> > I've been playing with the BIRD source code and implementing a new
> > extension to Babel but that is not what this mail is about.
> 
> Still curious - which extension? :)

The V4 over/via V6 draft. I would link to it but it seems the IETF
website is currently unreachable (or I am being cloudflared…).

> > Breakpoint 1, add_tail (n=0x234ac30, l=0x241f620) at ./lib/lists.c:82
> > #0  add_tail (n=0x234ac30, l=0x241f620) at ./lib/lists.c:82
> > #1  sl_alloc (s=0x21bd7d0) at lib/slab.c:264
> 
> This is that call to add_tail in slab.c, with a few lines of context:
> 
> full_partial:
>   rem_node(&h->n);
>   add_tail(&s->full_heads, &h->n);
>   goto redo;
> 
> Since add_tail is being called immediately after rem_node(), and they I
> both inlines, my guess would be that this comes from the compiler
> optimising out the clearing of n->prev and n->next in rem_node() since
> they are immediately being reset.

Interesting. The compiler removes the first assignment becuase it is
rewritten (across function boundaries/after inlining) but doesn't
realize that it is being read?

> > Are these all false positives or should there be more cleanup code on
> > these list entries?
> 
> Speaking for the backtraces you saw in the Babel code, these all come
> from variants of the following pattern:
> 
> struct babel_msg_node *msgn = sl_alloc(p->msg_slab);
> /* initialise msgn->msg */
> add_tail(&queue, NODE msgn);
> 
> where babel_msg_node is just:
> 
> struct babel_msg_node {
>   node n;
>   union babel_msg msg;
> };
> 
> Since add_tail sets n->next and n->prev right after those asserts, no
> harm is actually done and the memory ends up fully initialised, so I'm
> not sure if there's any reason to do anything about it: It's really
> only a problem because of the debug asserts.

That was my hope as well. Maybe there are some attributes / qualifiers
that could be added to the variables (when the debug flag is set) to
prohibit this optimisation? Having these checks is probably useful (e.g.
for CI, tests & development) but only if it doesn't break the entire
software.


Andi


More information about the Bird-users mailing list