bird 3.2.0 crashes: "Attempted to start channel's already started export"

Arkadiusz Miśkiewicz arekm at maven.pl
Wed Mar 18 19:39:54 CET 2026


Upgraded to 3.2.0 (from 1.x) on ancient i686 machine, running bird with 
-R and

#0  0xb7ef9569 in __kernel_vsyscall ()
#1  0xb7cc7ab7 in ?? () from /lib/libc.so.6
#2  0xb7c71171 in raise () from /lib/libc.so.6
#3  0xb7c59216 in abort () from /lib/libc.so.6
#4  0x08148c71 in bug (msg=msg at entry=0x815e7b8 "%s.%s: Attempted to 
start channel's already started export") at sysdep/unix/log.c:412
#5  0x080af197 in channel_start_export (c=c at entry=0x9ab5e60) at 
nest/proto.c:780
#6  0x080b2d0b in graceful_recovery_done (_=0x81d30d0 
<_graceful_recovery_context+16>) at nest/proto.c:2082
#7  0x08090182 in ev_run_list_limited (l=<optimized out>, 
limit=4294967293, limit at entry=4294967295) at lib/event.c:338
#8  0x0813eb77 in io_loop () at sysdep/unix/io.c:2646
#9  0x0804b32e in main (argc=2, argv=0xbfea0d24) at sysdep/unix/main.c:1111



Fed claude with the problem and its "findings" were like this below 
(bird doesn't crash with this "fix" but I can't tell if this is the 
right fix or papering of the real bug)

Kernel: skip export start in krt_init_scan() when gr_wait is set

When graceful restart recovery is active (-R), krt_start() sets gr_wait=1
on the kernel channel to defer export until recovery completes (via
graceful_recovery_done()). However, krt_init_scan() unconditionally calls
channel_start_export() when transitioning from KPS_INIT, without checking
gr_wait. When graceful_recovery_done() later runs, it finds gr_wait=1 and
tries to start the export again, triggering:

   bug("%s.%s: Attempted to start channel's already started export")

Reported on bird-users by Christoph (Jan 2026) for BIRD 3.1.5.
Still present in 3.2.0 despite bda2178e ("Kernel: pause exports also
on restart until scan is done") which set rt_notify=NULL in krt_start()
but didn't address the krt_init_scan() path.

Skip channel_start_export() in krt_init_scan() when gr_wait is set,
letting graceful_recovery_done() handle it instead.

--- a/sysdep/unix/krt.c
+++ b/sysdep/unix/krt.c
@@ -449,7 +449,9 @@ krt_init_scan(struct krt_proto *p)
      case KPS_INIT:
        /* Allow exports now */
        p->p.rt_notify = krt_rt_notify;
-      channel_start_export(p->p.main_channel);
+      /* When gr_wait is set, graceful_recovery_done() will start the 
export */
+      if (!p->p.main_channel->gr_wait)
+       channel_start_export(p->p.main_channel);
        rt_refresh_begin(&p->p.main_channel->in_req);
        p->sync_state = KPS_FIRST_SCAN;
        return 1;


   - The upstream fix bda2178e / 47ed4aa0 is already in 3.2.0 — it sets 
rt_notify = NULL in krt_start() to defer exports
   - But it's incomplete — krt_init_scan() still unconditionally calls 
channel_start_export() on first scan timer, without checking gr_wait
   - This is a new bug in the 3.2.0 release, not present in older code 
(since 91c98efe introduced the KPS_INIT state machine)
   - The fix on the 372-release-v3.2.1 branch has the same code — the 
bug will still be in 3.2.1 unless they pick up a further fix
   - Our patch is the correct minimal fix for this



-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )



More information about the Bird-users mailing list