BIRD 2.0.4 segfaulting on ARM
lorenz at irmhil.de
lorenz at irmhil.de
Fri Apr 26 16:49:21 CEST 2019
Hello again,
I narrowed the bug down to the kernel protocol. The config for debugging
is as follows:
--- snip ---
log syslog all;
router id 10.33.0.0;
protocol device {
scan time 15;
}
filter myrange {
if net ~ fd22:9c28:6cf6::/48 then accept;
reject;
};
protocol kernel {
# scan time 10;
ipv6 {
export all;
};
# learn;
}
protocol ospf v3 MyOSPF {
ipv6 {
import filter myrange;
export filter myrange;
};
area 0 {
networks {
fd22:9c28:6cf6::/48;
};
interface "eth0.1000" {
hello 2;
dead 5;
cost 10;
};
};
}
--- snap ---
If I remove the comment on "scan" or "learn" in the "protocol kernel"
section, I get the segfault.
Thanks for the code so far,
Lorenz
Am 26.04.19 um 13:39 schrieb Maria Jan Matejka:
> On 4/26/19 1:08 PM, lorenz at irmhil.de wrote:
>> Hello,
>>
>> after a "make clean", "./configure" and "make" I got this compile-time warning:
>>
>> --- snip ---
>>
>> sysdep/unix/io.c: In function ‘times_init’:
>> sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits]
>> if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40)))
>> ^
>> --- snap ---
>>
>>
>> But unfortunately the segmentation fault is still there. Is there anything I can do?
> Thank you for investigation; anyway I have no clue what may be happening.
> I'll try to install a local QEMU host to simulate this and then I'll return
> to you off-list if I don't happen to find any problem that may be related to this.
>
> Sadly, this seems too much to be some strange use-after-free (which may be caused
> by some architecture-specific misbehaviour) which I'm probably unable to debug
> only from core.
>
> Thank you
> Maria
>
>>
>> --- snip ---
>>
>> Core was generated by `bird -c bird.conf'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 ea__find (id=1554, e=0x81000601, e at entry=0x0) at nest/rt-attr.c:389
>> 389 if (e->flags & EALF_BISECT)
>> (gdb) #0 ea__find (id=1554, e=0x81000601, e at entry=0x0) at nest/rt-attr.c:389
>> a = <optimized out>
>> l = <optimized out>
>> r = <optimized out>
>> m = <optimized out>
>> a = <optimized out>
>> l = <optimized out>
>> r = <optimized out>
>> m = <optimized out>
>> #1 ea_find (e=e at entry=0xf1ede200, id=id at entry=1554) at nest/rt-attr.c:426
>> a = <optimized out>
>> #2 0x005267ba in nl_send_route (p=p at entry=0x575748, e=e at entry=0x587170, op=op at entry=1536, dest=<optimized out>, nh=<optimized out>, nh at entry=0x59c664) at sysdep/linux/netlink.c:1269
>> ea = <optimized out>
>> net = 0x588174
>> a = 0x59c630
>> eattrs = <optimized out>
>> bufsize = 284
>> priority = <optimized out>
>> r = 0xbee4f180
>> rsize = 312
>> metrics = {16, 0, 0, 3202675588, 12, 5681780, 5796832, 5800932, 3202675764, 5783224, 5414941, 2147483648, 5797092, 5800932, 3202675764, 0}
>> ews = {eattrs = 0x58e174, ea = 0x52aafb <krt_got_route+574>, visited = {5585652, 5797092, 5796884, 5585808}}
>> #3 0x005271b8 in nl_add_rte (e=0x587170, p=0x575748) at sysdep/linux/netlink.c:1351
>> a = 0x59c630
>> err = 0
>> a = <optimized out>
>> err = <optimized out>
>> nh = <optimized out>
>> #4 krt_replace_rte (p=p at entry=0x575748, n=n at entry=0x588174, new=new at entry=0x587170, old=old at entry=0x5874b0) at sysdep/linux/netlink.c:1387
>> err = 0
>> #5 0x0052a4d2 in krt_prune (p=0x575748) at sysdep/unix/krt.c:751
>> verdict = 2
>> new = <optimized out>
>> old = 0x5874b0
>> rt_free = 0x0
>> fn_ = 0x58817c
>> ff_ = 0x574694
>> count_ = <optimized out>
>> n = <optimized out>
>> t = 0x574330
>> t = <optimized out>
>> fn_ = <optimized out>
>> ff_ = <optimized out>
>> count_ = <optimized out>
>> n = <optimized out>
>> verdict = <optimized out>
>> new = <optimized out>
>> old = <optimized out>
>> rt_free = <optimized out>
>> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838
>> p = 0x575748
>> q = 0x5758a8
>> #7 0x004f2b86 in timers_fire (loop=loop at entry=0x56b7f0 <main_timeloop>) at lib/timer.c:235
>> ---Type <return> to continue, or q <return> to quit--- base_time = 74049570330
>> t = <optimized out>
>> #8 0x0052976e in io_loop () at sysdep/unix/io.c:2193
>> poll_tout = <optimized out>
>> timeout = <optimized out>
>> nfds = <optimized out>
>> events = 1
>> pout = <optimized out>
>> t = <optimized out>
>> s = <optimized out>
>> n = <optimized out>
>> fdmax = 256
>> pfd = 0x585df0
>> #9 0x004dabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884
>> use_uid = <optimized out>
>> use_gid = <optimized out>
>> conf = <optimized out>
>> (gdb) quit
>>
>> --- snap ---
>>
>>
>>
>> Am 26.04.19 um 08:09 schrieb lorenz at irmhil.de:
>>> Hello again!
>>>
>>> I'm new to gdb - thank you for your quick advice.
>>>
>>> I ran bird again, about 10 seconds later it segfaulted again and dumped core.
>>>
>>> Looks like some strange metrics?
>>>
>>> I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it.
>>>
>>> Thanks for any support!
>>>
>>> Lorenz
>>>
>>>
>>> The backtrace is:
>>>
>>> --- snip ---
>>> ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20190426/6c6df942/attachment.html>
More information about the Bird-users
mailing list