[PATCH] Bus error on ARMv7 when using OSPF

Ondrej Zajicek santiago at crfreenet.org
Thu Jun 24 14:08:39 CEST 2021


On Fri, Jun 18, 2021 at 05:06:27PM +0100, Matthew Reeve wrote:
> Hi, yes sure, here it is. Please let me know if this does not give you what
> you need.
> 
> Thanks!


Thanks, that looks like an issue with slists. We had similar issue with
lists code in the past and reworked them to be more conservative. Will
check that.


> root at OpenWrt:/tmp# gdb debug/bird bird.1623776146.6869.7.core
> GNU gdb (GDB) 10.1
> Copyright (C) 2020 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "arm-openwrt-linux".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>     <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from debug/bird...
> [New LWP 6869]
> Core was generated by `./bird'.
> Program terminated with signal SIGBUS, Bus error.
> #0  ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646
> 1646    proto/ospf/rt.c: No such file or directory.
> (gdb) bt
> #0  ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646
> #1  ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1698
> #2  ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1688
> #3  ospf_disp (timer=<optimized out>) at proto/ospf/ospf.c:468
> #4  0x00061574 in timers_fire (loop=0xc4878 <main_timeloop>) at
> lib/timer.c:235
> #5  0x00012ca8 in io_loop () at sysdep/unix/io.c:2195
> #6  main (argc=<optimized out>, argv=<optimized out>) at
> sysdep/unix/main.c:939
> (gdb)
> 
> On 18/06/2021 16:16, Ondrej Zajicek wrote:
> > On Mon, Jun 14, 2021 at 04:25:04PM +0100, Matthew Reeve wrote:
> > > Hi,
> > > 
> > > when using bird 2.0.8 on openwrt 21.02 (and other versions) on a Netgear
> > > R7800 router, if the OSPF protocol is used, either v2 or v3, bird
> > > immediately crashes on startup with:
> > > 
> > > Fri Jun 11 14:41:11 2021 daemon.info bird: Started
> > > Fri Jun 11 14:41:11 2021 kern.err kernel: [ 3500.853248] Alignment trap: not
> > > handling instruction f44c0a1f at [<00035848>] Fri Jun 11 14:41:11 2021
> > > kern.alert kernel: [ 3500.853283] 8<--- cut here ---
> > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.859363] Unhandled fault:
> > > alignment exception (0x801) at 0x007e0624
> > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.862443] pgd = 0bbef4fd
> > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.868821] [007e0624]
> > > *pgd=5d6ca835, *pte=5c40b75f, *ppte=5c40bc7f
> > > 
> > > 
> > > This router uses an ARMv7 processor and the issue seems to be to do with
> > > memory alignment issues. I've debugged it and traced it to an access to the
> > > top_hash_entry struct. I've found that if I add the PACKED macro to the
> > > struct definition then it fixes the problem, as per this patch:
> > Hi
> > 
> > Thanks, could you try to get backtrace from the coredump using gdb to see
> > where is the invalid access?
> > 
> > 

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."



More information about the Bird-users mailing list