ospf: bird stuck in R state

Yury Shevchuk bird at hole.botik.ru
Mon Dec 22 20:40:08 CET 2008


On Sun, Dec 21, 2008 at 12:46:22AM +0100, Ondrej Zajicek wrote:
> What are your other experiences with running ospf with bird?

Thank you for asking :) Ospf mostly works (18 routers running BIRD
OSPF in a single area) but bites from time to time.  Some experiences
are below.  But despite the problems, I feel BIRD worth the efforts
put in debugging.  It does the job while being really small, like a
real bird -- only the essential feathers, no fat body.

-- sizif


The experiences:

1) Cryptographic authentication is broken.  I have fixed something (*)
but have put the problem on hold after encountering the sequence
number problem.  The problem: BIRD sometimes reorders outgoing ospf
packets and when this happens, the peer reports: <TRACE> ospf1:
OSPF_auth: lower sequence number.


2) There's a bug in LSDB synchronisation process which prevents
neighborhood establishment in the presence of flapping OSPF ASE.  The
process advances to LOADING and then drops back to EXSTART, the slave
never reaching the FULL state.  I observed the sympthom while trying
to handle 1000+ ASEs exported to OSPF from BGP.  The bug manifests
itself like this:

03-12-2008 21:04:33 <WARN> Received bad LS req from: 192.168.73.48 looking: RT: 
192.168.73.120, ID: 209.85.128.1, Type: 5

I have worked around the bug by reducing the number of routes exported
from BGP to OSPF.

This spot within ospf_lsupd_send_list seems to be relevant to the
problem:

    if ((en = ospf_hash_find(po->gr, oa->areaid, llsh->lsh.id, llsh->lsh.rt,
                             llsh->lsh.type)) == NULL)
      continue;                 /* Probably flushed LSA */
    /* FIXME This is a bug! I cannot flush LSA that is in lsrt */



3) (not ospf bug though) netlink problem described here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=428865
The patch attached to the page seems to work well.



(*) a patch to proto/ospf/packet.c
--- /tmp/packet.c       2008-12-22 22:00:57.574539077 +0300
+++ packet.c    2008-11-17 23:44:21.000000000 +0300
@@ -180,12 +180,13 @@
 
       if(n)
       {
-        if(ntohs(pkt->u.md5.csn) < n->csn)
+        if(ntohl(pkt->u.md5.csn) < n->csn)
         {
-          OSPF_TRACE(D_PACKETS, "OSPF_auth: lower sequence number");
+          OSPF_TRACE(D_PACKETS, "OSPF_auth: lower sequence number 0x%08x < 0x%08x",
+                    ntohl(pkt->u.md5.csn), n->csn);
           return 0;
         }
-        n->csn = ntohs(pkt->u.md5.csn);
+        n->csn = ntohl(pkt->u.md5.csn);
       }
 
       MD5Init(&ctxt);



More information about the Bird-users mailing list