bug report:OSPF adjacency creation

Mohammad Amin Shoaie mashoaie at gmail.com
Mon Oct 24 12:06:15 CEST 2011


Dear bird developers and users,

I'm having problem with the OSPF adjacency creation. This problem
imposes an additional 30min (LSRefreshTime) delay to the adjacency
creation process under specific condition.
I guess bird's developers are somehow aware of this problem because
it's indicated in "line:675, file:lsupd.c, bird:1.3.4" where the
developer has commented "/* FIXME pg145 (6) */". Therefore, aside from
the rest of this email in which I'm
explaining how this problem occurs, I was wondering what's the resean
that the developers did not fix it?


How Does The Problem Occur?

Consider a simple point-to-point networks depicted in Figure 1. In
this network two nodes with router id 0.0.0.1 and 0.0.0.2 are
connected via interfaces "Ia" and "Ib". If we suppose that the
neighbor adjacency is already formed, then each node has two LSA
instances in its link-state database. Obviously link-state database
content for these two nodes is similar (probably different just in age
field). Therefore, LSA headers can be summarized by "LS-type",
"LS-ID", "Advertising Router", "LS sequence number" and "LS checksum"
fields as: LSA1-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x,
checksum:ch1) LSA2-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y,
checksum:ch2).



                         +----+Ia   Ib+----+
              LSA1    |RT1|---------|RT2|    LSA1
              LSA2   +----+         +----+   LSA2

LSA1-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x, checksum:ch1)
LSA2-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y, checksum:ch2)

Fig.1



The data included in LSA1 indicate that router 0.0.0.1 is connected to
router 0.0.0.2 via interface "Ib" and similarly the
data included in LSA2 indicate router 0.0.0.2 is connected to router
0.0.0.1 via the interface "Ia".

Now if the link between these two nodes is removed bidirectionally as
demonstrated in Figure 2 and the RouterDeadInterval timer is also
flushed, each router has to delete the former adjacency. This action
obviously translates to generating new LSAs, individually named LSA1*
and LSA2*. According to the method implemented by developer the header
of these two new LSAs (excepting the "age" and "checksum" field) would
be the same as their former corresponding LSAs. It means
LSA1*-Header=(type:1, rt:0.0.0.1, id:0.0.0.1,  sqNo:x, checksum:ch1*)
and LSA2*-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:x,
checksum:ch2*). However the data content of corresponding LSAs is
different since the previous adjacency is removed at each side.



                         +----+Ia   Ib+----+
             LSA1*    |RT1|           |RT2|    LSA1
             LSA2    +----+         +----+   LSA2*

LSA1*-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x, checksum:ch1*)
LSA2-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y, checksum:ch2)

LSA1-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x, checksum:ch1)
LSA2*-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y, checksum:ch2*)

Fig.2



Attention: As I'm going to explain in detail, the key condition in
which the problem occurs is that during generation of LSA1* and LSA2*,
we have "ch1>ch1* and ch2*>ch2". The reason behind this condition can
be analysed separately. Yet, the point is that it DOES occur randomly
in frequent examinations I hava made.

Now suppose the link between two nodes is restored as in Figure 3.
Thus, we start tracking adjacency creation process. At the beginning
each node transfers a summery of its link-state database through
database description packets. As a result router 0.0.0.1 and router
0.0.0.2 respectively send {LSA1*,LSA2} and {LSA1,LSA2*} to the other
side. Upon receiving database description packets at each side and
based on the condition checked in "line:232,
function:ospf_dbdes_reqladd, file:dbdes.c, bird:1.3.4" each received
LSA in database description packed is tested whether it is new or not.
If it's new, the LSA is added to link-state request list otherwise
it's discarded. The condition of being new is evaluated correct either
when there is no instance of such LSA in link-state database or when
the received LSA is more recent than that one occupied in link-state
database. Here the term more recent is equivalent to RFC2328 section
13.1. (For two LSAs with similar type, id and advertising router, the
comparison is performed using first:sequence number, second:checksum
and third:age).

Receiving database description packets, router 0.0.0.2 starts checking
whether each included LSA should be added to link-state request list
or not. LSA1* and LSA2 respectively have the same sequence number
compared with LSA1 and LSA2*. Therefore, according to the supposed
condition (ch1>ch1* and ch2*>ch2) LSA1 and LSA2* are respectively
considered more recent than LSA1* and LSA2. As a result router 0.0.0.1
adds LSA1 and LSA2* to its link-state request list. With the same
reasoning router 0.0.0.2 discards LSA1* and LSA2 and right after
finishing database exchange process it establishes the new adjacency
by generating a new LSA, named LSA2** which has the same sequence
number but different age, content and checksum in comparison with
LSA2*.



                         +----+Ia   Ib+----+
             LSA1*    |RT1|---------|RT2|    LSA1
             LSA2    +----+         +----+   LSA2**

LSA1*-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x, checksum:ch1*)
LSA2-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y, checksum:ch2)

LSA1-Header=(type:1, id:0.0.0.1, rt:0.0.0.1, sqNo:x, checksum:ch1)
LSA2**-Header=(type:1, id:0.0.0.2, rt:0.0.0.2, sqNo:y, checksum:ch2**)

Fig.3



Back to the scenario, router 0.0.0.1 asks for LSA1 and LSA2* as they
exist in its link-state request list. In response to this request
router 0.0.0.2 prepares a link-state update packet consist of LSA2**
and LSA1 (Note that immediate adjacency completion for router 0.0.0.2
replaces LSA2* with LSA2**). As router 0.0.0.1 receives LSA1,
according to RFC2328 section 13.4, it shall recognize this LSA as self
originated LSA and thus, it replies to this LSA with increasing the
sequence number and flooding it back to the router 0.0.0.2. However,
the case for receiving LSA2** is different. It may not be obvious in
the fist look but it's true that LSA2** is exactly the same as LAS2 by
having the same "LS-type", "LS-ID", "Advertising Router", "LS sequence
number" and "LS checksum" (they both indicate that router 0.0.0.2 has
an adjacency with router 0.0.0.1 via the similar interface). Thus if
the difference of the age fields is less than MaxAgeDiff they must be
interpreted as similar LSAs.

Upon receiving LSA2** and according to the code implemented in
"line:678, function:ospf_lsupd_receive, file:lsupd.c, bird:1.3.4",
this LSA is simply replied by a link-state acknowledge packet without
checking whether it's present in link-state request list or not.
Consequently, router 0.0.0.1 repeatedly asks for LSA2* update and
router 0.0.0.2 repeatedly answers with LSA2** which is similar to LSA2
occupied in router 0.0.0.1 database. This loop continues until
LSRefreshTime timer for LSA2** is flushed and router 0.0.0.2,
originates LSA2*** with the new sequence number as
LSA2***-Header=(type:1, rt:0.0.0.2, id:0.0.0.2,  sqNo:y+1,
checksum:ch2***). Here, the router 0.0.0.1 accepts LSA2*** as the more
recent LSA and in a happy ending this router establishes the new
adjacency.

Now here is my question: RFC2328 page145 rule number 6 can be
considered as a solution for this condition in which after receiving
the less recent LSA as the one requested through link-state request
list the node emits BadLSReq. As a result database exchange process
restarts and in the new round we have no more the previous problem.
So why this rule is not applied?

I'd be glad to have your comments and suggestions on this issue

Regards
Amin Shoaie



More information about the Bird-users mailing list