BFD and Juniper SRX inter-op issue

Alexander Shevchenko pepelac at gmail.com
Thu Sep 2 09:40:56 CEST 2021


Hi.
We are observing the same issue with both BIRD versions 1.6/2.0 and Juniper
qfx5120.
It seems that there is some issue with calculating the detection time in
JUNOS https://datatracker.ietf.org/doc/html/rfc5880#section-6.8.4
JUNOS is using old bfd.RemoteDiscr

      The remote discriminator for this BFD session.  This is the
      discriminator chosen by the remote system, and is totally opaque
      to the local system.  This MUST be initialized to zero.  If a
      period of a Detection Time passes without the receipt of a valid,
      authenticated BFD packet from the remote system, this variable
      MUST be set to zero.

>From the BIRD point of view, the BFD session between 10.10.255.1 <->
10.10.255.0 is in DOWN state.
bird.log with bfd debug
2021-06-09 13:58:53 <TRACE> bfd1: Shutting down
2021-06-09 13:58:53 <TRACE> bfd1: State changed to flush
2021-06-09 13:58:53 <TRACE> bfd1: State changed to down
2021-06-09 13:58:53 <TRACE> bfd1: Initializing
2021-06-09 13:58:53 <TRACE> bfd1: Starting
2021-06-09 13:58:53 <TRACE> bfd1: Session to 10.10.255.0 added
2021-06-09 13:58:53 <TRACE> bfd1: Session to 10.10.255.2 added
2021-06-09 13:58:53 <TRACE> bfd1: Connected to table master
2021-06-09 13:58:53 <TRACE> bfd1: State changed to feed
2021-06-09 13:58:53 <TRACE> bfd1: Sending CTL to 10.10.255.2 [Down]
2021-06-09 13:58:53 <TRACE> bfd1: State changed to up
2021-06-09 13:58:53 <TRACE> bfd1: Sending CTL to 10.10.255.0 [Down]
2021-06-09 13:58:53 <RMT> bfd1: Bad packet from 10.10.255.0 - unknown
session id (1932316248)
2021-06-09 13:58:53 <RMT> bfd1: Bad packet from 10.10.255.0 - unknown
session id (1932316248)
2021-06-09 13:58:53 <TRACE> bfd1: Sending CTL to 10.10.255.0 [Down]
2021-06-09 13:58:53 <TRACE> bfd1: Sending CTL to 10.10.255.2 [Down]
2021-06-09 13:58:53 <RMT> bfd1: Bad packet from 10.10.255.0 - unknown
session id (1932316248)
2021-06-09 13:58:55 <RMT> bfd1: Bad packet from 10.10.255.0 - unknown
session id (1932316248)
2021-06-09 13:58:55 <TRACE> bfd1: Sending CTL to 10.10.255.0 [Down]
2021-06-09 13:58:55 <TRACE> bfd1: Sending CTL to 10.10.255.2 [Down]


bird> show bfd sessions
bfd1:
IP address                Interface  State      Since       Interval
 Timeout
10.10.255.0              eno1       Down       13:58:52      2.000    0.000
10.10.255.2              ens2f0     Up         15:29:07      2.000    6.000


>From the qfx5120 the same session is UP state
adm at leaf> show bfd session address 10.10.255.1 extensive
                                                  Detect   Transmit
Address                  State     Interface      Time     Interval
 Multiplier
10.10.255.1             Up        et-0/0/5.0     10.000    1.000        3
 Client BGP, TX interval 1.000, RX interval 1.000
Session up time 04:55:53
Local diagnostic None, remote diagnostic None
Remote state Up, version 1
Session type: Single hop BFD
Min async interval 1.000, min slow interval 1.000
Adaptive async TX interval 1.000, RX interval 1.000
Local min TX interval 1.000, minimum RX interval 1.000, multiplier 3
Remote min TX interval 0.100, min RX interval 0.100, multiplier 10
Local discriminator 30, remote discriminator 1932316248
Echo mode disabled/inactive, no-absorb, no-refresh, update-adj
  Session ID: 0x0

1 sessions, 1 clients
Cumulative transmit rate 1.0 pps, cumulative receive rate 1.0 pps

Jun  9 15:56:14 Received Upstream SetSession (2) len 158:
Jun  9 15:56:14    Version (0) len 1: 255
Jun  9 15:56:14    ClientName (2) len 4: BGP
Jun  9 15:56:14    IfName (4) len 11: et-0/0/5.0
Jun  9 15:56:14    IfIndex (3) len 4: 568
Jun  9 15:56:14    DestAddr (6) len 8: 10.10.255.1
Jun  9 15:56:14    Handle (13) len 4: 0xf000000
Jun  9 15:56:14    TxInterval (7) len 4: 1000000
Jun  9 15:56:14    RxInterval (8) len 4: 1000000
Jun  9 15:56:14    HighTransmitInterval (10) len 4: 0
Jun  9 15:56:14    Multiplier (9) len 4: 3
Jun  9 15:56:14    HoldownInterval (24) len 4: 0
Jun  9 15:56:14    CntrlFlags (15) len 4: 0x0
Jun  9 15:56:14    RouteTblIdx (20) len 4: 10
Jun  9 15:56:14    SrcAddr (5) len 8: 10.10.255.0
Jun  9 15:56:14    AdaptationType (22) len 1: 0
Jun  9 15:56:14    HighDetectionTime (23) len 4: 0
Jun  9 15:56:14    RefreshDeadtime (25) len 4: 0
Jun  9 15:56:14    Unknown (30) len 1: (hex) 00

Jun  9 15:56:14 (bfdd_set_session_params:400) Session 10.10.255.1 (IFL
568): cur tx ivl 1000000, adapt tx ivl 1000000, pre tx ivl cur rx ivl
1000000, adapt rx ivl 1000000, pre tx ivl 1000000
Jun  9 15:56:14 (bfdd_set_session_params:453) Session 10.10.255.1 (IFL
568): cur tx ivl 1000000, adapt tx ivl 1000000, pre tx ivl cur rx ivl
1000000, adapt rx ivl 1000000, pre tx ivl 1000000
Jun  9 15:56:14 (bfdd_update_tx_intervals:510) Session 10.10.255.1 (IFL
568): cur tx ivl 1000000, new_invl 0(3,0,1000000,1000000)
Jun  9 15:56:14 (bfdd_update_tx_intervals:537) Session 10.10.255.1 (IFL
568): cur tx ivl 1000000, new_invl 1000000
Jun  9 15:56:14 (bfdd_update_tx_intervals:555) Session 10.10.255.1 (IFL
568): cur tx ivl 1000000, new_invl 1000000

Jun  9 15:56:14 Sent     Downstream SetAdj (11) len 139:
Jun  9 15:56:14    IfIndex (3) len 4: 568
Jun  9 15:56:14    SrcAddr (5) len 8: 10.10.255.1
Jun  9 15:56:14    HoldTime (14) len 8: 10 sec 0 nsec
Jun  9 15:56:14    NoAbsorb (15) len 1: True
Jun  9 15:56:14    NoRefresh (16) len 1: True
Jun  9 15:56:14    ForceRefresh (17) len 1: False
Jun  9 15:56:14    DoNotAge (18) len 1: True
Jun  9 15:56:14    Distribute (27) len 1: True
Jun  9 15:56:14    LooseAuth (122) len 1: (hex) 00
Jun  9 15:56:14    Discriminator (63) len 4: 0x1e
Jun  9 15:56:14    DestAddr (8) len 8: 10.10.255.0
Jun  9 15:56:14    RtblIdx (24) len 4: 10
Jun  9 15:56:14    MinRecvTTL (68) len 1: 255
Jun  9 15:56:14    RecvOnMhopPort (101) len 1: 0
Jun  9 15:56:14    Unknown (153) len 1: (hex) 00
Jun  9 15:56:14    Unknown (154) len 4: (hex) 00 00 00 00
Jun  9 15:56:14    Unknown (165) len 4: (hex) 00 00 00 03
Jun  9 15:56:14    Unknown (211) len 1: (hex) 04
Jun  9 15:56:14    Unknown (167) len 1: (hex) 01
Jun  9 15:56:14 (bfdd_build_packet:2261) : Session 10.10.255.1 (IFL 568):
cur tx ivl 1000000

WBR,
Alexander Shevchenko

On Wed, Sep 1, 2021 at 5:03 PM Justin Cattle <j at ocado.com> wrote:

> Hi,
>
>
> Unfortunately not.  I hope we will raise a bug with Juniper, but it could
> take a while to get any resolution.
>
> It would also be interesting to know if there is something more Bird
> could/should be doing in this case - I hope for some developer feedback on
> the issue :)
>
>
> Cheers,
> Just
>
>
> On Mon, 23 Aug 2021 at 12:21, Oliver <bird-o at sernet.de> wrote:
>
>> Hi Just,
>>
>> do you made any progress on this? We have the same problem with Deutsche
>> Telekom as Upstream provider. They also have Juniper Router.
>>
>> Best regards,
>>
>> Oliver
>>
>> On Tue, 10 Aug 2021, Justin Cattle wrote:
>>
>> > Forgot to mention, in the bird logs I see lofs of message such as this:
>> >
>> > <RMT> bfd1: Bad packet from 1.1.11.2 - unknown session id (0123456789)
>> >
>> >
>> > Cheers,
>> > Just
>> >
>> >
>> > On Tue, 10 Aug 2021 at 13:20, Justin Cattle <j at ocado.com> wrote:
>> >
>> > > Hi,
>> > >
>> > >
>> > > I have encountered what seems to be a bug of sorts in the Juniper
>> > > implementation of BFD in at least their SRX340.
>> > >
>> > > We have no issues with the QFX series, where BFD seems to work as
>> expected
>> > > with bird.
>> > >
>> > > I'm wondering if there is anything we can do to handle this issue on
>> the
>> > > bird side, or if anyone has any insight that may shed some light on
>> the
>> > > behaviour we are seeing.
>> > >
>> > > Here is the issue summary:
>> > >
>> > >    - BFD timers are set quite conservatively
>> > >       - interval 4000 ms
>> > >       - multiplier 6
>> > >
>> > >
>> > >    - A BFD session between a bird endpoint and a juniper endpoint is
>> up
>> > >    and running at the start - all fine
>> > >    - If the you stop bird on the server, after the Detection time [
>> > >    currently 24 secs ], the BFD messages from the Juniper show status
>> as Down
>> > >    with the Diagnostic message Control Detection Time Expired.  You
>> can then
>> > >    start bird on the server again, and the two sides will agree
>> session info
>> > >    and BFD status goes Up.  - This is expected.
>> > >    - However, if you stop bird, but start it again before the
>> Detection
>> > >    time [ currently 24 secs ], like for a service restart, the BFD
>> messages
>> > >    from the Juniper never show as Down, and the two sides never agree
>> on a BFD
>> > >    session and BFD remains Down on the server but Up on the Juniper.
>> - Should
>> > >    a new session be established at this point ?
>> > >    - Once the Juniper gets stuck in the BFD status Up state, then you
>> can
>> > >    stop the bird for a long time [ over an hour at least ] , and the
>> Juniper
>> > >    never seems to notice [ the BFD packets still show state Up ]. -
>> This seems
>> > >    to be a bug n the juniper end - why should it never go Down in
>> this state ?
>> > >    - If the BFD session info is reset on the Juniper side, then the
>> two
>> > >    sides will agree session info and BFD status goes Up.
>> > >
>> > >
>> > > Does anyone have any thoughts ?
>> > >
>> > > Is there a packet bird can send, gratuitous or not, that can make the
>> > > juniper end realise it MUST reinitialize ?
>> > > Any config that can be tweaked to help ?
>> > >
>> > >
>> > > Cheers,
>> > > Just
>> > >
>> >
>> > --
>> >
>> >
>> > Notice:
>> > This email is confidential and may contain copyright material of
>> > members of the Ocado Group. Opinions and views expressed in this
>> message
>> > may not necessarily reflect the opinions and views of the members of
>> the
>> > Ocado Group.
>> >
>> > If you are not the intended recipient, please notify us
>> > immediately and delete all copies of this message. Please note that it
>> is
>> > your responsibility to scan this message for viruses.
>> >
>> > References to the
>> > "Ocado Group" are to Ocado Group plc (registered in England and Wales
>> with
>> > number 7098618) and its subsidiary undertakings (as that expression is
>> > defined in the Companies Act 2006) from time to time. The registered
>> office
>> > of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,
>> > Hatfield, Hertfordshire, AL10 9UL.
>>
>> --
>> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
>> phone: 0551-370000-0, mailto:kontakt at sernet.de
>> Gesch.F.: Dr. Johannes Loxen und Reinhild Jung
>> AG Göttingen: HR-B 2816 - http://www.sernet.de
>> Datenschutz: https://www.sernet.de/datenschutz
>>
>
> Notice:
> This email is confidential and may contain copyright material of members
> of the Ocado Group. Opinions and views expressed in this message may not
> necessarily reflect the opinions and views of the members of the Ocado
> Group.
>
> If you are not the intended recipient, please notify us immediately and
> delete all copies of this message. Please note that it is your
> responsibility to scan this message for viruses.
>
> References to the "Ocado Group" are to Ocado Group plc (registered in
> England and Wales with number 7098618) and its subsidiary undertakings (as
> that expression is defined in the Companies Act 2006) from time to time.
> The registered office of Ocado Group plc is Buildings One & Two, Trident
> Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20210902/c5c58590/attachment.htm>


More information about the Bird-users mailing list