More BFD issues

Nico Schottelius nico.schottelius at ungleich.ch
Sat Aug 10 12:33:39 CEST 2024


Good morning bird'ers,

we have a bit of a strange error in regards to bfd, on two sessions we
get continuously the following error message:

server142:

--------------------------------------------------------------------------------
2024-08-10 09:50:25.533 <ERR> bfd1: Socket error: Destination address required
2024-08-10 09:50:25.738 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254)
2024-08-10 09:50:25.840 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254)
2024-08-10 09:50:26.287 <ERR> bfd1: Socket error: Destination address required
2024-08-10 09:50:26.512 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254)
2024-08-10 09:50:26.639 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254)
--------------------------------------------------------------------------------

The two devices with the incorrect TTL are openwrt devices. All routers
are running BIRD version 2.15.1.

Now things are getting even more interesting, but let me first show the
rough topology:

--------------------------------------------------------------------------------

s141             ------|------s123
(alpine linux)         |     (alpine linux)
s142              -- ibgp ---s122
(alpine linux)         |     (alpine linux)
                       |
vigir28          ----------- vigir29
(openwrt)                    (openwrt)

All connections are layer2, direct, vigirs only connect to servers, not
to each other.
--------------------------------------------------------------------------------

So now comes the interesting facts:

- s141 has bfd up with s122, s123, s142, vigir28
- s142 has bfd up with s122, s123, s141, no vigir
- s122 has bfd up with s123, s141, s142, no vigir
- s123 has bfd up with s122, s141, s142, vigir29
- vigir28 has bfd s141
- vigir29 has bfd s123

Each and every device can ping the other one, so I am strangely confused
as to what is going on.

Additionally, probably correctly, the bgp sessions fail to initiate
and/or are down:

--------------------------------------------------------------------------------
s122:

ibgp_s123 BGP        ---        up     2024-07-10    Established   
ibgp_s141 BGP        ---        up     2024-08-04    Established   
ibgp_s142 BGP        ---        up     2024-08-08    Established   
ibgp_vigir28 BGP        ---        start  10:17:53.412  Idle          BGP Error: Hold timer expired
ibgp_vigir29 BGP        ---        start  10:23:57.781  Idle          BGP Error: Hold timer expired

s123: (bfd & bgp fluctuate for vigir28)
ibgp_s122 BGP        ---        up     2024-07-10    Established   
ibgp_s141 BGP        ---        up     2024-08-04    Established   
ibgp_s142 BGP        ---        up     2024-08-08    Established   
ibgp_vigir28 BGP        ---        up     10:21:16.449  Established   
ibgp_vigir29 BGP        ---        up     2024-08-08    Established   

s141:
ibgp_s122 BGP        ---        up     2024-08-04    Established   
ibgp_s123 BGP        ---        up     2024-08-04    Established   
ibgp_s142 BGP        ---        up     2024-08-08    Established   
ibgp_vigir28 BGP        ---        start  10:20:42.819  OpenSent      Socket: Connection closed
ibgp_vigir29 BGP        ---        start  10:25:10.338  OpenSent      BGP Error: Hold timer expired

s142:
ibgp_s122 BGP        ---        up     2024-08-08    Established   
ibgp_s123 BGP        ---        up     2024-08-08    Established   
ibgp_s141 BGP        ---        up     2024-08-08    Established   
ibgp_vigir28 BGP        ---        start  10:27:20.079  OpenSent      BGP Error: Hold timer expired
ibgp_vigir29 BGP        ---        start  10:26:21.088  OpenSent      BGP Error: Hold timer expired

vigir28:
bgp1       BGP        ---        start  10:26:00.453  OpenConfirm   Received: Hold timer expired
bgp2       BGP        ---        up     10:21:30.592  Established   
bgp3       BGP        ---        start  10:25:09.416  OpenConfirm   BGP Error: Hold timer expired
bgp4       BGP        ---        start  10:25:07.000  OpenConfirm   Socket: Host is unreachable

vigir29:
bgp1       BGP        ---        start  10:24:21.241  OpenConfirm   Socket: Host is unreachable
bgp2       BGP        ---        up     2024-08-08    Established   
bgp3       BGP        ---        start  10:25:15.541  OpenConfirm   Socket: Host is unreachable
bgp4       BGP        ---        start  10:28:48.584  Idle          Received: Hold timer expired
--------------------------------------------------------------------------------


Some configuration samples:

--------------------------------------------------------------------------------
vigir28:

log syslog all;
router id 0.0.1.28;

protocol device { }
protocol bfd { }

# Just announce, no kernel interaction
protocol static static6 {
        ipv6;
        route 2a0a:e5c0:10:10::/96 unreachable;
}
# for getting iBGP routes
protocol babel {
        interface "br-lan", "wan" { type wired; authentication mac; password "...";
};
        ipv6 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); };
}
protocol kernel kernel_v6 {
        ipv6 { export where source ~ [ RTS_BABEL ]; };
}
protocol bgp {
        local as 213081;
        neighbor 2a0a:e5c0:10:1::122 as 213081;
        direct;
        bfd on;

        ipv6 {
                import none;
                export where source ~ [ RTS_STATIC ];
        };
}

(repeat bgp session for each ibgp peer)
--------------------------------------------------------------------------------

And s141:

--------------------------------------------------------------------------------
log stderr all;

protocol device { }

# Using BFD virtually everywhere, enable it globally
protocol bfd { }
protocol babel {
        interface "eth*" {
                type wired;
                authentication mac;
                password "...";
        };

        # This matches the default of babeld: redistribute all addresses
        # configured on local interfaces, plus re-distribute all routes received
        # from other babel peers.

        ipv4 {
                export where (source = RTS_DEVICE) || (source = RTS_BABEL);
        };
        ipv6 {
                export where (source = RTS_DEVICE) || (source = RTS_BABEL);
        };
}
protocol bgp ibgp_vigir28 {
    local as myas;
    neighbor 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc as myas;
    direct;
    bfd on;

    ipv6 {
      import all;
      export filter static_and_bgp;

      gateway recursive;
    };

    ipv4 {
      import all;
      export filter static_and_bgp;

      gateway recursive;
      extended next hop on;
    };
}

(repeat bgp session for each ibgp peer)
--------------------------------------------------------------------------------

s122 + s123 are virtually identical, as well as s141+s142, their
configurations are generated.

Any help in this direction would be appreciated. My next try will
probably be to disable bfd on all sessions to see if the bgp sessions
then stay up.

Best regards,

Nico

-- 
Sustainable and modern Infrastructures by ungleich.ch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 873 bytes
Desc: not available
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20240810/0ee4c54a/attachment.sig>


More information about the Bird-users mailing list