Invalid NEXT_HOP attribute for OSPF known route

Nico Schottelius nico.schottelius at ungleich.ch
Thu Dec 16 14:37:00 CET 2021


Ondrej Zajicek <santiago at crfreenet.org> writes:
> Yes, this is kind of confusing error message (as i noted in response
> to Simon Ruderich).

If there is one thing I'd suggest to improve first: print out the
specific route and the next_hop - this would already help so much to
debug the issue.

>> In detail:
>>
>> router1, router2 are peered to apu-router1,apu-router2 via OSPF + BGP.
>>
>> apu-router1,apu-router2 are peered to a set of kubernetes hosts.
>>
>> The goal is to have router1 + router2 import the routes sent by the
>> kubernetes hosts:
>>
>>            router1      router2---------|
>>               |  \          |           |
>>               |   \         |           |
>>               |    \        |           |
>>      apu-router1    apu-router2         |
>>         .     |           .             |
>>               |--------------------------
>>         .                 .
>>      [ kubernetes cluster via apu-routers ]
>>
>>
>> The problem: router1+router reject the routes with:
>>
>>     Dec 14 20:33:51 router1 daemon.err bird: apu_router1_place5_ungleich_ch_v6: Invalid NEXT_HOP attribute
>>
>> The setup:
>>
>>     router1, router2, apu-router1, apu-router2 = ASN209898
>>     kubernetes hosts = ASN65533
>>     kubernetes peers with apu-routers only.
>
> So the BGP link between kubernetes and APU-ROUTER is EBGP, while between
> APU-ROUTER and ROUTER is IBGP?

Correct.

> I expect it is in multihop /
> gateway-recursive mode, as it is default for IBGP?

It's in direct mode for all links, as the APU-ROUTERS have physical links
inside the kubernetes cluster as well as physical connection to the ROUTER.

>> The routes:
>>     Kubernetes announces parts of 2a0a:e5c0:0:12::/64 and
>>     2a0a:e5c0:0:13::/64, for instance the route
>>     2a0a:e5c0:0:12:b01a:5ae3:1bd4:1e00/122.
>>
>>     Kubernetes nodes live in 2a0a:e5c0::/64.
>>
>>     apu-routers have a leg in 2a0a:e5c0::/64, via eth1.2. They reach the
>>     cluster directly. They have the routes.
>>
>>     routers1+2 receive the route for 2a0a:e5c0::/64 via ospf:
>>
>>     bird> show route 2a0a:e5c0::/64
>>     Table master6:
>>     2a0a:e5c0::/64       unicast [ospf6 17:08:18.515] * I (150/20) [0.0.0.47]
>>                          via fe80::20d:b9ff:fe57:2f91 on bond0.8
>> The apu-routers:
>>     - They import the route [0]
>
> Could you show 'show route all' on apu-router for failed routes to see
> their BGP_NEXT_HOP attribute? And also ideally tcpdump output to see
> BGP_NEXT_HOP as sent from apu-router to router?

bird> show route all 2a0a:e5c0:0:12:b01a:5ae3:1bd4:1e00/122
Table master6:
2a0a:e5c0:0:12:b01a:5ae3:1bd4:1e00/122 unicast [k8s_p5_1_6 2021-12-14 from 2a0a:e5c0::225:90ff:fe1e:3e74] * (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100
                     unicast [k8s_p5_1_5 2021-12-14 from 2a0a:e5c0::225:90ff:fe1a:d680] (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100
                     unicast [k8s_p5_1_2 2021-12-14] (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100
                     unicast [k8s_p5_1_1 2021-12-14 from 2a0a:e5c0::225:90ff:fe1a:d682] (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100
                     unicast [k8s_p5_1_3 2021-12-14 from 2a0a:e5c0::225:90ff:fe1e:3e64] (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100
                     unicast [k8s_p5_1_4 2021-12-14 from 2a0a:e5c0::225:90ff:fe1e:62d6] (100) [AS65533i]
        via 2a0a:e5c0::225:90ff:fe1e:3e62 on eth1.2
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 65533
        BGP.next_hop: 2a0a:e5c0::225:90ff:fe1e:3e62
        BGP.local_pref: 100

Pcap is attached as well.

>>     - They export the route to the routers [1]
>
> Could you also show 'show protocol all' for the session from
> apu-router to the kubernetes hosts?

bird> show protocols
...
k8s_p5_1_1 BGP        ---        up     2021-12-14    Established
k8s_p5_1_2 BGP        ---        up     2021-12-14    Established
k8s_p5_1_3 BGP        ---        up     2021-12-14    Established
k8s_p5_1_4 BGP        ---        up     2021-12-14    Established
k8s_p5_1_5 BGP        ---        up     2021-12-14    Established
k8s_p5_1_6 BGP        ---        up     2021-12-14    Established
...
bird> show protocols all k8s_p5_1_6
Name       Proto      Table      State  Since         Info
k8s_p5_1_6 BGP        ---        up     2021-12-14    Established
  BGP state:          Established
    Neighbor address: 2a0a:e5c0::225:90ff:fe1e:3e74
    Neighbor AS:      65533
    Local AS:         209898
    Neighbor ID:      77.29.68.245
    Local capabilities
      Multiprotocol
        AF announced: ipv6
      Route refresh
      Graceful restart
      4-octet AS numbers
      Enhanced refresh
      Long-lived graceful restart
    Neighbor capabilities
      Multiprotocol
        AF announced: ipv6
      Route refresh
      Graceful restart
        Restart time: 120
        AF supported: ipv6
        AF preserved:
      4-octet AS numbers
      ADD-PATH
        RX: ipv6
        TX: ipv6
      Enhanced refresh
      Long-lived graceful restart
    Session:          external AS4
    Source address:   2a0a:e5c0::46
    Hold timer:       158.633/240
    Keepalive timer:  53.250/80
  Channel ipv6
    State:          UP
    Table:          master6
    Preference:     100
    Input filter:   ungleich_networks_no_igp
    Output filter:  REJECT
    Routes:         4 imported, 0 exported, 4 preferred
    Route change stats:     received   rejected   filtered    ignored   accepted
      Import updates:              4          0          0          0          4
      Import withdraws:            0          0        ---          0          0
      Export updates:             43          6         37        ---          0
      Export withdraws:            0        ---        ---        ---          0
    BGP Next hop:   2a0a:e5c0::46 fe80::20d:b9ff:fe57:2f91


>> The routers:
>>     - print 4x the Invalid NEXT_HOP attribute, once per exported
>>     kubernetes network
>>     - They ignore the 4 routes [2]
>>
>> Question: why does bird on the routers not accept the routes? Or is
>> there a different problem I am not seeing? Aside from that, shouldn't
>> bird on the apu-routers set itself as nexthop for the kubernetes routes?
>
> Not, because sending it to routerx over IBGP link, where BGP_NEXT_HOP is
> kept unmodified by default (unless 'next hop self' option is used).

Correct, sorry for my confusion. Actually setting the next hop self does
to some degree fix the problem, with the usual disadvantage of rewriting
the attribute.

Curious to what we are running into here. For reference the bgp
configuration sections:

>From the routers:

--------------------------------------------------------------------------------

protocol bgp apu_router1_place5_ungleich_ch_v6 {
        local     as 209898;
        neighbor 2a0a:e5c0:1:8::46 as 209898;
        direct;

        ipv6 {

            # What we accept from this protocol -> others send us
            import filter ungleich_networks;

            # What we export into this protocol -> what we send
            export filter ungleich_networks_no_igp;
        };
        # Highest preference on internal traffic
        default bgp_local_pref pref_normal;
}

--------------------------------------------------------------------------------

>From the apu-routers:

protocol bgp router1_place5_ungleich_ch_v6 {
        local     as 209898;
        neighbor 2a0a:e5c0:1:8::3 as 209898;
        direct;

        ipv6 {

            # What we accept from this protocol -> others send us
            import none;

            # What we export into this protocol -> what we send
            export filter ungleich_networks_no_igp;
        };
        # Highest preference on internal traffic
        default bgp_local_pref pref_normal;
}

protocol bgp k8s_p5_1_v6 {
        local     as 209898;
        neighbor range 2a0a:e5c0:0::/64 as 65533;
        dynamic name "k8s_p5_1_";

        ipv6 {

            # What we accept from this protocol -> others send us
            import filter ungleich_networks_no_igp;

            # What we export into this protocol -> what we send
            export none;
        };
        # Highest preference on internal traffic
        default bgp_local_pref pref_normal;
}

--------------------------------------------------------------------------------

I actually just realised that the "k8s_p5_1_v6" protocol did not have
the direct statement, however adding it does not change the situation
(probably because bird detects it as a direct link anyway).

Best regards from the snowy mountains,

Nico

--
Sustainable and modern Infrastructures by ungleich.ch


More information about the Bird-users mailing list