(unfortunately) more BIRD-2.0.0 woes ...

Clemens Schrimpe clemens.schrimpe at gmail.com
Fri Dec 15 17:56:09 CET 2017


Hello again -

a couple of days ago I converted one of my test route-server setups from „dual BIRD-1.6.3“ to „single BIRD-2.0.0“ by simply „merging“ bird6.conf and bird.conf into a single (unified) new bird.conf - also observing the guidelines laid out in https://gitlab.labs.nic.cz/labs/bird/wikis/transition-notes-to-bird-2 <https://gitlab.labs.nic.cz/labs/bird/wikis/transition-notes-to-bird-2> . 

The results of this simple approach are between weird and stunning:

I now have a bird process which is consuming 100% of one CPU. It is still talking to me through birdc, but is very sluggish, i.e. bringing up peers or tearing them down takes a long time - you even see states, like „flushing“ in „show proto all XXX“ for quite some time, which you normally don’t even notice, because they are usually very transitive :-)

I haven’t gotten deeper into the problem, yet, but there appears to be a loop involving the affected peer’s Export Updates:

    Route change stats:     received   rejected   filtered    ignored   accepted
      Import updates:         670962          0          0          0     670962
      Import withdraws:          216          0        ---          0        216
      Export updates:      317034802  159525749  157509053        ---          0
      Export withdraws:          468        ---        ---        ---          0

(those numbers are increasing fast)

Some of my peers were/are already MP-BGP capable and so some IPv6 peers now came up with „Hey, you speak IPv4 too? Cool! Here are some 700.000 routes for your, my friend!“, which is totally explicable, yet it surprised my a teeny bit, anyway :-)

It should be made clear in the documentation, that BGP protocol definitions without explicit address-family → „channel“ definitions my eventually (see below) ending up MP-BGP sessions with the above effect!?

Most surprising: My multiple-inheritance approach, using multiple, chained template definitions for my BGP „protocols“/sessions/peers broke in the weirdest way: I now have multiple IPv6 and IPv4 channels on some of my protocols with different parameterizations:

bird> show protocols all BOGONS6
Name       Proto      Table      State  Since         Info
BOGONS6    BGP        ---        up     17:04:11.929  Established   
  BGP state:          Established
    Neighbor address: 2620:0:6b0::26e5:4207
    Neighbor AS:      65332
    Neighbor ID:      38.229.66.20
    Local capabilities
      Multiprotocol
        AF announced: ipv4 ipv4 ipv6 ipv6
      Route refresh
      Graceful restart
        Restart time: 120
        AF supported: ipv4 ipv6
        AF preserved:
      4-octet AS numbers
      ADD-PATH
        RX: ipv4 ipv6
        TX: ipv4 ipv6
      Enhanced refresh
    Neighbor capabilities
      Multiprotocol
        AF announced: ipv6
      Route refresh
      4-octet AS numbers
    Session:          external multihop AS4
    Source address:   2001:XXXXXXXXXXXX:2c9b
    Hold timer:       120.187/240
    Keepalive timer:  2.682/80
  Channel ipv6
    State:          UP
    Table:          master6
    Preference:     100
    Input filter:   ACCEPT
    Output filter:  REJECT
    Routes:         0 imported, 0 filtered, 0 exported
    Route change stats:     received   rejected   filtered    ignored   accepted
      Import updates:              0          0          0          0          0
      Import withdraws:            0          0        ---          0          0
      Export updates:          49787          0      49787        ---          0
      Export withdraws:          261        ---        ---        ---          0
    BGP Next hop:   2001:XXXXXXXXXXXX:2c9b
    IGP IPv6 table: master6
  Channel ipv4
    State:          DOWN
    Table:          master4
    Preference:     100
    Input filter:   ACCEPT
    Output filter:  REJECT
    IGP IPv4 table: master4
  Channel ipv6
    State:          UP
    Table:          master6
    Preference:     100
    Input filter:   no_defaults
    Output filter:  REJECT
    Routes:         0 imported, 0 exported
    Route change stats:     received   rejected   filtered    ignored   accepted
      Import updates:              0          0          0          0          0
      Import withdraws:            0          0        ---          0          0
      Export updates:          49787          0      49787        ---          0
      Export withdraws:          261        ---        ---        ---          0
    BGP Next hop:   2001:XXXXXXXXXXX:2c9b
    IGP IPv6 table: master6
  Channel ipv4
    State:          DOWN
    Table:          master4
    Preference:     100
    Input filter:   no_defaults
    Output filter:  REJECT
    IGP IPv4 table: master4

This appears to be the result of 

template bgp ALL_BGP {
        local as MY_AS;
        ipv6 {
                add paths on;
                graceful restart on;
                import keep filtered on;
        };
        ipv4 {
                add paths on;
                graceful restart on;
                import keep filtered on;
        };
}

template bgp UPLINK from ALL_BGP {
        allow local as 4;
        ipv6 {
                export none;
                import filter no_defaults;
        };
        ipv4 {
                export none;
                import filter no_defaults;
        };
}

protocol bgp BOGONS6 from UPLINK {
        neighbor 2620:0:6B0::26E5:4207 as 65332;
        source address 2001:XXXXXXXXXX:fe00:2c9b;
        multihop 255;
}

(which is, as explained above, a result of simply merging former bird6.conf and bird.conf files)

Again, this is somehow „explicable“, yet completely surprising and „Something must be done against that!!!!11!!11 BURN THE WITCH!!!“ ;-), either within the documentation or within the code.
(it seems the ipv6/ipv4 channel definitions within the templates each create their own instances instead of inheriting/enhancing (from) the ones upstream???)

@Ondrej, et.al., is this even a „supported mode of operation“ → multiple IPv6 and IPv4 channels within one protocol?

So this still somehow works, yet still with the 100% CPU consumption.

@bird-devs: How would you like to address above issues? Yes, I can probably clean up the config file, which would probably even make the 100%-issue go away. But imho it shouldn’t be possible in the first place to create configs like these without BIRD at least kicking and screaming loudly about it.

NB: This a a test-setup on a test-machine, so I’ll happily create accounts for you, if you send me an ssh-pubkey through email, if you like to have a „live look“ at the beast!? (it practically lives in your garden anyway → 14ms RTT to nic.cz :-)

Liebe Grüße aus Berlin,

	Clemens

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20171215/dbe4d9cc/attachment.html>


More information about the Bird-users mailing list