Bird 2.0.7 not accepting BGP connections in a VRF

William bird at is.unlawful.id.au
Fri Aug 27 01:21:14 CEST 2021


Hi Alexander,
Thanks for the response, I suspecting it's not just bird from the sshd 
behaviours too, but thought someone here may have run into something 
similar and be able to suggest something.

iptables/nftables is not in use (all chains ACCEPT in all tables), nor 
is ebtables.  For completeness I have unloaded the kernel modules but no 
change.

IPv4 and v6 forwarding is enabled.

Just thought I'd test IPv6, getting the same behaviour too.

I'll keep investigating and see what I can come up with.  I don't think 
I've missed anything considering I can ping inside the VRFs on both 
sides but that's kernel-space, not handing off traffic into user-space.

Regards,
William

On 26/08/2021 23:27, Alexander Zubkov wrote:
> Hi,
> 
> This does not look like bird-related. As you have rp_filter disabled
> already (net.ipv4.conf.all.rp_filter too?) then you can also check
> things like iptables, maybe forwarding?
> 
> On Thu, Aug 26, 2021 at 4:57 AM William <bird at is.unlawful.id.au> wrote:
>> 
>> Hi All,
>> I have a Debian 11 host (5.10.0-8-amd64 kernel) with Bird 2.0.7.  
>> There
>> is a VRF set up called CORE.  Originally VRF and route table ID were
>> high (>100000000) but everything seemed fine apart from the fact that
>> even though Bird was listening for BGP connections in the VRf, nothing
>> was actually being received by the daemon.  I have since dropped the 
>> VRF
>> ID and routing table down to 32, but no change.
>> 
>> bond0.64 is in the VRF, with 198.18.64.254/30 as its IP.  Neighbour
>> device has 198.18.64.253, and the can ping each other happily.  No
>> firewall is configured.
>> 
>> Bird listening on the VRF interface (with SSH in MGMT VRF which works
>> fine):
>> # ss -tuplne --cgroup --tipcinfo
>> Netid           State            Recv-Q            Send-Q
>>              Local Address:Port                       Peer 
>> Address:Port
>>          Process
>> tcp             LISTEN           0                 8
>>         198.18.64.254%CORE:179                             0.0.0.0:*
>>           users:(("bird",pid=3807,fd=6)) uid:106 ino:62370 sk:102a
>> cgroup:/system.slice/bird.service <-> 
>> cgroup:/system.slice/bird.service
>> tcp             LISTEN           0                 128
>>               0.0.0.0%MGMT:22                              0.0.0.0:*
>>           users:(("sshd",pid=824,fd=3)) ino:12780 sk:1025
>> cgroup:/system.slice/ssh.service/vrf/MGMT <->
>> cgroup:/system.slice/ssh.service/vrf/MGMT
>> tcp             LISTEN           0                 128
>>                  [::]%MGMT:22                                 [::]:*
>>           users:(("sshd",pid=824,fd=4)) ino:12790 sk:1026
>> cgroup:/system.slice/ssh.service/vrf/MGMT v6only:1 <->
>> cgroup:/system.slice/ssh.service/vrf/MGMT
>> #
>> 
>> Normally Bird is not started inside the VRF, but even when it does, I
>> get no change.  I have tried changing the interface that the BGP
>> protocol is listening on, no change there.
>> 
>> Here is the template that the BGP session is initiated from:
>> 
>> template bgp CORE_RTR_BGP4_TMPL {
>>    # Will use IP of interface neighbour is connected to as IP is not 
>> set.
>>    # Override with "source address <ip>" in protocol definition or
>>    # 'interface "string"' for IPv6.  See Bird docs, section 6.3.4
>>    local 198.18.64.254 as CORE_my_asn;
>> #  vrf "CORE";
>>    ipv4 {
>>      import filter CORE_bgp_import4;
>>      export filter CORE_bgp_export4;
>>      table CORE_Routes4;
>>      next hop self;
>>    };
>>    vrf CORE_vrf;
>> #  interface "bond0.64";
>> #  interface "CORE";
>> #  direct;
>>    strict bind on;
>> }
>> 
>> and here is the protocol def itself:
>> protocol bgp CORE_LabRTR2_BGP4 from CORE_RTR_BGP4_TMPL {
>>    description "IPv4 BGP to LabRTR2";
>>    neighbor 198.18.64.253 as CORE_rcore_asn;
>> #  disabled;
>>    passive;
>> }
>> 
>> I have tried all variants of interface & launching inside and outside
>> the VRF with no change.
>> If I connected locally inside the vrf I can connect to the daemon:
>> 
>> # ip vrf exec CORE telnet -b 198.18.64.254 198.18.64.254 179
>> Trying 198.18.64.254...
>> Connected to 198.18.64.254.
>> Escape character is '^]'.
>> Connection closed by foreign host.
>> #
>> 
>> The same command run from the other node (binding IP changed of 
>> course),
>> no connection.
>> 
>> Packet dump shows the TCP SYNs coming in, but no change.
>> # tcpdump -enni bond0.64
>> tcpdump: verbose output suppressed, use -v[v]... for full protocol
>> decode
>> listening on bond0.64, link-type EN10MB (Ethernet), snapshot length
>> 262144 bytes
>> 12:41:46.465300 02:c0:7e:18:79:d3 > 02:c0:7e:4a:57:e6, ethertype IPv4
>> (0x0800), length 74: 198.18.64.253.34525 > 198.18.64.254.179: Flags 
>> [S],
>> seq 2959174086, win 64240, options [mss 1460,sackOK,TS val 658554227 
>> ecr
>> 0,nop,wscale 7], length 0
>> 12:41:47.495032 02:c0:7e:18:79:d3 > 02:c0:7e:4a:57:e6, ethertype IPv4
>> (0x0800), length 74: 198.18.64.253.34525 > 198.18.64.254.179: Flags 
>> [S],
>> seq 2959174086, win 64240, options [mss 1460,sackOK,TS val 658555257 
>> ecr
>> 0,nop,wscale 7], length 0
>> 12:41:49.511037 02:c0:7e:18:79:d3 > 02:c0:7e:4a:57:e6, ethertype IPv4
>> (0x0800), length 74: 198.18.64.253.34525 > 198.18.64.254.179: Flags 
>> [S],
>> seq 2959174086, win 64240, options [mss 1460,sackOK,TS val 658557273 
>> ecr
>> 0,nop,wscale 7], length 0
>> 12:41:53.639028 02:c0:7e:18:79:d3 > 02:c0:7e:4a:57:e6, ethertype IPv4
>> (0x0800), length 74: 198.18.64.253.34525 > 198.18.64.254.179: Flags 
>> [S],
>> seq 2959174086, win 64240, options [mss 1460,sackOK,TS val 658561401 
>> ecr
>> 0,nop,wscale 7], length 0
>> ^C
>> 4 packets captured
>> 4 packets received by filter
>> 0 packets dropped by kernel
>> #
>> 
>> I tried re-arranging the vrf multi-layer routing rules then adding 
>> more
>> specific rules, no change:
>> 
>> # ip rule
>> 64:     from all oif bond0.64 lookup CORE
>> 64:     from all iif bond0.64 lookup CORE
>> 1000:   from all lookup [l3mdev-table]
>> 32765:  from all lookup local
>> 32766:  from all lookup main
>> 32767:  from all lookup default
>> 
>> VRF routing table:
>> # ip r s t CORE
>> broadcast 198.18.64.252 dev bond0.64 proto kernel scope link src
>> 198.18.64.254
>> 198.18.64.252/30 dev bond0.64 proto kernel scope link src 
>> 198.18.64.254
>> 198.18.64.252/30 dev bond0.64 proto bird scope link metric 32
>> local 198.18.64.254 dev bond0.64 proto kernel scope host src
>> 198.18.64.254
>> broadcast 198.18.64.255 dev bond0.64 proto kernel scope link src
>> 198.18.64.254
>> broadcast 198.19.64.0 dev CORE_IP proto kernel scope link src
>> 198.19.64.1
>> 198.19.64.0/24 dev CORE_IP proto kernel scope link src 198.19.64.1
>> 198.19.64.0/24 dev CORE_IP proto bird scope link metric 32
>> local 198.19.64.1 dev CORE_IP proto kernel scope host src 198.19.64.1
>> broadcast 198.19.64.255 dev CORE_IP proto kernel scope link src
>> 198.19.64.1
>> #
>> 
>> CORE_IP is a dummy interface in the VRF to hold an extra subnets to
>> advertise to the other node.
>> 
>> I've seriously run out of ideas.  I started an instance of sshd in the
>> VRF too and I'm getting the same thing, but can't figure out why, 
>> might
>> be related, might not be.  I don't think I've missed anything.
>> 
>> As a last resort I even used strace and attached to the bird process.
>> Nothing was showing as being received by the Bird process, but
>> connecting locally did.
>> 
>> I have also set rp_filter=0 on related interfaces.
>> 
>> Anyone got any other thoughts/experiences/hacks?  I am stumped why 
>> sshd
>> in the MGMT vrf works, but not in the CORE VRF too, but that might be
>> something else (or it might not be).
>> 
>> Thanks.
>> 
>> Regards,
>> William


More information about the Bird-users mailing list