"gw" attribute assignment in filter invalidates routes learned via BGP, static, and possibly others?

Sergey Popovich popovich_sergei at mail.ru
Tue Aug 13 12:57:48 CEST 2013


Hello!

Another issue I spot last time:
  assigning value in protocol export filter
  invalidates route and prevents its from being
  installed in KRT.

Simple config to test this issue:
----------------------------------------
### SYSTEM

# ip link add dev lo255 type dummy
# ip link set up dev lo255
# ip -4 addr add 192.0.2.254/24 dev lo255
# ip -4 addr add 172.16.2.254/24 dev lo255
# ip link set up dev eth0
# ip -4 addr add 192.168.1.2/24 dev eth0

### BIRD

router id 10.10.10.10;

protocol device devices {
	scan time 120;
}

# Main routing table (master), mapped to ipt_main kernel table.
# ipt_main == 254 (see /etc/iproute2/rt_tables)
protocol kernel kernel254 {
	persist no;

	scan time 120;
	learn no;
	device routes no;
	kernel table ipt_main;
	import none;
	export all;
}

protocol direct direct254 {
	interface "lo255", "eth0";
}

protocol static static254_test {
        # If "via" is one from "lo255" subnets (172.16.2.5 for example: 
        # everything is correct and route installed by "kernel254"
	route 10.0.0.0/24 via 192.168.1.1;

	import filter {
                # This causes route invalidation, "gw" points to 192.0.2.5, 
                # but interface is still "eth0" (see birdc output below).
		gw = 192.0.2.5;
		accept;
	};
	export none;
};

protocol static static254 {
	# rs1, rs2
	route 192.168.254.1/32 via 192.168.1.1;
	route 192.168.254.2/32 via 192.168.1.1;
}

# AS65001, peering with Route Server(s), RS
template bgp tl_bgp254_as65001 {
	start delay time 10;
	connect retry time 60;
	startup hold time 30;
	keepalive time 10;
	hold time 30;

	capabilities yes;
	advertise ipv4 yes;
	enable route refresh yes;
	enable as4 yes;

	gateway recursive;

	multihop 4;
	local 192.168.1.2 as 65010;
	import filter {
			# This causes route invalidation, "gw" points to 192.0.2.5, but
                        # interface is still "eth0" (see birdc output below).
			gw = 192.0.2.5;
			accept;
	};
	export none;
}

protocol bgp bgp254_as65001_rs1 from tl_bgp254_as65001 {
	neighbor 192.168.254.1 as 65001;
}

protocol bgp bgp254_as65001_rs2 from tl_bgp254_as65001 {
	neighbor 192.168.254.2 as 65001;
}

Here is output from birdc and ip-route(8):
----------------------------------------------------

### common

## 192.0.2.0/24, lo255
$ birdc 'show route for 192.0.2.5 all'
BIRD 1.3.11 ready.
192.0.2.0/24       dev lo255 [direct254 12:45] * (240)
        Type: device unicast univ

$ ip -4 route show match 192.0.2.5/32
192.0.2.0/24 dev lo255  proto kernel  scope link  src 192.0.2.254

## 172.16.2.0/24, lo255
$ birdc 'show route for 172.16.2.5 all'
BIRD 1.3.11 ready.
172.16.2.0/24      dev lo255 [direct254 13:00] * (240)
        Type: device unicast univ

$ ip -4 route show match 172.16.2.5/32
172.16.2.0/24 dev lo255  proto kernel  scope link  src 172.16.2.254

## 192.168.1.0/24, eth0

$ birdc 'show route for 192.168.1.0/24 all'
BIRD 1.3.11 ready.
192.168.1.0/24     dev eth0 [direct254 12:45] * (240)
        Type: device unicast univ

$ ip -4 route show match 192.168.1.2/32
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.254

### static

## 10.0.0.0/24
$ birdc 'show route for 10.0.0.0/24 all'
BIRD 1.3.11 ready.
10.0.0.0/24        via 192.0.2.5 on eth0 [static254_test 13:19] ! (200)
        Type: static unicast univ

$ ip -4 route show exact 10.0.0.0/24

### BGP

$ birdc 'show route for 10.0.0.0/8 all'
BIRD 1.3.11 ready.
10.0.0.0/8         via 192.0.2.5 on eth0 [bgp254_as65001_rs1 13:00 from 
192.168.254.1] ! (100/0) [AS1011i]
        Type: BGP unicast univ
        BGP.origin: IGP
        BGP.as_path: 65001 1011
        BGP.next_hop: 192.168.1.1
        BGP.local_pref: 100
        BGP.community: (1001,65010) (1001,1001)
                   via 192.0.2.5 on eth0 [bgp254_as65001_rs2 13:00 from 
192.168.254.2] (100/0) [AS1011i]
        Type: BGP unicast univ
        BGP.origin: IGP
        BGP.as_path: 65001 1011
        BGP.next_hop: 192.168.1.1
        BGP.local_pref: 100
        BGP.community: (1001,65010) (1001,1001)

$ ip -4 route show exact 10.0.0.0/8

---------------------------------------------------------------------------------------
This is probably due to not updating iface, nexthops (multihop config) and
other fields of "rta" struct.

Provided patch in attachment tries to address this issue by calling 
rta_set_recursive_next_hop() in filter/filter.c to properly assign to "gw"
attribute. Special cases for bgp and static protocols was taken to use
"igp table" configuration parameter if present (tested, and found working
with static protocol, probably some with bgp).

Patch tested in my configuration and works wery well for both static and
bgp protocols.

---------------------------------------------------------------------------------------
Brief explanation why "gw" attribute might be wery important (at least
in my case).

There is common technique to stop DDoS in large ISP network:
  blackholing.

However implementations of this might wary from vendor to vendor.
In BIRD simplest way to implement this is to set "dest" attribute to something
like RTD_BLACKHOLE, and all other route attributes gets deleted (gw, iface, 
...). Route installed in KRT as blackhole.

Everything is ok with this setup, but sometimes searching blackhole route
might give some surprises: you trace path to blackholed destination and
gets nothing(???) from your gateway, as it drops packets destined to blackhole
route without any notification (in case of trace path without sending ICMP Time
To Live Exceeded). And what if you trace from some core router?

And heres another blackholing technique comes: use looped back interface to 
drop packets.

In this case we setup some system looped back interface (in Linux this is
dummy interface type), configure some network on it (say something 
192.0.2.0/24), And instead of setting blackholed route type to RTD_BLACKHOLE
we change its "gw" to any address within subnet assigned on looped back
interface.

Thats servers same as route with blackhole type, but behaves differently on 
trace paths: router sends ICMP Time To Live Exceeded messages from its
incoming interface address indicating last hop before dropping blackholed 
traffic. This has no impact on DDoS traffic, except it transmitted to
looped back interface instead of being dropped immediately after matching 
route and wasting few CPU cycles.

-- 
SP5474-RIPE
Sergey Popovich
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bird-1.3.11-gw-overwrite.patch
Type: text/x-patch
Size: 4441 bytes
Desc: not available
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20130813/bbd14906/attachment.bin>


More information about the Bird-users mailing list