[PATCH] General protocol route limits
Alexander V. Chernikov
melifaro at yandex-team.ru
Mon Nov 21 14:44:17 CET 2011
On 21.11.2011 16:07, Ondrej Zajicek wrote:
> On Mon, Nov 14, 2011 at 04:40:25PM +0400, Alexander V. Chernikov wrote:
>> Hello list!
>>
>> At the moment bird has route limits implemented for BGP only (route
>> limit XXX).
>>
> ...
>>
>> This patch introduces general limiting functionality for any protocol.
>>
>> Import/export limits can be configured with the following actions:
>> * warn (prints warning message)
>> * block (blocks new import/exports from/to the protocol)
>> * shutdown (restart the protocol, import only)
>> * disable (shutdown and disable protocol)
>>
>> If any protocol limit is hit and block action is taken, protocol can be
>> returned to 'normal' state by using reload [in|out] protocol (or
>> restaring it).
>
> There are several problems, esp. with 'block' action in export limit. Primary
> cause is that BIRD does not remember which route was sent to a peer,
> instead it reruns filter to know that. This get broken when 'block' is
> active, because it is not possible to know whether given route was
> blocked or not. For example, if export is blocked, then a new route arrives,
> it is blocked, but if later the route is updated, it is treated as an update
> (and not as a new route) and propagated, therefore limit is exceeded
> ('old' value in do_rte_announce is not NULL).
Understood. Okay, we can possibly do the following:
implement per-table counter which will be incremented with every
rte_announce. current counter value will be saved inside rte.
Even u32 should not overflow (over 8k times of full-view gets announced)
however using u64 is much better.
This should not introduce significant performance penalty (but this will
harden possible future multithreading a bit).
When protocol state changes to blocked we can save current counter value.
After that, if we're re-announcing such rte we can check if old rte
counter value is less than block counter value.
>
> Not sure if there is a way how to fix it. One possible way is when
> protocol is blocked then convert any later updates to withdraws (instead
> just block new routes). Don't know how exactly the block/withdraw
> behaves in Junipers and Ciscos. Note also that we have to count routes
> blocked by this as exported in exp_routes, otherwise it would be
> inconsistent and would drift away (when genuine withdraw is announced,
> it is unknown whether the withdrawed route is 'older' route (propagated
> before the block) or 'later' route (propagated after the block).
>
> Another possibility is just do not allow block in export filter.
>
> Another related problem is that exp_routes statistics became unreliable
> when 'configure soft' is used, export filter is changed and protocol
> is not reloaded. This is probably a reason why BGP limits were originally
> implemented for import only.
IMHO export limit main usage pattern is IGP. BGP export limit can be
forbidden (however the original problem with 'configure soft' remains).
>
> Some minor comments later:
>
>> From f9c8c639593aa98723397a640f8d899b85c39fb7 Mon Sep 17 00:00:00 2001
>> From: Alexander V. Chernikov<melifaro at ipfw.ru>
>> Date: Mon, 14 Nov 2011 15:33:27 +0000
>> Subject: [PATCH 1/1] * Implement general protocol limiting
>>
>> ---
>> doc/bird.sgml | 16 +++++-
>> nest/config.Y | 32 +++++++++
>> nest/proto.c | 175 +++++++++++++++++++++++++++++++++++++++++++++++++--
>> nest/protocol.h | 42 ++++++++++++-
>> nest/rt-table.c | 49 ++++++++++++++-
>> proto/bgp/bgp.c | 28 +++++---
>> proto/bgp/bgp.h | 1 -
>> proto/bgp/config.Y | 9 +++-
>> proto/bgp/packets.c | 3 -
>> 9 files changed, 329 insertions(+), 26 deletions(-)
>>
>> diff --git a/doc/bird.sgml b/doc/bird.sgml
>> index 7f53f02..4060f05 100644
>> --- a/doc/bird.sgml
>> +++ b/doc/bird.sgml
>> @@ -415,6 +415,19 @@ to zero to disable it. An empty<cf><m/switch/</cf> is equivalent to<cf/on/
>> <tag>export<m/filter/</tag> This is similar to the<cf>import</cf> keyword, except that it
>> works in the direction from the routing table to the protocol. Default:<cf/none/.
>>
>> + <tag>import limit<m/number/ exceed warn | block | shutdown | disable</tag>
>
> It would be much better to rename 'shutdown' to 'restart'.
>
>> + | IMPORT LIMIT expr {
>> + if ((!strcmp(this_proto->protocol->name, "Device")) ||
>> + (!strcmp(this_proto->protocol->name, "Static")))
>> + cf_error("%s protocol does not support import limits", this_proto->protocol->name);
>
> Why not? Altough it probably does not make much sense, i see no need to
> create an exception. And even if so, strcmp() is a crazy way to check
> protocol class.
>
>> + if (!this_proto->in_limit)
>> + this_proto->in_limit = cfg_allocz(sizeof(struct proto_limit));
>> + this_limit = this_proto->in_limit;
>> + this_limit->direction = PL_IMPORT;
>> + this_limit->limit = $3;
>> + }
>> + EXCEED limit_action
>> + | EXPORT LIMIT expr {
>> + if (!this_proto->out_limit)
>> + this_proto->out_limit = cfg_allocz(sizeof(struct proto_limit));
>> + this_limit = this_proto->out_limit;
>> + this_limit->direction = PL_EXPORT;
>> + this_limit->limit = $3;
>> + }
>> + EXCEED limit_action
>
> This seems unnecessary complicated, what about just (without global
> variable this_limit and with return value of limit_action):
>
> | IMPORT LIMIT expr EXCEED limit_action {
> ...
> xxx->limit = $3
> xxx->action = $5
> // and check for valid action w.r.t. direction
> }
>
>> SYM {
>> if ($1->class != SYM_TABLE) cf_error("Table name expected");
>> diff --git a/nest/proto.c b/nest/proto.c
>> index d55c348..f022dc2 100644
>> --- a/nest/proto.c
>> +++ b/nest/proto.c
>> @@ -33,6 +33,11 @@ static list initial_proto_list;
>> static list flush_proto_list;
>> static struct proto *initial_device_proto;
>>
>> +/* protocol limiting variables */
>> +static struct rate_limit rl_rt_limit;
>> +static event *proto_shut_event;
>> +static list abuse_proto_list;
>
>> - if (p->reconfiguring&& p->core_state == FS_HUNGRY&& p->proto_state == PS_DOWN)
>> + if ((p->flags& PFLAG_RECONFIGURING)&& p->core_state == FS_HUNGRY&& p->proto_state == PS_DOWN)
>
> It would be nice to use some inline function to check for reconfiguring flag.
>
>> + /* Remove from pre-shutdown list if exists */
>> + WALK_LIST_DELSAFE(ap, ap_next, abuse_proto_list)
>> + {
>> + if ((ap = SKIP_BACK(struct proto, shutdown_node, ap)) != p)
>> + continue;
>> + rem_node(&ap->shutdown_node);
>> + break;
>> + }
>
> This seems a bit crazy to me - what about just use some flag to mark a proto
> in abuse_proto_list and then call rem_node base on that?
>
> Personally, i would remove the whole abuse_proto_list. Disabling or restarting
> protocols because of limit breaching seems to be too unfrequent event
> that just walking active_proto_list and acting based on flags seems OK and
> keeps the code simpler.
>
>> +#define PL_HANDLED 1 /* Limit action is handled by the protocol hook */
>> +#define PL_DEFAULT 2 /* Core has to execute default action */
>
> Perhaps just keep the callback return boolean, like the reconfigure hook?
>
>> +#define PL_ACTION_WARN 1 /* Issue log warning */
>> +#define PL_ACTION_BLOCK 2 /* Block new routes */
>> +#define PL_ACTION_CLEAR 3 /* Clear exported routes */
>
> Some unimplemented feature?
>
>
>> @@ -188,20 +188,26 @@ do_rte_announce(struct announce_hook *a, int type UNUSED, net *net, rte *new, rt
>> {
>> struct proto *p = a->proto;
>> struct filter *filter = p->out_filter;
>> + struct proto_limit *l = p->out_limit;
>> struct proto_stats *stats =&p->stats;
>> rte *new0 = new;
>> rte *old0 = old;
>> - int ok;
>> + int ok, wrong_table = 0;
>>
>> #ifdef CONFIG_PIPE
>> /* The secondary direction of the pipe */
>> if (proto_is_pipe(p)&& (p->table != a->table))
>> {
>> filter = p->in_filter;
>> + l = p->in_limit;
>> stats = pipe_get_peer_stats(p);
>> }
>> #endif
>>
>> + /* Check if we're called on non-default protocol table */
>> + if ((!proto_is_pipe(p))&& (p->table != a->table))
>> + wrong_table = 1;
>> +
>
> Some unimplemented thing?
>
>> if (new)
>> {
>> stats->exp_updates_received++;
>> @@ -272,6 +278,22 @@ do_rte_announce(struct announce_hook *a, int type UNUSED, net *net, rte *new, rt
>> if (!new&& !old)
>> return;
>>
>> + /* Check if we can export new route / exceed export limit */
>> + if (new&& l&& !old)
>> + {
>> + if (p->flags& PFLAG_ELIMIT_BLOCK)
>> + return;
>> +
>> + if ((l = p->out_limit)&& (p->stats.exp_routes + 1> l->limit)&& (proto_notify_limit(p, l, a->table) == 1))
>> + {
>
> 'l' and 'stats' is already set above to be consistent with pipe, that should be used here?
>
>> + /* free allocated data and return */
>> + if (new != new0)
>> + rte_free(new);
>> +
>> + return;
>> + }
>> + }
>> +
>> if (new)
>> stats->exp_updates_accepted++;
>> else
>> @@ -426,6 +448,7 @@ static void
>> rte_recalculate(rtable *table, net *net, struct proto *p, struct proto *src, rte *new, ea_list *tmpa)
>> {
>> struct proto_stats *stats =&p->stats;
>> + struct proto_limit *l;
>> rte *old_best = net->routes;
>> rte *old = NULL;
>> rte **k, *r, *s;
>> @@ -486,6 +509,16 @@ rte_recalculate(rtable *table, net *net, struct proto *p, struct proto *src, rte
>> return;
>> }
>>
>> + /* Check limit for imported routes */
>> + if (new&& !old)
>> + {
>> + if (p->flags& PFLAG_ILIMIT_BLOCK)
>> + return;
>> +
>> + if ((l = p->in_limit)&& (p->stats.imp_routes + 1> l->limit)&& (proto_notify_limit(p, l, table) == 1))
>> + return;
>> + }
>
> Ignore this check in case of pipe?
>
>> +
>> if (new)
>> stats->imp_updates_accepted++;
>> else
>> @@ -1233,7 +1266,11 @@ rt_feed_baby(struct proto *p)
>> {
>> struct announce_hook *h;
>> struct fib_iterator *fit;
>> - int max_feed = 256;
>> + struct proto_limit *l = p->out_limit;
>> + int max_feed = 256, need_check, do_check;
>
> Isn't code here in rt_feed_baby() unnecessary?
> do_feed_baby() is called and do_rte_announce() is called,
> where limits are already handled. Perhaps just some check
> if block is reached to skip rest of feeding?
>
> Also pipes seems not to be handled properly here.
>
>> + /* Do we need to filter route updates? */
>> + need_check = (!(p->flags& PFLAG_ELIMIT_BLOCK)&& l) ? 1 : 0;
>>
>> if (!p->feed_ahook) /* Need to initialize first */
>> {
>> @@ -1248,6 +1285,8 @@ rt_feed_baby(struct proto *p)
>>
>> again:
>> h = p->feed_ahook;
>> + /* Do limit check for base protocol table only */
>> + do_check = ((p->table == h->table)&& need_check)? 1 : 0;
>> FIB_ITERATE_START(&h->table->fib, fit, fn)
>> {
>> net *n = (net *) fn;
>> @@ -1263,6 +1302,12 @@ again:
>> {
>> if (p->core_state != FS_FEEDING)
>> return 1; /* In the meantime, the protocol fell down. */
>> + if (do_check&& (p->stats.exp_routes + 1> l->limit))
>> + if (proto_notify_limit(p, l, h->table))
>> + {
>> + /* limit action forbids new exports, end feed */
>> + break;
>> + }
>> do_feed_baby(p, RA_OPTIMAL, h, n, e);
>> max_feed--;
>> }
>> diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c
>> index 675342d..483a2d0 100644
>> --- a/proto/bgp/bgp.c
>> +++ b/proto/bgp/bgp.c
>> @@ -542,19 +542,27 @@ bgp_active(struct bgp_proto *p)
>> bgp_start_timer(conn->connect_retry_timer, delay);
>> }
>>
>> -int
>> -bgp_apply_limits(struct bgp_proto *p)
>> +static int
>> +bgp_limit_notify(struct proto *P, struct proto_limit *l, struct rtable *table)
>> {
>> - if (p->cf->route_limit&& (p->p.stats.imp_routes> p->cf->route_limit))
>> + struct bgp_proto *p = (struct bgp_proto *) P;
>> + if (l->direction != PL_IMPORT)
>> + return PL_DEFAULT;
>
> Perhaps there should be proper error code even for import limit?
>
>> @@ -1141,9 +1150,6 @@ bgp_show_proto_info(struct proto *P)
>> p->rs_client ? " route-server" : "",
>> p->as4_session ? " AS4" : "");
>> cli_msg(-1006, " Source address: %I", p->source_addr);
>> - if (p->cf->route_limit)
>> - cli_msg(-1006, " Route limit: %d/%d",
>> - p->p.stats.imp_routes, p->cf->route_limit);
>
> Perhaps this should be kept compatible until next major release?
>
>
> BTW, which of your addresses should i use when sending/replying e-mail
> to you? Or keep both ones?
>
More information about the Bird-users
mailing list