[RFC] Babel: add v4viav6 support

Toke Høiland-Jørgensen toke at toke.dk
Tue Dec 15 13:05:22 CET 2020


Andreas Rammhold <andreas at rammhold.de> writes:

> This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as
> IPv4 via IPv6 extension to the Babel routing protocol that allows
> announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it
> possible for IPv4 traffic to flow through interfaces that have not been
> assigned an IPv4 address.
>
> I've tried my best to be compliant with the current RFC version. One
> point that is not entirely clear is how to implement in section 2.2.
> regarding fallback when the device doesn't support v4-via-v6 routes.
> (See notes below)
>
> The implementation is compatible with the current Babeld PR
> (https://github.com/jech/babeld/pull/56). I've verified this with a few
> VMs in the following setup:
>
> Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
>
> Each routing daemon was running on their own VM and had L2 connectivity
> to only its direct neighbors. At the nodes at the edges v4 networks have
> been announced and full end-to-end communication was possible over the
> mixed AF network. The v6 only link between Babel and Bird (at the
> "center" of the above setup) did transport the v4 packets via the v6
> link-local next hop addresses just as expected.
>
> Thanks to Toke Høiland-Jørgensen for early review on this work.
>
> -----< notes >------
>
> (My current notes on the current implementation regarding
> compliance with the current draft)
>
> Bird doesn't implement compression for outgoing (send) IPv4 prefixes and
> thus this also hasn't been implemented for this AE. Best guess is that 4
> bytes are cheap enough to not bother with the added complexity? In the
> RX path compression is supported for IPv4 prefixes (for the IPv4 AE
> as well as for IPv4-via-IPv6 AE).
>
> Especially the part last paragraph of 2.2. remains to be discussed/solved:
>
>  * If a node cannot install v4-over-v6 routes, eg., due to hardware or
>    software limitations, then routes to an IPv4 prefix with an IPv6
>    next-hop MUST NOT be selected, as described in Section 3.5.3 of
>    [RFC6126bis].
>
>  * What if the kernel doesn't accept the RTA_VIA value we gave it?
>    Does BIRD generally handle this already?
>    One example is hitting: "ipv4: use IS_ENABLED instead of ifdef"
>    (id:20201115224509.2020651-1-flokli at flokli.de @ linux-netdev) Where
>    the kernel does support it but due to a bug fails to add those routes
>    when CONFIG_IPV6=m.
>    Kernel version comparison is probably a bad idea. Which kind of route
>    can we attempt to install to test this functionality without breaking
>    any setup?

Can't we just have Bird do the equivalent of:

ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink
ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink

on startup, and set a system flag depending on whether the operation
fails or not?


Other than this, I think the patch generally looks good (with a few nits
below); but we should probably wait until the draft progresses to the
point where it actually has an AE assigned in the text before we merge
this :)

> Section 5: I am using the proposed value 4 to test interoperability with
> the Babled PR (https://github.com/jech/babeld/pull/56/commits).
> ---
>  doc/bird.sgml         |   7 +++
>  proto/babel/babel.c   |  26 +++++++---
>  proto/babel/babel.h   |   3 ++
>  proto/babel/config.Y  |   4 +-
>  proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------
>  5 files changed, 117 insertions(+), 39 deletions(-)
>
> diff --git a/doc/bird.sgml b/doc/bird.sgml
> index 5408cb2a..ee895a47 100644
> --- a/doc/bird.sgml
> +++ b/doc/bird.sgml
> @@ -1770,6 +1770,7 @@ protocol babel [<name>] {
>  		check link <switch>;
>  		next hop ipv4 <address>;
>  		next hop ipv6 <address>;
> +		ipv4 via ipv6 <switch>;
>  	};
>  }
>  </code>
> @@ -1860,6 +1861,12 @@ protocol babel [<name>] {
>        interface. If not set, the same link-local address that is used as the
>        source for Babel packets will be used. In normal operation, it should not
>        be necessary to set this option.
> +
> +      <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag>
> +      If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4
> +      addresses are absent from the interface as described in draft-ietf-babel-v4viav6.
> +      Default: yes.
> +
>  </descrip>
>  
>  <sect1>Attributes
> diff --git a/proto/babel/babel.c b/proto/babel/babel.c
> index 4b6b9d7f..64a88601 100644
> --- a/proto/babel/babel.c
> +++ b/proto/babel/babel.c
> @@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable)
>      msg.update.router_id = e->router_id;
>      net_copy(&msg.update.net, e->n.addr);
>  
> -    msg.update.next_hop = ((e->n.addr->type == NET_IP4) ?
> -			   ifa->next_hop_ip4 : ifa->next_hop_ip6);
> +    if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) {
> +	    msg.update.next_hop = ifa->next_hop_ip4;
> +    } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) {
> +	    msg.update.next_hop = ifa->next_hop_ip6;
> +    /* If there is no IPv4 address on the interface (and we are supporting v4viav6)
> +     * send the v6 next hop instead */
> +    } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) {
> +	    msg.update.next_hop = ifa->next_hop_ip6;
> +    }

You're doing a bit of redundant checking here, since there's a check for
ipa_zero() just below. How about:


 if (e->n.addr->type == NET_IP4) {
   /* Always prefer v4 nexthop if set */
   if(!ipa_zero(ifa->next_hop_ip4))
     msg.update.next_hop = ifa->next_hop_ip4;

   /* Only send v4-via-v6 if enabled */
   else if (ifa->cf->ip4_via_ip6)
     msg.update.next_hop = ifa->next_hop_ip6;
 } else {
   msg.update.next_hop = ifa->next_hop_ip6;
 }

>  
>      /* Do not send route if next hop is unknown, e.g. no configured IPv4 address */
>      if (ipa_zero(msg.update.next_hop))
> @@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa)
>      return;
>    }
>  
> +  /* Reject IPv4 via IPv6 routes if disabled */
> +  if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) {
> +    DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n");
> +    return;
> +  }
> +



>    /* Regular update */
>    e = babel_get_entry(p, &msg->net);
>    r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */
> @@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa)
>    ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4;
>  
>    if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel)
> -    log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname);
> +    log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected

Just check ifa->cf->ip4_via_ip6 here as well, and make sure it gets set
to false if the feature is not supported on the system?

>    if (ifa->up)
>      babel_iface_kick_timer(ifa);
> @@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff)
>    }
>  
>    cli_msg(-1023, "%s:", p->p.name);
> -  cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s",
> -	  "Interface", "State", "RX cost", "Nbrs", "Timer",
> +  cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s",
> +	  "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer",
>  	  "Next hop (v4)", "Next hop (v6)");
>  
>    WALK_LIST(ifa, p->interfaces)
> @@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff)
>  	nbrs++;
>  
>      btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time();
> -    cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I",
> +    cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I",
>  	    ifa->iface->name, (ifa->up ? "Up" : "Down"),
> +	    (ifa->cf->ip4_via_ip6 ? "Yes" : "No"),
>  	    ifa->cf->rxcost, nbrs, MAX(timer, 0),
>  	    ifa->next_hop_ip4, ifa->next_hop_ip6);
>    }
> diff --git a/proto/babel/babel.h b/proto/babel/babel.h
> index e075024c..146d28f5 100644
> --- a/proto/babel/babel.h
> +++ b/proto/babel/babel.h
> @@ -104,6 +104,7 @@ enum babel_ae_type {
>    BABEL_AE_IP4			= 1,
>    BABEL_AE_IP6			= 2,
>    BABEL_AE_IP6_LL		= 3,
> +  BABEL_AE_IPV4_VIA_IPV6	= 4,
>    BABEL_AE_MAX
>  };
>  
> @@ -137,6 +138,8 @@ struct babel_iface_config {
>  
>    ip_addr next_hop_ip4;
>    ip_addr next_hop_ip6;
> +
> +  u8 ip4_via_ip6;			/* Enabe IPv4 via IPv6 */

Typo in comment

>  };
>  
>  struct babel_proto {
> diff --git a/proto/babel/config.Y b/proto/babel/config.Y
> index 2f3b637b..f17884eb 100644
> --- a/proto/babel/config.Y
> +++ b/proto/babel/config.Y
> @@ -25,7 +25,7 @@ CF_DECLS
>  CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT,
>  	TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK,
>  	NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS,
> -	ENTRIES, RANDOMIZE, ROUTER, ID)
> +	ENTRIES, RANDOMIZE, ROUTER, ID, VIA)
>  
>  CF_GRAMMAR
>  
> @@ -65,6 +65,7 @@ babel_iface_start:
>    BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL;
>    BABEL_IFACE->tx_priority = sk_priority_control;
>    BABEL_IFACE->check_link = 1;
> +  BABEL_IFACE->ip4_via_ip6 = 1;
>  };
>  
>  
> @@ -109,6 +110,7 @@ babel_iface_item:
>   | CHECK LINK bool { BABEL_IFACE->check_link = $3; }
>   | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); }
>   | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); }
> + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; }
>   ;
>  
>  babel_iface_opts:
> diff --git a/proto/babel/packets.c b/proto/babel/packets.c
> index 415ac3f9..d4c66a53 100644
> --- a/proto/babel/packets.c
> +++ b/proto/babel/packets.c
> @@ -130,9 +130,11 @@ struct babel_parse_state {
>    u64 router_id;		/* Router ID used in subsequent updates */
>    u8 def_ip6_prefix[16];	/* Implicit IPv6 prefix in network order */
>    u8 def_ip4_prefix[4];		/* Implicit IPv4 prefix in network order */
> +  u8 def_ip4viaip6_prefix[4];	/* Implicit IPv4 prefix in network order */
>    u8 router_id_seen;		/* router_id field is valid */
>    u8 def_ip6_prefix_seen;	/* def_ip6_prefix is valid */
>    u8 def_ip4_prefix_seen;	/* def_ip4_prefix is valid */
> +  u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/
>    u8 current_tlv_endpos;	/* End of self-terminating TLVs (offset from start) */
>    u8 sadr_enabled;
>  };
> @@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct
>  static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
>  static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
>  static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
> +static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen,
> +		                        u8 (*def_prefix)[4], ip_addr * next_hop, int len);

No need to forward-declare this function; you're only using it after it
was defined.

>  static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
>  static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
> @@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
>    msg->addr = IPA_NONE;
>    msg->sender = state->saddr;
>  
> -  if (msg->ae >= BABEL_AE_MAX)
> -    return PARSE_IGNORE;
> -
>    /*
>     * We only actually read link-local IPs. In every other case, the addr field
>     * will be 0 but validation will succeed. The handler takes care of these
> @@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
>      if (TLV_OPT_LENGTH(tlv) < 4)
>        return PARSE_ERROR;
>      state->current_tlv_endpos += 4;
> -    break;
> +    return PARSE_SUCCESS;
>  
>    case BABEL_AE_IP6:
>      if (TLV_OPT_LENGTH(tlv) < 16)
>        return PARSE_ERROR;
>      state->current_tlv_endpos += 16;
> -    break;
> +    return PARSE_SUCCESS;
>  
>    case BABEL_AE_IP6_LL:
>      if (TLV_OPT_LENGTH(tlv) < 8)
> @@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
>  
>      msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr));
>      state->current_tlv_endpos += 8;
> -    break;
> +    return PARSE_SUCCESS;
>    }
>  
> -  return PARSE_SUCCESS;
> +  return PARSE_IGNORE;
>  }
>  
>  static uint
> @@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED,
>    return PARSE_IGNORE;
>  }
>  
> +
> +/* This is called directrly from babel_read_next_hop to handle the ip4 address
> +   compression state */

Typo in comment

> +static int
> +babel_read_update_ip4_prefix(struct babel_tlv_update *tlv,

The babel_read_* functions generally read a particular type of TLV, so
maybe change the name to make it obvious this is a helper function.
babel_get_ip4_prefix() ?

> +                             struct babel_msg_update *msg,
> +                             u8 *def_prefix_seen,
> +                             u8 (*def_prefix)[4],
> +                             ip_addr *next_hop,
> +                             int len)

I don't think the "pointer with array size" argument type is generally
used in the Bird code; is it even supported by all compilers?

Also, I don't think it's necessary to pass a pointer to next_hop; it's a
fairly small value, so copying it is fine, and if it's not a pointer
it's obvious that it's not modified.

> +{
> +    if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> +      return PARSE_ERROR;
> +
> +    /* Cannot omit data if there is no saved prefix */
> +    if (tlv->omitted && !*def_prefix_seen)
> +      return PARSE_ERROR;
> +
> +    /* Update must have next hop, unless it is retraction */
> +    if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY)
> +      return PARSE_IGNORE;
> +
> +    /* Merge saved prefix and received prefix parts */
> +    u8 buf[4] = {};
> +    memcpy(buf, *def_prefix, tlv->omitted);
> +    memcpy(buf + tlv->omitted, tlv->addr, len);
> +
> +    ip4_addr prefix4 = get_ip4(buf);
> +    net_fill_ip4(&msg->net, prefix4, tlv->plen);
> +
> +    if (tlv->flags & BABEL_UF_DEF_PREFIX)
> +    {
> +      put_ip4(*def_prefix, prefix4);
> +      *def_prefix_seen = 1;
> +    }
> +
> +    msg->next_hop = *next_hop;
> +
> +    return PARSE_SUCCESS;
> +}
> +
>  /* This is called directly from babel_write_update() and returns -1 if a next
>     hop should be written but there is not enough space. */
>  static int
> @@ -579,6 +621,7 @@ static int
>  babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
>                    struct babel_parse_state *state)
>  {
> +  int rc;
>    struct babel_tlv_update *tlv = (void *) hdr;
>    struct babel_msg_update *msg = &m->update;
>  
> @@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
>  
>    /* Length of received prefix data without omitted part */
>    int len = BYTES(tlv->plen) - (int) tlv->omitted;
> -  u8 buf[16] = {};
>  
>    if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv)))
>      return PARSE_ERROR;
> @@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
>      break;
>  
>    case BABEL_AE_IP4:
> -    if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> -      return PARSE_ERROR;
> -
> -    /* Cannot omit data if there is no saved prefix */
> -    if (tlv->omitted && !state->def_ip4_prefix_seen)
> -      return PARSE_ERROR;
> -
> -    /* Update must have next hop, unless it is retraction */
> -    if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY))
> -      return PARSE_IGNORE;
> -
> -    /* Merge saved prefix and received prefix parts */
> -    memcpy(buf, state->def_ip4_prefix, tlv->omitted);
> -    memcpy(buf + tlv->omitted, tlv->addr, len);
> -
> -    ip4_addr prefix4 = get_ip4(buf);
> -    net_fill_ip4(&msg->net, prefix4, tlv->plen);
> +    rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen,
> +		                      &state->def_ip4_prefix, &state->next_hop_ip4,
> +				      len);
> +    if (rc != PARSE_SUCCESS)
> +      return rc;
>  
> -    if (tlv->flags & BABEL_UF_DEF_PREFIX)
> -    {
> -      put_ip4(state->def_ip4_prefix, prefix4);
> -      state->def_ip4_prefix_seen = 1;
> -    }
> +    break;
>  
> -    msg->next_hop = state->next_hop_ip4;
> +  case BABEL_AE_IPV4_VIA_IPV6:
> +    rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen,
> +    				      &state->def_ip4viaip6_prefix, &state->next_hop_ip6,
> +				      len);
> +    if (rc != PARSE_SUCCESS)
> +      return rc;
>  
>      break;
>  
> @@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
>        return PARSE_ERROR;
>  
>      /* Merge saved prefix and received prefix parts */
> +    u8 buf[16] = {};
>      memcpy(buf, state->def_ip6_prefix, tlv->omitted);
>      memcpy(buf + tlv->omitted, tlv->addr, len);
>  
> @@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m,
>    }
>    else if (msg->net.type == NET_IP4)
>    {
> -    tlv->ae = BABEL_AE_IP4;
> +    if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop))
> +      tlv->ae = BABEL_AE_IPV4_VIA_IPV6;
> +    else
> +      tlv->ae = BABEL_AE_IP4;
> +
>      tlv->plen = net4_pxlen(&msg->net);
>      put_ip4_px(tlv->addr, &msg->net);
>    }
> @@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m,
>      msg->full = 1;
>      return PARSE_SUCCESS;
>  
> +  /*
> +   * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be
> +   * treated in the same manner: the receiver processes the request as
> +   * described in Section 3.8 of [RFC6126bis].  If an Update is sent, then
> +   * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
> +   * irrespective of which AE was used in the request.
> +   */

You write 4 in the beginning and TBD later here. Should obviously be
changed to the right number when it's assigned.

>    case BABEL_AE_IP4:
> +  case BABEL_AE_IPV4_VIA_IPV6:
>      if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
>        return PARSE_ERROR;
>  
> @@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m,
>    case BABEL_AE_WILDCARD:
>      return PARSE_ERROR;
>  
> +  /*
> +   * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be
> +   * treated in the same manner: the receiver processes the request as
> +   * described in Section 3.8 of [RFC6126bis].  If an Update is sent, then
> +   * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
> +   * irrespective of which AE was used in the request.
> +   */

Here it's TBD everywhere, but see above.

>    case BABEL_AE_IP4:
> +  case BABEL_AE_IPV4_VIA_IPV6:
>      if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
>        return PARSE_ERROR;
>  
> -- 
> 2.29.2



More information about the Bird-users mailing list