IPv6 BGP & kernel 4.19

Sun Mar 29 16:04:46 CEST 2020

Hi,

Here `net.ipv6.route.gc_thresh = -1` seems to be sufficent.

Thanks for the idea!

Alarig

On lun. 16 mars 22:10:28 2020, Clément Guivy wrote:
> Thanks.
> 
> I found a solution which seems to be working so far, with regular Debian
> 4.19 kernel, on my 2 edge routers.
> 
> I set both net.ipv6.gc_thresh and max_size to 131072, the reasoning behind
> that is to have this limit above the number of routes in the full view, so
> that gc is not triggered too often.
> Once the full view is loaded I can now perform an 'ip route get' lookup on
> each and every prefix without getting a "Network is unreachable" error
> (thanks for the tip Basil), nor face a noticeable service disruption, and
> IPv6 BGP sessions have also been stable so far (ie for a few days).
> 
> If anyone reproduces this solution (or found another one) I would be glad to
> know.
> 
> 
> On 16/03/2020 12:41, Baptiste Jonglez wrote:
> > FYI, babeld seems to be affected by this same bug: https://github.com/jech/babeld/issues/50
> > 
> > The net.ipv6.route.max_size workaround is also mentioned there.
> > 
> > Baptiste
> > 
> > On 26-02-20, Basil Fillan wrote:
> > > Hi,
> > > 
> > > We've also experienced this after upgrading a few routers to Debian Buster.
> > > With a kernel bisect we found that a bug was introduced in the following
> > > commit:
> > > 
> > > 3b6761d18bc11f2af2a6fc494e9026d39593f22c
> > > 
> > > This bug was still present in master as of a few weeks ago.
> > > 
> > > It appears entries are added to the IPv6 route cache which aren't visible
> > > from "ip -6 route show cache", but are causing the route cache garbage
> > > collection system to trigger extremely often (every packet?) once it exceeds
> > > the value of net.ipv6.route.max_size. Our original symptom was extreme
> > > forwarding jitter caused within the ip6_dst_gc function (identified by some
> > > spelunking with systemtap & perf) worsening as the size of the cache
> > > increased. This was due to our max_size sysctl inadvertently being set to 1
> > > million. Reducing this value to the default 4096 broke IPv6 forwarding
> > > entirely on our test system under affected kernels. Our documentation had
> > > this sysctl marked as the maximum number of IPv6 routes, so it looks like
> > > the use changed at some point.
> > > 
> > > We've rolled our routers back to kernel 4.9 (with the sysctl set to 4096)
> > > for now, which fixed our immediate issue.
> > > 
> > > You can reproduce this by adding more than 4096 (default value of the
> > > sysctl) routes to the kernel and running "ip route get" for each of them.
> > > Once the route cache is filled, the error "RTNETLINK answers: Network is
> > > unreachable" will be received for each subsequent "ip route get"
> > > incantation, and v6 connectivity will be interrupted.
> > > 
> > > Thanks,
> > > 
> > > Basil
> > > 
> > > 
> > > On 26/02/2020 20:38, Clément Guivy wrote:
> > > > Hi, did anyone find a solution or workaround regarding this issue?
> > > > Considering a router use case.
> > > > I have looked at rt6_stats, total route count is around 78k (full view),
> > > > and around 4100 entries in the cache at the moment on my first router
> > > > (forwarding a few Mb/s) and around 2500 entries on my second router
> > > > (forwarding less than 1 Mb/s).
> > > > I have reread the entire thread. At first, Alarig's research seemed to
> > > > lead to a neighbor management problem, my understanding is that route
> > > > cache is something else entirely - or is it related somehow?
> > > > 
> > > > 
> > > > On 03/12/2019 19:29, Alarig Le Lay wrote:
> > > > > We agree then, and I act as a router on all those machines.
> > > > > 
> > > > > Le 3 décembre 2019 19:27:11 GMT+01:00, Vincent Bernat
> > > > > <vincent at bernat.ch> a écrit :
> > > > > 
> > > > >      This is the result of PMTUd. But when you are a router, you don't
> > > > >      need to do that, so it's mostly a problem for end hosts.
> > > > > 
> > > > >      On December 3, 2019 7:05:49 PM GMT+01:00, Alarig Le Lay
> > > > >      <alarig at swordarmor.fr> wrote:
> > > > > 
> > > > >          On 03/12/2019 14:16, Vincent Bernat wrote:
> > > > > 
> > > > >              The information needs to be stored somewhere.
> > > > > 
> > > > > 
> > > > >          Why has it to be stored? It’s not really my problem if
> > > > > someone else has
> > > > >          a non-stantard MTU and can’t do TCP-MSS or PMTUd.
>