[PATCH 2/3] Fix issue where cli socket buffer may get full

Kenth Eriksson Kenth.Eriksson at infinera.com
Fri Jun 14 19:57:55 CEST 2019


On Tue, 2019-06-11 at 18:14 +0200, Ondrej Zajicek wrote:
> On Tue, Jun 11, 2019 at 03:54:21PM +0000, Kenth Eriksson wrote:
> > On Tue, 2019-06-11 at 17:03 +0200, Ondrej Zajicek wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you recognize the
> > > sender
> > > and know the content is safe.
> > > 
> > > 
> > > On Tue, Jun 11, 2019 at 12:09:02PM +0200, Kenth Eriksson wrote:
> > > > The cli module must reset the socket io buffer rpos when all
> > > > characters
> > > > in the socket buffer has been processed.
> > > > 
> > > > The method cli_get_command is always invoked twice for every
> > > > CLI
> > > > command, thus rxpos may also be reset at the second invocation
> > > > if
> > > > no
> > > > newline was found and no new data is input to the socket
> > > > buffer.
> > > 
> > > Hi
> > > 
> > > Now i see where the bug is hidden:
> > > 
> > > There are two RX buffers in CLI, one part of socket (s->rbuf) and
> > > one
> > > directly in CLI (c->rx_buf). In cli_get_command(), the command is
> > > copied
> > > from the first to the second (where it starts from the beginning
> > > of
> > > the
> > > buffer), but position in the second buffer is reset only during
> > > unsuccessful cli_get_command() of the next command. Therefore, if
> > > the
> > > command ends on the same position as the end of the first buffer,
> > > then
> > > sk_read calls read with 0 length.
> > > 
> > > This your patch resets position of the first buffer when all data
> > > in
> > > it were
> > > processed. That is correct and helps with the bug (as it is less
> > > likely
> > > to hit the end of the first buffer). But the socket is still a
> > > bit
> > > problematic - it works only in request-reply mode (which is OK
> > > w.r.t.
> > > birdc client). If a client sends more bytes (e.g. next commands)
> > > before
> > > all the reply is sent back, it is possible to hang the CLI
> > > without
> > > processing all commands, or trigger the condition when read is
> > > called
> > > with 0 length.
> > > 
> > > E.g. sender sends enough bytes to fill the socket buffer, CLI
> > > event
> > > is called, cli_get_command() reads first command, but there are
> > > more
> > > data in the socket buffer, so it does not reset the position,
> > > therefore the next rx_hook for CLI is called with 0 length.
> > > 
> > Correct, you still need patch 1/3 to not close socket if you read
> > zero
> > bytes due to socket buffer is full. 
> > 
> > Also patch 1/3 is not only relevant for CLI. E.g. what happens if
> > you
> > get a big burst of protocol packets (OSPF, BGP etc.)? That would
> > cause
> > the socket to get closed due to buffer is full?
> 
> No, because all other protocols process data from rx-buffer
> immediately
> in rx_hook. It is trivial for datagram-based protocols like OSPF or
> RIP,
> where one rx_hook call is one datagram. It is a bit more complicated
> for
> TCP-based BGP, where BGP message framing is not congruent with
> read(),
> but in that case we process the rx-buffer in rx_hook until we have
> one
> full message (and we have big enough buffer for one full message) and
> the we immediately process it from rx_hook.
> 
> Therefore we have invariant that after rx_hook, there is non-full
> buffer.
> This invariant is not satisfied by CLI rx_hook. This is crudely fixed
> by
> 3/3, but i think that if we want to throw away some data, it should
> be
> done on message boundary.
> 
> Also, better way to handle 1/3 is to add condition on line io.c line
> 2211, so that if we have full buffer, we do not try to check POLLIN,
> so that we do not even call sk_read() in such case instead of try it
> to handle from inside sk_read().
> 
I believe we *might* have seen a side effect of trying to do what you
suggest here. The sk_read also makes accept on the socket if it is not
in a connected state, e.g. SK_UNIX_PASSIVE and SK_TCP_PASSIVE. Might be
that I implemented your suggestion incorrectly. So could you please
provide a patch for your suggestion? Or maybe comment on the one I sent
on the list? 



More information about the Bird-users mailing list