Increasing the server wsize/rsize
Neil Brown
neilb at suse.de
Thu Aug 3 05:12:55 EDT 2006
On Thursday August 3, gnb at melbourne.sgi.com wrote:
> On Thu, 2006-08-03 at 17:23, Neil Brown wrote:
> > On Wednesday August 2, bfields at fieldses.org wrote:
> > > On Fri, Jul 28, 2006 at 07:34:43PM -0400, Dean Hildebrand wrote:
> > > > Thanks for the updates Greg. One issue I've found is that if we want to
> > > > go larger than 1MB, we still need to fix svc_tcp_recvfrom to use the
> > > > heap instead of the stack.
> > >
> > > Even with only 1MB, we're still putting a 1k array on the stack in that
> > > one function, which seems like a problem.
> > >
> >
> > But do we really need to scale the 'vec' array at all?
> > Why not just call svc_recvfrom multiple times.
>
> svc_recvfrom() calls kernel_recvmsg() which traverses LSM security
> hooks, futzes with the socket lock, and potentially sends an ACK
> to the peer. All of these are things you want to minimise doing.
True, but we also want to minimise the size of the 'vec' array (or
allocating is part of struct svc_rqst).
So where is an appropriate trade-off? My '16' that I picked out of
the air means one call per 64K. Is that too often?
We could probably go to 32, but much more and we would really need to
move vec into svc_rqst.
>
> Also, svc_recvfrom() calls sock->ops->getname which is low cost
> but there's no point calling it multiple times when you don't
> need to. But you could easily pull that out into the caller.
Yep.
>
> > We have exclusive access to read from the socket at this point so I
> > cannot see room for a race.
>
> svc_recvfrom() copies data out of the socket's receive queue
> atomically wrt bottom half updates of the receive queue (newly
> arrived packets go into a special backlog queue while we're
> doing recvmsg() and that queue is emptied at the end of
> recvmsg()). So calling recvmsg() multiple times could open
> a window where the receive queue changes due to a packet
> arriving on a different CPU. A side effect is that sunrpc
> could get a data_ready callback partway through reading a
> call from the socket, and presumably havoc would ensue.
>
We don't start calling svc_recvfrom until we know there are enough
bytes available. So these newly arriving packets have nothing much to
do with the current rpc message.
And a stray call to data_ready just calls svc_sock_enqueue which will
quickly exit before SK_BUSY is set - no room for havoc there.
So you've raised some issues worth considering, but I'm not sure what
conclusion they really point to...
NeilBrown
More information about the NFSv4
mailing list