Increasing the server wsize/rsize
Greg Banks
gnb at melbourne.sgi.com
Thu Aug 3 04:04:11 EDT 2006
On Thu, 2006-08-03 at 17:23, Neil Brown wrote:
> On Wednesday August 2, bfields at fieldses.org wrote:
> > On Fri, Jul 28, 2006 at 07:34:43PM -0400, Dean Hildebrand wrote:
> > > Thanks for the updates Greg. One issue I've found is that if we want to
> > > go larger than 1MB, we still need to fix svc_tcp_recvfrom to use the
> > > heap instead of the stack.
> >
> > Even with only 1MB, we're still putting a 1k array on the stack in that
> > one function, which seems like a problem.
> >
>
> But do we really need to scale the 'vec' array at all?
> Why not just call svc_recvfrom multiple times.
svc_recvfrom() calls kernel_recvmsg() which traverses LSM security
hooks, futzes with the socket lock, and potentially sends an ACK
to the peer. All of these are things you want to minimise doing.
Also, svc_recvfrom() calls sock->ops->getname which is low cost
but there's no point calling it multiple times when you don't
need to. But you could easily pull that out into the caller.
> We have exclusive access to read from the socket at this point so I
> cannot see room for a race.
svc_recvfrom() copies data out of the socket's receive queue
atomically wrt bottom half updates of the receive queue (newly
arrived packets go into a special backlog queue while we're
doing recvmsg() and that queue is emptied at the end of
recvmsg()). So calling recvmsg() multiple times could open
a window where the receive queue changes due to a packet
arriving on a different CPU. A side effect is that sunrpc
could get a data_ready callback partway through reading a
call from the socket, and presumably havoc would ensue.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
More information about the NFSv4
mailing list