Increasing the server wsize/rsize

Greg Banks gnb at melbourne.sgi.com
Thu Aug 3 04:04:11 EDT 2006


On Thu, 2006-08-03 at 17:23, Neil Brown wrote:
> On Wednesday August 2, bfields at fieldses.org wrote:
> > On Fri, Jul 28, 2006 at 07:34:43PM -0400, Dean Hildebrand wrote:
> > > Thanks for the updates Greg.  One issue I've found is that if we want to 
> > > go larger than 1MB, we still need to fix svc_tcp_recvfrom to use the 
> > > heap instead of the stack.
> > 
> > Even with only 1MB, we're still putting a 1k array on the stack in that
> > one function, which seems like a problem.
> > 
> 
> But do we really need to scale the 'vec' array at all?
> Why not just call svc_recvfrom multiple times.

svc_recvfrom() calls kernel_recvmsg() which traverses LSM security
hooks, futzes with the socket lock, and potentially sends an ACK
to the peer.  All of these are things you want to minimise doing.

Also, svc_recvfrom() calls sock->ops->getname which is low cost
but there's no point calling it multiple times when you don't
need to.  But you could easily pull that out into the caller.

> We have exclusive access to read from the socket at this point so I
> cannot see room for a race.

svc_recvfrom() copies data out of the socket's receive queue
atomically wrt bottom half updates of the receive queue (newly
arrived packets go into a special backlog queue while we're
doing recvmsg() and that queue is emptied at the end of
recvmsg()).  So calling recvmsg() multiple times could open
a window where the receive queue changes due to a packet
arriving on a different CPU.  A side effect is that sunrpc
could get a data_ready callback partway through reading a
call from the socket, and presumably havoc would ensue.


Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.




More information about the NFSv4 mailing list