[pnfs] Fwd: poor read performance
Benny Halevy
bhalevy at panasas.com
Wed Oct 3 17:59:38 EDT 2007
dean hildebrand wrote:
> I forgot to copy the list...., stuff to talk about next week.
> Dean
>
> ---------- Forwarded message ----------
> From: dean hildebrand <seattleplus at gmail.com>
> Date: Oct 3, 2007 11:48 AM
> Subject: Re: [pnfs] poor read performance
> To: "J. Bruce Fields" <bfields at fieldses.org>
>
>
> Thanks Bruce, it was a problem with calculation of the replen. Space
> is being added for the sequence op twice, once in the encode_xxx
> function and once at the top of nfs4xdr.c in NFS4_dec_read_sz. I will
> send out a quick fix patch.
>
> I call it a quick fix patch because the real fix involves a lot more
> changes. Currently our pNFS kernel cannot support NFSv4.0 due to the
> sessions code.
That's unacceptable. Both client and server must work properly with either a 4.0
or 4.1 counterpart.
> If you try to disable NFSv4.1 in .config it won't even
> compile.
Yeah, We need to fix that too. However, it should not be a requirement
for 4.0 backward compatibility. I even suggest we add a mount option
to skip trying 4.1 mount if we one wants to force mounting a 4.1 server over 4.0.
> With regards to xdr sequence header allocation, currently a
> global variable (nr_sequence_quads) (yikes!) is being used to
> calculate the size of the sequence header. Obviously this won't
> support simultaneous NFSv4.0 and NFSv4.1 mounts.
>
> Off the top of my head, it seems we need to add NFS4_enc_xxx_sz and
> NFSv4_dec_xxx_sz variables to support the different header sizes for
> NFSv4.1 and NFSv4.0. This also means adding some more encode/decode
> functions....
>
> Dean
>
>
>
> On 10/2/07, J. Bruce Fields <bfields at fieldses.org> wrote:
>> On Tue, Oct 02, 2007 at 10:57:13PM -0400, J. Bruce Fields wrote:
>>> On Tue, Oct 02, 2007 at 07:39:22PM -0700, dean hildebrand wrote:
>>>> I'm new to the sessions code, but it seems that in xdr_read_pages
>>>> there is a check which will call xdr_shrink_bufhead (which will call
>>>> memmove)
>>>> ...
>>>> /* Realign pages to current pointer position */
>>>> iov = buf->head;
>>>> shift = iov->iov_len + (char *)iov->iov_base - (char *)xdr->p;
>>>> if (shift > 0)
>>>> xdr_shrink_bufhead(buf, shift);
>>>> ...
>>>>
>>>> Placing some debug statements in xdr_read_pages, it seems that
>>>> xdr_shrink_bufhead is only called when the sessions code is used,
>>>> which might be the CPU hog. With sessions in our system the shift is
>>>> 44 on every read request.
>>>>
>>>> Any ideas? How could sessions affect the read buffer? Has anyone
>>>> seen a read performance problem? Something to do with slots? (always
>>>> a good guess)
>>> I think it's in the function that encodes the xdr read that you'll find
>>> an estimate of the offset into the reply that where the read data is
>> More specifically: take a look at the call to xdr_inline_pages() from
>> nfs4_xdr_enc_read(), and the comment right before it. Note some of the
>> other read-link functions (readdir, readlink, getacl, fs_locations) do
>> the same thing.
>>
>> --b.
>>
>>> expected to be found (assuming it turns out to be a succesful read).
>>> That estimate is what allows the client to arrange to have the read data
>>> received directly into the page cache instead of having to first receive
>>> it into a separate xdr buffer and then copy it all.
>>>
>>> That calculation probably hasn't been updated to take into account
>>> whatever extra new session op it is that the start of each read
>>> operation, probably adding a word or two to the start of each read
>>> reply....
>>>
>>> --b.
>>> _______________________________________________
>>> pNFS mailing list
>>> pNFS at linux-nfs.org
>>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
More information about the pNFS
mailing list