[pnfs] Layoutcommit and stable flush

Garth Goodson Garth.Goodson at netapp.com
Fri Aug 4 13:59:39 EDT 2006



Dean Hildebrand wrote:
> Hi All,
> 
> I found an issue with layoutcommit when writing larger files. (I notice 
> it with files > 2GB, but I have 2 GB of RAM)
> 
> Layoutcommit is currently called in nfs_sync_inode_wait.  Currently we 
> are ignoring the 'how' parameter, and always issueing a layoutcommit 
> whenever this function is invoked.  This can lead to an excessive number 
> of layoutcommit calls, since nfs_sync_inode_wait is called for a variety 
> of reasons to synchronously update the inode.
> 
> The issue I'm seeing is that the VM calls this function under memory 
> pressure to flush and commit writes to disk (so it can release the 
> memory for those write requests).  When this is done, how == 
> FLUSH_STABLE.  For a 4 GB file, I see this function being invoked 5 - 15 
> times, creating 5-15 layoutcommits just to write one file.  This can 
> cause a big slowdown.  From talking with Trond, our initial inclination 
> is to NOT issue a layoutcommit in this situation.  What does everyone 
> think?
> 
> The larger issue of course is to determine which values of 'how', or in 
> which situations should a layoutcommit be issued?  This might be a 
> protocol issue that I should mention on the nfsv4 email list (let me 
> know if you think so).
> 
> A cursory examination reveals that nfs_sync_inode_wait is called in the 
> following situations with the following value of 'how':
> 1. Lock/Unlock (how==0)
> 2. Getattr to update the mtime/ctime (how == nocommit)
> 3. Setattr (how==0)
> 4. Memory presssure (how == flush_stable)
> 5. Close (how==0)
> 6. Rename (how==0)
> 7. Delegation return (how==0)
> 8. Fsync (how==0)
> 9. A few other special cases
> 
> *Note: how == 0 means to flush all data, sync the data, and update the 
> metadata.
> 
> My initial view is that we always want to send a layoutcommit for all 
> values of 'how' other than FLUSH_STABLE.
> 
> One interesting case is with getattr, as layoutcommit is required to 
> update the mtime/ctime, but the data is not committed to stable 
> storage.  Is it ok to issue a layoutcommit without first calling commit?
> 

I think we want commits to occur before issuing layoutcommit.  Why 
update the metadata while the data is not guaranteed stable?

I'm not sure what the problem is with getattr.  I think it is fine for 
getattr to return client cached attrs that are not yet stable.  However, 
if things like mtime, ctime are returned we must be sure that once the 
layoutcommit does occur that they are updated to at least those times (I 
think we can do this by passing them into layoutcommit).

-Garth


More information about the pNFS mailing list