[pnfs] [PATCH 10/18] pnfs client write path no rpc nfs page cache cleanup
William A. (Andy) Adamson
andros at citi.umich.edu
Tue Jan 8 15:45:45 EST 2008
On Jan 8, 2008 3:14 PM, Benny Halevy <bhalevy at panasas.com> wrote:
> William A. (Andy) Adamson wrote:
> >
> >
> > On 1/8/08, *Benny Halevy* <bhalevy at panasas.com
> > <mailto:bhalevy at panasas.com>> wrote:
> >
> > On Jan. 07, 2008, 22:47 +0200, andros at umich.edu
> > <mailto:andros at umich.edu> wrote:
> > > From: Andy Adamson <andros at umich.edu <mailto:andros at umich.edu>>
> > >
> > > Cleanup the nfs page cache after a non rpc pNFS write
> > >
> > > The 2.6.18.3 <http://2.6.18.3> client called
> > pnfs_writeback_done_norpc() from pnfs_writeback_done
> > > to clean up after a pNFS write that used the page cache, but not
> > NFS RPC.
> > > pnfs_writeback_done_norpc() contained code copied from
> > > the rpc_call_done routines nfs_writeback_done_full, and
> > > nfs_writeback_done_partial. It ran all the code that did not
> > have to do
> > > with an rpc call, namely the NFS_PROTO(inode)->write_done and
> > rpc_restart_call
> > > called in nfs_writeback_done_partial. The PNFS_USE_FULL_CB was
> > used to switch
> > > between the nfs_writeback_done_full and nfs_writeback_done_partial
> > > pieces of pnfs_writeback_done_norpc.
> > >
> > > Following the implementation for read, this patch provides a
> > cleaner way of
> > > providing the same functionlality.
> > > The pNFS layout driver that uses the nfs page cache, but does
> > not use
> > > RPC sets the new flag PNFS_NO_RPC in nfs_write_data->pnfsflags.
> > > This flag is checked in pnfs4_write_done, nfs_writeback_done, and
> > > nfs_writeback_done_partial().
> > >
> > > This allows the non rpc pNFS I/O path to use the same cleanup
> > routines
> > > that NFS uses.
> > >
> > > The PNFS_USE_FULL_CB is no longer needed.
> > >
> > > NOTE: The 2.6.18.3 <http://2.6.18.3> code skipped the
> > "need_commit" path in
> > > nfs_writeback_done_partial yet calls it in
> nfs_writeback_done_full.
> > > This patch does the same. This should be reviewed.
> >
> > NAK on both the change to nfs_writeback_done_partial and
> > nfs_writeback_done
> > Managing the page cache has nothing to do with how the layout driver
> > performed the I/O on the DS. The upper layer cares only if and
> > how much
> > data was written and if it's committed to stable storage.
> >
> >
> > I can see your point with nfs_writeback_done_partial.
> >
> > Please explain why a non-RPC layout driver would want to resend and
> > RPC, which is what could happen without the change to
> > nfs_writeback_done.....
> When the layout driver has a partial layout covering only part of the
> written range
> the layout driver will effectively perform a short write and I want the
> upper layer to
> iterate forward, causing pnfs_update_layout for the remainder.
>
> The design principle behind this is that no matter how the layout driver
> performed
> the I/O it must still provide the same stable storage semantics to the
> upper layer
> so it is indistinguishable from a regular NFS I/O to the MDS.
ok. so we do something like the following, where pnfs_restart_call() is a
yet to be written layout type specific io operation:
int nfs_writeback_done()
........
/* Is this a short write? */
if (task->tk_status >= 0 && resp->count < argp->count) {
static unsigned long complain;
nfs_inc_stats(data->inode, NFSIOS_SHORTWRITE);
/* Has the server at least made some progress? */
if (resp->count != 0) {
/* Was this an NFSv2 write or an NFSv3 stable write?
*/
if (resp->verf->committed != NFS_UNSTABLE) {
/* Resend from where the server left off */
argp->offset += resp->count;
argp->pgbase += resp->count;
argp->count -= resp->count;
} else {
/* Resend as a stable write in order to
avoid
* headaches in the case of a server crash.
*/
argp->stable = NFS_FILE_SYNC;
}
#ifdef CONFIG_PNFS
if (data->pnfsflags & PNFS_NO_RPC)
pnfs_restart_call();
else
#endif /* CONFIG_PNFS */
rpc_restart_call(task);
return -EAGAIN;
}
if (time_before(complain, jiffies)) {
printk(KERN_WARNING
"NFS: Server wrote zero bytes, expected
%u.\n",
argp->count);
complain = jiffies + 300 * HZ;
}
/* Can't do anything about it except throw an error. */
task->tk_status = -EIO;
}
return 0;
}
>
> >
> > -->Andy
> <-- Benny :)
> >
> > Benny
> >
> > >
> > > Signed-off by: Andy Adamson< andros at umich.edu
> > <mailto:andros at umich.edu>>
> > > ---
> > > fs/nfs/nfs4proc.c | 3 +++
> > > fs/nfs/pnfs.c | 11 +++++++++--
> > > fs/nfs/pnfs.h | 1 -
> > > fs/nfs/write.c | 13 ++++++++-----
> > > 4 files changed, 20 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index bd32a1e..0fdd970 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -3022,6 +3022,9 @@ static int pnfs4_write_done(struct
> > rpc_task *task, struct nfs_write_data *data)
> > > struct nfs_client *client = mds_svr->nfs_client;
> > > int status = task->tk_status;
> > >
> > > + if (data->pnfsflags & PNFS_NO_RPC)
> > > + return 0;
> > > +
> > > /* Is this a DS session */
> > > if (data->ds_nfs_client) {
> > > dprintk("%s DS read\n",__func__);
> > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> > > index 5c3fc6b..28b3e7f 100644
> > > --- a/fs/nfs/pnfs.c
> > > +++ b/fs/nfs/pnfs.c
> > > @@ -905,7 +905,12 @@ pnfs_set_ds_iosize(struct nfs_server *server)
> > > }
> > > }
> > >
> > > -/* Invoked by all non-NFSv4 I/O layout drivers to mark pages
> > for commit
> > > +/* Post-write completion function. Invoked by non RPC layout
> > drivers
> > > + * to clean up write pages.
> > > + *
> > > + * NOTE: callers must set data->pnfsflags PNFS_NO_RPC
> > > + * so that the NFS cleanup routines perform only the page cache
> > > + * cleanup.
> > > */
> > > static void
> > > pnfs_writeback_done(struct nfs_write_data *data, ssize_t status)
> > > @@ -920,7 +925,9 @@ pnfs_writeback_done(struct nfs_write_data
> > *data, ssize_t status)
> > > /* Status is the number of bytes written or an error code */
> > > data->task.tk_status = status;
> > > data->res.count = status;
> > > - pnfs_writeback_done_norpc(&data->task, data);
> > > +
> > > + /* call the NFS cleanup routines. */
> > > + data->call_ops->rpc_call_done(&data->task, data);
> > > data->call_ops->rpc_release(data);
> > > }
> > >
> > > diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
> > > index 070c23b..2baf9ae 100644
> > > --- a/fs/nfs/pnfs.h
> > > +++ b/fs/nfs/pnfs.h
> > > @@ -50,7 +50,6 @@ void pnfs_set_ds_iosize(struct nfs_server
> > *server);
> > > int pnfs_commit(struct inode *inode, struct list_head *head,
> > int sync, struct nfs_write_data *data);
> > > int pnfs_try_to_commit(struct inode *, struct nfs_write_data *,
> > struct list_head *, int);
> > > void pnfs_readpage_result_norpc(struct rpc_task *task, void
> > *calldata);
> > > -void pnfs_writeback_done_norpc(struct rpc_task *, void *);
> > > void pnfs_commit_done_norpc(struct rpc_task *, void *);
> > > void pnfs_pageio_init_read(struct nfs_pageio_descriptor *,
> > struct inode *, struct nfs_open_context *, struct list_head *,
> > size_t *);
> > > void pnfs_pageio_init_write(struct nfs_pageio_descriptor *,
> > struct inode *);
> > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > index e086ae4..eefc50c 100644
> > > --- a/fs/nfs/write.c
> > > +++ b/fs/nfs/write.c
> > > @@ -1009,11 +1009,6 @@ pnfs_writeback_done_update(struct
> > nfs_write_data *data)
> > > pnfs_need_layoutcommit(nfsi, argp->context);
> > > }
> > >
> > > -void pnfs_writeback_done_norpc(struct rpc_task *task, void
> > *calldata)
> > > -{
> > > - printk("%s XXX Need to Implement\n", __func__);
> > > -}
> > > -
> > > void pnfs_commit_done_norpc(struct rpc_task *task, void
> *calldata)
> > > {
> > > printk("%s XXX Need to Implement\n", __func__);
> > > @@ -1045,7 +1040,11 @@ static void
> > nfs_writeback_done_partial(struct rpc_task *task, void *calldata)
> > > goto out;
> > > }
> > >
> > > +#ifdef CONFIG_PNFS
> > > + if (!(data->pnfsflags & PNFS_NO_RPC) &&
> > nfs_write_need_commit(data)) {
> > > +#else
> > > if (nfs_write_need_commit(data)) {
> > > +#endif /* CONFIG_PNFS */
> > > struct inode *inode = page->mapping->host;
> > >
> > > spin_lock(&inode->i_lock);
> > > @@ -1214,7 +1213,11 @@ int nfs_writeback_done(struct rpc_task
> > *task, struct nfs_write_data *data)
> > > nfs_inc_stats(data->inode, NFSIOS_SHORTWRITE);
> > >
> > > /* Has the server at least made some progress? */
> > > +#ifdef CONFIG_PNFS
> > > + if (resp->count != 0 && !(data->pnfsflags &
> > PNFS_NO_RPC)) {
> > > +#else
> > > if (resp->count != 0) {
> > > +#endif /* CONFIG_PNFS */
> > > /* Was this an NFSv2 write or an NFSv3
> > stable write? */
> > > if (resp->verf->committed != NFS_UNSTABLE) {
> > > /* Resend from where the server
> > left off */
> >
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20080108/1533d845/attachment-0001.htm
More information about the pNFS
mailing list