[pnfs] [PATCH 4/5] pNFS: Merge filelayout_flush_one into filelayout_write_pagelist.

Dean Hildebrand seattleplus at gmail.com
Wed May 7 20:10:15 EDT 2008



William A. (Andy) Adamson wrote:
>
>
> On Wed, May 7, 2008 at 6:35 PM, Dean Hildebrand <seattleplus at gmail.com 
> <mailto:seattleplus at gmail.com>> wrote:
>
>
>
>     William A. (Andy) Adamson wrote:
>
>
>
>         On Wed, May 7, 2008 at 5:47 PM, Dean Hildebrand
>         <seattleplus at gmail.com <mailto:seattleplus at gmail.com>
>         <mailto:seattleplus at gmail.com <mailto:seattleplus at gmail.com>>>
>         wrote:
>
>
>
>            William A. (Andy) Adamson wrote:
>
>                Hi Dean
>
>                Could you please split this patch up? Perhaps into the
>         removal
>                of the flush function + replacement function, the
>         changes to
>                filelayout_write_pagelist, the changes to
>         filelayout_commit,
>                and the removal of free _request_data?
>
>            But how do I keep the bisectability?  Touching any one of
>         those 3
>            elements affects the other.  I don't like separating the
>         add and
>            remove of things, I prefer the replace-style patches.
>          Either way,
>            this patch essentially has those elements in separate chunks.
>             (there is no replacement of flush_one, just deletion of
>            flush_one/free_request_data and change of
>         write_pagelist/commit)
>
>
>         ok. then 1 patch has the deletion of flush one, one patch has
>         the deletion of free_request_data, one patch has the change to
>         write_pagelist and the other the change to filelayout_commit.
>         that way, if we don't agree with the change to say
>         write_pagelist we can change just the one patch. or, if we end
>         up not wanting to remove free_request_data, we just remove one
>         patch.
>
>
>     If you want, I could have 1 patch to move to the new system of
>     writes, including the removal of file_layout_flushone and
>     filelayout_free_request_data, then have 1 patch which removes the
>     common part of flushone and 1 patch to remove the common parts of
>     free_request_data.  How about that?
>
>     Dean
>
>
>         -->Andy
>
>
>            If you want, I could split it up and send it out just for
>         interest
>            sake.
>            Dean
>
>
>                -->Andy
>
>
>                On Tue, May 6, 2008 at 7:45 PM, Dean Hildebrand
>                <seattleplus at gmail.com <mailto:seattleplus at gmail.com>
>         <mailto:seattleplus at gmail.com <mailto:seattleplus at gmail.com>>
>                <mailto:seattleplus at gmail.com
>         <mailto:seattleplus at gmail.com> <mailto:seattleplus at gmail.com
>         <mailto:seattleplus at gmail.com>>>>
>
>                wrote:
>
>                   details: The flush_one layout driver interface
>         function is a
>                   specialized
>                   function only for the file layout.  It's purpose was to
>                calculate
>                   the correct data server for each write request so this
>                calculation
>                   can be avoided when the data is actually sent on the
>         wire.
>
> I may be wrong, but when the non O_DIRECT write path is used, the 
> filelayout_pgtest is supposed to guarantee that
> the pagelist supplied to filelayout_write_pagelist all goes to one 
> data server. this is what made the filelayout_flush routine obsolete. 
> have you verified this?
My understanding of pg_test is that it tells the common code to keep 
gathering pages together until pg_test tells it that 2 pages cannot be 
gathered together.  At this point it stops gathering the pages together 
and sends the pages that were gathered together to write_pagelist.  So 
all the pages sent to write_pagelist are going to the same ds.


 >your code should be able to find the data server for the first page in 
the filelayout_write_pagelist pagelist and use it for all >other pages. 
if this is not true, then the filelayout_pgtest is incorrect.
Isn't this what my code is doing?  From my tests, the offset and count 
sent to write_pagelist are broken down per stripe unit and so the 
following command gets the correct ds.
status = nfs4_pnfs_dserver_get(data->lseg,....)

Do you think my code is letting data get to the wrong ds's?
>
> as you know, the O_DIRECT path does not use the pg_test logic, and so 
> needs to find the data server for each page in the 
> filelayout_write_pagelist pagelist.
Yes, that calculation will be in the o_direct patches, which I'm having 
trouble making progress on since the latest session changes broke 
o_direct.  I first need to figure out how to get sessions and o_direct 
working...., any ideas are welcome :)
Dean
>
>  
>
>          It
>                   was decided that having such a specialized function
>         (plus its
>                   associated private state) was not worth
>                   the effort.  In addition, O_DIRECT needs to redo this
>                calculation
>                   anyways.
>
>                   Signed-off-by: Dean Hildebrand <dhildeb at us.ibm.com
>         <mailto:dhildeb at us.ibm.com>
>                <mailto:dhildeb at us.ibm.com <mailto:dhildeb at us.ibm.com>>
>                   <mailto:dhildeb at us.ibm.com
>         <mailto:dhildeb at us.ibm.com> <mailto:dhildeb at us.ibm.com
>         <mailto:dhildeb at us.ibm.com>>>>
>
>
>                   ---
>                    fs/nfs/nfs4filelayout.c    |  356
>                   +++++++++++++-------------------------------
>                    fs/nfs/nfs4filelayout.h    |    8 +-
>                    fs/nfs/nfs4filelayoutdev.c |   26 +++-
>                    fs/nfs/pagelist.c          |    3 -
>                    fs/nfs/pnfs.c              |   45 ------
>                    fs/nfs/pnfs.h              |    2 -
>                    fs/nfs/write.c             |    4 -
>                    include/linux/nfs4_pnfs.h  |    3 -
>                    include/linux/nfs_page.h   |    4 -
>                    9 files changed, 125 insertions(+), 326 deletions(-)
>
>                   diff --git a/fs/nfs/nfs4filelayout.c
>         b/fs/nfs/nfs4filelayout.c
>                   index 4f4748b..5d9b551 100644
>                   --- a/fs/nfs/nfs4filelayout.c
>                   +++ b/fs/nfs/nfs4filelayout.c
>                   @@ -240,14 +240,7 @@ struct rpc_call_ops
>                filelayout_write_call_ops = {
>                    * allows the original read/write data structs to be
>         passed
>                in the
>                    * last argument.
>                    *
>                   -
>                   - * This is called after the pNFS client has already
>                created, so I
>                   pass it
>                   - * in via the last argument (void*).  I think this
>         is the only
>                   way as there
>                   - * are just too many NFS specific arguments in the
>         read/write
>                   data structs
>                   - * to pass to the layout drivers.
>                   - *
>                   - * TODO:
>                   - * 1. This is a lot of arguments, create special
>                non-nfs-specific
>                   structure?
>                   + * TODO: join with write_pagelist?
>                    */
>                    static int filelayout_read_pagelist(
>                          struct pnfs_layout_type *layoutid,
>                   @@ -259,12 +252,11 @@ static int
>         filelayout_read_pagelist(
>                          struct nfs_read_data *data)
>                    {
>                          struct inode *inode = PNFS_INODE(layoutid);
>                   -       struct nfs4_filelayout *nfslay = NULL;
>                          struct nfs4_filelayout_segment *flseg;
>                          struct nfs4_pnfs_dserver dserver;
>                   +       struct nfs4_pnfs_ds *ds;
>                          int status;
>
>                   -       nfslay = PNFS_LD_DATA(layoutid);
>                          flseg = LSEG_LD_DATA(data->lseg);
>
>                          /* Retrieve the correct rpc_client for the
>         byte range */
>                   @@ -280,7 +272,9 @@ static int filelayout_read_pagelist(
>                                  data->args.fh = NFS_FH(inode);
>                                  status = 0;
>                          } else {
>                   -               struct nfs4_pnfs_ds *ds =
>                dserver.dev->ds_list[0];
>                   +               ds = dserver.dev->ds_list[0];
>                   +
>                   +               dprintk("%s USE DS:ip %x\n", __func__,
>                   htonl(ds->ds_ip_addr));
>
>                                  /* just try the first data server for the
>                index..*/
>                                  data->pnfs_client =
>         ds->ds_clp->cl_rpcclient;
>                   @@ -316,179 +310,7 @@ print_ds(struct nfs4_pnfs_ds *ds)
>                          dprintk("        ds->ds_count %d\n",
>                   atomic_read(&ds->ds_count));
>                    }
>
>                   -static struct nfs4_pnfs_dserver *
>                   -filelayout_create_dserver(void)
>                   -{
>                   -       struct nfs4_pnfs_dserver *local;
>                   -
>                   -       local = kzalloc(sizeof(*local), GFP_KERNEL);
>                   -       if (!local)
>                   -               return NULL;
>                   -       kref_init(&local->ref);
>                   -       dprintk("<-- %s dserver %p\n", __func__, local);
>                   -       return local;
>                   -}
>                   -
>                   -static void filelayout_free_dserver(struct kref *kref)
>                   -{
>                   -       struct nfs4_pnfs_dserver *dserver;
>                   -       dserver = container_of(kref, struct
>                nfs4_pnfs_dserver, ref);
>                   -
>                   -       dprintk("--> %s dserver %p\n", __func__,
>         dserver);
>                   -       kfree(dserver);
>                   -}
>                   -
>                   -static void filelayout_release_dserver(struct
>                nfs4_pnfs_dserver
>                   *dserver)
>                   -{
>                   -       kref_put(&dserver->ref,
>         filelayout_free_dserver);
>                   -}
>                   -
>                   -static void filelayout_get_dserver(struct
>                nfs4_pnfs_dserver *dserver)
>                   -{
>                   -       kref_get(&dserver->ref);
>                   -}
>                   -
>                   -/*
>                   - * Called by nfs_release_request()
>                   - */
>                   -void
>                   -filelayout_free_request_data(struct nfs_page *req)
>                   -{
>                   -       struct nfs4_pnfs_dserver *dserver;
>                   -
>                   -       dserver = (struct nfs4_pnfs_dserver
>         *)req->wb_private;
>                   -       BUG_ON(!dserver);
>                   -       filelayout_release_dserver(dserver);
>                   -}
>                   -
>                   -static struct nfs4_pnfs_ds *
>                   -filelayout_create_init_ds(struct
>         pnfs_layout_segment *lseg,
>                   -                       loff_t file_offset, size_t
>         wb_bytes,
>                   -                       struct nfs4_pnfs_dserver **dsp)
>                   -{
>                   -       struct nfs4_pnfs_dserver *dserver;
>                   -       struct nfs4_pnfs_ds *ds;
>                   -       int status = -ENOMEM;
>                   -
>                   -       *dsp = dserver = filelayout_create_dserver();
>                   -       if (!dserver) {
>                   -               dprintk("%s failed to get dserver.
>         status
>                %d\n",
>                   -                                       __func__,
>         status);
>                   -               goto out_err;
>                   -       }
>                   -
>                   -       /* get the data server that serves this page */
>                   -       status = nfs4_pnfs_dserver_get(lseg,
>         file_offset,
>                   wb_bytes, dserver);
>                   -
>                   -       if (status != 0) {
>                   -               dprintk("%s failed to get dataserver.
>                status %d\n",
>                   -                                              
>         __func__,
>                status);
>                   -               filelayout_release_dserver(dserver);
>                   -               status =  -EIO;
>                   -               goto out_err;
>                   -       }
>                   -       /* just try the first multipath data server */
>                   -       ds = dserver->dev->ds_list[0];
>                   -
>                   -       return ds;
>                   -out_err:
>                   -       return ERR_PTR(status);
>                   -}
>                   -
>                   -/*
>                   -* feed nfs_flush_one with per data server pages.
>                   -*
>                   -* Assume stripesz >= PAGE_SIZE.
>                   -* TODO: If stripesz < PAGE_SIZE, use i/o through MDS
>                   -*
>                   -*/
>                   -int filelayout_flush_one(struct pnfs_layout_segment
>         *lseg,
>                   -                        struct list_head *head,
>         unsigned
>                int npages,
>                   -                        size_t count, int how)
>                   -{
>                   -       struct nfs4_pnfs_dserver *dserver = NULL;
>                   -       struct nfs4_pnfs_ds *ds = NULL;  /* current
>         stripe data
>                   server */
>                   -       struct nfs_page *req;
>                   -       loff_t file_offset = 0, stripe_offset, temp;
>                   -       size_t stripesz, dstotal = 0;
>                   -       struct list_head dslist;
>                   -       int status = -ENOMEM, use_ds = 0, ndspages = 0;
>                   -
>                   -       dprintk("--> %s npages %d, count %Zd, lseg
>         %p\n",
>                __func__,
>                   -                               npages, count, lseg);
>                   -
>                   -       INIT_LIST_HEAD(&dslist);
>                   -       stripesz =
>         filelayout_get_stripesize(lseg->layout);
>                   -       dprintk("%s stripesize %Zd\n", __func__,
>         stripesz);
>                   -
>                   -       /* split up the list according to DS */
>                   -       while (!list_empty(head)) {
>                   -next_ds:
>                   -               req = nfs_list_entry(head->next);
>                   -
>                   -               file_offset = req->wb_index <<
>                PAGE_CACHE_SHIFT;
>                   -
>                   -               if (!use_ds) {
>                   -                       ds =
>         filelayout_create_init_ds(lseg,
>                   file_offset,
>                   -                                                  
>                      req->wb_bytes, &dserver);
>                   -                       if (IS_ERR(ds)) {
>                   -                               status = PTR_ERR(ds);
>                   -                               goto out;
>                   -                       }
>                   -                       /* reset for new data server */
>                   -                       dstotal = 0;
>                   -                       ndspages = 0;
>                   -                       use_ds = 1;
>                   -               } else
>                   -                       filelayout_get_dserver(dserver);
>                   -
>                   -               req->wb_ops = &filelayout_io_operations;
>                   -               req->wb_private = dserver;
>                   -
>                   -               /* move request to dslist */
>                   -               nfs_list_remove_request(req);
>                   -               nfs_list_add_request(req, &dslist);
>                   -               ndspages++;
>                   -               npages--;
>                   -
>                   -               count -= req->wb_bytes;
>                   -               dstotal += req->wb_bytes;
>                   -
>                   -               /* Are we done with this DS? */
>                   -               temp = file_offset + req->wb_bytes;
>                   -               stripe_offset = do_div(temp, stripesz);
>                   -
>                   -               if (count == 0 || stripe_offset == 0) {
>                   -                       use_ds = 0;
>                   -                       goto send;
>                   -               }
>                   -       }
>                   -send:
>                   -       /* XXX should recover to send through MDS */
>                   -       dprintk("%s Send: ndspages %d dstotal %Zd
>                list_empty %d \n",
>                   -                               __func__, ndspages,
>         dstotal,
>                   list_empty(head));
>                   -       status = nfs_flush_one(PNFS_INODE(lseg->layout),
>                &dslist,
>                   ndspages,
>                   -                              dstotal, how);
>                   -       if (status < 0)
>                   -               goto out;
>                   -
>                   -       /* Is there more data to process? */
>                   -       if (!list_empty(head)) {
>                   -               /* count and npages better not be
>         zero! */
>                   -               dprintk("%s next_ds count %Zd npages
>         %d\n",
>                   -                               __func__, count,
>         npages);
>                   -               goto next_ds;
>                   -       }
>                   -
>                   -out:
>                   -       dprintk("<-- %s npages %d (should be zero!)\n",
>                __func__,
>                   npages);
>                   -       return status;
>                   -}
>                   -
>                   -/* Perform async writes.
>                   - *
>                   - * TODO: See filelayout_read_pagelist.
>                   - */
>                   +/* Perform async writes. */
>                    static int filelayout_write_pagelist(
>                          struct pnfs_layout_type *layoutid,
>                          struct page **pages,
>                   @@ -499,48 +321,52 @@ static int
>         filelayout_write_pagelist(
>                          int sync,
>                          struct nfs_write_data *data)
>                    {
>                   +       struct inode *inode = PNFS_INODE(layoutid);
>                          struct nfs4_filelayout_segment *flseg =
>                   LSEG_LD_DATA(data->lseg);
>                   -       struct nfs4_pnfs_dserver *dserver = NULL;
>                   +       struct nfs4_pnfs_dserver dserver;
>                          struct nfs4_pnfs_ds *ds;
>                   -       struct nfs_page *req = NULL;
>                   -       struct list_head *h;
>                   +       int status;
>
>                   -       dprintk("--> %s nr_pages %d offset:count
>                %Lu:%Zu\n", __func__,
>                   -                                              
>         nr_pages,
>                offset,
>                   count);
>                   +       dprintk("--> %s ino %lu nr_pages %d pgbase
>         %u req
>                %Zu@%Lu
>                   sync %d\n",
>                   +               __func__, inode->i_ino, nr_pages,
>         pgbase,
>                count,
>                   offset, sync);
>
>                          /* Retrieve the correct rpc_client for the
>         byte range */
>                   -       list_for_each(h, &data->pages) {
>                   -               req = list_entry(h, struct nfs_page,
>         wb_list);
>                   -               break;
>                   -       }
>                   -       BUG_ON(!req);
>                   -
>                   -       dserver = (struct nfs4_pnfs_dserver
>         *)req->wb_private;
>                   -       BUG_ON(!dserver);
>                   +       status = nfs4_pnfs_dserver_get(data->lseg,
>                   +                                      offset,
>                   +                                      count,
>                   +                                      &dserver);
>
>                   -       /* use the first multipath data server */
>                   -       ds = dserver->dev->ds_list[0];
>                   -       dprintk("%s USE DS:\n", __func__);
>                   -       print_ds(ds);
>                   +       if (status) {
>                   +               printk(KERN_ERR "%s: dserver get failed
>                status %d
>                   use MDS\n",
>                   +                      __func__, status);
>                   +               data->pnfs_client = NFS_CLIENT(inode);
>                   +               data->ds_nfs_client = NULL;
>                   +               data->args.fh = NFS_FH(inode);
>                   +               status = 0;
>                   +       } else {
>                   +               /* use the first multipath data
>         server */
>                   +               ds = dserver.dev->ds_list[0];
>
>                   -       data->pnfs_client = ds->ds_clp->cl_rpcclient;
>                   -       data->ds_nfs_client = ds->ds_clp;
>                   -       data->args.fh = dserver->fh;
>                   +               dprintk("%s ino %lu %Zu@%Lu
>         DS:%x:%hu\n",
>                   +                       __func__, inode->i_ino,
>         count, offset,
>                   +                       htonl(ds->ds_ip_addr),
>                ntohs(ds->ds_port));
>
>                   -       dprintk("%s using DS %x:%hu\n", __func__,
>                   -               htonl(ds->ds_ip_addr),
>         ntohs(ds->ds_port));
>                   +               data->pnfs_client =
>         ds->ds_clp->cl_rpcclient;
>                   +               data->ds_nfs_client = ds->ds_clp;
>                   +               data->args.fh = dserver.fh;
>
>                   -       /* Get the file offset on the dserver. Set the
>                write offset to
>                   -        * this offset and save the original offset.
>                   -        */
>                   -       data->args.offset =
>                filelayout_get_dserver_offset(offset,
>                   flseg);
>                   -       data->orig_offset = offset;
>                   +               /* Get the file offset on the
>         dserver. Set the
>                   write offset to
>                   +                * this offset and save the original
>         offset.
>                   +                */
>                   +               data->args.offset =
>                   filelayout_get_dserver_offset(offset, flseg);
>                   +               data->orig_offset = offset;
>                   +       }
>
>                          /* Perform an asynchronous write The offset
>         will be
>                reset
>                   in the
>                           * call_ops->rpc_call_done() routine
>                           */
>                          nfs_initiate_write(data, data->pnfs_client,
>                   -                       &filelayout_write_call_ops,
>         sync);
>                   +                        
>          &filelayout_write_call_ops, sync);
>
>                          return 0;
>                    }
>                   @@ -639,24 +465,6 @@ filelayout_free_lseg(struct
>                   pnfs_layout_segment *lseg)
>                    }
>
>                    /*
>                   - * Do two nfs_pnfs_dserver pointers point to the same
>                structure?
>                   - * Just compare the first multipath servers.
>                   - */
>                   -static int
>                   -filelayout_same_ds(struct nfs4_pnfs_dserver *one,
>         struct
>                   nfs4_pnfs_dserver *two)
>                   -{
>                   -       struct nfs4_pnfs_dev *d_one = one->dev, *d_two =
>                two->dev;
>                   -       struct nfs4_pnfs_ds *ds_one, *ds_two;
>                   -
>                   -       ds_one = d_one->ds_list[0];
>                   -       ds_two = d_two->ds_list[0];
>                   -       return (d_one->stripe_index ==
>         d_two->stripe_index &&
>                   -               d_one->num_ds == d_two->num_ds &&
>                   -               ds_one->ds_ip_addr ==
>         ds_two->ds_ip_addr &&
>                   -               ds_one->ds_port == ds_two->ds_port);
>                   -}
>                   -
>                   -/*
>                    * Allocate a new nfs_write_data struct and initialize
>                    */
>                    static struct nfs_write_data *
>                   @@ -693,66 +501,102 @@ filelayout_commit(struct
>                pnfs_layout_type
>                   *layoutid, int sync,
>                    {
>                          struct nfs4_filelayout_segment *nfslay;
>                          struct nfs_write_data   *dsdata = NULL;
>                   -       struct nfs4_pnfs_dserver *dserver = NULL;
>                   +       struct nfs4_pnfs_dserver dserver;
>                          struct nfs4_pnfs_ds *ds;
>                   -       struct nfs_page *req;
>                   +       struct nfs_page *req, *reqt;
>                          struct list_head *pos, *tmp, *head =
>         &data->pages;
>                   +       loff_t file_offset, comp_offset;
>                   +       size_t stripesz, cbytes;
>                   +       int status;
>                   +       struct nfs4_pnfs_dev_item *di;
>                   +       u32 idx1, idx2;
>
>                          nfslay = LSEG_LD_DATA(data->lseg);
>
>                   -       dprintk("%s data %p pnfs_client %p nfslay %p\n",
>                   -                       __func__, data,
>         data->pnfs_client,
>                nfslay);
>                   +       dprintk("%s data %p pnfs_client %p nfslay %p
>         sync
>                %d\n",
>                   +               __func__, data, data->pnfs_client,
>         nfslay,
>                sync);
>
>                          if (nfslay->commit_through_mds) {
>                                  dprintk("%s data %p commit through
>         mds\n",
>                   __func__, data);
>                                  return 1;
>                          }
>
>                   +       stripesz = filelayout_get_stripesize(layoutid);
>                   +       dprintk("%s stripesize %Zd\n", __func__,
>         stripesz);
>                   +
>                   +       di =
>         nfs4_pnfs_device_item_get(FILE_MT(data->inode),
>                   NFS_FH(data->inode),
>                   +                                      &nfslay->dev_id);
>                   +       if (di == NULL) {
>                   +               status = -EIO;
>                   +               goto out_bad;
>                   +       }
>                   +
>                          /* COMMIT to each Data Server */
>                          while (!list_empty(head)) {
>                   -
>                   -               /* dserver and dsdata must be NULL */
>                                  req = nfs_list_entry(head->next);
>
>                   -               dserver = (struct nfs4_pnfs_dserver
>                *)req->wb_private;
>                   -               /* XXX BUG_ON(!dserver) ?*/
>                   -               if (!dserver)
>                   +               file_offset = req->wb_index <<
>                PAGE_CACHE_SHIFT;
>                   +
>                   +               /* Get dserver for the current page */
>                   +               status =
>         nfs4_pnfs_dserver_get(data->lseg,
>                   +                                            
>          file_offset,
>                   +                                            
>          req->wb_bytes,
>                   +                                            
>          &dserver);
>                   +
>                   +               /* Get its index */
>                   +               idx1 =
>                filelayout_dserver_get_index(file_offset,
>                   di, nfslay);
>                   +
>                   +               if (status) {
>                   +                       status = -EIO;
>                                          goto out_bad;
>                   +               }
>
>                                  dsdata =
>         filelayout_clone_write_data(data);
>                   -               if (!dsdata)
>                   +               if (!dsdata) {
>                   +                       status = -ENOMEM;
>                                          goto out_bad;
>                   +               }
>
>                                  /* Just try the first multipath data
>         server */
>                   -               ds = dserver->dev->ds_list[0];
>                   +               ds = dserver.dev->ds_list[0];
>                                  dsdata->pnfs_client =
>         ds->ds_clp->cl_rpcclient;
>                                  dsdata->ds_nfs_client = ds->ds_clp;
>                   -               dsdata->args.fh = dserver->fh;
>                   -
>                   -               /* Gather all pages going to the
>         data server */
>                   +               dsdata->args.fh = dserver.fh;
>                   +               cbytes = req->wb_bytes;
>                   +
>                   +               /* Gather all pages going to the
>         current data
>                   server by
>                   +                * comparing their indices.
>                   +                * XXX: This recalculates the indices
>                unecessarily.
>                   +                *      One idea would be to calc
>         the index for
>                   every page
>                   +                *      and then compare if they are the
>                same. */
>                                  list_for_each_safe(pos, tmp, head) {
>                   -                       req = nfs_list_entry(pos);
>                   -                       if (filelayout_same_ds(dserver,
>                   req->wb_private)) {
>                   -                              
>         nfs_list_remove_request(req);
>                   -                              
>         nfs_list_add_request(req,
>                   &dsdata->pages);
>                   +                       reqt = nfs_list_entry(pos);
>                   +                       comp_offset = reqt->wb_index <<
>                   PAGE_CACHE_SHIFT;
>                   +                       idx2 =
>                   filelayout_dserver_get_index(comp_offset, di, nfslay);
>                   +                       if (idx1 == idx2) {
>                   +                              
>         nfs_list_remove_request(reqt);
>                   +                              
>         nfs_list_add_request(reqt,
>                   &dsdata->pages);
>                   +                               cbytes +=
>         reqt->wb_bytes;
>                                          }
>                                  }
>
>                   +               dprintk("%s: Initiating commit: %Zu@%llu
>                USE DS:\n",
>                   +                       __func__, cbytes, file_offset);
>                   +               print_ds(ds);
>                   +
>                                  /* Send COMMIT to data server */
>                                  nfs_initiate_commit(dsdata,
>                dsdata->pnfs_client, sync);
>                   -
>                   -               /* reset for next data server */
>                   -               dsdata = NULL;
>                   -               dserver = NULL;
>                          }
>                          /* Release original commit data since it is
>         not used */
>                          nfs_commit_free(data);
>                          return 0;
>
>                    out_bad:
>                   +       printk(KERN_ERR "%s: dserver get failed
>         status %d\n",
>                   __func__, status);
>                   +
>                          /* XXX should we send COMMIT to MDS e.g. not free
>                data and
>                   return 1 ? */
>                          nfs_commit_free(data);
>                   -       return -ENOMEM;
>                   +       return status;
>                    }
>
>                    /* Return the stripesize for the specified file.
>                   @@ -803,6 +647,10 @@ boundary:
>                          r_stripe = req->wb_index << PAGE_CACHE_SHIFT;
>                          do_div(r_stripe, pgio->pg_boundary);
>
>                   +#if 0
>                   +       dprintk("%s p %u r %u bnd %d bsize
>         %Zu\n",__func__,
>                   p_stripe, r_stripe, pgio->pg_boundary, pgio->pg_bsize);
>                   +#endif
>                   +
>                          return (p_stripe == r_stripe);
>                    }
>
>                   @@ -831,8 +679,6 @@ struct layoutdriver_io_operations
>                   filelayout_io_operations = {
>                          .commit                  = filelayout_commit,
>                          .read_pagelist           =
>         filelayout_read_pagelist,
>                          .write_pagelist          =
>         filelayout_write_pagelist,
>                   -       .flush_one               = filelayout_flush_one,
>                   -       .free_request_data       =
>                filelayout_free_request_data,
>                          .alloc_layout            =
>         filelayout_alloc_layout,
>                          .free_layout             =
>         filelayout_free_layout,
>                          .alloc_lseg              = filelayout_alloc_lseg,
>                   diff --git a/fs/nfs/nfs4filelayout.h
>         b/fs/nfs/nfs4filelayout.h
>                   index d0aeb78..4ec98ab 100644
>                   --- a/fs/nfs/nfs4filelayout.h
>                   +++ b/fs/nfs/nfs4filelayout.h
>                   @@ -74,7 +74,6 @@ struct nfs4_pnfs_devlist {
>                    struct nfs4_pnfs_dserver {
>                          struct nfs_fh        *fh;
>                          struct nfs4_pnfs_dev *dev;
>                   -       struct kref           ref;
>                    };
>
>                    struct nfs4_filelayout_segment {
>                   @@ -105,7 +104,6 @@ extern struct pnfs_client_operations
>                   *pnfs_callback_ops;
>                    char *deviceid_fmt(const struct pnfs_deviceid *dev_id);
>                    int  nfs4_pnfs_devlist_init(struct nfs4_pnfs_dev_hlist
>                *hlist);
>                    void nfs4_pnfs_devlist_destroy(struct
>         nfs4_pnfs_dev_hlist
>                *hlist);
>                   -
>                    int nfs4_pnfs_dserver_get(struct
>         pnfs_layout_segment *lseg,
>                                            loff_t offset,
>                                            size_t count,
>                   @@ -114,6 +112,12 @@ int
>         decode_and_add_devicelist(struct
>                   filelayout_mount_type *mt, struct pnfs_devi
>                    int process_deviceid_list(struct
>         filelayout_mount_type *mt,
>                                            struct nfs_fh *fh,
>                                            struct pnfs_devicelist
>         *devlist);
>                   +struct nfs4_pnfs_dev_item *
>         nfs4_pnfs_device_item_get(struct
>                   filelayout_mount_type *mt,
>                   +                                                  
>           struct
>                   nfs_fh *fh,
>                   +                                                  
>           struct
>                   pnfs_deviceid *dev_id);
>                   +u32 filelayout_dserver_get_index(loff_t offset,
>                   +                                struct
>         nfs4_pnfs_dev_item *di,
>                   +                                struct
>         nfs4_filelayout_segment
>                   *layout);
>
>                    #define READ32(x)         (x) = ntohl(*p++)
>                    #define READ64(x)         do {                 \
>                   diff --git a/fs/nfs/nfs4filelayoutdev.c
>                b/fs/nfs/nfs4filelayoutdev.c
>                   index dce1bd1..ead8a53 100644
>                   --- a/fs/nfs/nfs4filelayoutdev.c
>                   +++ b/fs/nfs/nfs4filelayoutdev.c
>                   @@ -695,6 +695,22 @@ nfs4_pnfs_device_item_get(struct
>                   filelayout_mount_type *mt,
>                          return dev;
>                    }
>
>                   +/* Want res = ((offset / layout->stripe_unit) %
>                di->stripe_count)
>                   + * Then: ((res + fsi) % di->stripe_count)
>                   + */
>                   +u32
>                   +filelayout_dserver_get_index(loff_t offset,
>                   +                            struct
>         nfs4_pnfs_dev_item *di,
>                   +                            struct
>         nfs4_filelayout_segment
>                *layout)
>                   +{
>                   +       u64 tmp, tmp2;
>                   +
>                   +       tmp = offset;
>                   +       do_div(tmp, layout->stripe_unit);
>                   +       tmp2 = do_div(tmp, di->stripe_count) +
>                   layout->first_stripe_index;
>                   +       return do_div(tmp2, di->stripe_count);
>                   +}
>                   +
>                    /* Retrieve the rpc client for a specified byte range
>                    * in 'inode' by filling in the contents of 'dserver'.
>                    */
>                   @@ -718,15 +734,9 @@ nfs4_pnfs_dserver_get(struct
>                   pnfs_layout_segment *lseg,
>                          if (di == NULL)
>                                  return 1;
>
>                   -       /* Want res = ((offset / layout->stripe_unit) %
>                   di->stripe_count)
>                   -        * Then: ((res + fsi) % di->stripe_count)
>                   -        */
>                   -
>                   -       tmp = offset;
>                   -       do_div(tmp, layout->stripe_unit);
>                   -       tmp2 = do_div(tmp, di->stripe_count) +
>                   layout->first_stripe_index;
>                   -       stripe_idx = do_div(tmp2, di->stripe_count);
>                   +       stripe_idx =
>         filelayout_dserver_get_index(offset,
>                di, layout);
>
>                   +       /* For debugging, ensure entire requested
>         range is
>                in this
>                   dserver */
>                          tmp = offset + count - 1;
>                          do_div(tmp, layout->stripe_unit);
>                          tmp2 = do_div(tmp, di->stripe_count) +
>                   layout->first_stripe_index;
>                   diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
>                   index b99fa0a..2b4cb44 100644
>                   --- a/fs/nfs/pagelist.c
>                   +++ b/fs/nfs/pagelist.c
>                   @@ -170,9 +170,6 @@ static void nfs_free_request(struct
>                kref *kref)
>                          /* Release struct file or cached credential */
>                          nfs_clear_request(req);
>                          put_nfs_open_context(req->wb_context);
>                   -#ifdef CONFIG_PNFS
>                   -       pnfs_free_request_data(req);
>                   -#endif /* CONFIG_PNFS */
>                          nfs_page_free(req);
>                    }
>
>                   diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>                   index 27e37be..9f504df 100644
>                   --- a/fs/nfs/pnfs.c
>                   +++ b/fs/nfs/pnfs.c
>                   @@ -1338,40 +1338,6 @@ pnfs_writeback_done(struct
>                nfs_write_data
>                   *data)
>                    }
>
>                    /*
>                   - * return 0 for success, 1 for legacy nfs fallback,
>                negative for
>                   error
>                   - */
>                   -int
>                   -pnfs_flush_one(struct inode *inode, struct
>         list_head *head,
>                   -                unsigned int npages, size_t count,
>         int how)
>                   -{
>                   -       struct nfs_server *nfss = NFS_SERVER(inode);
>                   -       struct layoutdriver_io_operations *io_ops;
>                   -       struct nfs_page *req;
>                   -       struct pnfs_layout_segment *lseg;
>                   -       int status;
>                   -
>                   -       if (!pnfs_enabled_sb(nfss) ||
>                   !nfss->pnfs_curr_ld->ld_io_ops->flush_one)
>                   -               goto fallback;
>                   -
>                   -       req = nfs_list_entry(head->next);
>                   -       status = pnfs_update_layout(inode,
>                   -                                   req->wb_context,
>                   -                                   count,
>                   -                                   req->wb_offset,
>                   -                                   IOMODE_RW,
>                   -                                   &lseg);
>                   -       if (status)
>                   -               goto fallback;
>                   -       io_ops = nfss->pnfs_curr_ld->ld_io_ops;
>                   -       status = io_ops->flush_one(lseg, head, npages,
>                count, how);
>                   -       put_lseg(lseg);
>                   -
>                   -       return status;
>                   -fallback:
>                   -       return nfs_flush_one(inode, head, npages,
>         count, how);
>                   -}
>                   -
>                   -/*
>                    * Obtain a layout for the the write range, and call
>                do_sync_write.
>                    *
>                    * Unlike the read path which can wait until page
>         coalescing
>                   @@ -1965,17 +1931,6 @@ void
>                _pnfs_modify_new_write_request(struct
>                   nfs_page *req,
>                          }
>                    }
>
>                   -void pnfs_free_request_data(struct nfs_page *req)
>                   -{
>                   -       struct layoutdriver_io_operations *lo;
>                   -
>                   -       if (!req->wb_ops || !req->wb_private)
>                   -               return;
>                   -       lo = (struct layoutdriver_io_operations
>         *)req->wb_ops;
>                   -       if (lo->free_request_data)
>                   -               lo->free_request_data(req);
>                   -}
>                   -
>                    void pnfs_free_fsdata(struct pnfs_fsdata *fsdata)
>                    {
>                          kfree(fsdata);
>                   diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>                   index acbce4b..95ffbd3 100644
>                   --- a/fs/nfs/pnfs.h
>                   +++ b/fs/nfs/pnfs.h
>                   @@ -53,8 +53,6 @@ int _pnfs_try_to_commit(struct
>                nfs_write_data *);
>                    void pnfs_pageio_init_read(struct
>         nfs_pageio_descriptor *,
>                struct
>                   inode *, struct nfs_open_context *, struct list_head *,
>                size_t *);
>                    void pnfs_pageio_init_write(struct
>         nfs_pageio_descriptor *,
>                   struct inode *);
>                    void pnfs_update_layout_commit(struct inode *, struct
>                list_head
>                   *, pgoff_t, unsigned int);
>                   -int pnfs_flush_one(struct inode *, struct list_head *,
>                unsigned
>                   int, size_t, int);
>                   -void pnfs_free_request_data(struct nfs_page *req);
>                    void pnfs_free_fsdata(struct pnfs_fsdata *fsdata);
>                    ssize_t pnfs_file_write(struct file *, const char
>         __user *,
>                   size_t, loff_t *);
>                    void pnfs_get_layout_done(struct pnfs_layout_type *,
>                   diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>                   index 0da5280..9410fad 100644
>                   --- a/fs/nfs/write.c
>                   +++ b/fs/nfs/write.c
>                   @@ -1008,11 +1008,7 @@ static void
>         nfs_pageio_init_write(struct
>                   nfs_pageio_descriptor *pgio,
>                          if (wsize < PAGE_CACHE_SIZE)
>                                  nfs_pageio_init(pgio, inode,
>         nfs_flush_multi,
>                   wsize, ioflags);
>                          else
>                   -#ifdef CONFIG_PNFS
>                   -               nfs_pageio_init(pgio, inode,
>         pnfs_flush_one,
>                   wsize, ioflags);
>                   -#else
>                                  nfs_pageio_init(pgio, inode,
>         nfs_flush_one,
>                wsize,
>                   ioflags);
>                   -#endif /* CONFIG_PNFS */
>                    }
>
>                    /*
>                   diff --git a/include/linux/nfs4_pnfs.h
>                b/include/linux/nfs4_pnfs.h
>                   index c83aae6..290ffd6 100644
>                   --- a/include/linux/nfs4_pnfs.h
>                   +++ b/include/linux/nfs4_pnfs.h
>                   @@ -137,7 +137,6 @@ struct layoutdriver_io_operations {
>                                                 struct page **pages,
>         unsigned int
>                   pgbase,
>                                                 unsigned nr_pages,
>         loff_t offset,
>                   size_t count,
>                                                 int sync, struct
>         nfs_write_data
>                   *nfs_data);
>                   -       int (*flush_one) (struct pnfs_layout_segment
>         *, struct
>                   list_head *head, unsigned int npages, size_t count,
>         int how);
>                          int (*write_begin) (struct
>         pnfs_layout_segment *lseg,
>                   struct page *page,
>                                              loff_t pos, unsigned count,
>                                              struct pnfs_fsdata *fsdata);
>                   @@ -147,8 +146,6 @@ struct layoutdriver_io_operations {
>                          void (*new_request)(struct
>         pnfs_layout_segment *lseg,
>                                              struct nfs_page *req,
>         loff_t pos,
>                   unsigned count,
>                                              struct pnfs_fsdata *fsdata);
>                   -       void (*free_request_data) (struct nfs_page *);
>                   -
>
>                          /* Consistency ops */
>                          /* 2 problems:
>                   diff --git a/include/linux/nfs_page.h
>                b/include/linux/nfs_page.h
>                   index 10329fb..1d6e830 100644
>                   --- a/include/linux/nfs_page.h
>                   +++ b/include/linux/nfs_page.h
>                   @@ -45,10 +45,6 @@ struct nfs_page {
>                          struct kref             wb_kref;        /*
>         reference
>                count */
>                          unsigned long           wb_flags;
>                          struct nfs_writeverf    wb_verf;        /* Commit
>                cookie */
>                   -#ifdef CONFIG_PNFS
>                   -       void                    *wb_ops;        /*
>         pNFS io
>                   operations */
>                   -       void                    *wb_private;
>                   -#endif
>                    };
>
>                    struct nfs_pageio_descriptor {
>                   --
>                   1.5.3.3 <http://1.5.3.3> <http://1.5.3.3>
>         <http://1.5.3.3>
>
>
>                   _______________________________________________
>                   pNFS mailing list
>                   pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
>         <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>>
>                <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
>         <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>>>
>
>
>                   http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
>
>
>
>
>


More information about the pNFS mailing list