[pnfs] 19 questions

Benny Halevy bhalevy at panasas.com
Thu Feb 7 03:12:14 EST 2008


On Feb. 06, 2008, 23:35 +0200, Dean Hildebrand <seattleplus at gmail.com> wrote:
> 
> William A. (Andy) Adamson wrote:
>>
>> On Feb 6, 2008 2:27 PM, Dean Hildebrand <seattleplus at gmail.com 
>> <mailto:seattleplus at gmail.com>> wrote:
>>
>>
>>
>>     William A. (Andy) Adamson wrote:
>>     >
>>     >
>>     > On Feb 6, 2008 1:26 PM, Dean Hildebrand <seattleplus at gmail.com
>>     <mailto:seattleplus at gmail.com>
>>     > <mailto:seattleplus at gmail.com <mailto:seattleplus at gmail.com>>>
>>     wrote:
>>     >
>>     >
>>     >     William A. (Andy) Adamson wrote:
>>     >     >
>>     >     >
>>     >     > On 2/4/08, *Dean Hildebrand* <seattleplus at gmail.com
>>     <mailto:seattleplus at gmail.com>
>>     >     <mailto:seattleplus at gmail.com <mailto:seattleplus at gmail.com>>
>>     >     > <mailto:seattleplus at gmail.com
>>     <mailto:seattleplus at gmail.com> <mailto:seattleplus at gmail.com
>>     <mailto:seattleplus at gmail.com>>>>
>>     >     wrote:
>>     >     >
>>     >     >     Some questions about draft 19 (not 19 questions):
>>     >     >
>>     >     >     1) On the server I need the superblock so I can call the
>>     >     getdeviceinfo
>>     >     >     export operation.
>>     >     >
>>     >     >
>>     >     > why? the deviceid space is global.
>>     >     Well, the devices and the device ids still come from the
>>     file system,
>>     >     and the only way I know how to talk to the file system is
>>     through the
>>     >     export ops, which requires a superblock.
>>     >
>>     >     To make them globally unique, through coincidence or the
>>     fact that we
>>     >     are all very like minded people, we had already coded up the
>>     >     deviceid on
>>     >     the server as Benny suggested:
>>     >
>>     >     struct pnfs_deviceid {
>>     >            u64     ex_fsid;
>>     >            __be64  devid;
>>     >     };
>>     >
>>     >     One limitation here is that in include/linux/nfsd/nfsfh.h it
>>     >     defines the
>>     >     sizes for various types of fsids:
>>     >     static inline int key_len(int type)
>>     >     {
>>     >            switch(type) {
>>     >            case FSID_DEV:          return 8;
>>     >            case FSID_NUM:          return 4;
>>     >            case FSID_MAJOR_MINOR:  return 12;
>>     >            case FSID_ENCODE_DEV:   return 8;
>>     >            case FSID_UUID4_INUM:   return 8;
>>     >            case FSID_UUID8:        return 8;
>>     >            case FSID_UUID16:       return 16;
>>     >            case FSID_UUID16_INUM:  return 24;
>>     >            default: return 0;
>>     >            }
>>     >     }
>>     >
>>     >     So if we are limiting the fsid to be only 8 bytes, then
>>     there are
>>     >     several types of fsids we cannot support.  I'm wondering if
>>     this is a
>>     >     major problem for all OSs or just a hassle for all OSs.
>>      There are
>>     >     workarounds I guess (replace fsid with a made up id in
>>     >     /etc/exports) but
>>     >     they seem like a hassle.  What do people think?
>>     >
>>     >
>>     > svc_export->ex_dentry->d_inode->i_sb gives you a super block for an
>>     > export.
>>     >
>>     > the server can remember which exports (only need one) are pNFS
>>     vi any
>>     > pNFS
>>     > call (such as the fs_layout_type GETATTR) with a new
>>     svc_export->ex_flag.
>>     > e.g. EX_PNFS_FILE, EX_PNFS_BLOCK, etc.
>>     >
>>     > else, the server can check for the existence of
>>     export_op->layout_type
>>     > (e.g. call nfsd_layout_verify) on each exported super_block until it
>>     > finds one that matches the GETDEVICELIST layout type.
>>     Hmmm, the problem I see is that there could be multiple exported file
>>     systems supporting the same layout type. 
>>
>>
>> this is a good question. if the MDS exports two file layout GPFS 
>> filesystems, one could expect a device ID to be "global" across both 
>> file systems.
>> but, if the MDS export one file layout GPFS file system and one spNFS 
>> file system - no way!
>>
>> what does a "global device id" mean? global to layout type/MDS? no. 
>> global to layout type/vendor?
> In 3.3.14, I also found a definition of deviceid which I wish we could 
> keep but seems to not have any support:
> " The device ID is qualified by the layout type and are unique per file 
> system (FSID). "
> I think it should say,
> " The device ID is qualified by the layout type and may be different for 
> each clientid"
> 
> Not sure why a server would want to give different deviceids per 
> clientid, but that seems to be the agreement from the conf call.
>> i say we add a currentfh to GETDEVICE[INFO] and be done with it.
>  From what I remember, the reason they didn't want the currentfh with 
> getdeviceinfo is notifications.  If deviceids are shared across 2 
> fsid's, they don't want to have to send 2 getdeviceinfo calls to set 
> notifications for the same device.  I think that was the issue.
> 
> My current thought, if we keep the current spec, is that nfsd will pass 
> a unique identifier (fsid) as an argument to the getdevicelist and 
> getlayout export ops, so that when the file system builds the deviceid 
> it can make it unique.

Right.  This uniquifier does not have to be the real fsid, it can be any
handle (e.g. index into a table) identifying the exported file system.
And, like you and Trond quoted, this handle need not persist across server
reboot.

Benny

> 
> Also,
>  From 3.3.14: "A client must not assume that device IDs are valid across 
> metadata server reboots."
> 
> Which I think means that after an MDS reboot, the client may find that 
> devices don't work, but the server should do everything it can to ensure 
> they are the same and that they do work.  Another interpretation is that 
> the client must return all its layouts and retrieve them all over again.
> Dean
> 
>> -->Andy
>>
>>      So matching the exported
>>     layout type to the getdeviceinfo layout type isn't sufficient.  You
>>     would have to call export_op->getdeviceinfo on each exported file
>>     system
>>     that matches the layout type until you find one that recognizes the
>>     deviceid.  ugh.
>>
>>     So I think that we still need to somehow encapsulate which export
>>     we are
>>     referring to in the deviceid.  But if fsid's can be up to 24 bytes
>>     long,
>>     we have a problem.
>>
>>     Dean
>>
>>     >     >     The problem is that getdeviceinfo no longer includes
>>     >     >     the current_fh.  Is there a way on the server to get the
>>     >     superblock
>>     >     >     without the filehandle? (maybe using something in the
>>     >     >     session?)  If not,
>>     >     >     then we will have to encode something in the device id
>>     that
>>     >     allows
>>     >     >     us to
>>     >     >     map to the superblock (is the fsid enough?)  Any ideas?
>>     >     >
>>     >     >     2) Can getdeviceinfo return a device for the wrong layout
>>     >     type?  It
>>     >     >     seems that getdeviceinfo returns the layout type, which
>>     >     seems a little
>>     >     >     silly to me.  If it should return the layout type, and it
>>     >     doesn't
>>     >     >     match
>>     >     >     the requesting layout type, what should the client do?
>>     >     >
>>     >     >     3) In draft-19, should bitmap4 not have <>?
>>     >     >     bitmap4<> typedef uint32_t bitmap4<>;
>>     >     >           ^^^^
>>     >     >     For reference, in rfc3530 it does not have the <>. (other
>>     >     ops have the
>>     >     >     <> added as well)
>>     >     >
>>     >     >     Thanks,
>>     >     >     Dean
>>     >     >     _______________________________________________
>>     >     >     pNFS mailing list
>>     >     >     pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
>>     <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>>
>>     >     <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
>>     <mailto:pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>>>
>>     >     >     http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>>     >     >
>>     >     >
>>     >
>>     >
>>
>>
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs



More information about the pNFS mailing list