[pnfs] Linux pNFS road map

Marc Eshel eshel at almaden.ibm.com
Wed Dec 12 10:35:24 EST 2007


pnfs-bounces at linux-nfs.org wrote on 12/11/2007 11:23:39 PM:

> It does not use a cluster FS.
> 
> The file system on the MDS and the DSs can be any physical fs.  E.g.,
> for our testing we've been using either ext2 or ext3 (or mixed) on the
> MDS and DSs.  It does not currently perform i/o through the MDS, though
> it would probably work to do as Bruce suggested and have the MDS act as
> a client of itself.
 
I am not sure what that means. For the MDS to do i/o it needs to 
communicate with the corresponding DS for that i/o and if the client 
failed in doing the i/o through the DS I don't see that the MDS will a 
better chance.
Marc. 

> I believe the "picture" would look like the GPFS picture, except the DSs
> don't share a clustered fs--they each have their own, separate file
> systems for stripe storage.
> 
> spNFS uses:
> 
> Existing server export ops defined for other pNFS implementations.
> Hooks in some NFS ops that do not currently have "export ops"
> [todo: define a set of pnfs_ops that include function pointers that
> encompass both of the above]
> 
> The export ops and hooks call into "spnfs_ops.c", to perform the
> spnfs-specific code for the given operation.  This may involve creating
> a message for an upcall to the userspace code.
> 
> If an upcall is required, spnfs_ops.c calls spnfs_upcall(msg,...) in
> spnfs_com.c to send a msg via pipefs to the userspace daemon, spnfsd.
> 
> Spnfsd is comprised of spnfsd.c (equivalent to the kernel spnfs_com.c,
> performing the pipefs communication piece) and spnfsd_ops.c (equivalent
> to kernel spnfs_ops.c).  The msg is received by spnfsd.c and the
> appropriate operation in spnfsd_ops.c is called to create the response
> message.
> 
> Sooo...
> 
> If the client send a LAYOUTGET to the MDS, on the MDS there is an export
> op for LAYOUTGET that will call spnfs_layoutget in spnfs_ops.c.
> Spnfs_ops.c determines that it needs information from the userspace
> daemon to return the requested layout.  A message requesting layout
> information is constructed and sent via pipefs (through spnfs_upcall in
> spnfs_com.c) to the userspace daemon.  The userspace daemon creates a
> layout reply message containing the information requested by the kernel
> and returns the message via pipefs.  The kernel builds a LAYOUTGET
> response and returns it, otw, to the client.
> 
> Lather, rinse, and repeat for the various operations we need to
> implement (spnfs_open, spnfs_remove, spnfs_getdeviceinfo, etc.)
> 
>   -Dan
> 
> 
> > -----Original Message-----
> > From: Dean Hildebrand [mailto:seattleplus at gmail.com] 
> > Sent: Tuesday, December 11, 2007 9:04 PM
> > To: Muntz, Daniel
> > Cc: J. Bruce Fields; pnfs at linux-nfs.org; peter honeyman
> > Subject: Re: [pnfs] Linux pNFS road map
> > 
> > Hi Dan,
> > 
> > I'm confused, spNFS requires a cluster file system to perform I/O 
> > through the MDS or it requires a CFS for everything?
> > 
> > Do you have a diagram or anything that describes the architecture of 
> > spNFS? I don't need a writeup, but something that described the 
> > user/kernel space split, which modules are where, etc would be really 
> > useful.  Even something as basic as the following would be useful:
> > http://www.citi.umich.edu/projects/asci/pnfs/docs/pnfs.gif
> > 
> > Thanks,
> > Dean
> > 
> > Muntz, Daniel wrote:
> > > As currently implemented, the MDS would have to access the 
> > file system
> > > via NFS.  Without a cluster back-end, this may be the only 
> > way to do it
> > > for the forseeable future.  Otherwise, you'd need some other non-NFS
> > > file system that understood how to speak pNFS, and I'm not 
> > sure there'd
> > > be much value added to spNFS by such a feature.  Something to think
> > > about...
> > >
> > > -----Original Message-----
> > > From: J. Bruce Fields [mailto:bfields at fieldses.org] 
> > > Sent: Tuesday, December 11, 2007 1:08 PM
> > > To: Muntz, Daniel
> > > Cc: pnfs at linux-nfs.org; peter honeyman
> > > Subject: Re: [pnfs] Linux pNFS road map
> > >
> > > On Tue, Dec 11, 2007 at 01:00:58PM -0800, Muntz, Daniel wrote:
> > > 
> > >> Yes, there's a bunch of stuff happening in userspace in 
> > the current 
> > >> implementation.  Some (most, all?) will move into the kernel.  The 
> > >> existing userspace design was done primarily to get something out 
> > >> quickly, which we could also modify/debug easily.  There's 
> > pleanty of 
> > >> stuff on the todo list, which we'll put out with the code RSN.
> > >> 
> > >
> > > Sure.
> > >
> > > Another random question: do you expect the "filesystem" 
> > you're exporting
> > > to ever be usable by applications on the server?  (Without doing a
> > > loopback NFS mount, that is).
> > >
> > > --b.
> > > _______________________________________________
> > > pNFS mailing list
> > > pNFS at linux-nfs.org
> > > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > > _______________________________________________
> > > pNFS mailing list
> > > pNFS at linux-nfs.org
> > > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > > 
> > 
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs



More information about the pNFS mailing list