[pnfs] Linux pNFS road map
Muntz, Daniel
Dan.Muntz at netapp.com
Wed Dec 12 02:23:39 EST 2007
It does not use a cluster FS.
The file system on the MDS and the DSs can be any physical fs. E.g.,
for our testing we've been using either ext2 or ext3 (or mixed) on the
MDS and DSs. It does not currently perform i/o through the MDS, though
it would probably work to do as Bruce suggested and have the MDS act as
a client of itself.
I believe the "picture" would look like the GPFS picture, except the DSs
don't share a clustered fs--they each have their own, separate file
systems for stripe storage.
spNFS uses:
Existing server export ops defined for other pNFS implementations.
Hooks in some NFS ops that do not currently have "export ops"
[todo: define a set of pnfs_ops that include function pointers that
encompass both of the above]
The export ops and hooks call into "spnfs_ops.c", to perform the
spnfs-specific code for the given operation. This may involve creating
a message for an upcall to the userspace code.
If an upcall is required, spnfs_ops.c calls spnfs_upcall(msg,...) in
spnfs_com.c to send a msg via pipefs to the userspace daemon, spnfsd.
Spnfsd is comprised of spnfsd.c (equivalent to the kernel spnfs_com.c,
performing the pipefs communication piece) and spnfsd_ops.c (equivalent
to kernel spnfs_ops.c). The msg is received by spnfsd.c and the
appropriate operation in spnfsd_ops.c is called to create the response
message.
Sooo...
If the client send a LAYOUTGET to the MDS, on the MDS there is an export
op for LAYOUTGET that will call spnfs_layoutget in spnfs_ops.c.
Spnfs_ops.c determines that it needs information from the userspace
daemon to return the requested layout. A message requesting layout
information is constructed and sent via pipefs (through spnfs_upcall in
spnfs_com.c) to the userspace daemon. The userspace daemon creates a
layout reply message containing the information requested by the kernel
and returns the message via pipefs. The kernel builds a LAYOUTGET
response and returns it, otw, to the client.
Lather, rinse, and repeat for the various operations we need to
implement (spnfs_open, spnfs_remove, spnfs_getdeviceinfo, etc.)
-Dan
> -----Original Message-----
> From: Dean Hildebrand [mailto:seattleplus at gmail.com]
> Sent: Tuesday, December 11, 2007 9:04 PM
> To: Muntz, Daniel
> Cc: J. Bruce Fields; pnfs at linux-nfs.org; peter honeyman
> Subject: Re: [pnfs] Linux pNFS road map
>
> Hi Dan,
>
> I'm confused, spNFS requires a cluster file system to perform I/O
> through the MDS or it requires a CFS for everything?
>
> Do you have a diagram or anything that describes the architecture of
> spNFS? I don't need a writeup, but something that described the
> user/kernel space split, which modules are where, etc would be really
> useful. Even something as basic as the following would be useful:
> http://www.citi.umich.edu/projects/asci/pnfs/docs/pnfs.gif
>
> Thanks,
> Dean
>
> Muntz, Daniel wrote:
> > As currently implemented, the MDS would have to access the
> file system
> > via NFS. Without a cluster back-end, this may be the only
> way to do it
> > for the forseeable future. Otherwise, you'd need some other non-NFS
> > file system that understood how to speak pNFS, and I'm not
> sure there'd
> > be much value added to spNFS by such a feature. Something to think
> > about...
> >
> > -----Original Message-----
> > From: J. Bruce Fields [mailto:bfields at fieldses.org]
> > Sent: Tuesday, December 11, 2007 1:08 PM
> > To: Muntz, Daniel
> > Cc: pnfs at linux-nfs.org; peter honeyman
> > Subject: Re: [pnfs] Linux pNFS road map
> >
> > On Tue, Dec 11, 2007 at 01:00:58PM -0800, Muntz, Daniel wrote:
> >
> >> Yes, there's a bunch of stuff happening in userspace in
> the current
> >> implementation. Some (most, all?) will move into the kernel. The
> >> existing userspace design was done primarily to get something out
> >> quickly, which we could also modify/debug easily. There's
> pleanty of
> >> stuff on the todo list, which we'll put out with the code RSN.
> >>
> >
> > Sure.
> >
> > Another random question: do you expect the "filesystem"
> you're exporting
> > to ever be usable by applications on the server? (Without doing a
> > loopback NFS mount, that is).
> >
> > --b.
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
>
More information about the pNFS
mailing list