PNFS Server Filesystem API Design
From Linux NFS
Contents |
Overview
The pNFS server introduces an extension to the filesystem VFS API to call into the filesystem for exporting it over pNFS. Essentially, the filesystem is called upon receiving respective pNFS protocol operations such as LAYOUTGET and GETDEVICEINFO, and in turn, it may use a set of layout-type library calls, implemented in the exportfs kernel module, to decode and encode layout-type specific functions so that fielsystems that share the same layout type can share this common code.
Callbacks, such as CB_LAYOUTRECALL and CB_NOTIFY_DEVICEID can be generated by the filesystem using the pnfsd cb operations vector provided by the nfsd module.
pNFS Export Operations
The pNFS export operations are defined in the following header file:
include/linux/nfsd/nfsd4_pnfs.h struct pnfs_export_operations;
The filesystem must implement the following methods for basic functionality:
layout_type get_device_info layout_get
mandatory | |
layout_type | Returns the supported pnfs_layouttype4 |
get_device_info | Encode device info onto the xdr stream |
layout_get | Retrieve and encode a layout for inode onto the xdr stream |
optional | |
set_device_notify | Implement device notification negotiation |
get_device_iter | Retrieve all available devices via an iterator |
layout_commit | Commit changes to layout
|
layout_return | Returns the layout.
Note that this method may be called internally by pnfsd upon, e.g.
|
can_merge_layouts | Policy call. Can layout segments be merged for this layout type? |
files layout only | |
get_verifier | Get the write verifier for DS (called on MDS only) |
get_state | Call fs on DS only |
pNFSd Callback Operations
The pNFSd callback operations are defined in the following header files:
include/linux/nfsd/nfsd4_pnfs.h include/linux/exportfs.h struct pnfsd_cb_operations; pnfsd_get_cb_op()
To use the pnfsd callback operations, the filesystem module must call pnfsd_get_cb_op() to get a reference on the global vector the nfsd module provides. The motivation for doing this is the reverse dependency of the filesystem on the nfsd module so that nfsd won't be able to go down while the filesystem is up, allowing it to call a callback function into thin air. pnfsd_put_cb_op() is called by the filesystem to release the reference.
cb_layout_recall | Recall layout(s). nfsd4_pnfs_cb_layout is used to specify the recall scope for per-file, fsid, or all layouts |
cb_device_notify | Notify device ids change or delete |
files layout only | |
cb_get_state | Callback from fs on MDS only |
cb_change_state | Callback from fs on DS only |
Locking Design
The nfsd state lock is never held while calling into the file system to avoid potential deadlocks that may be caused by the following scenario:
- State lock is held by nfsd
- The fs is being called (e.g. close())
- The fs generates a callback to the client (e.g. cb_layout_recall)
- The client sends a layout_return synchronously with the callback before replying to it.
- For serving layout_return, nfsd needs to acquire the state lock