NFS client patches for Linux 2.6.28-rc6

The following set of patches fix known issues with the 2.6.28-rc6 NFS client code, and significantly enhance the support for NFSv4.

linux-2.6.28-001-fix_socket_close.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 12:56:05 -0500

SUNRPC: Ensure the server closes sockets in a timely fashion

We want to ensure that connected sockets close down the connection when we set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the stuff that is keeping a reference to the socket.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-002-fix_svc_delete_xprt.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 12:56:06 -0500

SUNRPC: We only need to call svc_delete_xprt() once...

Use XPT_DEAD to ensure that we only call xpo_detach & friends once.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-003-fix_svc_xprt_enqueue.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 12:56:07 -0500

SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Aside from being racy (there is nothing preventing someone setting XPT_DEAD after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is wrong to assume that transports which have called svc_delete_xprt() might not need to be re-enqueued.

See the list of deferred requests, which is currently never going to be cleared if the revisit call happens after svc_delete_xprt(). In this case, the deferred request will currently keep a reference to the transport forever.

The fix should be to allow dead transports to be enqueued in order to clear the deferred requests, then change the order of processing in svc_recv() so that we pick up deferred requests before we do the XPT_CLOSE processing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-004-nfs_fix_readahead.dif:

From: Wu Fengguang <fengguang.wu@intel.com>

Date: Mon, 1 Dec 2008 13:20:38 -0500

nfs: remove redundant tests on reading new pages

aops->readpages() and its NFS helper readpage_async_filler() will only be called to do readahead I/O for newly allocated pages. So it's not necessary to test for the always 0 dirty/uptodate page flags.

The removal of nfs_wb_page() call also fixes a readahead bug: the NFS readahead has been synchronous since 2.6.23, because that call will clear PG_readahead, which is the reminder for asynchronous readahead.

More background: the PG_readahead page flag is shared with PG_reclaim, one for read path and the other for write path. clear_page_dirty_for_io() unconditionally clears PG_readahead to prevent possible readahead residuals, assuming itself to be always called in the write path. However, NFS is one and the only exception in that it _always_ calls clear_page_dirty_for_io() in the read path, i.e. for readpages()/readpage().

Cc: Trond Myklebust <Trond.Myklebust@netapp.com>

Signed-off-by: Wu Fengguang <wfg@linux.intel.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-005-remove_last_bkl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:07 -0500

SUNRPC: Remove the last remnant of the BKL...

Somehow, this escaped the previous purge. There should be no need to keep any extra locks in the XDR callbacks.

The NFS client XDR code only writes into private objects, whereas all reads of shared objects are confined to fields that do not change, such as filehandles...

Ditto for lockd, the NFSv2/v3 client mount code, and rpcbind.

The nfsd XDR code may require the BKL, but since it does a synchronous RPC call from a thread that already holds the lock, that issue is moot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-006-xdr_should_be_export_symbol_gpl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:08 -0500

SUNRPC: Convert the xdr helpers and rpc_pipefs to EXPORT_SYMBOL_GPL

We've never considered the sunrpc code as part of any ABI to be used by out-of-tree modules.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-007-auth_gss_should_be_export_symbol_gpl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:09 -0500

SUNRPC: rpcsec_gss modules should not be used by out-of-tree code

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-008-nfs_common_should_be_export_symbol_gpl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:11 -0500

SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-only

Again, this has never been intended as a public abi for out-of-tree modules.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-008-sunrpc_server_should_be_export_symbol_gpl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:10 -0500

SUNRPC: The sunrpc server code should not be used by out-of-tree modules

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-009-lockd_should_be_export_symbol_gpl.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:11 -0500

LOCKD: Make lockd_up() and lockd_down() exported GPL-only

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-010-convert_reclaimer_to_use_kthread.dif:

From: Jeff Layton <jlayton@redhat.com>

Date: Mon, 1 Dec 2008 13:21:12 -0500

lockd: convert reclaimer thread to kthread interface

My understanding is that there is a push to turn the kernel_thread interface into a non-exported symbol and move all kernel threads to use the kthread API. This patch changes lockd to use kthread_run to spawn the reclaimer thread.

I've made the assumption here that the extra module references taken when we spawn this thread are unnecessary and removed them. I've also added a KERN_ERR printk that pops if the thread can't be spawned to warn the admin that the locks won't be reclaimed.

In the future, it would be nice to be able to notify userspace that locks have been lost (probably by implementing SIGLOST), and adding some good policies about how long we should reattempt to reclaim the locks.

Finally, I removed a comment about memory leaks that I believe is obsolete and added a new one to clarify the result of sending a SIGKILL to the reclaimer thread. As best I can tell, doing so doesn't actually cause a memory leak.

I consider this patch 2.6.29 material.

Signed-off-by: Jeff Layton <jlayton@redhat.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-011-rename_nfs_path_variable.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:13 -0500

NFS: rename nfs_path variable

Clean up: I'm about to move the declaration of nfs_mount into fs/nfs/internal.h and include it in fs/nfs/nfsroot.c. There's a conflicting definition of nfs_path in fs/nfs/internal.h and fs/nfs/nfsroot.c, so rename the private one.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-012-move_declaration_of_nfs_mount_to_fs_nfs_internal_h.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:14 -0500

NFS: Move declaration of nfs_mount() to fs/nfs/internal.h

Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-013-introduce_nfs_mount_info_struct_for_calling_nfs_mount.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:16 -0500

NFS: introduce nfs_mount_info struct for calling nfs_mount()

Clean up: convert nfs_mount() to take a single data structure argument to make it simpler to add more arguments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-014-expand_flags_passed_to_nfs_create_rpc_client.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:17 -0500

NFS: expand flags passed to nfs_create_rpc_client()

The nfs_create_rpc_client() function sets up an RPC client for an NFS mount point. Add an option that allows it to set up an RPC transport from an unprivileged port.

Instead of having nfs_create_rpc_client()'s callers retain local knowledge about how to set up an RPC client, create a couple of flag arguments to control the use of RPC_CLNT_CREATE flags.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-015-move_nfs_server_flag_initialization.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:18 -0500

NFS: move nfs_server flag initialization

Make it possible for the NFSv4 mount set up logic to pass mount option flags down the stack to nfs_create_rpc_client().

This is immediately useful if we want NFS mount options to modulate settings of the underlying RPC transport, but it may be useful at some later point if other parts of the NFSv4 mount initialization logic want to know what the mount options are.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-016-add_no_resvport_mount_option.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:19 -0500

NFS: add "[no]resvport" mount option

The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged.

On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently.

An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work.

In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023.

This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one.

Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not.

Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented.

To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point.

This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5).

The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port.

This mount option is supported only for text-based NFS mounts.

[ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security.

Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.]

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-017-no_resvport_mount_option_changes_mountd_client_too.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:20 -0500

NFS: "[no]resvport" mount option changes mountd client too

If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's mountd client should use an unprivileged port in this case as well.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-018-allow_lockd_requests_from_an_unprivileged_port.dif:

From: Chuck Lever <chuck.lever@oracle.com>

Date: Mon, 1 Dec 2008 13:21:20 -0500

NLM: allow lockd requests from an unprivileged port

If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-019-cleanup_rpc_exit.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:21 -0500

SUNRPC: Ensure that rpc_exit() always wakes up a sleeping task

Make rpc_exit() non-inline, and ensure that it always wakes up a task that has been queued.

Kill off the now unused rpc_wake_up_task().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-020-cleanup_rpc_set_active.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:22 -0500

SUNRPC: Move remaining RPC client related task initialisation into clnt.c

Now that rpc_run_task() is the sole entry point for RPC calls, we can move the remaining rpc_client-related initialisation of struct rpc_task from sched.c into clnt.c.

Also move rpc_killall_tasks() into the same file, since that too is relative to the rpc_clnt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-021-cleanup_rpc_bind_cred.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:22 -0500

SUNRPC: Clean up of rpc_bindcred()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-022-constify_rpc_clnt_char_pointers.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:23 -0500

SUNRPC: constify rpc_clnt fields cl_server and cl_protname

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-023-constify_rpc_program_name.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:24 -0500

SUNRPC: constify rpc_program->name

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-024-constify_rpc_prog_info.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:24 -0500

SUNRPC: constify the rpc_program

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-025-move_bind_cred.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:25 -0500

SUNRPC: Defer the call to rpcauth_bindcred()

When we introduce filesyste migration, we're going to have to be able to block RPC new calls while we're replumbing the rpc_clnt. When we release them, the credential cache may have been changed too, so we shouldn't bind the creds until right when the RPC call is starting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-026-cleanup_rpc_task_set_client.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:26 -0500

SUNRPC: Add client to clnt->cl_tasks only when it starts executing

This will allow us to know when there are no more tasks running.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-113-respond_promptly_to_socket_errors_2.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:27 -0500

SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting

If the server aborts an established connection, then we should retry connecting immediately. Since xprt->reestablish_timeout is not reset unless we go through a TCP_FIN_WAIT1 state, we may end up waiting unnecessarily. The fix is to reset xprt->reestablish_timeout in TCP_ESTABLISHED, and then rely on the fact that we set it to non-zero values in all other cases when the server closes the connection.

Also fix a race between xs_connect() and xs_tcp_state_change(). The value of xprt->reestablish_timeout should be updated before we actually attempt the connection.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-117-speed_up_reads.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:27 -0500

NFS: Accelerate reads on read/write files

Try to schedule writeout of dirty pages asynchronously at the beginning of a read instead of calling nfs_wb_page() on each page as we try to read it. The result should normally be faster because we can write out the entire chunk of data at once by means of writepages().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

linux-2.6.28-118-udp_connect.dif:

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Date: Mon, 1 Dec 2008 13:21:28 -0500

SUNRPC: Add connected sockets for UDP

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory   -  
[TXT]linux-2.6.28-001-fix..>2008-12-01 18:21 2.0K 
[TXT]linux-2.6.28-002-fix..>2008-12-01 18:21 1.4K 
[TXT]linux-2.6.28-003-fix..>2008-12-01 18:21 6.1K 
[TXT]linux-2.6.28-004-nfs..>2008-12-01 18:21 1.6K 
[TXT]linux-2.6.28-005-rem..>2008-12-01 18:21 4.1K 
[TXT]linux-2.6.28-006-xdr..>2008-12-01 18:21 8.3K 
[TXT]linux-2.6.28-007-aut..>2008-12-01 18:21 4.5K 
[TXT]linux-2.6.28-008-nfs..>2008-12-01 18:21 776  
[TXT]linux-2.6.28-008-sun..>2008-12-01 18:21 11K 
[TXT]linux-2.6.28-009-loc..>2008-12-01 18:21 1.0K 
[TXT]linux-2.6.28-010-con..>2008-12-01 18:21 3.0K 
[TXT]linux-2.6.28-011-ren..>2008-12-01 18:21 2.3K 
[TXT]linux-2.6.28-012-mov..>2008-12-01 18:21 1.7K 
[TXT]linux-2.6.28-013-int..>2008-12-01 18:21 6.3K 
[TXT]linux-2.6.28-014-exp..>2008-12-01 18:21 2.2K 
[TXT]linux-2.6.28-015-mov..>2008-12-01 18:21 2.4K 
[TXT]linux-2.6.28-016-add..>2008-12-01 18:21 7.3K 
[TXT]linux-2.6.28-017-no_..>2008-12-01 18:21 1.9K 
[TXT]linux-2.6.28-018-all..>2008-12-01 18:21 4.8K 
[TXT]linux-2.6.28-019-cle..>2008-12-01 18:21 3.0K 
[TXT]linux-2.6.28-020-cle..>2008-12-01 18:21 7.9K 
[TXT]linux-2.6.28-021-cle..>2008-12-01 18:21 5.1K 
[TXT]linux-2.6.28-022-con..>2008-12-01 18:21 3.1K 
[TXT]linux-2.6.28-023-con..>2008-12-01 18:21 1.3K 
[TXT]linux-2.6.28-024-con..>2008-12-01 18:21 15K 
[TXT]linux-2.6.28-025-mov..>2008-12-01 18:21 2.4K 
[TXT]linux-2.6.28-026-cle..>2008-12-01 18:21 1.7K 
[TXT]linux-2.6.28-113-res..>2008-12-01 18:21 2.5K 
[TXT]linux-2.6.28-117-spe..>2008-12-01 18:21 2.4K 
[TXT]linux-2.6.28-118-udp..>2008-12-01 18:21 7.0K 
[   ]series 2008-12-01 18:21 1.6K 

Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16 mod_perl/2.0.11 Perl/v5.16.3 Server at linux-nfs.org Port 80