[pnfs] [PATCH 19/37] pnfs: client layout cache: allow waiting for alloc_layout
Muntz, Daniel
Dan.Muntz at netapp.com
Thu Jan 3 15:10:51 EST 2008
Sometimes it's just better to take/release locks in "unbalanced" places.
I like to put big, bold comments in these places.
-Dan
-----Original Message-----
From: Benny Halevy [mailto:bhalevy at panasas.com]
Sent: Thursday, January 03, 2008 12:04 PM
To: Dean Hildebrand
Cc: pnfs at linux-nfs.org
Subject: Re: [pnfs] [PATCH 19/37] pnfs: client layout cache: allow
waiting for alloc_layout
On Jan. 03, 2008, 21:29 +0200, Dean Hildebrand <seattleplus at gmail.com>
wrote:
> I think we need to talk about all these waits and locks on the conf
> call. This code has almost no comments, so understanding their
purpose
> is really difficult.
>
> There seems to be multiple nested locks and waits, with the locks and
unlocks happening in different functions. This could easily lead to
deadlocks, maybe not by you, but future maintainers will have a heck of
a time ensuring they don't accidentally change things so a deadlock
occurs.
>
>
> My personal view is that each function should grab and release its
> locks at the same level in the call hierarchy. For example,
> pnfs_update_layout grabs locks in lower level function calls
> (get_alloc_layout), and then goes on to release and reacquire these
> locks at the high level. This is bad.
The lower level functions either search or allocate the layout structure
and return it locked and referenced. it's possible to only increment the
reference count and take the lock right before it's needed in the
calling function.
That might look simpler but will be less efficient since reference
counting will take one atomic (== bus locking) operation and the lock
will take another one.
The reason the lock is released around the call to get_layout is that
get_layout sends an rpc and you certainly don't want to keep the
spinlock around it.
> Not only that, but it locks
> nfsi->lo_lock in the get_lock_current_layout function but also in
> nfsi->random
> other parts of the code (end of get_alloc_layout). At the minimum, a
not really random. The lock is taken by exactly two low-level helper
functions:
get_lock_current_layout and get_alloc_layout, when they return non-NULL.
It is released in exactly one low-level helper function:
put_unlock_current_layout.
The only special case the lock is release and taken is in
pnfs_update_layout, around the call to get_layout.
> single helper function should lock and unlock lo_lock.
>
> goto's have their purpose, but C has lots of constructs that allow
> code simplification at the expense of readability. Programming style
> in 1974 (the year of that paper) applies little to programming in
> 2007. For example, recursion is useful and cool, but most programmers
> these days will not understand what it is doing. And even for those
> who do understand recursion, it will take longer to review the code.
> Let's just stick with regular loops and not sacrifice readability.
OK. readability is good :)
>
> more below.
>
>> Well, the allocation function might block so a spinlock is
>> inappropriate, a mutex could have been used instead.
> Do you mean the layout driver will block? Why?
kmalloc can block when memory is low when the system evicts dirty
pages or swaps out clean pages.
>> However, on a different thread,
>> Trond expressed his desire to get rid of locks where possible and use
>> waits instead.
> waits are ok, but pnfs_inject_layout, get_lock_current_layout,
> put_lock_current_layout, get_alloc_layout, pnfs_update_layout,
> pnfs_getboundary, seem to mix waits with locks in scattered places.
I
> think these are the functions that need to be reorg'd.
Hmm, the locks and waits you mentioned have different purposes.
spinlocks are used for efficient, short term locking where the critical
sections taking the locks are guaranteed to never block.
Waits are used for longer term synchronization where the code holding
the lock is expected to block or just take longer time to execute.
>
>> The idea here is to solve the potential race between
>> multiple threads wanting to allocate nfsi->current_layout while
>> allowing signals to interrupt and wake up any waiting thread.
>>
> I like the idea, we just need to get the code to do this in an
> appropriate way. Would it be possible to re-org the code so that
> get_lock_current_layout, put_lock_current_layout are always called in
> the same function? In addition, get_lock_current_layout,
> put_lock_current_layout would be the only 2 functions locking the
> nfsi->lo_lock?
The only way I see to simplify that is what I described above, i.e.
take the spinlock locally in the higher level functions, sacrificing
performance. We can that if more folks are confused about the current
implementation, though I think that the model is simple enough to follow
once you grasp the general idea.
Benny
> Dean
>
>
>>
>>> res = wait_on_bit_lock(&nfsi->pnfs_layout_state,
NFS_INO_LAYOUT_ALLOC,
>>> pnfs_wait_schedule, TASK_INTERRUPTIBLE);
>>>
>>> if (res) {
>>> lo = ERR_PTR(res);
>>> goto err;
>>> }
>>>
>>> if (nfsi->current_layout) {
>>> clear_bit_unlock(NFS_INO_LAYOUT_ALLOC,
>>> &nfsi->pnfs_layout_state);
>>> wake_up_bit(&nfsi->pnfs_layout_state,
NFS_INO_LAYOUT_ALLOC);
>>>
>>> goto out;
>>> }
>>>
>>> lo = nfsi->current_layout = alloc_init_layout(ino, io_ops);
>>> clear_bit_unlock(NFS_INO_LAYOUT_ALLOC,
&nfsi->pnfs_layout_state);
>>> wake_up_bit(&nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC);
>>>
>> This patch represents an intermediate version of this function before
the
>> introduction of {get_lock,put_unlock}_current_layout
>> (see
http://git.linux-nfs.org/?p=bhalevy/linux-pnfs.git;a=commitdiff;h=522058
9afa99bdc8f5301e26db8d5c5870c285c8)
>> I hope the backward goto makes more sense over there when an extra
step (atomic locking
>> and referencing of nfsi->current_layout) is needed.
>> This can be changed to an infinite loop with explicit continue and
break points
>> like the following code
>> (if it looks simpler, I'm not sure... see
http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf)
>>
>> static struct pnfs_layout_type *
>> get_alloc_layout(struct inode *ino,
>> struct layoutdriver_io_operations *io_ops)
>> {
>> struct nfs_inode *nfsi = NFS_I(ino);
>> struct pnfs_layout_type *lo;
>> int res;
>>
>> dprintk("%s Begin\n", __FUNCTION__);
>>
>>
>> while ((lo = get_lock_current_layout(nfsi)) == NULL) {
>> /* Compete against other threads on who's doing the
allocation,
>> * wait until bit is cleared if we lost this race.
>> */
>> res = wait_on_bit_lock(&nfsi->pnfs_layout_state,
NFS_INO_LAYOUT_ALLOC,
>> pnfs_wait_schedule,
TASK_INTERRUPTIBLE);
>> if (res) {
>> lo = ERR_PTR(res);
>> goto err;
>> }
>>
>> /* Was current_layout already allocated while we slept?
>> * If so, try to get it again using
get_lock_current_layout.
>> */
>> if (nfsi->current_layout) {
>> clear_bit_unlock(NFS_INO_LAYOUT_ALLOC,
>> &nfsi->pnfs_layout_state);
>> continue;
>>
> ^^^^^^^^^^
>
>> }
>>
>> /* Try to allocate current_layout and grab the lo_lock
*/
>> lo = nfsi->current_layout = alloc_init_layout(ino,
io_ops);
>> spin_lock(&nfsi->lo_lock);
>>
>> /* release the NFS_INO_LAYOUT_ALLOC bit and wake up any
waiter */
>> clear_bit_unlock(NFS_INO_LAYOUT_ALLOC,
&nfsi->pnfs_layout_state);
>> wake_up_bit(&nfsi->pnfs_layout_state,
NFS_INO_LAYOUT_ALLOC);
>>
>> /* we're done here. just check whether alloc_init_layout
succeeded. */
>> if (lo)
>> break;
>>
>> spin_unlock(&nfsi->lo_lock);
>> lo = ERR_PTR(-ENOMEM);
>> goto err;
>> }
>>
>> dprintk("%s Return %p\n", __FUNCTION__, lo);
>> return lo;
>>
>> err:
>> dprintk("%s Return error %ld\n", __FUNCTION__, PTR_ERR(lo));
>> return lo;
>> }
>>
>>
>>> Dean
>>>
>>
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
_______________________________________________
pNFS mailing list
pNFS at linux-nfs.org
http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
More information about the pNFS
mailing list