[pnfs] [PATCH 4/4] nfs41: use a sorted free list for sessions slots
Benny Halevy
bhalevy at panasas.com
Mon Mar 17 05:07:32 EDT 2008
On Mar. 17, 2008, 1:27 +0200, Brent Welch <welch at panasas.com> wrote:
> Hmm - instead of lists or bitmaps, I would suggest just keeping
> track of the lowest free slot, and highest used slot. (I don't
> think you need the highest free slot, but you might.) These are
> cheap to maintain, and they give what you need to convey to the
> server, right? Each time you allocate the lowest free slot you
> scan up to the next free slot, which is O(n). Freeing a slot you
> maintain this with O(1). Maintaining the highest used slot
> requires a scan as well whenever you free the highest slot.
Right.
>
> Before you implement something more complicated, I'd like to
> see evidence that you are burning cycles in this code.
I don't see a big difference in complexity between maintaining
the per-slot state in a bit per slot structure which wastes memory
and dcache vs. keeping it in an efficient bitmap. The algorithm is
essentially the same...
Benny
>
>>>> Benny Halevy said:
> >
> > On Mar. 16, 2008, 21:50 +0200, Dean Hildebrand <seattleplus at gmail.com> wrote
> :
> > >
> > > Benny Halevy wrote:
> > >> On Mar. 15, 2008, 0:08 +0200, Dean Hildebrand <seattleplus at gmail.com> wro
> te:
> > >>
> > >>> Hi Benny,
> > >>>
> > >>> Could you add some comments to the code to explain your new algorithm
> > >>> and logic? Especially explaining the purpose of each loop and why we
> > >>> need to track the highest slot id.
> > >>>
> > >> Sure. no problem.
> > >> By the way, I thought about this more over the weekend, and managing the
> > >> free slots can be made even easier with a bitmap. Up to 64 slots can be
> > >> covered trivially and efficiently with a single unsigned long integer and
> > >> then the only remaining data we need to keep per-slot (for now) is its se
> qid.
> > >>
> > > Would that still require searching through the 64 slots (bits)? We
> > > should be able to do it in O(1) if we use a list and keep track of the
> > > pointer to the slot for each request, non?
> >
> > I'm not sure what use case you have in mind.
> > The one I'm thinking of is keeping the highest slot id in use.
> > when a slot becomes free you need to track which used now
> > has the highest slot ID. This is done to provide feedback to the server
> > so it can adjust and possibly rebalance the resources allocated for each cli
> ent.
> > If you want to do that in O(1) with a list keeping all used slots, the list
> > needs to be ordered, which requires O(n) for each insert :-(
> >
> > With the proposed bitmap, scanning can cost O(n) but the actual complexity c
> onstant
> > is very low comparing to scanning a list when you take into account memory
> > access penalties and caching.
> >
> > Note 1: Some architectures (x86 in particular :) provide for cpu instruction
> s
> > that find the first or last bit set in a word in O(1)
> >
> > Note 2: This does not have to be done in real time and can be updated period
> ically.
> >
> > Benny
> >
> > >
> > > Dean
> > >>
> > >>> Dean
> > >>>
> > >>> Benny Halevy wrote:
> > >>>
> > >>>> Also,
> > >>>> * Maintain highest_slotid correctly.
> > >>>> * sleep for a free slot only if failed to allocate one.
> > >>>>
> > >>>> Signed-off-by: Benny Halevy <bhalevy at panasas.com>
> > >>>> ---
> > >>>> fs/nfs/nfs4proc.c | 104 +++++++++++++++++++++++++++-------
> --------
> > >>>> include/linux/nfs4_session.h | 9 +--
> > >>>> 2 files changed, 70 insertions(+), 43 deletions(-)
> > >>>>
> > >>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > >>>> index 0203afe..d0e9c5a 100644
> > >>>> --- a/fs/nfs/nfs4proc.c
> > >>>> +++ b/fs/nfs/nfs4proc.c
> > >>>> @@ -210,6 +210,43 @@ static void renew_lease(const struct nfs_server *s
> erver, unsigned long timestamp
> > >>>> }
> > >>>>
> > >>>> #if defined(CONFIG_NFS_V4_1)
> > >>>> +
> > >>>> +static inline void __free_slot_sorted(struct nfs4_slot_table *tbl,
> > >>>> + struct nfs4_slot *slot)
> > >>>> +{
> > >>>> + int slot_nr = slot->slot_nr;
> > >>>> + struct nfs4_slot *sp;
> > >>>> +
> > >>>> + list_for_each_entry (sp, &tbl->free_slots, entry)
> > >>>> + if (slot_nr < sp->slot_nr) {
> > >>>> + list_add_tail(&slot->entry, &sp->entry);
> > >>>> + return;
> > >>>> + }
> > >>>> + list_add_tail(&slot->entry, &tbl->free_slots);
> > >>>> +}
> > >>>> +
> > >>>> +static inline void __nfs41_free_slot(struct nfs4_slot_table *tbl,
> > >>>> + struct nfs4_slot *slot)
> > >>>> +{
> > >>>> + int slot_nr;
> > >>>> + struct nfs4_slot *sp;
> > >>>> +
> > >>>> + spin_lock(&tbl->slot_tbl_lock);
> > >>>> + __free_slot_sorted(tbl, slot);
> > >>>> + slot_nr = slot->slot_nr;
> > >>>> + if (slot_nr == tbl->highest_slotid) {
> > >>>> + slot_nr--;
> > >>>> + list_for_each_entry_reverse(sp, &tbl->free_slots, entry
> ) {
> > >>>> + if (sp->slot_nr != slot_nr)
> > >>>> + break;
> > >>>> + slot_nr--;
> > >>>> + }
> > >>>> + tbl->highest_slotid = slot_nr;
> > >>>> + }
> > >>>> + spin_unlock(&tbl->slot_tbl_lock);
> > >>>> + rpc_wake_up_next(&tbl->slot_tbl_waitq);
> > >>>> +}
> > >>>> +
> > >>>> /* For pNFS filelayout data servers:
> > >>>> * the nfs_client is NULL - to signal no lease update.
> > >>>> * session is the data server.
> > >>>> @@ -235,7 +272,7 @@ static int nfs41_sequence_done(struct nfs_client *c
> lp,
> > >>>> slot = res->sr_slot;
> > >>>> if (slot == NULL) {
> > >>>> dprintk("%s: no slot: status %d\n", __func__, status);
> > >>>> - goto out; /* session recovery probably failed */
> > >>>> + goto ret; /* session recovery probably failed */
> > >>>> }
> > >>>>
> > >>>> switch (status) {
> > >>>> @@ -267,12 +304,7 @@ static int nfs41_sequence_done(struct nfs_client *
> clp,
> > >>>> spin_unlock(&clp->cl_lock);
> > >>>> }
> > >>>> no_update:
> > >>>> - /* Clear the 'busy' bit on the slot that was used */
> > >>>> - smp_mb__before_clear_bit();
> > >>>> - clear_bit(NFS4_SLOT_BUSY, &slot->flags);
> > >>>> - smp_mb__after_clear_bit();
> > >>>> -out:
> > >>>> - rpc_wake_up_next(&tbl->slot_tbl_waitq);
> > >>>> + __nfs41_free_slot(tbl, slot);
> > >>>> ret:
> > >>>> return status;
> > >>>> }
> > >>>> @@ -299,31 +331,28 @@ static int nfs4_sequence_done(struct nfs_server *
> server,
> > >>>> return ret;
> > >>>> }
> > >>>>
> > >>>> -static struct nfs4_slot *__nfs4_find_slot(struct nfs4_slot_table *tbl)
> > >>>> +static inline struct nfs4_slot *nfs4_find_slot(struct nfs4_slot_table
> *tbl,
> > >>>> + struct rpc_task *task,
> > >>>> + struct nfs41_sequence_ar
> gs *args)
> > >>>> {
> > >>>> - int i;
> > >>>> + int slot_nr;
> > >>>> struct nfs4_slot *slot;
> > >>>>
> > >>>> - for (i = 0; i < tbl->max_slots; ++i) {
> > >>>> - slot = &tbl->slots[i];
> > >>>> - if (!test_bit(NFS4_SLOT_BUSY, &slot->flags)) {
> > >>>> - set_bit(NFS4_SLOT_BUSY, &slot->flags);
> > >>>> - return slot;
> > >>>> - }
> > >>>> - }
> > >>>> -
> > >>>> - return NULL;
> > >>>> -}
> > >>>> -
> > >>>> -static struct nfs4_slot *nfs4_find_slot(struct nfs4_slot_table *tbl,
> > >>>> - struct rpc_task *task)
> > >>>> -{
> > >>>> - struct nfs4_slot *slot;
> > >>>> -
> > >>>> - rpc_sleep_on(&tbl->slot_tbl_waitq, task, NULL, NULL);
> > >>>> -
> > >>>> spin_lock(&tbl->slot_tbl_lock);
> > >>>> - slot = __nfs4_find_slot(tbl);
> > >>>> + if (likely(!list_empty(&tbl->free_slots))) {
> > >>>> + slot = list_first_entry(&tbl->free_slots, struct nfs4_s
> lot,
> > >>>> + entry);
> > >>>> + list_del(&slot->entry);
> > >>>> + slot_nr = slot->slot_nr;
> > >>>> + if (tbl->highest_slotid < slot_nr)
> > >>>> + tbl->highest_slotid = slot_nr;
> > >>>> + args->sa_slotid = slot_nr;
> > >>>> + args->sa_seqid = slot->seq_nr;
> > >>>> + args->sa_max_slotid = tbl->highest_slotid;
> > >>>> + } else {
> > >>>> + slot = NULL;
> > >>>> + rpc_sleep_on(&tbl->slot_tbl_waitq, task, NULL, NULL);
> > >>>> + }
> > >>>> spin_unlock(&tbl->slot_tbl_lock);
> > >>>>
> > >>>> return slot;
> > >>>> @@ -338,21 +367,21 @@ static int nfs41_setup_sequence(struct nfs4_sessi
> on *session,
> > >>>> struct nfs4_slot *slot;
> > >>>> struct nfs4_slot_table *tbl;
> > >>>>
> > >>>> + dprintk("--> %s\n", __func__);
> > >>>> tbl = &session->fore_channel.slot_table;
> > >>>> - slot = nfs4_find_slot(tbl, task);
> > >>>> + slot = nfs4_find_slot(tbl, task, args);
> > >>>>
> > >>>> - if (!slot)
> > >>>> + if (!slot) {
> > >>>> + dprintk("<-- %s: no free slots\n", __func__);
> > >>>> return -EAGAIN;
> > >>>> + }
> > >>>>
> > >>>> memcpy(args->sa_sessionid.data, session->sess_id,
> > >>>> NFS4_MAX_SESSIONID_LEN);
> > >>>> - args->sa_slotid = slot->slot_nr;
> > >>>> - args->sa_seqid = slot->seq_nr;
> > >>>> args->sa_cache_this = cache_reply;
> > >>>>
> > >>>> - spin_lock(&tbl->slot_tbl_lock);
> > >>>> - args->sa_max_slotid = tbl->max_slots;
> > >>>> - spin_unlock(&tbl->slot_tbl_lock);
> > >>>> + dprintk("<-- %s slot=%p slotid=%d seqid=%d\n", __func__,
> > >>>> + slot, args->sa_slotid, slot->seq_nr);
> > >>>>
> > >>>> res->sr_slot = slot;
> > >>>> res->sr_renewal_time = jiffies;
> > >>>> @@ -4541,11 +4570,11 @@ int nfs4_init_slot_table(struct nfs4_channel *c
> hannel)
> > >>>> dprintk("%s: slots=%p max_slots=%d\n", __func__,
> > >>>> tbl->slots, tbl->max_slots);
> > >>>>
> > >>>> - for (i = 0; i < channel->chan_attrs.max_reqs; ++i) {
> > >>>> + for (i = 0; i < channel->chan_attrs.max_reqs - 1; ++i) {
> > >>>> slot = &tbl->slots[i];
> > >>>> slot->slot_nr = i;
> > >>>> slot->seq_nr = 1;
> > >>>> - slot->flags = 0;
> > >>>> + list_add_tail(&slot->entry, &tbl->free_slots);
> > >>>> }
> > >>>>
> > >>>> out:
> > >>>> @@ -4573,6 +4602,7 @@ static int nfs4_init_channel(struct nfs4_channel
> *channel)
> > >>>> tbl = &channel->slot_table;
> > >>>>
> > >>>> spin_lock_init(&tbl->slot_tbl_lock);
> > >>>> + INIT_LIST_HEAD(&tbl->free_slots);
> > >>>> rpc_init_wait_queue(&tbl->slot_tbl_waitq, "Slot table");
> > >>>>
> > >>>> return 0;
> > >>>> diff --git a/include/linux/nfs4_session.h b/include/linux/nfs4_session.
> h
> > >>>> index 8ef8d6d..542d099 100644
> > >>>> --- a/include/linux/nfs4_session.h
> > >>>> +++ b/include/linux/nfs4_session.h
> > >>>> @@ -7,11 +7,6 @@
> > >>>> #include <linux/smp_lock.h>
> > >>>> #include <linux/sunrpc/sched.h>
> > >>>>
> > >>>> -/* The flags for the nfs4_slot struct */
> > >>>> -#define NFS4_SLOT_BUSY 0X0 /* Slot in use */
> > >>>> -#define NFS4_SLOT_RECLAIMED 0x1 /* Slot has been reclaimed by
> > >>>> - the server */
> > >>>> -
> > >>>> struct nfs4_channel_attrs {
> > >>>> u32 headerpadsz;
> > >>>> u32 max_rqst_sz;
> > >>>> @@ -25,14 +20,16 @@ struct nfs4_channel_attrs {
> > >>>> struct nfs4_slot {
> > >>>> u32 slot_nr;
> > >>>> u32 seq_nr;
> > >>>> - unsigned long flags;
> > >>>> + struct list_head entry;
> > >>>> };
> > >>>>
> > >>>> struct nfs4_slot_table {
> > >>>> struct nfs4_slot *slots;
> > >>>> spinlock_t slot_tbl_lock;
> > >>>> + struct list_head free_slots;
> > >>>> struct rpc_wait_queue slot_tbl_waitq;
> > >>>> int max_slots;
> > >>>> + int highest_slotid;
> > >>>> };
> > >>>>
> > >>>> struct nfs4_channel {
> > >>>>
> > >>>>
> > >>
> >
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
> >
>
> --
> Brent Welch
> Director, Software Architecture, Panasas Inc
> The Leader in Parallel Storage
>
> www.panasas.com
> welch at panasas.com
More information about the pNFS
mailing list