[pnfs] 2.6.17 pNFS client hanging

William A.(Andy) Adamson andros at citi.umich.edu
Wed Jul 19 16:07:03 EDT 2006


so now tigran is not experiencing the hang

tigran.mkrtchyan at desy.de said:
> P.S.: I got the updates from CVS. client do not hang any more, but do  not
> recognizes chimera as pNFS. 

dean is still hanging - here is the code from mempool_alloc.

 * this function only sleeps if the alloc_fn function sleeps or
 * returns NULL. Note that due to preallocation, this function
 * *never* fails when called from process contexts. (it might
 * fail if called from an IRQ context.)

        /* Now start performing page reclaim */
        gfp_temp = gfp_mask;
        init_wait(&wait);
        prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);
        smp_mb();
        if (!pool->curr_nr)
                io_schedule();
        finish_wait(&pool->wait, &wait);

well, the call may never fail, but it hangs! i need to understand the 
TASK_UNINTERRUPTIBLE flag.

anyway, this is in the code path prior to any i/o, and prior to the trigger 
for the layoutget call.


from dean:

Here is the stuck thread:
Jul 19 11:23:25 foufoune kernel: Call Trace:
Jul 19 11:23:25 foufoune kernel:  <c0465b9b> io_schedule+0x26/0x30  
<c0143b99> mempool_alloc+0xc5/0xc7
Jul 19 11:23:25 foufoune kernel:  <c01d6191> 
nfs_writedata_alloc+0x17/0x96  <c01d7afd> nfs_flush_multi+0x76/0x16e
Jul 19 11:23:25 foufoune kernel:  <c01d7dfc> pnfs_flush_one+0xbd/0xf7  
<c01d6b04> nfs_flush_list+0x90/0xe4
Jul 19 11:23:25 foufoune kernel:  <c01d6bad> nfs_flush_inode+0x55/0x61  
<c01d85dc> nfs_writepages+0xa4/0x153
Jul 19 11:23:25 foufoune kernel:  <c0146b9f> do_writepages+0x26/0x3f  
<c01407eb> __filemap_fdatawrite_range+0x5b/0x67
Jul 19 11:23:25 foufoune kernel:  <c0141789> 
filemap_fdatawrite+0x26/0x28  <c015fd6b> do_fsync+0x4b/0x9c
Jul 19 11:23:25 foufoune kernel:  <c015fddc> __do_fsync+0x20/0x2f  
<c015fe0a> sys_fsync+0xd/0xf
Jul 19 11:23:25 foufoune kernel:  <c0103927> syscall_call+0x7/0xb



More information about the pNFS mailing list