That's normal for a worker thread. They wait on the condition variable,
until work is available, then wake up and do the work.
op_ctx is safe, it's stored in thread-local-storage, so it's completely
thread safe.
Usually, on a hung Ganesha, you'll find at least one thread with a long
backtrace, waiting on something at the bottom, or hard looping, or
something like that. Along that call path it's holding a lock, that
other threads are waiting on.
One possibility is that the async is messed up somehow, and sockets
aren't being released back to epoll, so we stop listening on them. You
can turn on ntirpc tracing (set TIRPC = FULL_DEBUG in Components, and
RPC_Debug_Flags = 0xffffffff in NFS_CORE_PARAM). It should then log
xprts being added and removed from epoll, and you can look at that.
Daniel
On 12/18/20 4:47 AM, Chakra Divi wrote:
Hi Daniel,
Thanks for responding. I have attached the hung ganesha and captured the backtraces for
all threads - all worker threads are waiting on the conditional variable. I could see no
locks as such in any of the threads. Is there anything i need to take care while updating
op_ctx ?
backtrace of one of the worker thread
*************************************************************************
(gdb) bt
#0 pthread_cond_timedwait@(a)GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007f98a96a3d7b in work_pool_thread (arg=0x7f9899373020) at
/root/chakra/nfs-ganesha_V3-stable/nfs-ganesha/src/libntirpc/src/work_pool.c:215
#2 0x00007f98a9e9b6ba in start_thread (arg=0x7f987b2fe700) at pthread_create.c:333
#3 0x00007f98a99c941d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
*************************************************************************
fio command im using - see weird output...at the end
*************************************************************************
root@testnode:~# fio --direct=1 --name=testfile --directory=/nfs4/testnode/test --thread
--runtime=120 --time_based --refill_buffers --overwrite=0 --ioengine=libaio --rw=read
--bs=128k --numjobs=4 --filesize=4g --fallocate=none --group_reporting --iodepth=512
testfile: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K, ioengine=libaio, iodepth=512
...
fio-2.2.10
Starting 4 threads
testfile: Laying out IO file(s) (1 file(s) / 4096MB)
testfile: Laying out IO file(s) (1 file(s) / 4096MB)
testfile: Laying out IO file(s) (1 file(s) / 4096MB)
testfile: Laying out IO file(s) (1 file(s) / 4096MB)
Jobs: 4 (f=4): [R(4)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:43m:07s]
*************************************************************************
Regards,
Chakra
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org