On 4/11/19, Frank Filz <ffilzlnx(a)mindspring.com> wrote:
> From: Daniel Gryniewicz [mailto:dang@redhat.com]
> Sent: Thursday, April 11, 2019 5:28 AM
>
> Okay, I misunderstood the situation. That fix won't work, because
> mdcache_lru_unref_chunk() takes the qlane lock, and it also takes the
> content
> lock, which must be taken before the qlane lock.
>
> The problem here is that the parent has a pointer to the chunk and a ref
> on it,
> and the chunk has a pointer to the parent but no ref on it.
> This means that the parent refcount can go to zero while the chunk has a
> pointer. I think we need to force remove all the chunks when we recycle
> the
> parent, and null out the parent pointer in the chunk as well. We'll have
> to audit
> all the uses of the parent pointer and make sure the call path has a ref
> to the
> parent, or that it can handle a null parent pointer.
Hi Daniel,
By force cleaning, did you have something like this in mind?
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/451311/1/src/FSAL/Stack...
Thanks,
Pradeep
>>
>> Thoughts? Ideas?
>
> This sounds good. It looks like lru_reap_chunk_impl and
> mdcache_lru_unref_chunk need to be able to handle a NULL parent pointer.
>
> There are other functions that use chunk->parent, but have been passed in
> the parent. Those functions maybe should assert that parent == chunk->parent
> (or entry == chunk->parent in the case of avl_dirent_set_deleted.
> mdcache_clean_dirent_chunk could be passed the parent to facilitate this
> check (both places it's called already have the parent). Note that function
> already sets chunk->parent = NULL.
>
> Frank
>
>> On 4/10/19 5:40 PM, Pradeep wrote:
>> > Hi Daniel,
>> >
>> > I already have that fix
>> > (
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/441566) you mention
>> > below. I believe this race is different. What I meant is something
>> > like this may be needed; otherwise the chunk cleanup will end up using
>> > wrong parent pointer.
>> >
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/450825
>> >
>> > Thanks,
>> > Pradeep
>> >
>> > On Wed, Apr 10, 2019 at 11:31 AM Daniel Gryniewicz <dang(a)redhat.com
>> > <mailto:dang@redhat.com>> wrote:
>> >
>> > This is probably 389f1f09da19c4f42046474af3869815b6bd5959 fixed in
>> > 2.7.2.
>> >
>> > Daniel
>> >
>> > On 4/10/19 1:55 PM, Pradeep wrote:
>> > > Hello,
>> > >
>> > > I'm hitting hang where rmv_detached_dirent() is stuck at the
>> > spinlock
>> > > for ever.
>> > >
>> > > #0 0x00007feb01a9a4e5 in pthread_spin_lock () from
>> > /lib64/libpthread.so.0
>> > > #1 0x00000000005519e7 in rmv_detached_dirent
>> > (parent=0x7feaeb8dd600,
>> > > dirent=0x7fea9fbd8700)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:420
>> > > #2 0x00000000005522e8 in mdcache_avl_remove
>> (parent=0x7feaeb8dd600,
>> > > dirent=0x7fea9fbd8700)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_avl.c:256
>> > > #3 0x0000000000547752 in mdcache_clean_dirent_chunk
>> > (chunk=0x7fea9f86f970)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:454
>> > > #4 0x00000000005386a2 in lru_clean_chunk (chunk=0x7fea9f86f970)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2078
>> > > #5 0x000000000053881b in mdcache_lru_unref_chunk
>> > (chunk=0x7fea9f86f970)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2097
>> > > #6 0x000000000053698c in chunk_lru_run_lane (lane=14)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1509
>> > > #7 0x0000000000536d26 in chunk_lru_run (ctx=0x7feafe00f580)
>> > > at
>> > >
>> > /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1563
>> > > #8 0x0000000000508ad9 in fridgethr_start_routine
>> > (arg=0x7feafe00f580)
>> > > at /usr/src/debug/nfs-ganesha-2.7.1/support/fridgethr.c:550
>> > > #9 0x00007feb01a95e25 in start_thread () from
>> > /lib64/libpthread.so.0
>> > > #10 0x00007feb0139dbad in clone () from /lib64/libc.so.6
>> > >
>> > > 420
pthread_spin_lock(&parent->fsobj.fsdir.spin);
>> > > (gdb) print parent->fsobj.fsdir.spin
>> > > $1 = -1
>> > > (gdb) print parent->obj_handle.type
>> > > $2 = REGULAR_FILE
>> > >
>> > > Looks like when the thread is in this path, the parent got
>> > reused
>> > for a
>> > > different object in the filesystem. From looking at code, this
>> > seems
>> > > possible:
>> > >
>> > > Lets say thread1 goes through the reuse path:
>> > >
>> > > * mdcache_lru_get() ->
>> > > o mdcache_lru_clean() ->
>> > > + mdc_clean_entry() ->
>> > > # mdcache_dirent_invalidate_all()
>> > > * mdcache_lru_unref_chunk()
>> > > * This will call lru_clean_chunk() only if
>> > refcnt is
>> > > zero. Lets say another thread (thread2
>> > below)
>> > > incremented it from the background thread.
>> > So
>> > > thread1 will return back and the entry now
>> > gets reused.
>> > >
>> > >
>> > > Now the background thread (thread2) comes up to clean chunks and
>> > > increments refcnt:
>> > >
>> > > * chunk_lru_run()
>> > > o chunk_lru_run_lane()
>> > > + mdcache_lru_unref_chunk()
>> > > + Here we unlock qlane lock and then decrement refcnt.
>> > When
>> > > this thread unlocks, thread1 will grab qlane lock
>> > and
>> > skip
>> > > the chunk because refcnt is > 0. By the time we
>> > reach
>> > > mdcache_lru_unref_chunk(), the parent would have
>> > been
>> > reused
>> > > by thread1 for a different object. Now
>> > > mdcache_lru_unref_chunk() make progress as refcnt
>> > became
>> > > zero; but parent is invalid. So it gets stuck in
>> > > rmv_detached_dirent().
>> > >
>> > > I think we should hold the qlock in chunk_lru_run_lane() till
>> > the
>> > refcnt
>> > > is decremented to make sure that reaping of parent is not
>> > possible.
>> > > Since mdcache_lru_unref_chunk() already holds the lock later, we
>> > could
>> > > probably pass a flag to indicate if the qlane lock is already
>> > held or
>> > > nor (similar to content_lock). Please let me know if there is a
>> > better
>> > > approach to refactor this.
>> > >
>> > > Thanks,
>> > > Pradeep
>> > >
>> > >
>> > >
>> > > _______________________________________________
>> > > Devel mailing list -- devel(a)lists.nfs-ganesha.org
>> > <mailto:devel@lists.nfs-ganesha.org>
>> > > To unsubscribe send an email to
>> > devel-leave(a)lists.nfs-ganesha.org
>> > <mailto:devel-leave@lists.nfs-ganesha.org>
>> > >
>> > _______________________________________________
>> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
>> > <mailto:devel@lists.nfs-ganesha.org>
>> > To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>> > <mailto:devel-leave@lists.nfs-ganesha.org>
>> >
>> _______________________________________________
>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an
>> email to
>> devel-leave(a)lists.nfs-ganesha.org
>
>