[NFS-Ganesha-Devel] Re: Race between reusing MDCACHE entry of the parent and chunk_lru_run()

Tuesday, 16 April 2019

On 4/11/19, Frank Filz <ffilzlnx(a)mindspring.com&gt; wrote:
...
> From: Daniel Gryniewicz [mailto:dang@redhat.com]
> Sent: Thursday, April 11, 2019 5:28 AM
>
> Okay, I misunderstood the situation.  That fix won't work, because
> mdcache_lru_unref_chunk() takes the qlane lock, and it also takes the
> content
> lock, which must be taken before the qlane lock.
>
> The problem here is that the parent has a pointer to the chunk and  a ref
> on it,
> and the chunk has a pointer to the parent but no ref on it.
> This means that the parent refcount can go to zero while the chunk has a
> pointer.  I think we need to force remove all the chunks when we recycle
> the
> parent, and null out the parent pointer in the chunk as well.  We'll have
> to audit
> all the uses of the parent pointer and make sure the call path has a ref
> to the
> parent, or that it can handle a null parent pointer. 
Hi Daniel,

By force cleaning, did you have something like this in mind?

https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/451311/1/src/FSAL/Stack...

Thanks,
Pradeep
>>
>> Thoughts?  Ideas?
>
> This sounds good. It looks like lru_reap_chunk_impl and
> mdcache_lru_unref_chunk need to be able to handle a NULL parent pointer.
>
> There are other functions that use chunk->parent, but have been passed in
> the parent. Those functions maybe should assert that parent == chunk->parent
> (or entry == chunk->parent in the case of avl_dirent_set_deleted.
> mdcache_clean_dirent_chunk could be passed the parent to facilitate this
> check (both places it's called already have the parent). Note that function
> already sets chunk->parent = NULL.
>
> Frank
>
>> On 4/10/19 5:40 PM, Pradeep wrote:
>> > Hi Daniel,
>> >
>> > I already have that fix
>> > (https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/441566) you mention
>> > below. I believe this race is different. What I meant is something
>> > like this may be needed; otherwise the chunk cleanup will end up using
>> > wrong parent pointer.
>> > https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/450825
>> >
>> > Thanks,
>> > Pradeep
>> >
>> > On Wed, Apr 10, 2019 at 11:31 AM Daniel Gryniewicz <dang(a)redhat.com
>> > <mailto:dang@redhat.com>> wrote:
>> >
>> >     This is probably 389f1f09da19c4f42046474af3869815b6bd5959 fixed in
>> >     2.7.2.
>> >
>> >     Daniel
>> >
>> >     On 4/10/19 1:55 PM, Pradeep wrote:
>> >      > Hello,
>> >      >
>> >      > I'm hitting hang where rmv_detached_dirent() is stuck at the
>> >     spinlock
>> >      > for ever.
>> >      >
>> >      > #0  0x00007feb01a9a4e5 in pthread_spin_lock () from
>> >     /lib64/libpthread.so.0
>> >      > #1  0x00000000005519e7 in rmv_detached_dirent
>> >     (parent=0x7feaeb8dd600,
>> >      > dirent=0x7fea9fbd8700)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:420
>> >      > #2  0x00000000005522e8 in mdcache_avl_remove
>> (parent=0x7feaeb8dd600,
>> >      > dirent=0x7fea9fbd8700)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_avl.c:256
>> >      > #3  0x0000000000547752 in mdcache_clean_dirent_chunk
>> >     (chunk=0x7fea9f86f970)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:454
>> >      > #4  0x00000000005386a2 in lru_clean_chunk (chunk=0x7fea9f86f970)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2078
>> >      > #5  0x000000000053881b in mdcache_lru_unref_chunk
>> >     (chunk=0x7fea9f86f970)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2097
>> >      > #6  0x000000000053698c in chunk_lru_run_lane (lane=14)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1509
>> >      > #7  0x0000000000536d26 in chunk_lru_run (ctx=0x7feafe00f580)
>> >      >      at
>> >      >
>> >     /usr/src/debug/nfs-ganesha-
>> 2.7.1/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1563
>> >      > #8  0x0000000000508ad9 in fridgethr_start_routine
>> >     (arg=0x7feafe00f580)
>> >      >      at /usr/src/debug/nfs-ganesha-2.7.1/support/fridgethr.c:550
>> >      > #9  0x00007feb01a95e25 in start_thread () from
>> > /lib64/libpthread.so.0
>> >      > #10 0x00007feb0139dbad in clone () from /lib64/libc.so.6
>> >      >
>> >      > 420            
pthread_spin_lock(&parent->fsobj.fsdir.spin);
>> >      > (gdb) print parent->fsobj.fsdir.spin
>> >      > $1 = -1
>> >      > (gdb) print parent->obj_handle.type
>> >      > $2 = REGULAR_FILE
>> >      >
>> >      > Looks like when the thread is in this path, the parent got
>> > reused
>> >     for a
>> >      > different object in the filesystem. From looking at code, this
>> > seems
>> >      > possible:
>> >      >
>> >      > Lets say thread1 goes through the reuse path:
>> >      >
>> >      >   * mdcache_lru_get() ->
>> >      >       o mdcache_lru_clean() ->
>> >      >           + mdc_clean_entry() ->
>> >      >               # mdcache_dirent_invalidate_all()
>> >      >                   * mdcache_lru_unref_chunk()
>> >      >                   * This will call lru_clean_chunk() only if
>> >     refcnt is
>> >      >                     zero. Lets say another thread (thread2
>> > below)
>> >      >                     incremented it from the background thread.
>> > So
>> >      >                     thread1 will return back and the entry now
>> >     gets reused.
>> >      >
>> >      >
>> >      > Now the background thread (thread2) comes up to clean chunks and
>> >      > increments refcnt:
>> >      >
>> >      >   * chunk_lru_run()
>> >      >       o chunk_lru_run_lane()
>> >      >           + mdcache_lru_unref_chunk()
>> >      >           + Here we unlock qlane lock and then decrement refcnt.
>> > When
>> >      >             this thread unlocks, thread1 will grab qlane lock
>> > and
>> >     skip
>> >      >             the chunk because refcnt is > 0. By the time we
>> > reach
>> >      >             mdcache_lru_unref_chunk(), the parent would have
>> > been
>> >     reused
>> >      >             by thread1 for a different object. Now
>> >      >             mdcache_lru_unref_chunk() make progress as refcnt
>> > became
>> >      >             zero; but parent is invalid. So it gets stuck in
>> >      >             rmv_detached_dirent().
>> >      >
>> >      > I think we should hold the qlock in chunk_lru_run_lane() till
>> > the
>> >     refcnt
>> >      > is decremented to make sure that reaping of parent is not
>> > possible.
>> >      > Since mdcache_lru_unref_chunk() already holds the lock later, we
>> >     could
>> >      > probably pass a flag to indicate if the qlane lock is already
>> >     held or
>> >      > nor (similar to content_lock). Please let me know if there is a
>> >     better
>> >      > approach to refactor this.
>> >      >
>> >      > Thanks,
>> >      > Pradeep
>> >      >
>> >      >
>> >      >
>> >      > _______________________________________________
>> >      > Devel mailing list -- devel(a)lists.nfs-ganesha.org
>> >     <mailto:devel@lists.nfs-ganesha.org>
>> >      > To unsubscribe send an email to
>> > devel-leave(a)lists.nfs-ganesha.org
>> >     <mailto:devel-leave@lists.nfs-ganesha.org>
>> >      >
>> >     _______________________________________________
>> >     Devel mailing list -- devel(a)lists.nfs-ganesha.org
>> >     <mailto:devel@lists.nfs-ganesha.org>
>> >     To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>> >     <mailto:devel-leave@lists.nfs-ganesha.org>
>> >
>> _______________________________________________
>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an
>> email to
>> devel-leave(a)lists.nfs-ganesha.org
>
>

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Devel] Re: Race between reusing MDCACHE entry of the parent and chunk_lru_run()