I am running nfs-ganesha with my custom FSAL on macOS. My FSAL implements a custom VFS layer, with its own caching.

We recently upgraded from version 4.0.6 to 5.5. We are now seeing a lot of instability and we think these are regressions in nfs-ganesha itself. Our FSAL did not change much during the version upgrade.

For example, with a debug build I see:
Assertion failed: (!entry->fh_hk.inavl), function _mdcache_lru_unref, file mdcache_lru.c, line 1971.

One thing worth noting: we really do not want MDCACHE to do any caching at all. We have this in our config:

MDCACHE {
# Disable readdir caching (does not affect client, only server)
Dir_Chunk = 0;
}
EXPORT {
...
# Disable attribute-caching (does not affect client, only server)
Attr_Expiration_Time = 0;
...
}

With TSAN:
ThreadSanitizer reports lots of races. I have seen https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/tools/helgrind-suppressions ... There is no such thing as a benign data race. There could be bugs in tsan itself, but this could also point to bugs in nfs-ganesha.
==================
WARNING: ThreadSanitizer: data race (pid=82454)
  Read of size 4 at 0x00010d701f60 by thread T52:
    #0 _mdcache_lru_unref <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009ec81c) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #1 mdcache_put_ref <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009d904c) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #2 compound_data_Free <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008f60d4) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #3 nfs4_Compound <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008f5d60) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #4 nfs_rpc_process_request <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008de0c0) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #5 nfs_rpc_valid_NFS <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008deb60) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #6 svc_vc_decode <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a2d010) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #7 svc_request <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a293e4) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #8 svc_vc_recv <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a2c758) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #9 svc_rqst_xprt_task_recv <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a29288) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #10 svc_rqst_epoll_loop <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a26f40) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #11 work_pool_thread <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a306e0) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Previous write of size 4 at 0x00010d701f60 by thread T56 (mutexes: read M0, write M1):
    #0 _mdcache_lru_ref <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009efcc8) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #1 mdcache_find_keyed_reason <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009e1dd8) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #2 mdcache_locate_host <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009e3040) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #3 mdcache_create_handle <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009dd818) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #4 nfs4_op_putfh <null>:164731984 (srcfsd_darwin_dev:arm64+0x10091d450) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #5 process_one_op <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008f4908) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #6 nfs4_Compound <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008f5c24) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #7 nfs_rpc_process_request <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008de0c0) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #8 nfs_rpc_valid_NFS <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008deb60) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #9 svc_vc_decode <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a2d010) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #10 svc_request <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a293e4) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #11 svc_vc_recv <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a2c758) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #12 svc_rqst_xprt_task_recv <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a29288) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #13 svc_rqst_epoll_loop <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a26f40) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #14 work_pool_thread <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a306e0) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Location is heap block of size 1736 at 0x00010d701c00 allocated by main thread:
    #0 calloc <null>:174073376 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x6094c) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 mdcache_lru_get <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009eede4) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #2 mdcache_new_entry <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009e0700) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #3 mdcache_lookup_path <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009dd5cc) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #4 init_export_root <null>:164725696 (srcfsd_darwin_dev:arm64+0x100999afc) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #5 init_export_cb <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009995a4) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #6 foreach_gsh_export <null>:164725696 (srcfsd_darwin_dev:arm64+0x100994544) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #7 exports_pkginit <null>:164725696 (srcfsd_darwin_dev:arm64+0x100999534) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #8 nfs_start <null>:164725696 (srcfsd_darwin_dev:arm64+0x1008cd00c) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #9 nfs_libmain2 <null>:164725696 (srcfsd_darwin_dev:arm64+0x1008cf168) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #10 devtools_vfs::NfsMain(devtools_vfs::NfsMainOptions&) <null>:164725696 (srcfsd_darwin_dev:arm64+0x100809370) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #11 main <null>:164725696 (srcfsd_darwin_dev:arm64+0x10015c9f8) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Mutex M0 (0x00010d827db8) created at:
    #0 pthread_rwlock_init <null>:174073376 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x31798) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 cih_pkginit <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009de328) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #2 mdcache_pkginit <null>:164725696 (srcfsd_darwin_dev:arm64+0x1009f2fc0) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #3 init_server_pkgs <null>:164725696 (srcfsd_darwin_dev:arm64+0x1008cc984) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #4 nfs_libmain2 <null>:164725696 (srcfsd_darwin_dev:arm64+0x1008cef00) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #5 devtools_vfs::NfsMain(devtools_vfs::NfsMainOptions&) <null>:164725696 (srcfsd_darwin_dev:arm64+0x100809370) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #6 main <null>:164725696 (srcfsd_darwin_dev:arm64+0x10015c9f8) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Mutex M1 (0x000104bab008) created at:
    #0 pthread_mutex_init <null>:174073376 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x31354) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 mdcache_lru_pkginit <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009ed250) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #2 mdcache_pkginit <null>:164731984 (srcfsd_darwin_dev:arm64+0x1009f2f78) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #3 init_server_pkgs <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008cc984) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #4 nfs_libmain2 <null>:164731984 (srcfsd_darwin_dev:arm64+0x1008cef00) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #5 devtools_vfs::NfsMain(devtools_vfs::NfsMainOptions&) <null>:164731984 (srcfsd_darwin_dev:arm64+0x100809370) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)
    #6 main <null>:164731984 (srcfsd_darwin_dev:arm64+0x10015c9f8) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Thread T52 (tid=1939586, running) created by thread T41 at:
    #0 pthread_create <null>:174073376 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2fd88) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 work_pool_thread <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a30630) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

  Thread T56 (tid=1939590, running) created by thread T53 at:
    #0 pthread_create <null>:174073376 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2fd88) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 work_pool_thread <null>:164731984 (srcfsd_darwin_dev:arm64+0x100a30630) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00)

SUMMARY: ThreadSanitizer: data race (srcfsd_darwin_dev:arm64+0x1009ec81c) (BuildId: ad5f84c5659332ad9098efdf1add11ec32000000200000000100000000000b00) in _mdcache_lru_unref+0x60
==================
With ASAN:
SUMMARY: AddressSanitizer: unknown-crash (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x41204) (BuildId: f0a7ac5c49bc3abc851181b6f92b308a32000000200000000100000000000b00) in __asan_memset+0x104
Shadow bytes around the buggy address:
  0x00702233af00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x00702233af10: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa
  0x00702233af20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x00702233af30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x00702233af40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x00702233af50: 00 00 00 00 00 00 00 00 00 00[04]00 00 00 00 00
  0x00702233af60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00702233af70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00702233af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00702233af90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00702233afa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==78997==ABORTING