[NFS-Ganesha-Support] Re: Ref Leak in Chunk Cache

Wednesday, 18 August 2021

So, the chunking isn't perfect.  There's an interaction between how many 
dirents the FSAL returns and how many dirents we can return to the 
client.  It looks like, if we're at the end of a chunk when the client 
response is full, we'll start a new chunk on the next cycle, even if 
more dirents would fit in the current chunk.  I think in your case, the 
only way to get perfect chunking would be for the readahead from the 
FSAL to be large enough to always return 500 dirents, so that each 
directory would contain exactly 1 chunk.  Most FSALs, however, do 
readahead based on a buffer size, and fit as many dirents as they can 
into the buffer, so readahead count will change based on the names of 
the files in the directory.

As for subsequent runs, the issue here is that the entire working set 
(total number of files and directories) is 12,500,000, which is much 
larger than the 500,000 of the hwmark.  This means that handles for some 
of the directories early in the listing will be re-used later in the 
listing, clearing out the dirent cache for those directories (dirent 
cache it not a separate thing; the dirent cache for each directory is 
attached to that directory's handle, and so is freed when that handle is 
freed or re-used).  If you always list the directories in the same 
order, then the next listing, the handles for later directories will be 
re-used when scanning the early entries, so you will get a new set of 
dirents and chunks every listing.

Basically, this workload is a pathological case for the combination of 
handle cache and dirent cache, as implemented in Ganesha.  Those caches 
were designed on the idea that the working set of files/directories 
would be a fairly constant subset of the total set of files and 
directories, that might change slowly over time, but that it wouldn't be 
the entire set.  This workload of listing the entire tree over and over 
will be effectively reloading the entire cache every time.

That said, it's not impossible that there's a leak somewhere.  Could you 
run the listing maybe 20 times and see if the number only goes up?  If 
so, that would be a strong hint that there's a leak.  If it goes up and 
down, then it's likely just variability in readahead/client buffer in 
the listing.

Daniel

On 8/17/21 3:54 PM, None via Support wrote:
...
 Hi,

 I am running ganesha 3.5. I have a directory structure such that it has 25000 directories
with each one of them contain 500 files. The ChunkSize is set to 1000 and Entries-HWMark
is set to 50000 in the ganesha configuration. I run a script to list all the 25000
directories. I ran the list command three times. No other clients were interacting with
server. Through the stats tool, I retrieved the chunk count value.

 1st Run: Chunk Count: 58335
 2nd Run: Chunk Count: 69792
 3rd Run: Chunk Count: 73270

  From my understanding, the Chunk Cache stores the directory information about a
directory. So with 25000 directories with 500 files each, ideally the chunk count should
be around 25000. One for each directory. And on subsequent runs, it should not increase.

 I have not configured the Chunk_HWMark value. So it is set to the default 100k. I suspect
there is some chunk ref leak that is causing the chunk count to grow.

 Setup Summary

 Ganesha: 3.5
 Cache Configuration:

 MDCACHE {
          Dir_Chunk = 1000;
          Entries_HWMark = 500000;
 }

 VFS: Custom

 _______________________________________________
 Support mailing list -- support(a)lists.nfs-ganesha.org
 To unsubscribe send an email to support-leave(a)lists.nfs-ganesha.org

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Support] Re: Ref Leak in Chunk Cache