So, it's not impossible, based on the workload, but it may also be a bug.
For global FDs (All NFSv3 and stateless NFSv4), we obviously cannot know
when the client closes the FD, and opening/closing all the time causes a
large performance hit. So, we cache open FDs.
All handles in MDCACHE live on the LRU. This LRU is divided into 2
levels. Level 1 is more active handles, and they can have open FDs.
Various operation can demote a handle to level 2 of the LRU. As part of
this transition, the global FD on that handle is closed. Handles that
are actively in use (have a refcount taken on them) are not eligible for
this transition, as the FD may be being used.
We have a background thread that runs, and periodically does this
demotion, closing the FDs. This thread runs more often when the number
of open FDs is above FD_HwMark_Percent of the available number of FDs,
and runs constantly when the open FD count is above FD_Limit_Percent of
the available number of FDs.
So, a heavily used server could definitely have large numbers of FDs
open. However, there have also, in the past, been bugs that would
either keep the FDs from being closed, or would break the accounting (so
they were closed, but Ganesha still thought they were open). You didn't
say what version of Ganesha you're using, so I can't tell if one of
those bugs apply.
Daniel
On 9/19/19 10:19 AM, Billich Heinrich Rainer (ID SD) wrote:
Hello,
Is it usual to see 200’000-400’000 open files for a single ganesha
process? Or does this indicate that something ist wrong?
We have some issues with ganesha (on spectrum scale protocol nodes)
reporting NFS3ERR_IO in the log. I noticed that the affected nodes
have a large number of open files, 200’000-400’000 open files per daemon
(and 500 threads and about 250 client connections). Other nodes have
1’000 – 10’000 open files by ganesha only and don’t show the issue.
If someone could explain how ganesha decides which files to keep open
and which to close that would help, too. As NFSv3 is stateless the
client doesn’t open/close a file, it’s the server to decide when to
close it? We do have a few NFSv4 clients, too.
Are there certain access patterns that can trigger such a large number
of open file? Maybe traversing and reading a large number of small files?
Thank you,
Heiner
I did count the open files by counting the entries in /proc/<pid of
ganesha>/fd/ . With several 100k entries I failed to do a ‘ls -ls’ to
list all the symbolic links, hence I can’t relate the open files to
different exports easily.
--
=======================
Heinrich Billich
ETH Zürich
Informatikdienste
Tel.: +41 44 632 72 56
heinrich.billich(a)id.ethz.ch
========================
_______________________________________________
Support mailing list -- support(a)lists.nfs-ganesha.org
To unsubscribe send an email to support-leave(a)lists.nfs-ganesha.org