We do that. If open_fd_count > fds_hard_limit, we return EDELAY in
mdcache_open2() and fsal_reopen_obj().
Daniel
On 06/18/2018 11:20 AM, Malahal Naineni wrote:
> The actual number of fds open is 554, at least that is what the kernel
> thinks. If you have open_fd_count as 4055, something is wrong in the
> accounting of open files. What is the max files your Ganesha daemon can
> open (cat /proc/<PID>/limits should tell you). As far as I remember,
> the accounting value "open_fd_count" is only used to close files
> aggressively. Can you track code path where ganesha is sending DELAY error?
>
> On Mon, Jun 18, 2018 at 8:03 PM bharat singh <bharat064015@gmail.com
> <mailto:bharat064015@gmail.com>> wrote:
>
> This is a V3 mount only.
> There are a bunch of socket and anonymous fds opened, but that only
> 554. In current state my setup has 4055 fds opened and it won't make
> any progress for days even without any new I/O coming in. I have a
> coredump, please let me know what info you need out of it to debug this.
>
> # ls -l /proc/2576/fd | wc -l
> 554
>
> On Mon, Jun 18, 2018 at 7:12 AM Malahal Naineni <malahal@gmail.com
> <mailto:malahal@gmail.com>> wrote:
>
> Try to find the open files by doing "ls -l /proc/<PID>/fds".
> Are you using NFSv4 or V3? If this is all a V3, then clearly a
> bug. NFSv4 may imply some clients opened the files but never
> closed for some reason or we ignored client's CLOSE request.
>
> On Mon, Jun 18, 2018 at 7:30 PM bharat singh
> <bharat064015@gmail.com <mailto:bharat064015@gmail.com>> wrote:
>
> I already have this patch
> c2b448b1a079ed66446060a695e4dd06d1c3d1c2 Fix closing global
> file descriptors
>
>
>
> On Mon, Jun 18, 2018 at 5:41 AM Daniel Gryniewicz
> <dang@redhat.com <mailto:dang@redhat.com>> wrote:
>
> Try this one:
>
> 5c2efa8f077fafa82023f5aec5e2c474c5ed2fdf Fix closing
> global file descriptors
>
> Daniel
>
>
> On 06/15/2018 03:08 PM, bharat singh wrote:
> > I have been testing Ganesha 2.5.4 code with default
> mdcache settings. It
> > starts showing issues after prolonged I/O runs.
> > Once it exhausts all the allowed fds, its kind of
> gets stuck
> > returning ERR_FSAL_DELAY for every client op.
> >
> > A snapshot of the mdcache
> >
> > open_fd_count = 4055
> > lru_state = {
> > entries_hiwat = 100000,
> > entries_used = 323,
> > chunks_hiwat = 100000,
> > chunks_used = 9,
> > fds_system_imposed = 4096,
> > fds_hard_limit = 4055,
> > fds_hiwat = 3686,
> > fds_lowat = 2048,
> > futility = 109,
> > per_lane_work = 50,
> > biggest_window = 1638,
> > prev_fd_count = 4055,
> > prev_time = 1529013538,
> > fd_state = 3
> > }
> >
> > [cache_lru] lru_run :INODE LRU :INFO :After work,
> open_fd_count:4055
> > entries used count:327 fdrate:0 threadwait=9
> > [cache_lru] lru_run :INODE LRU :INFO :lru entries:
> 327 open_fd_count:4055
> > [cache_lru] lru_run :INODE LRU :INFO :lru entries:
> 327open_fd_count:4055
> > [cache_lru] lru_run :INODE LRU :INFO :After work,
> open_fd_count:4055
> > entries used count:327 fdrate:0 threadwait=90
> >
> > I have killed the NFS clients, so no new I/O is being
> received. But even
> > after a couple of hours I don't see lru_run making
> any progress, thereby
> > open_fd_count remains a 4055 and even a single file
> open won't be
> > served. So basically the server is in stuck state.
> >
> > I have these changes patched over 2.5.4 code
> > e2156ad3feac841487ba89969769bf765457ea6e Replace
> cache_fds parameter and
> > handling with better logic
> > 667083fe395ddbb4aa14b7bbe7e15ffca87e3b0b MDCACHE -
> Change and lower
> > futility message
> > 37732e61985d919e6ca84dfa7b4a84163080abae Move
> open_fd_count from MDCACHE
> > to FSALs (https://review.gerrithub.io/#/c/391267/)
> >
> > Any suggestions how to resolve this ?
> >
> >
> >
> >
> > _______________________________________________
> > Devel mailing list -- devel@lists.nfs-ganesha.org
> <mailto:devel@lists.nfs-ganesha.org>
> > To unsubscribe send an email to
> devel-leave@lists.nfs-ganesha.org
> <mailto:devel-leave@lists.nfs-ganesha.org>
> >
> _______________________________________________
> Devel mailing list -- devel@lists.nfs-ganesha.org
> <mailto:devel@lists.nfs-ganesha.org>
> To unsubscribe send an email to
> devel-leave@lists.nfs-ganesha.org
> <mailto:devel-leave@lists.nfs-ganesha.org>
>
>
>
> --
> -Bharat
>
>
> _______________________________________________
> Devel mailing list -- devel@lists.nfs-ganesha.org
> <mailto:devel@lists.nfs-ganesha.org>
> To unsubscribe send an email to
> devel-leave@lists.nfs-ganesha.org
> <mailto:devel-leave@lists.nfs-ganesha.org>
>
>
>
> --
> -Bharat
>
>
>
>
> _______________________________________________
> Devel mailing list -- devel@lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org
>
w
_______________________________________________
Devel mailing list -- devel@lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org