We are using V2.8.2 but last we pulled was 7 months ago.
Frank, your fix I mentioned above did fix the Filebench hang in fileserver workload.
But now we are seeing new issue in the 'videoserver' workload.
We are seeing a dump of these errors in ganesha log:
2020-03-03T21:08:53Z : epoch 5e5bfd3c : fsvm23 : ganesha.nfsd-33[::ffff:172.30.0.121]
[svc_3546] 288
:vdfs_filehandle_open :FSAL :vdfs_open failed: could not get attributes: Input/output
error (5)
2020-03-03T21:08:53Z : epoch 5e5bfd3c : fsvm23 : ganesha.nfsd-33[::ffff:172.30.0.121]
[svc_3546] 84
:posix2fsal_error :FSAL :Mapping 5 to ERR_FSAL_IO, rlim_cur=65536 rlim_max=65536
There are like 200k lines of these errors within few minutes. I think all operations are
failing with this error.
So this means Ganesha's LRU threads could not close the FDs faster and hence hitting
the underlying filesystem
resource limits?
ulimit -n on our server is 65536, and ganesha.conf is configured with :
CACHEINODE { LRU_Run_Interval = 60; FD_Limit_Percent = 75; FD_HWMark_Percent = 50;
FD_LWMark_Percent = 20; Entries_HWMark = 65536; Reaper_Work_Per_Lane = 100; }
So we should have hit the high water mark and before underlying filesystem and closed the
FDs right?