---------- Forwarded message ---------
From: Ivan Rossi <rouge2507@gmail.com>
Date: Thu, Aug 26, 2021 at 1:01 PM
Subject: [Gluster-users] Ganesha+Gluster strange issue: incomplete directory reads
To: <gluster-users@gluster.org>


Hello list,

Ganesha (but not Gluster) newbie here.
This is the first time I have to set-up Ganesha to serve a Gluster volume, but
it seems i stumbled on a weird issue. Hope it is due to my inexperience with
Ganesha.

I need to serve a Gluster volume to some old production VMs that cannot install
a recent Gluster client. Thus they need to access the Gluster data using the
standard Linux NFS client. Since native Gluster NFS server is gone, I had to go
the Ganesha route.

Volume served is used for bioinformatics analysis, each subdirectory containing
on the order of a thousand files, a few of them large (think 50 Gb each)

Now the issue:

When the volume is mounted on the client (using NFSv3) directory reads
SOMETIMES return an INCOMPLETE list of files. Problem goes away if you redo the
read in a different way as if the first directory metadata read did not
complete successfully but it is then cached anyway.

Problem does not manifests if there are few files in the directory or they are all small (think < 1 GB)

Direct access to the files is OK eve if they did not show up in the ls. E.g. :

mount -t nfs ganesha:/srv/glfs/work /mnt/
ls /mnt/47194616IMS187mm10 | wc -l
# wrong result
ls: reading directory /mnt/47194616IMS187mm10: Input/output error
304

# right ( NB ls-l returns one line more than plain ls)
ls -l /mnt/47194616IMS187mm10 | wc -l
668

# after 'ls -l' now even plain ls returns the expected number of files

ls /mnt/47194616IMS187mm10 | wc -l
667

Furthermore i see the Input/output message only because of the pip to wc, if i
just run plain ls, in a terminal it fails silently returning a partial list.

If the client mounts the volume using NFSv4 everything looks as expected.

mount -t nfs -o vers=4.0 ganesha:/work /mnt/
ls /mnt/47194616IMS187mm10 | wc -l
667

but as you can guess my confidence in using Ganesha in production is somewhat
shaking ATM.

My feeling is that it is a Ganesha problem or something lacking in the Ganesha
configuration for Gluster. My Ganesha configuration is basically just defaults.  
No failover conf either.

My Gluster setup has nothing strange, I am just serving a R3 volume and
defaults are just fine to get a fast volume given the hardware. Furthermore the
volume looks fine from the Gluster clients.

I am using Gluster 8.4 and Ganesha 3.4 on Debian 10 (buster). Packages coming
from the Gluster and Ganesha repos, not the debian one.

Has anyone seen anything similar before?
Did I stumble on a bug?
Any advice or common wisdom to share?

Ivan Rossi



________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


--

Kaleb