A patch that worked for us is attached. If this is along the lines that is acceptable, I
will submit a merge request.
Regards.
Krishna Harathi
On 10/15/18, 11:39 AM, "Krishna Harathi"
<krishna.harathi(a)storagecraft.com> wrote:
Thanks for all the input. BTW we do use (and configure) our own FS-derived FSID
major/minor that does not depend on device major and minor numbers.
Since we would like to use SIGHUP so as to not disrupt clients after configuration
changes, we will attempt to fix this along the lines Franks suggested and submit a
patch.
Regards.
Krishna Harathi
On 10/11/18, 8:00 AM, "Frank Filz" <ffilzlnx(a)mindspring.com> wrote:
From: Jeff Layton [mailto:jlayton@redhat.com]
I'd suggest that the fundamental problem here is the use of device major/minor
in filehandles. Device major+minor (at least on Linux and probably BSD too) is
often not stable across reboots, when devices are added and removed. Getting
ESTALEs because your server rebooted is pretty sucky.
It is for good reason that we moved away from doing that in Linux' knfsd years
ago. It does still use that in legacy cases, when there is no filesystem UUID
available. Most installations these days use filesystem UUID-based filehandles,
which are stable no matter what device they show up on.
Ganesha does default to using UUID if available. The problem is that the way
getmntent works on Linux, sometimes a filesystem may show up several times (I don't
seem to currently have any of those, but I have seen it in the past). We use the device ID
to distinguish duplicate entries for the same filesystem. The problem is that if a
filesystem is unmounted and a different filesystem is mounted, we have no way of knowing
that. We do rescan filesystems when adding exports (so you can dynamically mount a new
filesystem and then export it).
My suggestion is to check for filesystems that have gone away also somewhere...
I'd counsel against trying to track mounts and unmounts
automagically too. I
think you'll find it very difficult to do that in a way that isn't subject to
races.
-- Jeff
On Thu, 2018-10-11 at 04:28 +0000, Sriram Patil wrote:
> I have also encountered same issue where the underlying file system is
unmounted but ganesha still holds it details and the device major, minor reuse
later causes problems.
>
> I can try and look into this and figure out a way to refresh the file systems list
for ganesha. But, another problem in this is if the file system was exported by
ganesha, should the export also be removed or should we wait for an explicit
RemoveExport. Any way, since the file system is unmounted, the clients will start
getting ESTALEs and may be even worse errors.
For FSALs that use actual filesystems (FSAL_VFS. FSAL_XFS, FSAL_GPFS), we have a
handle for the root open of the export that will generally prevent the filesystem from
being unmounted before being unexported (a force unmount may work). We don't want to
automatically unexport because if the sysadmin really has taken a filesystem offline
without unexporting, they will probably be mounting it back again. Of course in the
meantime, we throw ESTALE and clients fall apart...
Frank
>
> Thanks,
> Sriram
> On Oct 10, 2018, 11:04 PM +0530, Frank Filz <ffilzlnx(a)mindspring.com>,
wrote:
> > > - On the issue itself, I forgot to mention that we used SIGHUP to
> > > reload. As you suggested, will work on a patch to cleanup device
> > > major/minor to reuse if possible, although I understand delete is not
completely supported via SIGHUP.
> > > But I do see some action to remove even with SIGHUP.
> >
> > You have to use DBUS to unexport. I couldn’t come up with a way to do
unexport with SIGHUP. Actually, I do have an idea, add a Unexport = true;
option. If that's set and the export is present, unexport, otherwise ignore the
export - then you can leave exports in your config but have them not be
present....
> >
> > If an export is removed with unexport, that should release it's
filesystems,
and then a filesystem re-scan has the potential of removing filesystems no
longer present (but we have to be careful there, GPFS has some issues with
filesystems going offline and then messing up exports).
> >
> > Frank
> >
> > > - On this list subscription - I did subscribe to the list as well as
registered
myself.
> > > But I had trouble posting to the list, I had to login directly and
> > > post. I also not received my own post yet, although I do and did get
other
posts.
> > >
> > > Regards.
> > > Krishna Harathi
> > >
> > >
> > > On 10/9/18, 4:19 PM, "Frank Filz"
<ffilzlnx(a)mindspring.com> wrote:
> > >
> > > Ganesha's management of filesystems is probably not ideal. We can
> > > add new ones, but I don't think we implemented removing unused ones.
> > >
> > > I would suggest looking at the filesystem management code to see
> > > if there's a good way to remove them.
> > >
> > > You would have to unmount the filesystem, trigger Ganesha to clean
> > > up and remove the filesystem, and then mount the new filesystem.
> > >
> > > The filesystem enumeration showed multiple entries using the same
> > > device major and minor (all really related to the same filesystem,
> > > but something about the way Linux handles volumes and such) which
> > > means Ganesha must pick a filesystem to use for a given device
> > > major and minor. If the device major and minor is reused and
> > > Ganesha hasn't cleaned out the old filesystem, it will detect the
> > > duplicate and just re-use the original filesystem (which of course no
longer
is mounted...).
> > >
> > > Frank
> > >
> > > > -----Original Message-----
> > > > From: krishna.harathi(a)storagecraft.com
> > > > [mailto:krishna.harathi@storagecraft.com]
> > > > Sent: Tuesday, October 9, 2018 3:30 PM
> > > > To: devel(a)lists.nfs-ganesha.org
> > > > Subject: [NFS-Ganesha-Devel] Ganesha 2.5.4 - usage of device
> > > > major, minor
> > > >
> > > > We are using Ganesha 2.5.4 VFS FSAL with FUSE based filesystem.
> > > >
> > > > During our testing of deleting existing exports and creating new
> > > > ones, found
> > > that
> > > > if a device major and minor is reused, clients get ESTALE for
> > > > accessing a
> > > newly
> > > > created export (nfs2 below).
> > > > This seems to cause the following log entry, and explains the
> > > > ESTALE
> > > response.
> > > > 04/10/2018 T15:16:59.769027-0700 : nfs-ganesha-26627[sigmgr]
> > > > 1595 :claim_posix_filesystems :FSAL :INFO :Root fs for export
> > > > /exports/nfs1 is
> > > > /exports/nfs2
> > > >
> > > > We use our own Exportid and unique FSID configured for each
> > > > export in the configuration file.
> > > >
> > > > I would like to know more about the intent and purpose of the
> > > > usage of
> > > device
> > > > major and minor of an export in this context.
> > > > Any help in fixing this reuse issue is also appreciated.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > Regards.
> > > > Krishna Harathi
> > > > _______________________________________________
> > > > Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
> > > > send an
> > > email to
> > > > devel-leave(a)lists.nfs-ganesha.org
> > >
> > >
> >
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
> > send an email to devel-leave(a)lists.nfs-ganesha.org
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send
> an email to devel-leave(a)lists.nfs-ganesha.org
--
Jeff Layton <jlayton(a)redhat.com>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org