Dan and I chatted more on IRC.
So if there is no need to run FSAL_PROXY with other FSALs, returning
ERR_FSAL_NOT_OPENED should work to eliminate the LogCrit.
However, the NFS v3 CREATE case will still cause increment of open_fd_count,
however, no messages will ensue (the count going above the hi water mark is
only logged in a function never called in the FSAL_PROXY paths). Unfortunately
this will eventually result in Futility count behavior...
A not too ugly hack is for FSAL_PROXY to decrement open_fd_count in the NFS
v3 CREATE case (which will go negative unless there are other non-PROXY
opens...) so that it gets incremented back up by open2_by_name. If other FSALs
are in use, there is some possibility of a race such that fsal_close is called to
close the last global fd from another FSAL between the decrement and
increment, thus resulting in a negative count and a LogCrit.
Another option would be for FSAL_PROXY to note that an fsal_obj_handle was
created in such a create case, and the first time fsal_close is called, return
ERR_FSAL_NO_ERROR so the extra open_fd_count is decremented.
A bit more on this final option. I suggest adding an fsal_fd to the pxy_obj_handle. Then
in the condition where NFS v3 CREATE would cause open2_by_name to increment open_fd_count,
you indicate the file is open (for read/write? It really doesn't matter). Then
pxy_close would check that condition just like all the other FSALs and return either
ERR_FSAL_NOT_OPENED or ERR_FSAL_NO_ERROR. If opened, it would clear the flags.
Down the road, when more state handling is added, you will eventually want to call
fsal_find_fd to find a usable stateid. At that point, you have a pxy_fd structure that
also has room for stateid4. The global fd you fill in a special stateid (you could even
keep track of the "open flags" and fill in the most appropriate special stateid,
which then gets automatically used). At that point of course fsal_reopen_obj will be being
called and the open_fd_count will be incremented for anonymous I/O (well, unless you
always "open" the file read/write in which case the fd will never get reopened
and you can still pick the appropriate special stateid for the actual operation).
Frank
Frank
> -----Original Message-----
> From: Daniel Gryniewicz [mailto:dang@redhat.com]
> Sent: Wednesday, May 16, 2018 7:36 AM
> To: devel(a)lists.nfs-ganesha.org
> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY : "open_fd_count is negative"
>
> Yeah, this is a problem. Mixing doing it in common code and in the
> FSALs will always cause this problem. I'm tempted to say it should
> always be done in common code, so it never breaks; but that way we
> can't have FSALs that don't actually use real FDs control their
contribution.
> We're a long way from that working, though.
>
> I don't see a good way to fix this, other than PROXY actually using
> the proper infrastructure. The way it is now, you cannot mix PROXY
> and any other FSAL and have a working system.
>
> Daniel
>
> On 05/16/2018 10:16 AM, Frank Filz wrote:
> >> How about this:
> >>
> >> pxy_close() just returns ERR_FSAL_NO_ERROR. How about, instead, it
> >> returns ERR_FSAL_NOT_OPENED?
> >
> > Yea, that would do it. I see that fsal_close converts
> > ERR_FSAL_NOT_OPENED
> to ERR_FSAL_NO_ERROR.
> >
> > Though that leaves us with this increment occurring without decrement:
> >
> > FSAL/fsal_helper.c:413 open2_by_name : (void )
> > atomic_inc_size_t(&open_fd_count);
> >
> >> Daniel
> >>
> >> On 05/16/2018 09:28 AM, Frank Filz wrote:
> >>> This is an example of why we need help from the FSAL in managing
> >> open_fd_count.
> >>>
> >>> The counter is decremented in fsal_close.
> >>>
> >>> The primary place it is incremented is in fsal_reopen_obj, which
> >>> FSAL_PROXY
> >> never calls since it has absolutely no need to maintain a global
> >> file descriptor (it uses the always available special stateids).
> >>>
> >>> We could move the decrement into the FSAL itself which would then
> >>> know
> >> exactly when it has closed a resource it was counting with open_fd_count.
> >>>
> >>> Frank
> >>>
> >>>> -----Original Message-----
> >>>> From: Daniel Gryniewicz [mailto:dang@redhat.com]
> >>>> Sent: Wednesday, May 16, 2018 5:29 AM
> >>>> To: devel(a)lists.nfs-ganesha.org
> >>>> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY : "open_fd_count
is
> negative"
> >>>>
> >>>> Hi, Patrice,
> >>>>
> >>>> This is definitely a problem. The LRU uses open_fd_count to
> >>>> determine when it needs to run, and when it's having problems.
I
> >>>> made that change specifically to catch problems where it goes
> >>>> negative (which the LRU counts as huge...)
> >>>>
> >>>> Do you have a reproducer? Preferably a relatively simple one?
> >>>> I'd love to get this fixed.
> >>>>
> >>>> Daniel
> >>>>
> >>>> On 05/16/2018 04:28 AM, patrice.lucas(a)cea.fr wrote:
> >>>>> Since 2.7-dev.8 and the Daniel's patch fc5e13ecb
"adding a
> >>>>> message when open_fd_count comes negative", FSAL_PROXY
tests
> >>>>> logs are full of these critical messages : "open_fd_count
is negative".
> >>>>>
> >>>>>
> >>>>> Should I worry about and try to fix it ? Or is this
> >>>>> open_fd_count deprecated and unused ?
> >>>>>
> >>>>>
> >>>>> Best regards,
> >>>>>
> >>>>> Patrice
> >>>>> _______________________________________________
> >>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To
unsubscribe
> >>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
> >>>> _______________________________________________
> >>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
> >>>> send an email to devel-leave(a)lists.nfs-ganesha.org
> >>>
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
> > send an email to devel-leave(a)lists.nfs-ganesha.org
> >
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send
> an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org