Thanks a lot for this explanation. I posted the corresponding patch on
gerrithub.
Regards,
Patrice
On 05/17/18 16:37, Frank Filz wrote:
> Thank you, Frank and Daniel, for all your answers.
>
> As a first and simple step, I think simply changing the code returned by pxy_close
> from ERR_FSAL_NO_ERROR to ERR_FSAL_NOT_OPENED is good enough choice.
> NFSv4.1 is our main targeted protocol and not mixing FSAL_PROXY with other
> FSAL is a fully allowed restriction.
>
> But before changing it and submitting the patch, I would like to be sure to
> understand. I thought that pxy_close was not used anymore and that
> pxy_close2, the new API version of close was used instead ? Why should we still
> take care of pxy_close ?
The close (as opposed to close2) method is called primarily by LRU to close global file
descriptors (which of course FSAL_PROXY doesn't have).
To explain better the purpose...
The new support_ex API was started originally to help FSAL_VFS deal with the stupid POSIX
behavior that a close of any file descriptor open on a file requires all fcntl byte range
locks to be held by the process for the file dropped. This required Ganesha to do crazy
things to keep from losing locks. In response to this issue, Jeff Layton introduced Open
File Description Locks which attach to an open file description (which is related to but
distinct from an open file descriptor). From a Ganesha perspective, this allows each file
descriptor to be a lock owner rather than the process (it also happens to allow sharing
lock owners between processes if file descriptions are inherited through fork/clone). To
track these additional file descriptors, the support_ex added the capability of attaching
a file descriptor to a state_t (leaving the meaning of "file descriptor" loose,
so that for example, for FSAL_PROXY a "file descriptor" could actually end up
being a stateid). Along the way, the support_ex also got the improved open2 that in
addition to opening a file descriptor for an NFS v4 open state_t (or a 9p fid), allows for
a more atomic open.
So with NFS v4, most file I/O is done via an open state or a lock state.
But with NFS v3, there is no way to attach to a state for I/O, and NFS v4 does have the
special stateid values for anonymous I/O, plus GETATTR doesn't attach to a stateid.
For those purposes, we still need a "global" file descriptor. For performance
reasons, it is desirable to be able to keep them open for multiple I/O requests, so I
retained some of the pre-support_ex open file logic, including the LRU management of open
files. This the LRU thread will make close (as opposed to close2) calls via fsal_close.
There are a few other callers of fsal_close, for example, open2_by_name will call
fsal_close if there is a permission error and no state_t was passed (which is basically an
NFS v3 CREATE) since the FSAL's open2 method may have opened a global fd in the
process of doing the create (which might have been UNCHECKED, i.e. not exclusive, and thus
require a permission check after we instantiate the fsal_obj_handle for an existing file,
and of course that permission check may fail). The fsal_rename and fsal_remove functions,
since they are unlinking a file, also call fsal_close to assure that an open global file
descriptor doesn't prevent actual deletion of the file.
Note that there has been some discussion that we could simply call close2 with a NULL
state_t, however, that would still have to do the same logic. I'm not sure if there
would actually be much saving there...
Frank
> Best regards,
> Patrice
>
>
> On 05/16/18 23:00, Frank Filz wrote:
>>> Dan and I chatted more on IRC.
>>>
>>> So if there is no need to run FSAL_PROXY with other FSALs, returning
>>> ERR_FSAL_NOT_OPENED should work to eliminate the LogCrit.
>>>
>>> However, the NFS v3 CREATE case will still cause increment of
>>> open_fd_count, however, no messages will ensue (the count going above
>>> the hi water mark is only logged in a function never called in the
>>> FSAL_PROXY paths). Unfortunately this will eventually result in Futility
count
> behavior...
>>> A not too ugly hack is for FSAL_PROXY to decrement open_fd_count in
>>> the NFS
>>> v3 CREATE case (which will go negative unless there are other
>>> non-PROXY
>>> opens...) so that it gets incremented back up by open2_by_name. If
>>> other FSALs are in use, there is some possibility of a race such that
>>> fsal_close is called to close the last global fd from another FSAL
>>> between the decrement and increment, thus resulting in a negative count and
> a LogCrit.
>>> Another option would be for FSAL_PROXY to note that an
>>> fsal_obj_handle was created in such a create case, and the first time
>>> fsal_close is called, return ERR_FSAL_NO_ERROR so the extra open_fd_count
> is decremented.
>> A bit more on this final option. I suggest adding an fsal_fd to the
> pxy_obj_handle. Then in the condition where NFS v3 CREATE would cause
> open2_by_name to increment open_fd_count, you indicate the file is open (for
> read/write? It really doesn't matter). Then pxy_close would check that condition
> just like all the other FSALs and return either ERR_FSAL_NOT_OPENED or
> ERR_FSAL_NO_ERROR. If opened, it would clear the flags.
>> Down the road, when more state handling is added, you will eventually want
> to call fsal_find_fd to find a usable stateid. At that point, you have a pxy_fd
> structure that also has room for stateid4. The global fd you fill in a special
> stateid (you could even keep track of the "open flags" and fill in the
most
> appropriate special stateid, which then gets automatically used). At that point of
> course fsal_reopen_obj will be being called and the open_fd_count will be
> incremented for anonymous I/O (well, unless you always "open" the file
> read/write in which case the fd will never get reopened and you can still pick the
> appropriate special stateid for the actual operation).
>> Frank
>>
>>> Frank
>>>
>>>> -----Original Message-----
>>>> From: Daniel Gryniewicz [mailto:dang@redhat.com]
>>>> Sent: Wednesday, May 16, 2018 7:36 AM
>>>> To: devel(a)lists.nfs-ganesha.org
>>>> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY : "open_fd_count is
negative"
>>>>
>>>> Yeah, this is a problem. Mixing doing it in common code and in the
>>>> FSALs will always cause this problem. I'm tempted to say it should
>>>> always be done in common code, so it never breaks; but that way we
>>>> can't have FSALs that don't actually use real FDs control their
contribution.
>>>> We're a long way from that working, though.
>>>>
>>>> I don't see a good way to fix this, other than PROXY actually using
>>>> the proper infrastructure. The way it is now, you cannot mix PROXY
>>>> and any other FSAL and have a working system.
>>>>
>>>> Daniel
>>>>
>>>> On 05/16/2018 10:16 AM, Frank Filz wrote:
>>>>>> How about this:
>>>>>>
>>>>>> pxy_close() just returns ERR_FSAL_NO_ERROR. How about, instead,
>>>>>> it returns ERR_FSAL_NOT_OPENED?
>>>>> Yea, that would do it. I see that fsal_close converts
>>>>> ERR_FSAL_NOT_OPENED
>>>> to ERR_FSAL_NO_ERROR.
>>>>> Though that leaves us with this increment occurring without
decrement:
>>>>>
>>>>> FSAL/fsal_helper.c:413 open2_by_name : (void )
>>>>> atomic_inc_size_t(&open_fd_count);
>>>>>
>>>>>> Daniel
>>>>>>
>>>>>> On 05/16/2018 09:28 AM, Frank Filz wrote:
>>>>>>> This is an example of why we need help from the FSAL in
managing
>>>>>> open_fd_count.
>>>>>>> The counter is decremented in fsal_close.
>>>>>>>
>>>>>>> The primary place it is incremented is in fsal_reopen_obj,
which
>>>>>>> FSAL_PROXY
>>>>>> never calls since it has absolutely no need to maintain a global
>>>>>> file descriptor (it uses the always available special stateids).
>>>>>>> We could move the decrement into the FSAL itself which would
then
>>>>>>> know
>>>>>> exactly when it has closed a resource it was counting with
> open_fd_count.
>>>>>>> Frank
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Daniel Gryniewicz [mailto:dang@redhat.com]
>>>>>>>> Sent: Wednesday, May 16, 2018 5:29 AM
>>>>>>>> To: devel(a)lists.nfs-ganesha.org
>>>>>>>> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY :
"open_fd_count is
>>>> negative"
>>>>>>>> Hi, Patrice,
>>>>>>>>
>>>>>>>> This is definitely a problem. The LRU uses open_fd_count
to
>>>>>>>> determine when it needs to run, and when it's having
problems.
>>>>>>>> I made that change specifically to catch problems where
it goes
>>>>>>>> negative (which the LRU counts as huge...)
>>>>>>>>
>>>>>>>> Do you have a reproducer? Preferably a relatively simple
one?
>>>>>>>> I'd love to get this fixed.
>>>>>>>>
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>> On 05/16/2018 04:28 AM, patrice.lucas(a)cea.fr wrote:
>>>>>>>>> Since 2.7-dev.8 and the Daniel's patch fc5e13ecb
"adding a
>>>>>>>>> message when open_fd_count comes negative",
FSAL_PROXY tests
>>>>>>>>> logs are full of these critical messages :
"open_fd_count is negative".
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Should I worry about and try to fix it ? Or is this
>>>>>>>>> open_fd_count deprecated and unused ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Patrice
>>>>>>>>> _______________________________________________
>>>>>>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To
>>>>>>>>> unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
>>>>>>>> _______________________________________________
>>>>>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To
unsubscribe
>>>>>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>>>> _______________________________________________
>>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
>>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>>>>
>>>> _______________________________________________
>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>> _______________________________________________
>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send
>>> an email to devel-leave(a)lists.nfs-ganesha.org
>> _______________________________________________
>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send
>> an email to devel-leave(a)lists.nfs-ganesha.org
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to
> devel-leave(a)lists.nfs-ganesha.org
--
Patrice LUCAS
Ingenieur-Chercheur, CEA-DAM/DSSI/SISR/LA2S
tel : +33 (0)1 69 26 47 86
e-mail : patrice.lucas(a)cea.fr