Thank you, Frank and Daniel, for all your answers.
As a first and simple step, I think simply changing the code returned by
pxy_close from ERR_FSAL_NO_ERROR to ERR_FSAL_NOT_OPENED is good enough
choice. NFSv4.1 is our main targeted protocol and not mixing FSAL_PROXY
with other FSAL is a fully allowed restriction.
But before changing it and submitting the patch, I would like to be sure
to understand. I thought that pxy_close was not used anymore and that
pxy_close2, the new API version of close was used instead ? Why should
we still take care of pxy_close ?
Best regards,
Patrice
On 05/16/18 23:00, Frank Filz wrote:
> Dan and I chatted more on IRC.
>
> So if there is no need to run FSAL_PROXY with other FSALs, returning
> ERR_FSAL_NOT_OPENED should work to eliminate the LogCrit.
>
> However, the NFS v3 CREATE case will still cause increment of open_fd_count,
> however, no messages will ensue (the count going above the hi water mark is
> only logged in a function never called in the FSAL_PROXY paths). Unfortunately
> this will eventually result in Futility count behavior...
>
> A not too ugly hack is for FSAL_PROXY to decrement open_fd_count in the NFS
> v3 CREATE case (which will go negative unless there are other non-PROXY
> opens...) so that it gets incremented back up by open2_by_name. If other FSALs
> are in use, there is some possibility of a race such that fsal_close is called to
> close the last global fd from another FSAL between the decrement and
> increment, thus resulting in a negative count and a LogCrit.
>
> Another option would be for FSAL_PROXY to note that an fsal_obj_handle was
> created in such a create case, and the first time fsal_close is called, return
> ERR_FSAL_NO_ERROR so the extra open_fd_count is decremented.
A bit more on this final option. I suggest adding an fsal_fd to the pxy_obj_handle. Then
in the condition where NFS v3 CREATE would cause open2_by_name to increment open_fd_count,
you indicate the file is open (for read/write? It really doesn't matter). Then
pxy_close would check that condition just like all the other FSALs and return either
ERR_FSAL_NOT_OPENED or ERR_FSAL_NO_ERROR. If opened, it would clear the flags.
Down the road, when more state handling is added, you will eventually want to call
fsal_find_fd to find a usable stateid. At that point, you have a pxy_fd structure that
also has room for stateid4. The global fd you fill in a special stateid (you could even
keep track of the "open flags" and fill in the most appropriate special stateid,
which then gets automatically used). At that point of course fsal_reopen_obj will be being
called and the open_fd_count will be incremented for anonymous I/O (well, unless you
always "open" the file read/write in which case the fd will never get reopened
and you can still pick the appropriate special stateid for the actual operation).
Frank
> Frank
>
>> -----Original Message-----
>> From: Daniel Gryniewicz [mailto:dang@redhat.com]
>> Sent: Wednesday, May 16, 2018 7:36 AM
>> To: devel(a)lists.nfs-ganesha.org
>> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY : "open_fd_count is
negative"
>>
>> Yeah, this is a problem. Mixing doing it in common code and in the
>> FSALs will always cause this problem. I'm tempted to say it should
>> always be done in common code, so it never breaks; but that way we
>> can't have FSALs that don't actually use real FDs control their
contribution.
>> We're a long way from that working, though.
>>
>> I don't see a good way to fix this, other than PROXY actually using
>> the proper infrastructure. The way it is now, you cannot mix PROXY
>> and any other FSAL and have a working system.
>>
>> Daniel
>>
>> On 05/16/2018 10:16 AM, Frank Filz wrote:
>>>> How about this:
>>>>
>>>> pxy_close() just returns ERR_FSAL_NO_ERROR. How about, instead, it
>>>> returns ERR_FSAL_NOT_OPENED?
>>> Yea, that would do it. I see that fsal_close converts
>>> ERR_FSAL_NOT_OPENED
>> to ERR_FSAL_NO_ERROR.
>>> Though that leaves us with this increment occurring without decrement:
>>>
>>> FSAL/fsal_helper.c:413 open2_by_name : (void )
>>> atomic_inc_size_t(&open_fd_count);
>>>
>>>> Daniel
>>>>
>>>> On 05/16/2018 09:28 AM, Frank Filz wrote:
>>>>> This is an example of why we need help from the FSAL in managing
>>>> open_fd_count.
>>>>> The counter is decremented in fsal_close.
>>>>>
>>>>> The primary place it is incremented is in fsal_reopen_obj, which
>>>>> FSAL_PROXY
>>>> never calls since it has absolutely no need to maintain a global
>>>> file descriptor (it uses the always available special stateids).
>>>>> We could move the decrement into the FSAL itself which would then
>>>>> know
>>>> exactly when it has closed a resource it was counting with
open_fd_count.
>>>>> Frank
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Daniel Gryniewicz [mailto:dang@redhat.com]
>>>>>> Sent: Wednesday, May 16, 2018 5:29 AM
>>>>>> To: devel(a)lists.nfs-ganesha.org
>>>>>> Subject: [Nfs-ganesha-devel] Re: FSAL_PROXY : "open_fd_count
is
>> negative"
>>>>>> Hi, Patrice,
>>>>>>
>>>>>> This is definitely a problem. The LRU uses open_fd_count to
>>>>>> determine when it needs to run, and when it's having
problems. I
>>>>>> made that change specifically to catch problems where it goes
>>>>>> negative (which the LRU counts as huge...)
>>>>>>
>>>>>> Do you have a reproducer? Preferably a relatively simple one?
>>>>>> I'd love to get this fixed.
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>> On 05/16/2018 04:28 AM, patrice.lucas(a)cea.fr wrote:
>>>>>>> Since 2.7-dev.8 and the Daniel's patch fc5e13ecb
"adding a
>>>>>>> message when open_fd_count comes negative", FSAL_PROXY
tests
>>>>>>> logs are full of these critical messages :
"open_fd_count is negative".
>>>>>>>
>>>>>>>
>>>>>>> Should I worry about and try to fix it ? Or is this
>>>>>>> open_fd_count deprecated and unused ?
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Patrice
>>>>>>> _______________________________________________
>>>>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To
unsubscribe
>>>>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>>>>> _______________________________________________
>>>>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
>>>>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>> _______________________________________________
>>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe
>>> send an email to devel-leave(a)lists.nfs-ganesha.org
>>>
>> _______________________________________________
>> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send
>> an email to devel-leave(a)lists.nfs-ganesha.org
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to
> devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org