On 3/20/20 6:56 AM, Ashish Sangwan wrote:
On Tue, Mar 17, 2020 at 6:57 PM Daniel Gryniewicz
<dang(a)redhat.com> wrote:
>
> Replying inline, so I can handle both emails at once.
>
> On 3/15/20 1:17 PM, Ashish Sangwan wrote:
>> The other issue I saw with objects lying around even after the
>> corresponding export is deleted is when the export is created again
>> with same attributes.
>> We start getting cache hits for the old objects. The "struct
>> fsal_filesystem *fs" for these objects is now pointing to the fs which
>> we already freed in vfs_unexport_filesystems by calling
>> release_posix_file_system()
>
> I'm not sure this is a problem. The old cache entry is being found
> because VFS generates the same key the second time around as it did on
> the first time around. This means that the cache entry would be the
> same, and so would be recreated with the same data. Re-using the old
> cache entry should be fine. And, if the export is not re-created
> immediately, those entries will be recycled by the LRU.
I am able to hit "Use after free" crash with ASAN.
As mentioned above, it's because the cache entry's fs pointer is
pointing to freed memory.
Okay, that's definitely an issue.
Daniel
>
>>
>> On Sun, Mar 15, 2020 at 10:09 PM Ashish Sangwan
>> <ashishsangwan2(a)gmail.com> wrote:
>>>
>>> Hi Daniel,
>>>
>>> It looks like mdcache_lru_cleanup_try_push() can be improved for the
>>> cases when there are ops executing parallely with an unexport request.
>>> In this case mdcache_lru_cleanup_try_push() is not able to drop the
>>> sentinel ref and mark the entry for cleanup. This is because one extra
>>> ref is taken by the op which is currently executing. The total number
>>> of refs are > 2.
>>>
>>> Once the current op execution finishes (after the execution of
>>> mdcache_unexport()) the sentinel ref remains and the entry is also not
>>> marked for cleanup even though no export is referring to it
>>> (entry->first_export_id == -1).
>>> Is it possible to modify mdcache_lru_cleanup_try_push() such that even
>>> if the refcnt is > 2 but if the entry->first_export_id == -1 (the
>>> current export which is going away was the only one referring to this
>>> entry), we should drop the sentinel ref and mark the entry for
>>> cleanup? This will ensure we cleanup the entries when the current op
>>> does put_ref and these are reused faster than other mdcache entries
>>> which are actually pointing to valid exports.
>>>
>
> This would be possible, but I don't think it would help significantly.
> It narrows the race window by ~9 lines of code (admittedly including 3
> lock ops), but the window between when the ref it taken and
> first_export_id is set to -1 still exists.
>
> These entries are not leaked, the LRU will recycle them eventually,
> since only their sentinel ref is left. Or, if the export is re-created
> exactly, they can be reused in the new export.
>
> Daniel
>