Hello All,

This is regarding the problem observed in nfs ganesha. The problem is when one try to
remove the directory (in case of VFS FSAL) it returned with an error "Device or
resource busy".Interestingly it mount that directory at client side.

I have investigated this problem and found out that the fsid is getting changed and hence
problem is when we try to remove the directory. Further investigation revealed that problem
actually happening at the time of directory creation time.  


Following is portion of code from makedir()
--------------->8--------------------------
❘⠀⠀⠀if (attrib->valid_mask) {
❘⠀⠀⠀❘⠀⠀⠀/* Now per support_ex API, if there are any other attributes
❘⠀⠀⠀❘⠀⠀⠀ * set, go ahead and get them set now.
❘⠀⠀⠀❘⠀⠀⠀ */
❘⠀⠀⠀❘⠀⠀⠀status = (*handle)->obj_ops->setattr2(*handle, false, NULL,
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀     attrib);
❘⠀⠀⠀❘⠀⠀⠀if (FSAL_IS_ERROR(status)) {
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀/* Release the handle we just allocated. */
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀LogFullDebug(COMPONENT_FSAL,
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀     "setattr2 status=%s",
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀     fsal_err_txt(status));
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀(*handle)->obj_ops->release(*handle);
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀*handle = NULL;
❘⠀⠀⠀❘⠀⠀⠀} else if (attrs_out != NULL) {
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀status = (*handle)->obj_ops->getattrs(*handle,
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀     attrs_out);
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀if (FSAL_IS_ERROR(status) &&
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀    (attrs_out->request_mask & ATTR_RDATTR_ERR) == 0) {
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀/* Get attributes failed and caller expected
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀ * to get the attributes.
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀ */
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀goto fileerr;
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀}
❘⠀⠀⠀❘⠀⠀⠀}
❘⠀⠀⠀} else {
❘⠀⠀⠀❘⠀⠀⠀status.major = ERR_FSAL_NO_ERROR;
❘⠀⠀⠀❘⠀⠀⠀status.minor = 0;

❘⠀⠀⠀❘⠀⠀⠀if (attrs_out != NULL) {
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀/* Since we haven't set any attributes other than what
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀ * was set on create, just use the stat results we used
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀ * to create the fsal_obj_handle.
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀ */
❘⠀⠀⠀❘⠀⠀⠀❘⠀⠀⠀posix2fsal_attributes_all(&stat, attrs_out); <=========
❘⠀⠀⠀❘⠀⠀⠀}
----------------->8--------------------------

It is observed that when we create a directory (VFS FSAL case) we land in else block and
we set the fsid using device major and minor number irrespective of fsid type or even
fsid_device setting. Because of this mdcache entry has a fsid setting where device major
and minor number is used. This causes the directory removal operation to fail.

This code change is added by following commit :
--------------->8--------------------------
commit de7f5712c898d9f5b51a8188cbee5282a9b9d533
Author: Frank S. Filz <ffilzlnx@mindspring.com>
Date:   Wed Jun 15 09:34:14 2016 -0700

    Pass attributes out on fsal_obj_handle create and readdir

    When an FSAL object is created, most FSALs must get the file attributes
    in order to properly create the fsal_obj_handle, in this case, since many
    callers (including MDCACHE) will need attributes, it makes sense to pass
    them up instead of having an additional getattrs call.

    The same can be said of readdir, so go ahead an pass the attributes up
    on the call back.

    MDCACHE has also been somewhat re-organized to perform cached object
    creates in the same way as much as makes sense.

    Change-Id: Ia89bb16356fa5117169e80b66b7d27e0a6a0e23e
    Signed-off-by: Frank S. Filz <ffilzlnx@mindspring.com>
---------------->8--------------------------

Looks like mkdir is one of the operations affected. Though I have not tested others.

One possible solution, I think, is to fill fsid based on fsid_device global setting
Or maybe call getattr blindly. I have tested getattr call instead of using stat data.

Please suggest.

Thanks,
Rahul.