Hi,
I am having a problem with ACLs in NFS Ganesha v2.6.3. Based on what I believe to be the
cause of the problem, I think the problem will exist in the tip too. I am using a
proprietary fsal.
The symptom of the problem is simple. If I create a file and quickly read the ACLs,
nothing is returned:
# echo test > test.txt
# nfs4_getfacl test.txt
#
If I restart ganesha, I see the correct ACLs:
# nfs4_getfacl test.txt
A::OWNER@:rwatTnNcCy
A::GROUP@:rtncy
A::EVERYONE@:rtncy
Also, if I create a file, and wait over 60 seconds, then I can read the correct ACLs.
This is all working fine in v2.5.1.
I think I see the problem in the code. In my case, the nfs client first sends a GETATTR
request with the ACL bit set in the request mask. But then it sends another one without
the ACL bit set. During this second request, when ACLs are not requested, the entry in
mdcache is updated in mdcache_refresh_attrs.
In 2.5.1, mdcache_refresh_attrs, after the subfsal is called to get the attributes, it
does:
if (entry->attrs.acl != NULL) {
/* We used to have an ACL... */
if (need_acl) {
/* We requested update of an existing ACL, release the
* old one.
*/
nfs4_acl_release_entry(entry->attrs.acl);
} else {
/* The ACL wasn't requested, move it into the
* new attributes so we will retain it and make
* it such that the entry attrs DO request the
* ACL.
*/
attrs.acl = entry->attrs.acl;
attrs.valid_mask |= ATTR_ACL;
entry->attrs.request_mask |= ATTR_ACL;
}
/* ACL was released or moved to new attributes. */
entry->attrs.acl = NULL;
}
Here, the second path is taken. Basically it moves the ACLs to attrs, but it sets the
entry's request mask bit. Soon after, fsal_copy_attrs is called which copies the ACLs
back to the entry. It does this because the entry's request mask ACL bit is set. If it
was not set, the ACLs would not be copied.
Unfortunately, this is not happening in 2.6.3:
In 2.6.3, this functionality was moved into mdc_update_attr_cache:
if (entry->attrs.acl != NULL) {
/* We used to have an ACL... */
if (attrs->acl != NULL) {
/* We got an ACL from the sub FSAL whether we asked for
* it or not, given that we had an ACL before, and we
* got a new one, update the ACL, so release the old
* one.
*/
nfs4_acl_release_entry(entry->attrs.acl);
} else {
/* A new ACL wasn't provided, so move the old one
* into the new attributes so it will be preserved
* by the fsal_copy_attrs.
*/
attrs->acl = entry->attrs.acl;
attrs->valid_mask |= ATTR_ACL;
}
/* NOTE: Because we already had an ACL,
* entry->attrs.request_mask MUST have the ATTR_ACL bit set.
* This will assure that fsal_copy_attrs below will copy the
* selected ACL (old or new) into entry->attrs.
*/
/* ACL was released or moved to new attributes. */
entry->attrs.acl = NULL;
} else if (attrs->acl != NULL) {
/* We didn't have an ACL before, but we got a new one. We may
* not have asked for it, but receive it anyway.
*/
entry->attrs.request_mask |= ATTR_ACL;
}
We are taking the path where it says "A new ACL wasn't provided". It is
moving the existing ACL into attrs, which we would like to be copied later by
fsal_copy_attrs, just like in 2.5.1. But note in 2.6, we are never setting the ACL bit in
the entry's request mask, only in attrs. Therefore, fsal_copy_attrs will not copy the
ACL back to the entry, and the ACL will be lost forever. Even worse, is the trusted ACL
flag in mde_flags is set, so now the caller thinks there are no ACLs at all!
The logic looks the same in the current 2.8 code to me.
Please let me know if my analysis is correct or not, and if so, if there is any possible
workaround.
Bill