Thanks for your kind interpretation, Jeff.
This patch looks good, and validation works. But I still have some doubts, I want to ask you, thank you again for your help.

FSAL_CEPH depends on the MDS preserving the state of ganesha's ceph
client if a server head has to be restarted. If we just tear down the
connection in this situation (as in a normal umount), the MDS would
release all of the state held by that ganesha, and other active/active
server heads could sneak in and steal the locks that it had previously
held.

Aborting the connection ensures that ganesha will have no further
communication with the MDS until the server comes back. To the MDS, it
looks like the ganesha client just dropped off the net. The MDS will
then keep the state held by that client until it comes back or times
out. Once the client does come back, it then ensures that the other
ganesha servers are enforcing the grace period, and then it will ask the
MDS to kill off the old session.

Does MDS kill thd old session when calling ceph_reclaim_start  in ganesha start process?  Why was it added to the osd blacklist?
For active/active ganesha server, what will they do when other ganesha enter grace period, or what is the effect of keeping the previous state here?
I still don't quite understand this process. I hope you can tell me more.

At 2020-09-15 00:54:31, "Jeff Layton" <jlayton@redhat.com> wrote:
>On Mon, 2020-09-14 at 09:26 -0400, Jeff Layton wrote:
>> On Mon, 2020-09-14 at 19:32 +0800, liuwei wrote:
>> > Hi,
>> > I found many error info in the ganesha.log when stop nfs-ganesha.service, as follows.
>> > ganesha.log:
>> > £ºganesha.nfsd-1550083[Admin] mdcache_lru_clean INODE:F_DBG:Trusting op_ctx export id 2
>> > £ºganesha.nfsd-1550083[Admin] posix2fsal_error:FSAL:CRIT:Default case mapping Transport endpoint is not connected (107) to ERR_FSAL_SERVERFAULT
>> > £ºganesha.nfsd-1550083[Admin]
>> > mdcache_Iru_clean:INODE LRU:CRIT:Error closing file in cleanup:Undefined server error
>> > My version info: Ganesha-3.3+FSAL_CEPH(ceph version 14.2.10)£»
>> > 
>
>We should probably add a patch similar to this. I don't have the cycles
>to test this at the moment. liuwei, would you be able to do so?
>
>---------------------------8<----------------------------
>
>FSAL_CEPH: paper over -ENOTCONN return from close when shutting down
>
>When we're shutting down the server, we'll usually abort the connection
>first. The mdcache will then try to clean up entries and issue a ->close
>to each, and FSAL_CEPH ends up returning -ENOTCONN in that situation
>which causes a lot of log spam.
>
>Fix this by just ignoring -ENOTCONN errors in ceph_close_my_fd when
>ganesha is shutting down.
>
>Change-Id: If8231998a61e759be4d044f102ae6dbcebfdc975
>Reported-by: liuwei <liuwei_coder@163.com>
>Signed-off-by: Jeff Layton <jlayton@redhat.com>
>---
> src/FSAL/FSAL_CEPH/handle.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
>diff --git a/src/FSAL/FSAL_CEPH/handle.c b/src/FSAL/FSAL_CEPH/handle.c
>index 464db598e6c8..c5891894bd01 100644
>--- a/src/FSAL/FSAL_CEPH/handle.c
>+++ b/src/FSAL/FSAL_CEPH/handle.c
>@@ -45,6 +45,7 @@
> #include "nfs_exports.h"
> #include "sal_data.h"
> #include "statx_compat.h"
>+#include "nfs_core.h"
> #include "linux/falloc.h"
> 
> /**
>@@ -863,15 +864,22 @@ static fsal_status_t ceph_open_my_fd(struct ceph_handle *myself,
> 
> static fsal_status_t ceph_close_my_fd(struct ceph_fd *my_fd)
> {
>-	int rc = 0;
> 	fsal_status_t status = fsalstat(ERR_FSAL_NO_ERROR, 0);
> 	struct ceph_export *export =
> 		container_of(op_ctx->fsal_export, struct ceph_export, export);
> 
> 	if (my_fd->fd != NULL && my_fd->openflags != FSAL_O_CLOSED) {
>-		rc = ceph_ll_close(export->cmount, my_fd->fd);
>-		if (rc < 0)
>+		int rc = ceph_ll_close(export->cmount, my_fd->fd);
>+
>+		if (rc < 0) {
>+			/*
>+			 * We expect -ENOTCONN errors on shutdown. Ignore
>+			 * them so we don't spam the logs.
>+			 */
>+			if (rc == -ENOTCONN && admin_shutdown)
>+				rc = 0;
> 			status = ceph2fsal_error(rc);
>+		}
> 		my_fd->fd = NULL;
> 		my_fd->openflags = FSAL_O_CLOSED;
> 	}
>-- 
>2.26.2
>
>_______________________________________________
>Devel mailing list -- devel@lists.nfs-ganesha.org
>To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org