(resending response and adding devel mailing list – maybe someone else has some ideas)
There is this fix:
660f330243c57c0b2fea11c87507b3e1991bb300 FSAL_MDCACHE: avoid assertion due to wrong check
I’m pretty sure since V2.5 we’ve also fixed several places where op_ctx was not setup for things in the shutdown path, but I can’t find the patches.
Frank
From: Trishali Nayar [mailto:ntrishal@in.ibm.com]
Sent: Monday, January 14, 2019 6:04 AM
To: Frank Filz <ffilzlnx@mindspring.com>
Cc: 'Malahal R Naineni' <mnaineni@in.ibm.com>
Subject: RE: Crash seen in shutdown path
This was hit on our 2.5 code stream...
I did try to look into the community stream and even 2.7, but this particular code seemed same everywhere.
Thanks and regards,
Trishali.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trishali Nayar
IBM Systems
ETZ, Pune.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: "Frank Filz" <ffilzlnx@mindspring.com>
To: "'Trishali Nayar'" <ntrishal@in.ibm.com>
Cc: "'Malahal R Naineni'" <mnaineni@in.ibm.com>
Date: 01/12/2019 04:22 AM
Subject: RE: Crash seen in shutdown path
What code base is this with? If upstream, this may be fixed.
Frank
From: Trishali Nayar [mailto:ntrishal@in.ibm.com]
Sent: Friday, January 11, 2019 7:09 AM
To: Frank Filz <ffilzlnx@mindspring.com>
Cc: Malahal R Naineni <mnaineni@in.ibm.com>
Subject: Crash seen in shutdown path
Hi Frank,
I was looking at a crash in mdcache_lru_clean() routine which happened due to the assert, as "op_ctx" is not set. This was in the shutdown path via shutdown_handles()
The first_export_id is having a value of -1 for the entry.
I observed that for the other admin_thread routines in the same shutdown path...we call init_root_op_context() explicitly Eg- in remove_all_exports() and unexport() etc.
So only when we get into the below path of Extra file handles hanging around...we will hit this problem.
static void shutdown_handles(struct fsal_module *fsal)
{
/* Handle iterator */
struct glist_head *hi = NULL;
/* Next pointer in handle iteration */
struct glist_head *hn = NULL;
if (glist_empty(&fsal->handles))
return;
LogDebug(COMPONENT_FSAL, "Extra file handles hanging around."); <<<< in below path
glist_for_each_safe(hi, hn, &fsal->handles) {
struct fsal_obj_handle *h = glist_entry(hi,
struct fsal_obj_handle,
handles);
LogDebug(COMPONENT_FSAL,
"Releasing handle");
h->obj_ops->release(h);
}
}
1> I was thinking maybe we should fix this by calling init_root_op_context() when we get into the "Extra file handles" path...
2> But we would still hit the second assert for "op_ctx->ctx_export"
So could we also move the second assert into a condition:-
if (export_id >= 0 )
assert(op_ctx->ctx_export)
The stack trace and the values in the entry are attached here for reference as well :
Your insights on this will be extremely useful.
Thanks and regards,
Trishali.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trishali Nayar
IBM Systems
ETZ, Pune.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~