Hi,

Not sure if my previous mail was sent, hence re-sending the mail.

I used the latest ganesha code and ran the following test from a NFSv3 client in a script:
        ct=0
while [ $ct -lt 4096 ]; do
      flock -x mylock echo 1 >> myfile
     let ct=$ct+1
done

After running for 1000+ times the client got error "No locks available"

ganesha.log had the following trace:
ganesha.nfsd-9328[svc_103]  nsm_connect :NLM :CRIT :connect to statd failed: RPC: Unknown protocol

/var/log/messages showed "Too many open files" message. It looks like for NLM LOCK requests connection to rpc.statd were created but not closed for NLM UNLOCK request.

After analyzing the code, it seems this happens because for NLM LOCK request the 'xprt->xp_refcnt' is ref'ed twice. But while handling NLM UNLOCK request the 'xprt->xp_refcnt' is un-ref'ed only once, and thus svc_vc_destroy_it() doesn't get called and connection to rpc.statd is not closed.

More details about the code analysis is below. Can you please check about this issue ? Thank you.   I am not sure why are we incrementing 'xprt->xp_refcnt' twice in svc_xprt_lookup() ?

For NLM LOCK request the code path is:
--------------------------------------------------------------
nlm4_Lock() -> ...... -> nsm_connect() -> ....... -> makefd_xprt() -> svc_xprt_lookup()

    137 SVCXPRT *
    138 svc_xprt_lookup(int fd, svc_xprt_setup_t setup)
    139 {
     ......
     ......
    173                         (*setup)(&xprt); /* zalloc, xp_refcnt = 1 */   --> leads to call to svc_vc_xprt_setup()
    174                         xprt->xp_fd = fd;
    175                         xprt->xp_flags = SVC_XPRT_FLAG_INITIAL;
    176
    177                         /* Get ref for caller */
    178                         SVC_REF(xprt, SVC_REF_FLAG_NONE);

Here, at line 173 function svc_vc_xprt_setup() is called which sets 'xprt->xp_refcnt = 1'
Then at line 178, SVC_REF increments  'xprt->xp_refcnt' by 1. Thus, when handling NLM LOCK request 'xprt->xp_refcnt = 2' is set.


For NLM UNLOCK request the code path is:
-------------------------------------------------------------------
nlm4_Unlock() -> ...... -> nsm_disconnect -> ..... -> clnt_vc_destroy() -> svc_release_it()

    410 static inline void svc_release_it(SVCXPRT *xprt, u_int flags,
    411                                   const char *tag, const int line)
    412 {
    413         int32_t refs = atomic_dec_int32_t(&xprt->xp_refcnt);
    ......
    ......
    425         if (likely(refs > 0)) {
    426                 /* normal case */
    427                 return;
    428         }
    429
    430         /* enforce once-only semantic, trace others */
    431         xp_flags = atomic_postset_uint16_t_bits(&xprt->xp_flags,
    432                                                 SVC_XPRT_FLAG_RELEASING);
    ......
    439         /* Releasing last reference */
    440         (*(xprt)->xp_ops->xp_destroy)(xprt, flags, tag, line);

Here, at line 413 'xprt->xp_refcnt' gets decremented and becomes 'xprt->xp_refcnt = 1'. 
But as  'xprt->xp_refcnt != 0' the function returns from line 427. And thus it doesn't proceed with closure of connection.

Thanks,
Madhu Thorat
IBM Pune.