The analysis is correct. Memory leak with DRC is a known issue. An xprt places a ref on the drc and deletes it when it is freed, so that isn't an issue. It is the dupreq entries that hang on to a drc that cause an issue.  Every dupreq entry places a ref on drc and it removes its ref when it is freed. The problem is that dupreq entries (symbol "dv" in the code) get freed in response to some other request on the socket. So you always have some dupreq entries...

We should have a periodic timer function that goes around all DRC that have no active XPRTs and free their dvs. Freeing all dvs should bring the refcount on drc to zero.

Maybe, enhance drc_free_expired() to do this as well?

Regards, Malahal.

On Fri, Mar 29, 2019 at 8:19 PM Frank Filz <ffilzlnx@mindspring.com> wrote:
Forwarding to devel@lists.nfs-ganesha.org so others can see also.

Frank

From: 方媛 [mailto:fang_yuan1004@163.com]
Sent: Friday, March 29, 2019 1:46 AM
To: malahal@us.ibm.com; ffilzlnx@mindspring.com
Cc: yanhuan@bwstor.com.cn
Subject: A question about ganesha drc can not be free

I have a question about ganasha drc ref count, can you help me. thank you.

I read the latest ganesha git codes, for the protocol V4.0 case, I find drc ref count will not decrease to 0, so can not been inserted into the tcp_drc_recycle_q, I wonder how to free expired drc?

1. At the begining of the request process, nfs_dupreq_start will call nfs_dupreq_get_drc, either a new drc be allocated or an old drc be reused, the drc.ref is set to 0 at first and then added to 1 as below
                        /* Avoid double reference of drc,
                         * setting xp_u2 under DRC_ST_LOCK */
                        req->rq_xprt->xp_u2 = (void *)drc;
                        (void)nfs_dupreq_ref_drc(drc);  /* xprt ref */

                        DRC_ST_UNLOCK();
                        drc->d_u.tcp.recycle_time = 0;

then at the end of the nfs_dupreq_get_drc, drc.ref is added to 2 as below
        /* call path ref */
        (void)nfs_dupreq_ref_drc(drc);
        PTHREAD_MUTEX_unlock(&drc->mtx);

        if (drc_check_expired)
                drc_free_expired();

2. After finish the request process, drc ref count will not decrease 1. Because I see nfs_dupreq_finish will not decrease ref count every time,  just when the drc->size > drc->maxsize, will call nfs_dupreq_put_drc to decrease ref count.
/* conditionally retire entries */
dq_again:
        if (drc_should_retire(drc)) {        /* if drc->sixe < drc->maxsize, will not call nfs_dupreq_put_drc to decrease the drc ref count */
                        ......
                        TAILQ_REMOVE(&drc->dupreq_q, ov, fifo_q);
                        --(drc->size);
                        /* release dv's ref */
                        nfs_dupreq_put_drc(drc, DRC_FLAG_LOCKED);
                        /* drc->mtx gets unlocked in the above call! */

                        rbtree_x_cached_remove(&drc->xt, t, &ov->rbt_k, ov->hk);
    }
so drc ref count will not decrease to 1 forever.

3. Then umount the connection, drc ref count will not decrease to 0,
static enum xprt_stat nfs_rpc_free_user_data(SVCXPRT *xprt)
{
        if (xprt->xp_u2) {
                nfs_dupreq_put_drc(xprt->xp_u2, DRC_FLAG_RELEASE);    /* drc ref count is more then 1 */
                xprt->xp_u2 = NULL;
        }
        return XPRT_DESTROYED;
}

 so drc can not insert to the tcp_drc_recycle_q, then how to free expired drc?

I will waiting for your reply, thanks very much.


_______________________________________________
Devel mailing list -- devel@lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org