Crash in getclnthandle()
by Madhu P Punjabi
Hi,
When using V2.8-dev.28 saw the following crash. Had set NOFILE to 1024 for
testing and clients (mounted export with NFSv3) were acquiring many locks.
(gdb) bt
#0 0x00007f26b719a5d7 in raise () from /lib64/libc.so.6
#1 0x00007f26b719bcc8 in abort () from /lib64/libc.so.6
#2 0x00007f26b7193546 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f26b71935f2 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f26b9592782 in getclnthandle (host=0x7f26b99b8e34 "localhost",
nconf=0x7f267c002960, targaddr=0x7f26ae4c7b10)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:350
#5 0x00007f26b9593166 in __rpcb_findaddr_timed (program=100024, version=1,
nconf=0x7f267c002960, host=0x7f26b99b8e34 "localhost", clpp=0x7f26ae4c7bb0,
tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:683
#6 0x00007f26b95843ca in clnt_tp_ncreate_timed (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, nconf=0x7f267c002960, tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/clnt_generic.c:265
#7 0x00007f26b9584271 in clnt_ncreate_timed (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, netclass=0x7f26b99b8e30 "tcp", tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/clnt_generic.c:196
#8 0x00007f26b995b4bb in clnt_ncreate (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, nettype=0x7f26b99b8e30 "tcp")
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/ntirpc/rpc/clnt.h:396
#9 0x00007f26b995b6ef in nsm_connect ()
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nsm.c:58
#10 0x00007f26b995bd4a in nsm_monitor (host=0x7f267c002250)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nsm.c:118
#11 0x00007f26b989dc1d in get_nsm_client (care=CARE_MONITOR,
xprt=0x7f269c000b60, caller_name=0x7f267c001480 "ss_bignode_cl1")
at /usr/src/debug/nfs-ganesha-2.8-dev.28/SAL/nlm_owner.c:1014
#12 0x00007f26b995a594 in nlm_process_parameters (req=0x7f267c000a00,
exclusive=true, alock=0x7f267c0011f0, plock=0x7f26ae4c8970,
ppobj=0x7f26ae4c91b8, care=CARE_MONITOR,
ppnsm_client=0x7f26ae4c89a8, ppnlm_client=0x7f26ae4c89a0,
ppowner=0x7f26ae4c8998, block_data=0x7f26ae4c8968, nsm_state=11,
state=0x7f26ae4c8990)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nlm_util.c:291
#13 0x00007f26b9955888 in nlm4_Lock (args=0x7f267c0011d8,
req=0x7f267c000a00, res=0x7f267c001260)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nlm_Lock.c:105
#14 0x00007f26b980ee63 in nfs_rpc_process_request (reqdata=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/MainNFSD/nfs_worker_thread.c:1484
#15 0x00007f26b980f2e5 in nfs_rpc_valid_NLM (req=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/MainNFSD/nfs_worker_thread.c:1633
#16 0x00007f26b95a12fb in svc_vc_decode (req=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_vc.c:827
#17 0x00007f26b959d797 in svc_request (xprt=0x7f269c000b60,
xdrs=0x7f267c001600)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:793
#18 0x00007f26b95a120c in svc_vc_recv (xprt=0x7f269c000b60)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_vc.c:800
#19 0x00007f26b959d718 in svc_rqst_xprt_task (wpe=0x7f269c000d80)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:774
#20 0x00007f26b959e020 in svc_rqst_epoll_loop (wpe=0x9c84c0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:1089
#21 0x00007f26b95a6aaf in work_pool_thread (arg=0x7f26a4000ef0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/work_pool.c:184
#22 0x00007f26b7b56df5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007f26b725b1ad in clone () from /lib64/libc.so.6
(gdb) f 4
#4 0x00007f26b9592782 in getclnthandle (host=0x7f26b99b8e34 "localhost",
nconf=0x7f267c002960, targaddr=0x7f26ae4c7b10)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:350
350 assert(client == NULL);
(gdb) p/x client
$1 = 0x7f267c00bc60
Have posted a patch with possible fix for this crash at the following link.
If this fix does not look appropriate then please suggest another fix for
this issue. Thank you.
https://github.com/nfs-ganesha/ntirpc/pull/172
Thanks,
Madhu Thorat.
5 years, 7 months
FSAL_CEPH - export multiple directories
by bbk@nocloud.ch
Dear Ganesha(-UserList),
I am trying to setup nfs-ganesha with FSAL_CEPH but i lack knowledge on how to export multiple directories. When i only configure one export everything works as expected, but as soon as i have two of it, the ceph client gets blacklisted.
The following documentation states that it is possible to export multiple directories, but i don't know how to configure it correctly.
* http://docs.ceph.com/docs/master/cephfs/nfs/
```
Per running ganesha daemon, FSAL_CEPH can only export one Ceph filesystem although multiple directories in a Ceph filesystem may be exported.
```
From what i get out of the following message, is that it should be possible, but each export will use it's own ceph client:
* https://lists.nfs-ganesha.org/archives/list/support@lists.nfs-ganesha.org...
My nfs-server setup has the following software versions installed:
* NFS-Ganesha Release = V2.7.3
* ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (stable)
On my cephfs i have created the folder /scratch, in the ganesha configuration i use the ```rados_cluster``` recovery backend and i have the following two exports defined:
```
EXPORT
{
Export_id=100;
Path = "/";
Pseudo = /cephfs;
FSAL
{
Name = CEPH;
User_Id = "ganesha";
filesystem = "cephfs";
}
CLIENT
{
Clients = @root-rw;
Squash = "No_Root_Squash";
Access_type = RW;
}
}
EXPORT
{
Export_Id = 300;
Path = "/scratch";
Pseudo = /scratch;
FSAL
{
Name = CEPH;
User_Id = "ganesha";
filesystem = "cephfs";
}
CLIENT
{
Clients = @client-rw;
Squash = "Root_Squash";
Access_type = RW;
}
}
```
When i restart the nfs-ganesha service i'll get the following errors:
Ganesha Log:
```
ganesha.nfsd-13554[main] posix2fsal_error :FSAL :CRIT :Mapping 108(default) to ERR_FSAL_SERVERFAULT
```
Ceph Log:
```
mds.0.43 Evicting (and blacklisting) client session 122567
log_channel(cluster) log [INF] : Evicting (and blacklisting) client session 122567
```
At the end i only have one export (ceph client) which is working.
Yours,
bbk
5 years, 8 months
CephFS via Kernel client + NFS Ganesha
by Shantur Rathore
Hi all,
I am trying to implement a highly available (active active) nfs server
cluster for our infrastructure.
My initial implementation was with CephFS -> FSAL_CEPH with
rados_cluster recovery backend but it didn't perform very well for
speeds maybe due to libcephfs ( ceph-fuse) being really slow compared
to cephfs kernel client.
I was getting 1.8GB/s ( kernel client) compared to 60MB/s (ceph-fuse
and NFS Ganesha)
I am now trying to implement
CephFS -> Kernel client mount -> FSAL_VFS with rados cluster recovery
backed but seems like FSAL_VFS doesn't like the cephfs kernel mount.
Whenever I am trying to mount nfs on the client i am getting errors
listed below in ganesha.
Please let me know what is the best way out to get good performance.
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] nfs4_Compound :NFS4 :DEBUG :Status of OP_GETFH
in position 3 = NFS4_OK, op response size is 40 total response size is
136
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] nfs4_Compound :NFS4 :DEBUG :Request 4: opcode 9
is OP_GETATTR
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] file_To_Fattr :NFS4 ACL :DEBUG :No permission
check for ACL for obj 0x56358b04bb08
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] posix2fsal_error :FSAL :CRIT :Mapping
38(default) to ERR_FSAL_SERVERFAULT
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] vfs_open_by_handle :FSAL :DEBUG :Failed with
Function not implemented openflags 0x00000000
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] find_fd :FSAL :DEBUG :Failed with Function not
implemented openflags 0x00000020
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] posix2fsal_error :FSAL :CRIT :Mapping
38(default) to ERR_FSAL_SERVERFAULT
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] posix2fsal_error :FSAL :CRIT :Mapping
38(default) to ERR_FSAL_SERVERFAULT
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] vfs_open_by_handle :FSAL :DEBUG :Failed with
Function not implemented openflags 0x00000000
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] find_fd :FSAL :DEBUG :Failed with Function not
implemented openflags 0x00000020
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] posix2fsal_error :FSAL :CRIT :Mapping
38(default) to ERR_FSAL_SERVERFAULT
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] fsal_common_is_referral :FSAL :EVENT :Failed to
get attributes for referral, request_mask: 1433550
12/04/2019 16:01:14 : epoch 5cb0b61c : ganesha2-9b76ccf7b-wdpds :
ganesha.nfsd-1[svc_3] nfs4_Compound :NFS4 :DEBUG :Status of OP_GETATTR
in position 4 = NFS4ERR_SERVERFAULT, op response size is 4 total
response size is 144
Thanks,
Shantur
5 years, 8 months
FSAL_CEPH - exporting subdirectories
by Wyllys Ingersoll
I have a cephfs filesystem and I want to export a subdirectory of my cephfs tree. Is this possible or do I have to export the entire tree from the root?
If it is possible, then is it also possible to define exports for multiple subdirectories of my cephfs FS such as /cephfs/exports/foo and /cephfs/exports/bar as 2 separate exports?
5 years, 8 months