unavailability of NFSv3
by Talamo Ivano Giuseppe (PSI)
Hi all,
We are using nfs-ganesha via the IBM Spectrum Scale protocol setup, currently consisting of 2 servers and around 50 clients.
After a couple of months of smooth run, we started to experience (already twice in three days) a critical issue that consists in all clients
not being able to mount the filesystem.
When the issue happens this is what we see via rpcinfo on the server:
[root@server ~]# rpcinfo -s
program version(s) netid(s) service owner
100000 2,3,4 local,udp,tcp,udp6,tcp6 portmapper superuser
100024 1 tcp6,udp6,tcp,udp status 29
100003 4,3 tcp6,tcp,udp6,udp nfs superuser
100005 3,1 tcp6,tcp,udp6,udp mountd superuser
100021 4 tcp6,tcp,udp6,udp nlockmgr superuser
100011 2,1 tcp6,tcp,udp6,udp rquotad superuser
[root@host ~]# rpcinfo -T tcp localhost 100003 3
rpcinfo: RPC: Timed out
The logs are set to EVENT and when the issue starts, ganesha.log gets full of lines like the following:
2018-10-04 16:40:45 : epoch 0002003d : server : gpfs.ganesha.nfsd-40452[State_Async] nlm_send_async :NLM :MAJ :Cannot create NLM async tcp connection to client ::ffff:129.129.117.65
2018-10-04 16:40:45 : epoch 0002003d : server: gpfs.ganesha.nfsd-40452[State_Async] nlm4_send_grant_msg :NLM :MAJ :GRANTED_MSG RPC call failed with return code -1. Removing the blocking lock
The nfs-ganesha version is 2.5.3 even if that’s the ibm version so I am not sure what are the changes.
I was wondering if someone on the mailing list had an idea of what direction to take to investigate this further.
Thanks,
Ivano
6 years, 2 months
Crash in libntirpc with 1.6.3 version
by Naresh Babu
Hi All,
We are using a custom FSAL with NFS Ganesha 2.6.3 version and
libntirpc 1.6.3 version. We are consistently running into the following
crash in libntirpc and wondering if this is a known issue. Appreciate any
help to resolve this issue.
(gdb) bt
#0 0x00007fdc88000478 in ?? ()
#1 0x00007fdd5375f6d8 in svc_release_it (xprt=0x7fdc880430d0, flags=0,
tag=0x7fdd5376ea36 <__func__.8221> "svc_ioq_write", line=233) at
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/ntirpc/rpc/svc.h:433
#2 0x00007fdd5375fc46 in svc_ioq_write (xprt=0x7fdc880430d0,
xioq=0x7fdcf0003160, ifph=0x12c8c10) at
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/svc_ioq.c:233
#3 0x00007fdd5375fd88 in svc_ioq_write_callback (wpe=0x7fdcf00031c8) at
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/svc_ioq.c:257
#4 0x00007fdd537605e0 in work_pool_thread (arg=0x7fdc900034d0) at
/home/naresh/clfsrepo3/external/nfs/src/libntirpc/src/work_pool.c:181
#5 0x00007fdd5255be25 in start_thread (arg=0x7fdc80e8e700) at
pthread_create.c:308
#6 0x00007fdd51e6834d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) p *xprt->xp_ops
$3 = {xp_recv = 0x7fdc88000458, xp_stat = 0x7fdc88000458, xp_decode =
0x7fdc880430c0, xp_reply = 0x7fdc880430c0, xp_checksum = 0x7fdc88000478,
xp_destroy = 0x7fdc88000478, xp_control = 0x7fdc88000488, xp_free_user_data
= 0x7fdc88000488}
(gdb) info symbol 0x7fdc88000478
No symbol matches 0x7fdc88000478.
(gdb) p *xprt
$2 = {xp_ops = 0x7fdc88000468, xp_dispatch = {process_cb = 0x7fdc88000468,
rendezvous_cb = 0x7fdc88000468}, xp_parent = 0x7fdc880430c0, xp_tp =
0x7fdc880430c0 "\020", xp_netid = 0x7fdcf4000bc0 "\240\246\227S\335\177",
xp_p1 = 0x7fdc88005238, xp_p2 = 0x7fdc88043288,
xp_p3 = 0x7fdc880051b0, xp_u1 = 0x7fdc880051b0, xp_u2 = 0x7fdc88005238,
xp_local = {nb = {maxlen = 2281730480, len = 32732, buf = 0x600000000}, ss
= {ss_family = 0, __ss_align = 0,
__ss_padding = '\000' <repeats 61 times>,
"\001\000\000\001\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377",
'\000' <repeats 31 times>}}, xp_remote = {nb = {maxlen = 0, len = 0, buf =
0x0}, ss = {ss_family = 12728, __ss_align = 0,
__ss_padding = "\000\000\000\000\000\000\000\000\377\377\377\377",
'\000' <repeats 37 times>, " \000\000\000\000\000\000\000 \020", '\000'
<repeats 21 times>, "`\001\000\000\000\000\000\000\204", '\000' <repeats 15
times>, "\270\061\004\210\334\177\000"}},
xp_lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size =
'\000' <repeats 39 times>, __align = 0}, xp_fd = 0, xp_ifindex = 0,
xp_si_type = 0, xp_type = 0, xp_refcnt = 0,
xp_flags = 64}
6 years, 2 months
Copy permission error
by Jones, Stephen R.
Hello all;
I’m running into an interesting problem, and since I haven’t seen anything similar on the mailing or issues list I thought I would see if there’s something I’m missing. Here’s my setup:
1. CentOS 7 server running GlusterFS 4.1.4 and NFS-Ganesha 2.6.3
2. Windows 10 Pro system with nfs-client installed. Mounting the external NFS drive on the system (v3), and have made uid and guid changes in the registry to match the uid for the share.
With this setup, I’m able to copy most files from the windows system to the share without a problem. However, if the file on windows has the ‘Read-Only’ attribute set, the copy fails with a permission denied error, and looking at the Wireshark trace I’m seeing an NFS3ERR_ACCES error. Is there something I can set in the NFS-Ganesha config file to resolve this issue?
Thanks,
Steve
--
Stephen Jones
Group Lead / Principal Software Engineer
The MITRE Corporation
Phone: (781)271-2517 Email: srjones(a)mitre.org<mailto:srjones@mitre.org>
6 years, 2 months