lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK
by Kinglong Mee
The latest ganesha/gluster supports seek according to,
https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#section-15.11
From the given sa_offset, find the next data_content4 of type sa_what
in the file. If the server can not find a corresponding sa_what,
then the status will still be NFS4_OK, but sr_eof would be TRUE. If
the server can find the sa_what, then the sr_offset is the start of
that content. If the sa_offset is beyond the end of the file, then
SEEK MUST return NFS4ERR_NXIO.
For a file's filemap as,
Part 1: HOLE 0x0000000000000000 ---> 0x0000000000600000
Part 2: DATA 0x0000000000600000 ---> 0x0000000000700000
Part 3: HOLE 0x0000000000700000 ---> 0x0000000001000000
SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1, sr_offset:0x70000) from ganesha/gluster;
SEEK(0x700000, SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from ganesha/gluster.
If an application depends the lseek result for data searching, it may enter infinite loop.
while (1) {
next_pos = lseek(fd, cur_pos, seek_type);
if (seek_type == SEEK_DATA) {
seek_type = SEEK_HOLE;
} else {
seek_type = SEEK_DATA;
}
if (next_pos == -1) {
return ;
cur_pos = next_pos;
}
The lseek syscall always gets 0x70000 from nfs client for those two cases,
but, if underlying filesystem is ext4/f2fs, or the nfs server is knfsd,
the lseek(0x700000, SEEK_DATA) gets ENXIO.
I wanna to know,
should I fix the ganesha/gluster as knfsd return ENXIO for the first case?
or should I fix the nfs client to return ENXIO for the first case?
thanks,
Kinglong Mee
4 years, 3 months
Re: Better interop for NFS/SMB file share mode/reservation
by J. Bruce Fields
On Tue, Mar 05, 2019 at 04:47:48PM -0500, J. Bruce Fields wrote:
> On Thu, Feb 14, 2019 at 04:06:52PM -0500, J. Bruce Fields wrote:
> > After this:
> >
> > https://marc.info/?l=linux-nfs&m=154966239918297&w=2
> >
> > delegations would no longer conflict with opens from the same tgid. So
> > if your threads all run in the same process and you're willing to manage
> > conflicts among your own clients, that should still allow you to do
> > multiple opens of the same file without giving up your lease/delegation.
> >
> > I'd be curious to know whether that works with Samba's design.
>
> Any idea whether that would work?
>
> (Easy? Impossible? Possible, but realistically the changes required to
> Samba would be painful enough that it'd be unlikely to get done?)
Volker reminds me off-list that he'd like to see Ganesha and Samba work
out an API in userspace first before commiting to a user<->kernel API.
Jeff, wasn't there some work (on Ceph maybe?) on a userspace delegation
API? Is that close to what's needed?
In any case, my immediate goal is just to get knfsd fixed, which doesn't
really commit us to anything--knfsd only needs kernel internal
interfaces. But it'd be nice to have at least some idea if we're on the
right track, to save having to redo that work later.
--b.
5 years, 7 months
Re: [Gluster-users] Proposing to previous ganesha HA clustersolution back to gluster code as gluster-7 feature
by Strahil
Keep in mind that corosync/pacemaker is hard for proper setup by new admins/users.
I'm still trying to remediate the effects of poor configuration at work.
Also, storhaug is nice for hyperconverged setups where the host is not only hosting bricks, but other workloads.
Corosync/pacemaker require proper fencing to be setup and most of the stonith resources 'shoot the other node in the head'.
I would be happy to see an easy to deploy (let say 'cluster.enable-ha-ganesha true') and gluster to be bringing up the Floating IPs and taking care of the NFS locks, so no disruption will be felt by the clients.
Still, this will be a lot of work to achieve.
Best Regards,
Strahil NikolovOn Apr 30, 2019 15:19, Jim Kinney <jim.kinney(a)gmail.com> wrote:
>
> +1!
> I'm using nfs-ganesha in my next upgrade so my client systems can use NFS instead of fuse mounts. Having an integrated, designed in process to coordinate multiple nodes into an HA cluster will very welcome.
>
> On April 30, 2019 3:20:11 AM EDT, Jiffin Tony Thottan <jthottan(a)redhat.com> wrote:
>>
>> Hi all,
>>
>> Some of you folks may be familiar with HA solution provided for nfs-ganesha by gluster using pacemaker and corosync.
>>
>> That feature was removed in glusterfs 3.10 in favour for common HA project "Storhaug". Even Storhaug was not progressed
>>
>> much from last two years and current development is in halt state, hence planning to restore old HA ganesha solution back
>>
>> to gluster code repository with some improvement and targetting for next gluster release 7.
>>
>> I have opened up an issue [1] with details and posted initial set of patches [2]
>>
>> Please share your thoughts on the same
>>
>> Regards,
>>
>> Jiffin
>>
>> [1] https://github.com/gluster/glusterfs/issues/663
>>
>> [2] https://review.gluster.org/#/q/topic:rfc-663+(status:open+OR+status:merged)
>
>
> --
> Sent from my Android device with K-9 Mail. All tyopes are thumb related and reflect authenticity.
5 years, 7 months
For NLM UNLOCK - connection to rpc.statd not getting closed
by Madhu Thorat
Hi,
Not sure if my previous mail was sent, hence re-sending the mail.
I used the latest ganesha code and ran the following test from a NFSv3
client in a script:
ct=0
while [ $ct -lt 4096 ]; do
flock -x mylock echo 1 >> myfile
let ct=$ct+1
done
After running for 1000+ times the client got error "No locks available"
ganesha.log had the following trace:
ganesha.nfsd-9328[svc_103] nsm_connect :NLM :CRIT :connect to statd
failed: RPC: Unknown protocol
/var/log/messages showed "Too many open files" message. It looks like for
NLM LOCK requests connection to rpc.statd were created but not closed for
NLM UNLOCK request.
After analyzing the code, it seems this happens because for NLM LOCK
request the 'xprt->xp_refcnt' is ref'ed twice. But while handling NLM
UNLOCK request the 'xprt->xp_refcnt' is un-ref'ed only once, and thus
svc_vc_destroy_it() doesn't get called and connection to rpc.statd is not
closed.
More details about the code analysis is below. Can you please check about
this issue ? Thank you. I am not sure why are we incrementing
'xprt->xp_refcnt' twice in svc_xprt_lookup() ?
For NLM LOCK request the code path is:
--------------------------------------------------------------
nlm4_Lock() -> ...... -> nsm_connect() -> ....... -> makefd_xprt() ->
svc_xprt_lookup()
137 SVCXPRT *
138 svc_xprt_lookup(int fd, svc_xprt_setup_t setup)
139 {
......
......
173 (*setup)(&xprt); /* zalloc, xp_refcnt = 1
*/ --> leads to call to svc_vc_xprt_setup()
174 xprt->xp_fd = fd;
175 xprt->xp_flags = SVC_XPRT_FLAG_INITIAL;
176
177 /* Get ref for caller */
178 SVC_REF(xprt, SVC_REF_FLAG_NONE);
Here, at line 173 function svc_vc_xprt_setup() is called which sets
'xprt->xp_refcnt = 1'
Then at line 178, SVC_REF increments 'xprt->xp_refcnt' by 1. Thus, when
handling NLM LOCK request 'xprt->xp_refcnt = 2' is set.
For NLM UNLOCK request the code path is:
-------------------------------------------------------------------
nlm4_Unlock() -> ...... -> nsm_disconnect -> ..... -> clnt_vc_destroy() ->
svc_release_it()
410 static inline void svc_release_it(SVCXPRT *xprt, u_int flags,
411 const char *tag, const int line)
412 {
413 int32_t refs = atomic_dec_int32_t(&xprt->xp_refcnt);
......
......
425 if (likely(refs > 0)) {
426 /* normal case */
427 return;
428 }
429
430 /* enforce once-only semantic, trace others */
431 xp_flags = atomic_postset_uint16_t_bits(&xprt->xp_flags,
432
SVC_XPRT_FLAG_RELEASING);
......
439 /* Releasing last reference */
440 (*(xprt)->xp_ops->xp_destroy)(xprt, flags, tag, line);
Here, at line 413 'xprt->xp_refcnt' gets decremented and becomes
'xprt->xp_refcnt = 1'.
But as 'xprt->xp_refcnt != 0' the function returns from line 427. And thus
it doesn't proceed with closure of connection.
Thanks,
Madhu Thorat
IBM Pune.
5 years, 7 months
open_fd_count incremented unconditionally
by srikrishanmalik@gmail.com
Hello,
For unchecked creates ganesha may increment open_fd_count even when no new fd is created by ganesha.
This can happen for v3 when an fd for a file is already open and the client sends a create(unchecked) for the same file, the fsal closes the old fd and reopens the file, but fsal_helper.c:open_by_name increments open_fd_count without checking(no way to do that) whether a new fd is really created.
We will hit this when two clients are trying to create the same file simultaneously, i.e lookup from both clients will fail and both will issue a create, the file will be opened by the first create, the other create will end up just incrementing open_fd_count.
There is a patch(https://gerrithub.io/c/ffilz/nfs-ganesha/+/391267) which may address this, any plans to take this?
Thanks
Sri
5 years, 7 months
Crash in stats_v4_full
by gaurav gangalwar
Hi,
We are getting this crash on loaded system, we are using dbus-1.10.24-13,
looks like issue with dbus lib.
Anyone aware of this issue?
(gdb) bt
#0 0x00007f4e468e159b in raise () from /lib64/libpthread.so.0
#1 0x0000000000448db4 in crash_handler (signo=11, info=0x7f4e2c1db870,
ctx=0x7f4e2c1db740) at
/usr/src/debug/nfs-ganesha-2.7.1/MainNFSD/nfs_init.c:246
#2 <signal handler called>
#3 0x00007f4e47f0b269 in _dbus_marshal_read_uint32 () from
/lib64/libdbus-1.so.3
#4 0x00007f4e47f0bb6a in _dbus_marshal_skip_basic () from
/lib64/libdbus-1.so.3
#5 0x00007f4e47ef71f2 in base_reader_next () from /lib64/libdbus-1.so.3
#6 0x00007f4e47ef70cb in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#7 0x00007f4e47ef71b8 in base_reader_next () from /lib64/libdbus-1.so.3
#8 0x00007f4e47ef7239 in struct_reader_next () from /lib64/libdbus-1.so.3
#9 0x00007f4e47ef70cb in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#10 0x00007f4e47ef7368 in array_reader_next () from /lib64/libdbus-1.so.3
#11 0x00007f4e47ef70cb in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#12 0x00007f4e47ef5348 in _dbus_header_cache_revalidate () from
/lib64/libdbus-1.so.3
#13 0x00007f4e47ef5c8e in _dbus_header_get_field_raw () from
/lib64/libdbus-1.so.3
#14 0x00007f4e47efa222 in _dbus_message_iter_open_signature.part.4 () from
/lib64/libdbus-1.so.3
#15 0x00007f4e47efc0f4 in dbus_message_iter_append_basic () from
/lib64/libdbus-1.so.3
#16 0x0000000000514be1 in server_dbus_v4_full_stats (iter=0x7f4e2c1dc130)
at /usr/src/debug/nfs-ganesha-2.7.1/support/server_stats.c:2287
#17 0x000000000051b319 in stats_v4_full (args=0x0, reply=0x7f4e2441f080,
error=0x7f4e2c1dc230) at
/usr/src/debug/nfs-ganesha-2.7.1/support/export_mgr.c:2102
#18 0x0000000000554e10 in dbus_message_entrypoint (conn=0x7f4e42c67400,
msg=0x7f4e42c31200, user_data=0x7e7cd0 <export_interfaces>) at
/usr/src/debug/nfs-ganesha-2.7.1/dbus/dbus_server.c:562
#19 0x00007f4e47f00276 in _dbus_object_tree_dispatch_and_unlock () from
/lib64/libdbus-1.so.3
#20 0x00007f4e47ef1b29 in dbus_connection_dispatch () from
/lib64/libdbus-1.so.3
#21 0x00007f4e47ef1e42 in _dbus_connection_read_write_dispatch () from
/lib64/libdbus-1.so.3
#22 0x00000000005559e8 in gsh_dbus_thread (arg=0x0) at
/usr/src/debug/nfs-ganesha-2.7.1/dbus/dbus_server.c:795
#23 0x00007f4e468d9e25 in start_thread () from /lib64/libpthread.so.0
#24 0x00007f4e461e1bad in clone () from /lib64/libc.so.6
(gdb)
Thanks.
Gaurav
5 years, 7 months
Crash in getclnthandle()
by Madhu P Punjabi
Hi,
When using V2.8-dev.28 saw the following crash. Had set NOFILE to 1024 for
testing and clients (mounted export with NFSv3) were acquiring many locks.
(gdb) bt
#0 0x00007f26b719a5d7 in raise () from /lib64/libc.so.6
#1 0x00007f26b719bcc8 in abort () from /lib64/libc.so.6
#2 0x00007f26b7193546 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f26b71935f2 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f26b9592782 in getclnthandle (host=0x7f26b99b8e34 "localhost",
nconf=0x7f267c002960, targaddr=0x7f26ae4c7b10)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:350
#5 0x00007f26b9593166 in __rpcb_findaddr_timed (program=100024, version=1,
nconf=0x7f267c002960, host=0x7f26b99b8e34 "localhost", clpp=0x7f26ae4c7bb0,
tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:683
#6 0x00007f26b95843ca in clnt_tp_ncreate_timed (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, nconf=0x7f267c002960, tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/clnt_generic.c:265
#7 0x00007f26b9584271 in clnt_ncreate_timed (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, netclass=0x7f26b99b8e30 "tcp", tp=0x0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/clnt_generic.c:196
#8 0x00007f26b995b4bb in clnt_ncreate (hostname=0x7f26b99b8e34
"localhost", prog=100024, vers=1, nettype=0x7f26b99b8e30 "tcp")
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/ntirpc/rpc/clnt.h:396
#9 0x00007f26b995b6ef in nsm_connect ()
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nsm.c:58
#10 0x00007f26b995bd4a in nsm_monitor (host=0x7f267c002250)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nsm.c:118
#11 0x00007f26b989dc1d in get_nsm_client (care=CARE_MONITOR,
xprt=0x7f269c000b60, caller_name=0x7f267c001480 "ss_bignode_cl1")
at /usr/src/debug/nfs-ganesha-2.8-dev.28/SAL/nlm_owner.c:1014
#12 0x00007f26b995a594 in nlm_process_parameters (req=0x7f267c000a00,
exclusive=true, alock=0x7f267c0011f0, plock=0x7f26ae4c8970,
ppobj=0x7f26ae4c91b8, care=CARE_MONITOR,
ppnsm_client=0x7f26ae4c89a8, ppnlm_client=0x7f26ae4c89a0,
ppowner=0x7f26ae4c8998, block_data=0x7f26ae4c8968, nsm_state=11,
state=0x7f26ae4c8990)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nlm_util.c:291
#13 0x00007f26b9955888 in nlm4_Lock (args=0x7f267c0011d8,
req=0x7f267c000a00, res=0x7f267c001260)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/Protocols/NLM/nlm_Lock.c:105
#14 0x00007f26b980ee63 in nfs_rpc_process_request (reqdata=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/MainNFSD/nfs_worker_thread.c:1484
#15 0x00007f26b980f2e5 in nfs_rpc_valid_NLM (req=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/MainNFSD/nfs_worker_thread.c:1633
#16 0x00007f26b95a12fb in svc_vc_decode (req=0x7f267c000a00)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_vc.c:827
#17 0x00007f26b959d797 in svc_request (xprt=0x7f269c000b60,
xdrs=0x7f267c001600)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:793
#18 0x00007f26b95a120c in svc_vc_recv (xprt=0x7f269c000b60)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_vc.c:800
#19 0x00007f26b959d718 in svc_rqst_xprt_task (wpe=0x7f269c000d80)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:774
#20 0x00007f26b959e020 in svc_rqst_epoll_loop (wpe=0x9c84c0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/svc_rqst.c:1089
#21 0x00007f26b95a6aaf in work_pool_thread (arg=0x7f26a4000ef0)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/work_pool.c:184
#22 0x00007f26b7b56df5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007f26b725b1ad in clone () from /lib64/libc.so.6
(gdb) f 4
#4 0x00007f26b9592782 in getclnthandle (host=0x7f26b99b8e34 "localhost",
nconf=0x7f267c002960, targaddr=0x7f26ae4c7b10)
at /usr/src/debug/nfs-ganesha-2.8-dev.28/libntirpc/src/rpcb_clnt.c:350
350 assert(client == NULL);
(gdb) p/x client
$1 = 0x7f267c00bc60
Have posted a patch with possible fix for this crash at the following link.
If this fix does not look appropriate then please suggest another fix for
this issue. Thank you.
https://github.com/nfs-ganesha/ntirpc/pull/172
Thanks,
Madhu Thorat.
5 years, 7 months
Announce Push of V2.8-dev.28
by Frank Filz
Branch next
Tag:V2.8-dev.28
Release Highlights
* Fix memory leak for RPCSEC_GSS
* Arg sanitization in dbus for stats.
* Fix NLM owner refcount leak in state_lock()
* Fix memory leak in nlm4_Lock()
* Add 'expire_time_parent' parameter and refill parent handle if expired
* MDCACHE - Fix race between lru functions for the chunk and the parent
* MDCACHE: Don't deref the mdcache_entry pointer once the ref is dropped.
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
9536ffc Frank S. Filz V2.8-dev.28
5f8976c Ashish Sangwan MDCACHE: Don't deref the mdcache_entry pointer once
the ref is dropped.
11e0e37 Ashish Sangwan MDCACHE - Fix race between lru functions for the
chunk and the parent of the chunk getting freed and reused.
b05a40d Madhu Thorat Add 'expire_time_parent' parameter and refill parent
handle if expired
191d5a2 Malahal Naineni Fix memory leak in nlm4_Lock()
777236c Malahal Naineni Fix NLM owner refcount leak in state_lock()
2fe23b9 Gaurav Gangalwar Arg sanitization in dbus for stats.
ba14f01 Sachin Punadikar Fix memory leak for RPCSEC_GSS
5 years, 8 months