Hi Matt,
I think I found the reason why this problem will never triggered by the
copy from local fs dir, because when copy from the fuse mount point,
ganesha process rpc requests concurrently no matter what the mount param
is.But when copy from the localfs dir, ganesha process rpc requests
sequentially. This really confused me, why this happen?
Appreciate your help.
Thanks
Here is some prove:
1.ceph-fuse mount point --> nfs mount point
[two different threads process concurrently]
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG
:0x7f66cc001090 fd 43 context 0x7f66cc00e390
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG
:0x7f66cc001090 fd 43 context 0x7f66e81864f0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=218961045
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 100000 len: 80000
UNSTABLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2
post obj_hdl 0x7f66d80317d0 r state (nil) returned -5
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] nfs3_Errno_verbose :NFS3 :CRIT :Error I/O error
in nfs3_write converted to NFS3ERR_IO but was set non-retryable
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=202183829
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 0 len: 100000
UNSTABLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=218961045 returned
XPRT_IDLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_13] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7f66cc001090 fd 43 xp_refs 5 rq_refs 0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2
post obj_hdl 0x7f66d80317d0 state (nil) returned 0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=202183829 returned
XPRT_IDLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
ganesha.nfsd-41666[svc_8] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7f66cc001090 fd 43 xp_refs 3 rq_refs 0
2.local fs dir --. nfs mount point
[always process seconde write rpc after free the first one]
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] nfs_rpc_decode_request :DISP :DEBUG
:0x7fd7f8044410 fd 43 context 0x7fd80000a3e0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=3075222631
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb74054e68f7b7758d48bc5000000 start: 0 len: 100000
FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2
post obj_hdl 0x7fd7ec0416c0 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3075222631 returned
XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_15] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] nfs_rpc_decode_request :DISP :DEBUG
:0x7fd7f8044410 fd 43 context 0x7fd7dc0437a0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=3091999847
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb74054e68f7b7758d48bc5000000 start: 100000 len: 80000
FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2
post obj_hdl 0x7fd7ec0416c0 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3091999847 returned
XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_9] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] nfs_rpc_decode_request :DISP :DEBUG
:0x7fd7f8044410 fd 43 context 0x7fd7d0025230
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=3159108711
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb7405408e314ea92675ba9000000 start: 0 len: 100000
FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2
post obj_hdl 0x7fd7c4059570 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3159108711 returned
XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_4] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] nfs_rpc_decode_request :DISP :DEBUG
:0x7fd7f8044410 fd 43 context 0x7fd810046740
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] nfs_rpc_process_request :DISP :DEBUG :Request
from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
xid=3175885927
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
Calling nfs_Write handle: File Handle V3: Len=24
4300000110bef9bc69eeb7405408e314ea92675ba9000000 start: 100000 len: 80000
FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2
post obj_hdl 0x7fd7c4059570 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3175885927 returned
XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 :
ganesha.nfsd-50922[svc_17] free_nfs_request :DISP :DEBUG :free_nfs_request:
0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
j j <jiang4357291(a)gmail.com> 于2019年12月18日周三 上午10:38写道:
Hi Matt,
I've tried both sync and async, the async returned " failed to close
.... input/out error".
But the final error point is still the "non-0 initial write position"
in librgw, which triggered by the unordered write.
Thanks
Matt Benjamin <mbenjami(a)redhat.com> 于2019年12月17日周二 下午8:29写道:
> Hi j j,
>
> Are you mounted -osync?
>
> Matt
>
> On Tue, Dec 17, 2019 at 7:26 AM j j <jiang4357291(a)gmail.com> wrote:
> >
> > Hi everyone,
> > I was using FSAL_RGW to export nfsv3, and trying to copy files from
> ceph-fuse mountpoint to the nfs mountpoint, file size is 1.5mb, but some
> files write failed with "input/output error" returning.
> > After simply debug, I found that the error is because the write is
> unordered, the file size is 1.5mb, so it need two write rpc, sometimes the
> second rpc will processed first, then the librgw will return EIO fail. But
> here's the strange thing, when I copy files from local fs dir, the problem
> never appearance.
> > And the receiving order is always right, but processing order is
> not. But why this never appearance when copy from local fs ? And Is there
> any suggestion to fix? Really appreciate your help.
> >
> > Thanks
> >
> > ganesha version: i tried two version:2.6.2 and 2.8.1
> > ceph version: 12.2.5 luminous
> > And here is log of a failed writing:
> >
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG
> :0x7f66cc001090 fd 43 context 0x7f66cc00e390
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG
> :0x7f66cc001090 fd 43 context 0x7f66e81864f0
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] get_gsh_client :HT CACHE :DEBUG :client_mgr
> cache hit slot 0
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_rpc_process_request :DISP :DEBUG :Request
> from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
> xid=218961045
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] get_gsh_export :HT CACHE :DEBUG :export_mgr
> cache hit slot 1
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
> Calling nfs_Write handle: File Handle V3: Len=24
> 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 100000 len: 80000
> UNSTABLE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] cih_get_by_key_latch :HT CACHE :DEBUG :cih cache
> hit slot 9517
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2
> post obj_hdl 0x7f66d80317d0 state (nil) returned -5
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs3_Errno_verbose :NFS3 :CRIT :Error I/O error
> in nfs3_write converted to NFS3ERR_IO but was set non-retryable
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] get_gsh_client :HT CACHE :DEBUG :client_mgr cache
> hit slot 0
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] nfs_rpc_process_request :DISP :DEBUG :Request
> from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has
> xid=202183829
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] get_gsh_export :HT CACHE :DEBUG :export_mgr cache
> hit slot 1
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING:
> Calling nfs_Write handle: File Handle V3: Len=24
> 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 0 len: 100000
> UNSTABLE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] cih_get_by_key_latch :HT CACHE :DEBUG :cih cache
> hit slot 9517
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :dupreq_tcp_cmpf
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :xids eq
> 2920027285, ck1 12828036216958414204 ck2 12828036216958414204
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_dupreq_finish :DUPREQ :DEBUG :retiring
> ov=0x7f66fc028360 xid=2920027285 on DRC=0x7f66dc003bb0
> state=DUPREQ_COMPLETE, status=DUPREQ_SUCCESS, refcnt=1
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_dupreq_free_dupreq :DUPREQ :DEBUG :freeing
> dupreq entry dv=0x7f66fc028360, dv xid=2920027285 cksum
> 12828036216958414204 state=DUPREQ_COMPLETE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :dupreq_tcp_cmpf
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :xids eq
> 2936804501, ck1 8812608633411635075 ck2 8812608633411635075
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_dupreq_finish :DUPREQ :DEBUG :retiring
> ov=0x7f66e400fd90 xid=2936804501 on DRC=0x7f66dc003bb0
> state=DUPREQ_COMPLETE, status=DUPREQ_SUCCESS, refcnt=1
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_dupreq_free_dupreq :DUPREQ :DEBUG :freeing
> dupreq entry dv=0x7f66e400fd90, dv xid=2936804501 cksum 8812608633411635075
> state=DUPREQ_COMPLETE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
> on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=218961045 returned
> XPRT_IDLE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_13] free_nfs_request :DISP :DEBUG :free_nfs_request:
> 0x7f66cc001090 fd 43 xp_refs 5 rq_refs 0
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2
> post obj_hdl 0x7f66d80317d0 state (nil) returned 0
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE
> on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=202183829 returned
> XPRT_IDLE
> > 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 :
> ganesha.nfsd-41666[svc_8] free_nfs_request :DISP :DEBUG :free_nfs_request:
> 0x7f66cc001090 fd 43 xp_refs 3 rq_refs 0
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
>
http://www.redhat.com/en/technologies/storage
>
> tel. 734-821-5101
> fax. 734-769-8938
> cel. 734-216-5309
>
>