Hi Matt,
  I think I found the reason why this problem will never triggered by the copy from local fs dir, because when copy from the fuse mount point,
ganesha process rpc requests concurrently no matter what the mount param is.But when copy from the localfs dir, ganesha process rpc requests
sequentially. This really confused me, why this happen? 
  Appreciate your help.
Thanks

Here is some prove:
1.ceph-fuse mount point --> nfs mount point
[two different threads process concurrently]

17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :0x7f66cc001090 fd 43 context 0x7f66cc00e390
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :0x7f66cc001090 fd 43 context 0x7f66e81864f0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=218961045
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 100000 len: 80000 UNSTABLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2 post obj_hdl 0x7f66d80317d0 r state (nil) returned -5
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs3_Errno_verbose :NFS3 :CRIT :Error I/O error in nfs3_write converted to NFS3ERR_IO but was set non-retryable
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=202183829
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 0 len: 100000 UNSTABLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=218961045 returned XPRT_IDLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7f66cc001090 fd 43 xp_refs 5 rq_refs 0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2 post obj_hdl 0x7f66d80317d0 state (nil) returned 0
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=202183829 returned XPRT_IDLE
17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7f66cc001090 fd 43 xp_refs 3 rq_refs 0

2.local fs dir --. nfs mount point
[always process seconde write rpc after free the first one]
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] nfs_rpc_decode_request :DISP :DEBUG :0x7fd7f8044410 fd 43 context 0x7fd80000a3e0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=3075222631
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054e68f7b7758d48bc5000000 start: 0 len: 100000 FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2 post obj_hdl 0x7fd7ec0416c0 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3075222631 returned XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_15] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] nfs_rpc_decode_request :DISP :DEBUG :0x7fd7f8044410 fd 43 context 0x7fd7dc0437a0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=3091999847
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054e68f7b7758d48bc5000000 start: 100000 len: 80000 FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2 post obj_hdl 0x7fd7ec0416c0 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3091999847 returned XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_9] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0

18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] nfs_rpc_decode_request :DISP :DEBUG :0x7fd7f8044410 fd 43 context 0x7fd7d0025230
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=3159108711
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb7405408e314ea92675ba9000000 start: 0 len: 100000 FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2 post obj_hdl 0x7fd7c4059570 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3159108711 returned XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_4] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] nfs_rpc_decode_request :DISP :DEBUG :0x7fd7f8044410 fd 43 context 0x7fd810046740
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=3175885927
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb7405408e314ea92675ba9000000 start: 100000 len: 80000 FILE_SYNC
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] rgw_fsal_write2 :FSAL :DEBUG :rgw_fsal_write2 post obj_hdl 0x7fd7c4059570 state (nil) returned 0
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7fd7f8044410 fd 43 (::ffff:10.10.112.54:813) xid=3175885927 returned XPRT_IDLE
18/12/2019 11:17:29 : epoch 5df99527 : 10-10-112-54 : ganesha.nfsd-50922[svc_17] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7fd7f8044410 fd 43 xp_refs 3 rq_refs 0

j j <jiang4357291@gmail.com> 于2019年12月18日周三 上午10:38写道:
Hi Matt,
   I've tried both sync and async, the async returned " failed to close .... input/out error".
   But the final error point is still the "non-0 initial write position" in librgw, which triggered by the unordered write.
Thanks

Matt Benjamin <mbenjami@redhat.com> 于2019年12月17日周二 下午8:29写道:
Hi j j,

Are you mounted -osync?

Matt

On Tue, Dec 17, 2019 at 7:26 AM j j <jiang4357291@gmail.com> wrote:
>
> Hi everyone,
>     I was using FSAL_RGW to export nfsv3, and trying to copy files from ceph-fuse mountpoint to the nfs mountpoint, file size is 1.5mb, but some files write failed with "input/output error" returning.
>     After simply debug, I  found that the error is because the write is unordered, the file size is 1.5mb, so it need two write rpc, sometimes the second rpc will processed first, then the librgw will return EIO fail. But here's the strange thing, when I copy files from local fs dir, the problem never appearance.
>     And the receiving order is always right, but processing order is not.  But why this never appearance when copy from local fs ? And Is there any suggestion to fix? Really appreciate your help.
>
>  Thanks
>
> ganesha version: i tried two version:2.6.2 and 2.8.1
> ceph version: 12.2.5 luminous
> And here is log of a failed writing:
>
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :0x7f66cc001090 fd 43 context 0x7f66cc00e390
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :0x7f66cc001090 fd 43 context 0x7f66e81864f0
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] get_gsh_client :HT CACHE :DEBUG :client_mgr cache hit slot 0
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=218961045
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] get_gsh_export :HT CACHE :DEBUG :export_mgr cache hit slot 1
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 100000 len: 80000 UNSTABLE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] cih_get_by_key_latch :HT CACHE :DEBUG :cih cache hit slot 9517
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2 post obj_hdl 0x7f66d80317d0 state (nil) returned -5
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs3_Errno_verbose :NFS3 :CRIT :Error I/O error in nfs3_write converted to NFS3ERR_IO but was set non-retryable
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] get_gsh_client :HT CACHE :DEBUG :client_mgr cache hit slot 0
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:10.10.112.54 for Program 100003, Version 3, Function 7 has xid=202183829
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] get_gsh_export :HT CACHE :DEBUG :export_mgr cache hit slot 1
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs3_write :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Write handle: File Handle V3: Len=24 4300000110bef9bc69eeb74054f5b17dac1ddb65af000000 start: 0 len: 100000 UNSTABLE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] cih_get_by_key_latch :HT CACHE :DEBUG :cih cache hit slot 9517
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :dupreq_tcp_cmpf
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :xids eq 2920027285, ck1 12828036216958414204 ck2 12828036216958414204
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_dupreq_finish :DUPREQ :DEBUG :retiring ov=0x7f66fc028360 xid=2920027285 on DRC=0x7f66dc003bb0 state=DUPREQ_COMPLETE, status=DUPREQ_SUCCESS, refcnt=1
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_dupreq_free_dupreq :DUPREQ :DEBUG :freeing dupreq entry dv=0x7f66fc028360, dv xid=2920027285 cksum 12828036216958414204 state=DUPREQ_COMPLETE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :dupreq_tcp_cmpf
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] dupreq_tcp_cmpf :DUPREQ :DEBUG :xids eq 2936804501, ck1 8812608633411635075 ck2 8812608633411635075
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_dupreq_finish :DUPREQ :DEBUG :retiring ov=0x7f66e400fd90 xid=2936804501 on DRC=0x7f66dc003bb0 state=DUPREQ_COMPLETE, status=DUPREQ_SUCCESS, refcnt=1
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_dupreq_free_dupreq :DUPREQ :DEBUG :freeing dupreq entry dv=0x7f66e400fd90, dv xid=2936804501 cksum 8812608633411635075 state=DUPREQ_COMPLETE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=218961045 returned XPRT_IDLE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_13] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7f66cc001090 fd 43 xp_refs 5 rq_refs 0
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] rgw_fsal_write2 :FSAL :DEBUG : rgw_fsal_write2 post obj_hdl 0x7f66d80317d0 state (nil) returned 0
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7f66cc001090 fd 43 (::ffff:10.10.112.54:706) xid=202183829 returned XPRT_IDLE
> 17/12/2019 14:23:42 : epoch 5df873e3 : 10-10-112-54 : ganesha.nfsd-41666[svc_8] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7f66cc001090 fd 43 xp_refs 3 rq_refs 0
> _______________________________________________
> Devel mailing list -- devel@lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org



--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309