May 2019 - Devel - Nfs Ganesha List Archives

Change in ...nfs-ganesha[next]: MDCACHE - Add MDCACHE {} config block

by Daniel Gryniewicz (GerritHub)

Daniel Gryniewicz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929 Change subject: MDCACHE - Add MDCACHE {} config block ...................................................................... MDCACHE - Add MDCACHE {} config block Add a config block name MDCACHE that is a copy of CACHEINODE. Both can be configured, but MDCACHE will override CACHEINODE. This allows us to deprecate CACHEINODE. Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b Signed-off-by: Daniel Gryniewicz <dang(a)fprintf.net> --- M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c M src/config_samples/ceph.conf M src/config_samples/config.txt M src/config_samples/ganesha.conf.example M src/doc/man/ganesha-cache-config.rst M src/doc/man/ganesha-config.rst 6 files changed, 31 insertions(+), 7 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/29/454929/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b Gerrit-Change-Number: 454929 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Gryniewicz <dang(a)redhat.com> Gerrit-MessageType: newchange

5 years, 4 months

3
2
0 / 0

lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK

by Kinglong Mee

The latest ganesha/gluster supports seek according to, https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#section-15.11 From the given sa_offset, find the next data_content4 of type sa_what in the file. If the server can not find a corresponding sa_what, then the status will still be NFS4_OK, but sr_eof would be TRUE. If the server can find the sa_what, then the sr_offset is the start of that content. If the sa_offset is beyond the end of the file, then SEEK MUST return NFS4ERR_NXIO. For a file's filemap as, Part 1: HOLE 0x0000000000000000 ---> 0x0000000000600000 Part 2: DATA 0x0000000000600000 ---> 0x0000000000700000 Part 3: HOLE 0x0000000000700000 ---> 0x0000000001000000 SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1, sr_offset:0x70000) from ganesha/gluster; SEEK(0x700000, SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from ganesha/gluster. If an application depends the lseek result for data searching, it may enter infinite loop. while (1) { next_pos = lseek(fd, cur_pos, seek_type); if (seek_type == SEEK_DATA) { seek_type = SEEK_HOLE; } else { seek_type = SEEK_DATA; } if (next_pos == -1) { return ; cur_pos = next_pos; } The lseek syscall always gets 0x70000 from nfs client for those two cases, but, if underlying filesystem is ext4/f2fs, or the nfs server is knfsd, the lseek(0x700000, SEEK_DATA) gets ENXIO. I wanna to know, should I fix the ganesha/gluster as knfsd return ENXIO for the first case? or should I fix the nfs client to return ENXIO for the first case? thanks, Kinglong Mee

5 years, 5 months

4
8
0 / 0

GPFS LogCrit on lock_op2

by Frank Filz

On the call, I mentioned I would look at bypassing permission check for the file owner for the open_func call in fsal_find_fd with open_for_locks. It turns out there is a difference between FSAL_GPFS and FSAL_VFS FSAL_VFS makes the ultimate call to open_by_handle as root, and therefor even a non-owner of the file will not be an issue in opening the file read/write. GPFS calls GPFSFSAL_open which calls fsal_set_credentials so if the permissions do not allow read/write when open_for_locks occurs, then the file will instead be opened in the same mode as the OPEN stateid. I think it would be good to evaluate when GPFSFSAL_open actually needs to be called, and whether open_func should make a more direct call to fsal_internal_handle2fd. Frank

6 years, 5 months

3
4
0 / 0

Nfs Ganesha 2.7.4 ETA?

by Rungta, Vandana

Is there an ETA for Nfs-Ganesha 2.7.4? Thanks, Vandana

6 years, 9 months

5
6
0 / 0

回复：Re: Cannot close Error with gluster FSAL

by QR

Hi Soumya, thanks for your help. The ganesha process is started by root, but the tar is executed by a non-root(I use qr here) user.The mount point is /home/glusterfs.Reproduce steps1. Prepare the tar file and mount point with root user cd /tmp/perm echo 444 > big1.hdr chown qr:qr big1.hdr tar czf 444.tgz big1.hdr chown qr:qr 444.tgz mkdir /home/glusterfs/perm && chown qr:qr /home/glusterfs/perm2. switch to non-root user: su qr3. cd /home/glusterfs/perm4. tar xzf /tmp/perm/444.tgz The ganesha and nfs client are on the same host.os version : CentOS Linux release 7.5.1804 (Core)kernel version: 3.10.0-693.17.1.el7.x86_64glusterfs api : glusterfs-api-devel-4.1.5-1.el7.x86_64 ERROR LOGS - ganesha.log01/06/2019 06:59:48 : ganesha.nfsd-7236[svc_2] glusterfs_close_my_fd :FSAL :CRIT :Error : close returns with Permission deniedERROR LOGS - ganesha-gfapi.log[2019-05-31 22:59:48.613207] E [MSGID: 114031] [client-rpc-fops_v2.c:272:client4_0_open_cbk] 0-gv0-client-0: remote operation failed. Path: /perm/big1.hdr (c591c1bf-9494-4222-8aa1-deaaf0b44c7f) [Permission denied]ERROR LOGS - export-sdb1-brick.log[2019-05-31 22:59:48.388169] W [MSGID: 113117] [posix-metadata.c:569:posix_update_utime_in_mdata] 0-gv0-posix: posix utime set mdata failed on file[2019-05-31 22:59:48.399813] W [MSGID: 113117] [posix-metadata.c:569:posix_update_utime_in_mdata] 0-gv0-posix: posix utime set mdata failed on file [函数未实现][2019-05-31 22:59:48.403448] I [MSGID: 139001] [posix-acl.c:269:posix_acl_log_permit_denied] 0-gv0-access-control: client: CTX_ID:51affcdc-77ef-497d-aece-36a568671907-GRAPH_ID:0-PID:7236-HOST:nfs-ganesha-PC_NAME:gv0-client-0-RECON_NO:-0, gfid: c591c1bf-9494-4222-8aa1-deaaf0b44c7f, req(uid:1000,gid:1000,perm:2,ngrps:1), ctx(uid:1000,gid:1000,in-groups:1,perm:444,updated-fop:SETATTR, acl:-) [Permission denied][2019-05-31 22:59:48.403499] E [MSGID: 115070] [server-rpc-fops_v2.c:1442:server4_open_cbk] 0-gv0-server: 124: OPEN /perm/big1.hdr (c591c1bf-9494-4222-8aa1-deaaf0b44c7f), client: CTX_ID:51affcdc-77ef-497d-aece-36a568671907-GRAPH_ID:0-PID:7236-HOST:nfs-ganesha-PC_NAME:gv0-client-0-RECON_NO:-0, error-xlator: gv0-access-control [Permission denied] -------------------------------- ----- 原始邮件 ----- 发件人：Soumya Koduri <skoduri(a)redhat.com> 收件人：zhbingyin(a)sina.com, ganesha-devel <devel(a)lists.nfs-ganesha.org>, Frank Filz <ffilz(a)redhat.com> 主题：[NFS-Ganesha-Devel] Re: Cannot close Error with gluster FSAL 日期：2019年06月01日 01点35分 On 5/31/19 1:55 PM, Soumya Koduri wrote: > > > On 5/31/19 4:30 AM, QR wrote: >> We cannot decompress this file with gluster FSAL, but vfs FSAL and >> kernel nfs can. >> Is anyone know about this? >> Thanks in advance. >> >> [qr@nfs-ganesha perm]$ tar xzf /tmp/perm/444.tgz >> tar: big1.hdr: Cannot close: Permission denied >> tar: Exiting with failure status due to previous errors >> [qr@nfs-ganesha perm]$ tar tvf /tmp/perm/444.tgz >> -r--r--r-- qr/qr 4 2019-05-30 11:25 big1.hdr >> > > It could be similar to the issue mentioned in [1]. Will check and confirm. I couldn't reproduce this issue. What is the OS version of the server and client machines? Also please check if there are any errors in ganesha.log, ganesha-gfapi.log and brick logs. Most probably you are hitting the issues discussed in [1]. The problem is that unlike most of the other FSALs in FSAL_GLUSTER we switch to user credentials before performing any operations on the backend file system [these changes were done to be able to run nfs-ganesha by a non-root user] The side-effect is that though first time the fd is opened as part of NFSv4.x client OPEN call, NFS-ganesha server may need to re-open the file to get additional fds to perform certain other operations like COMMIT, LOCK/LEASE and glusterfs doesn't grant RW access to those fds (as expected). Not sure if there is a clean way of fixing it. In [1], the author tried to workaround the problem by switching to root user if the ganesha process ID is also root. That means this issue will still remain if the ganesha process is started by a non-root user. @Frank, any thoughts? Thanks, Soumya > > Thanks, > Soumya > > [1] https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/447012 > >> Ganesha server info >> ganesha version: a3c6fa39ce72682049391b7e094885a8c151b0c8(V2.8-rc1) >> FSAL : gluster >> nfs client info >> nfs version : nfs4.0 >> mount options : >> rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=XXX,local_lock=none,addr=YYY >> >> >> _______________________________________________ >> Devel mailing list -- devel(a)lists.nfs-ganesha.org >> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org >> > _______________________________________________ > Devel mailing list -- devel(a)lists.nfs-ganesha.org > To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org _______________________________________________ Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org

6 years, 9 months

2
1
0 / 0

Re: Regarding a lost RECLAIM_COMPLETE

by Jeff Layton

On Fri, 2019-05-31 at 10:08 +0000, Sriram Patil wrote: > Hi, > > Recently I came across an issue where NFS client lease expired so ganesha returned NFS4ERR_EXPIRED. This resulted in the client creating a new session with EXCHANGE_ID + CREATE_SESSION. > > The client id was immediately confirmed in CREATE_SESSION because the recov directory was not deleted. I observed that ganesha sets “cid_allow_reclaim = true” in nfs4_op_create_session->nfs4_chk_clid->nfs4_chk_clid_impl. This flags allows the client to do reclaims, even though ganesha is not in grace. CLAIM_PREVIOUS, in “open4_validate_reclaim” is as follows, > > case CLAIM_PREVIOUS: > want_grace = true; > if (!clientid->cid_allow_reclaim || > ((data->minorversion > 0) && > clientid->cid_cb.v41.cid_reclaim_complete)) > status = NFS4ERR_NO_GRACE; > break; > cid_allow_reclaim is just a flag saying that the client in question is present in the recovery DB. The logic above looks correct to me. > Now, there is another flag to mark the completion of reclaim from client “clientid->cid_cb.v41.cid_reclaim_complete”. This flag is set to true as part of RECLAIM_COMPLETE operation. Now, consider a case where ganesha does not receive RECLAIM_COMPLETE from the client and the CLAIM_NULL case in "open4_validate_reclaim", > > case CLAIM_NULL: > if ((data->minorversion > 0) > && !clientid->cid_cb.v41.cid_reclaim_complete) > status = NFS4ERR_GRACE; > break; > > So, the client gets stuck in a loop for OPEN with CLAIM_NULL, because it keeps returning NFS4ERR_GRACE. > A client that doesn't send a RECLAIM_COMPLETE before attempting to do a non-reclaim open is broken. RFC5661, page 567: Whenever a client establishes a new client ID and before it does the first non-reclaim operation that obtains a lock, it MUST send a RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no locks to reclaim. If non-reclaim locking operations are done before the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned. So the above behavior is correct, IMO. > I guess allowing clients to reclaim as long as they keep sending the reclaim requests is the point of implementing sticky grace periods. But if RECLAIM_COMPLETE is lost we should not be stuck in grace period forever. May be we can change cid_allow_reclaim to the time at which last reclaim request was received. And then allow non-reclaim requests after (cid_allow_reclaim + grace_period), which means ganesha will wait for a RECLAIM_COMPLETE for a full grace period. We could choose the timeout to be grace_period/3 or something if that makes more sense. > The point of sticky grace periods was to ensure that we don't end up with a ToC/ToU race with the grace period. In general, we check whether we're in the grace period at the start of an operation, but we could end up lifting it or entering it after that check but before the operation was complete. With the sticky grace period patches, we ensure that we remain in whichever state we need until the operation is done. In general, this should not extend the length of the grace period unless you have an operation that is taking an extraordinarily long time before putting its reference. Maybe you have an operation that is stuck and holding a grace reference? > But this will ensure that SERVER does not fail because RECLAIM_COMPLETE was not sent. > > Meanwhile I am also trying to figure out why NFS client did not send RECLAIM_COMPLETE. > That's the real question. -- Jeff Layton <jlayton(a)redhat.com>

6 years, 9 months

2
1
0 / 0

Announce Push of V2.8.0

by Frank Filz

Branch next Tag:V2.8.0 NOTE: This merge contains an ntirpc pullup, please update your submodule. Release Highlights * Pullup ntirpc to 1.8.0 * MDCACHE - Restart readdir if directory is invalidated * Add symbols needed for tests * FSAL fix race in FSAL close method * fsal_open2 - check for non-regular file when open by name * FSAL_GLUSTER: Include "enable_upcall" option in sample gluster.conf * spec file changes for RHEL8 python3 * [FSAL_VFS] Reduce the number of opens done for referral directories Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com> Contents: 73ba576 Frank S. Filz V2.8.0 ccf5c5f Daniel Gryniewicz Pull up to ntirpc 1.8.0 bf19b29 Daniel Gryniewicz SAL - Remove state_obj as the last operation 82b6faf Soumya Koduri SAL: Check for state type before reading lock.openstate 514be20 Soumya Koduri SAL: Validate write_deleg_client ptr before dereferencing it 18e33f9 Kaleb S. KEITHLEY ganeshactl: async is a reserved word starting in python-3.7 d57fb08 Kaleb S. KEITHLEY rpm: exclude 9p utils when 9p is disabled 0c41be3 Trishali Nayar ganesha_mgr gets a new option to show the idmapper cache 6a070ff Trishali Nayar Fix syntax error in ganesha_mgr script ee99322 Frank S. Filz Handle close race in FSAL_MEM and FSAL_RGW

6 years, 9 months

1
0
0 / 0

Change in ...nfs-ganesha[next]: Free the extra ref taken by the dirent on the mdcache entry even if t...

by Ashish Sangwan (GerritHub)

Ashish Sangwan has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/456516 Change subject: Free the extra ref taken by the dirent on the mdcache entry even if the content lock is held in read mode. ...................................................................... Free the extra ref taken by the dirent on the mdcache entry even if the content lock is held in read mode. Not freeing the extra ref leads to memory growth as we are not able to prune the mdcache entries with ref count >= 2. Though we expect that the ref will be released on a invalidate call, but invalidation of the dirents only happen when we either add/delete any directory entry. Otherwise the dirent stays in the cache pinning the correspodning mdcache entry. Signed-off-by: Ashish Sangwan <ashishsangwan2(a)gmail.com> Change-Id: I48c6e6f02705a067e3ce0fcaa2177e069ddd53f1 --- M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c M src/include/abstract_atomic.h 2 files changed, 35 insertions(+), 9 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/16/456516/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/456516 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I48c6e6f02705a067e3ce0fcaa2177e069ddd53f1 Gerrit-Change-Number: 456516 Gerrit-PatchSet: 1 Gerrit-Owner: Ashish Sangwan <ashishsangwan2(a)gmail.com> Gerrit-MessageType: newchange

6 years, 9 months

1
0
0 / 0

Cannot close Error with gluster FSAL

by QR

We cannot decompress this file with gluster FSAL, but vfs FSAL and kernel nfs can.Is anyone know about this?Thanks in advance. [qr@nfs-ganesha perm]$ tar xzf /tmp/perm/444.tgztar: big1.hdr: Cannot close: Permission deniedtar: Exiting with failure status due to previous errors[qr@nfs-ganesha perm]$ tar tvf /tmp/perm/444.tgz-r--r--r-- qr/qr 4 2019-05-30 11:25 big1.hdr Ganesha server info ganesha version: a3c6fa39ce72682049391b7e094885a8c151b0c8(V2.8-rc1) FSAL : glusternfs client info nfs version : nfs4.0 mount options : rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=XXX,local_lock=none,addr=YYY

6 years, 9 months

2
2
0 / 0

Change in ...nfs-ganesha[next]: SAL: Check for state type before reading lock.openstate

by Soumya (GerritHub)

Soumya has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/456494 Change subject: SAL: Check for state type before reading lock.openstate ...................................................................... SAL: Check for state type before reading lock.openstate state_data.lock.openstate is valid only for the states of type STATE_TYPE_LOCK and STATE_TYPE_NLM_LOCK. Verify the same before using it. (not sure if lock.openstate is initialized for NLM .. couldn't find it from code reading) Change-Id: I57223cd9775c6fe2f4332922d19ed4ee996f7ae1 Signed-off-by: Soumya Koduri <skoduri(a)redhat.com> --- M src/FSAL/commonlib.c 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/94/456494/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/456494 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I57223cd9775c6fe2f4332922d19ed4ee996f7ae1 Gerrit-Change-Number: 456494 Gerrit-PatchSet: 1 Gerrit-Owner: Soumya <skoduri(a)redhat.com> Gerrit-MessageType: newchange

6 years, 9 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

Devel May 2019