This weeks community call cancelled
by Frank Filz
I should have sent this out earlier, however, I think most everyone will see
this in time.
I think most of us Red Hatters including myself will be (or would like to
be) attending an internal meeting at the time of tomorrow's community call,
and that typically is most if not all of our attendance, so I'll just cancel
our call.
If you have anything important, I should be on IRC and have some bandwidth
to respond during the meeting.
Frank
5 years, 9 months
IMPORTANT! Tuesday Community Call and Daylight Savings Time
by Frank Filz
This coming Sunday (March 10), the US enters Daylight Savings Time. This
means that for our community members that do not observe DST at all, the
meeting will now be 1 hour earlier for the next 7 months or so. For those in
the Northern Hemisphere who observe DST, but transition at a different date
that the US, the meeting will be 1 hour earlier for a short time. For those
in the Southern Hemisphere that observe daylight savings time, you of course
are making the opposite transition and most often not on the same date as
the US so you will have a net 2 hour shift in time, with a short term 1 hour
shift in time.
Frank
5 years, 9 months
Change in ...nfs-ganesha[next]: MDCACHE - Ensure ATTR_ACL bit matches attrs.acl ptr
by Daniel Gryniewicz (GerritHub)
Daniel Gryniewicz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446980
Change subject: MDCACHE - Ensure ATTR_ACL bit matches attrs.acl ptr
......................................................................
MDCACHE - Ensure ATTR_ACL bit matches attrs.acl ptr
The refresh/copy/ref-handoff code depends on the ATTR_ACL bit in
request_mask always matching the attrs.acl pointer. However,
mdcache_refresh_attrs() overwrites request_mask with the given one,
losing this relationship. Fix it up, to make sure the bit is always set
when the pointer is non-NULL.
Change-Id: I15c556e806184ba42ed41519618f74a8aa66c6df
Signed-off-by: Daniel Gryniewicz <dang(a)redhat.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c
1 file changed, 5 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/80/446980/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446980
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I15c556e806184ba42ed41519618f74a8aa66c6df
Gerrit-Change-Number: 446980
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Gryniewicz <dang(a)redhat.com>
Gerrit-MessageType: newchange
5 years, 9 months
V2.7.2 Crash in mdcache_readdir_chunked
by Rungta, Vandana
Testing with version 2.7.2 of Ganesha with the following patch applied.
https://github.com/nfs-ganesha/nfs-ganesha/commit/23c05a5a3e37a8bd960073e...
Evidence in the log files point to the crash because mdc_lookup_uncached can free the dirent that continues to be used by mdcache_readdir_chunked. In mdcache_avl_insert it finds an existing dirent, so removes and frees the old dirent and inserts the new one. mdcache_readdir_chunked still has a pointer to the dirent that is freed.
Suggest a change in mdcache_readdir_chunked (circa line 3046) – if the call to mdc_lookup_uncached returns success, to jump back to again to get the dirent. ( I am not sure which states need to be restored before jumping back to again)
status = mdc_lookup_uncached(directory, dirent->name,
&entry, NULL);
if (FSAL_IS_ERROR(status)) {
. . .
return status;
}
+ mdcache_put(entry);
+ first_pass = true;
+ chunk = NULL;
+ goto again;
}
Relevant lines from the log file:
mdcache_handle.c:557 :mdcache_readdir :NFS READDIR :DEBUG :NFS READDIR: DEBUG: Calling mdcache_readdir_chunked whence=0
mdcache_helpers.c:2934 :mdcache_readdir_chunked :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: found dirent in cached chunk 0x51e375e0 dirent 0x4a2eadf0 created-on-rd-1
mdcache_helpers.c:2976 :mdcache_readdir_chunked :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Lookup by key for created-on-rd-1 failed, lookup by name now
(gets the write lock and repeats)
mdcache_helpers.c:663 :mdcache_new_entry :INODE :DEBUG :Adding a REGULAR_FILE, entry=0x17ab9150
mdcache_helpers.c:758 :mdcache_new_entry :INODE :F_DBG :New entry 0x17ab9150 added with fh_hk.key hk=d68582e4df05b9f5 fsal=0x7f4e48b0ed20 key=0xfb956a03000000000100
mdcache_handle.c:112 :mdcache_alloc_and_check_handle :INODE :F_DBG :lookup Created entry 0x17ab9150 FSAL FOO for created-on-rd-1
mdcache_helpers.c:1447 :mdcache_dirent_add :INODE :F_DBG :Add dir entry created-on-rd-1
mdcache_avl.c:327 :mdcache_avl_insert :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Insert dir entry 0x51731610 created-on-rd-1
mdcache_avl.c:385 :mdcache_avl_insert :NFS READDIR :DEBUG :NFS READDIR: DEBUG: Already existent when inserting new dirent on entry=0x329efe50 name=created-on-rd-1
mdcache_avl.c:406 :mdcache_avl_insert :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Keys for created-on-rd-1 don't match v=hk=d68582e4df05b9f5 fsal=0x7f4e48b0ed20 key=0xfb956a03000000000100 v2=hk=41227e96857f4696 fsal=0x7f4e48b0ed20 key=0xe08a5903000000000100
mdcache_avl.c:171 :unchunk_dirent :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Unchunking 0x4a2eadf0 created-on-rd-1
mdcache_avl.c:249 :mdcache_avl_remove :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Just freed dirent 0x4a2eadf0 from chunk 0x51e375e0 parent 0x329efe50
mdcache_avl.c:373 :mdcache_avl_insert :NFS READDIR :F_DBG :NFS READDIR: FULLDEBUG: Inserted dirent created-on-rd-1 with ckey hk=d68582e4df05b9f5 fsal=0x7f4e48b0ed20 key=0xfb956a03000000000100
Long running tests with a windows client doing a readdir over 5 million files while other threads do IO.
Reproduced with fsal, nfs_readdir, and inode_cache debug turned on.
I am happy to provide any additional debug info from the logs that will help.
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `bin/ganesha.nfsd -f etc/ganesha/ganesha.conf -p var/run/ganesha.pid -F'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000000005146e8 in display_opaque_bytes (dspbuf=0x7f4e416a35b0, value=0x1f72a01d349d6820,
len=1219554592) at /src/src/log/display.c:364
364 /src/src/log/display.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install sgw-nfs-ganesha-2.0.32.0-1.x86_64
(gdb) bt
#0 0x00000000005146e8 in display_opaque_bytes (dspbuf=0x7f4e416a35b0, value=0x1f72a01d349d6820,
len=1219554592) at /src/src/log/display.c:364
#1 0x000000000053a8be in display_mdcache_key (dspbuf=0x7f4e416a35b0, key=0x6096508)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:858
#2 0x000000000053a9cd in mdcache_find_keyed_reason (key=0x6096508, entry=0x7f4e416a3738,
reason=MDC_REASON_SCAN) at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:890
#3 0x0000000000542603 in mdcache_readdir_chunked (directory=0x329efe50, whence=0, dir_state=0x7f4e416a3900,
cb=0x43225c <populate_dirent>, attrmask=122830, eod_met=0x7f4e416a3ffb)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2968
#4 0x0000000000530387 in mdcache_readdir (dir_hdl=0x329efe88, whence=0x7f4e416a38e0,
dir_state=0x7f4e416a3900, cb=0x43225c <populate_dirent>, attrmask=122830, eod_met=0x7f4e416a3ffb)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:559
#5 0x0000000000432b83 in fsal_readdir (directory=0x329efe88, cookie=0, nbfound=0x7f4e416a3ffc,
eod_met=0x7f4e416a3ffb, attrmask=122830, cb=0x493020 <nfs3_readdirplus_callback>, opaque=0x7f4e416a3fb0)
at /src/src/FSAL/fsal_helper.c:1164
#6 0x0000000000492e79 in nfs3_readdirplus (arg=0x458a3d48, req=0x458a3640, res=0xee64330)
at /src/src/Protocols/NFS/nfs3_readdirplus.c:310
#7 0x0000000000457d0e in nfs_rpc_process_request (reqdata=0x458a3640)
at /src/src/MainNFSD/nfs_worker_thread.c:1328
(gdb) select-frame 3
(gdb) print *dirent
$1 = {chunk_list = {next = 0x0, prev = 0xc1}, chunk = 0x41e3a6a0, node_name = {left = 0x4a2eade0, right = 0x0,
parent = 0}, node_ck = {left = 0x0, right = 0x2, parent = 0}, node_sorted = {left = 0x0, right = 0x0,
parent = 0}, ck = 0, eod = false, namehash = 0, ckey = {hk = 0, fsal = 0xc9e3c5e3b0667c56, kv = {
addr = 0x1f72a01d349d6820, len = 139974203731232}}, flags = 0, name = 0x0, name_buffer = 0x6096538 ""}
5 years, 9 months
expire_time_attr
by Marc Eshel
Why is the expire_time_attr export config parameter not respected when
attributes are being refreshed, I don't see where we check for
expire_time_attr before calling the FSAL getattr. Did I miss it ?
Thanks, Marc.
5 years, 9 months
Re: [Nfs-ganesha-devel] NFS-Ganesha CEPH_FSAL ceph.quota.max_bytes not enforced
by Jeff Layton
On Mon, 2019-03-04 at 09:11 -0500, Jeff Layton wrote:
> This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.
> On Fri, 2019-03-01 at 15:49 +0000, David C wrote:
> > This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.
> > Hi All
> >
> > Exporting cephfs with the CEPH_FSAL
> >
> > I set the following on a dir:
> >
> > setfattr -n ceph.quota.max_bytes -v 100000000 /dir
> > setfattr -n ceph.quota.max_files -v 10 /dir
> >
> > From an NFSv4 client, the quota.max_bytes appears to be completely ignored, I can go GBs over the quota in the dir. The quota.max_files DOES work however, if I try and create more than 10 files, I'll get "Error opening file 'dir/new file': Disk quota exceeded" as expected.
> >
> > From a fuse-mount on the same server that is running nfs-ganesha, I've confirmed ceph.quota.max_bytes is enforcing the quota, I'm unable to copy more than 100MB into the dir.
> >
> > According to [1] and [2] this should work.
> >
> > Cluster is Luminous 12.2.10
> >
> > Package versions on nfs-ganesha server:
> >
> > nfs-ganesha-rados-grace-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-vfs-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64
> > libcephfs2-13.2.2-0.el7.x86_64
> > ceph-fuse-12.2.10-0.el7.x86_64
> >
> > My Ganesha export:
> >
> > EXPORT
> > {
> > Export_ID=100;
> > Protocols = 4;
> > Transports = TCP;
> > Path = /;
> > Pseudo = /ceph/;
> > Access_Type = RW;
> > Attr_Expiration_Time = 0;
> > #Manage_Gids = TRUE;
> > Filesystem_Id = 100.1;
> > FSAL {
> > Name = CEPH;
> > }
> > }
> >
> > My ceph.conf client section:
> >
> > [client]
> > mon host = 10.10.10.210:6789, 10.10.10.211:6789, 10.10.10.212:6789
> > client_oc_size = 8388608000
> > #fuse_default_permission=0
> > client_acl_type=posix_acl
> > client_quota = true
> > client_quota_df = true
> >
> > Related links:
> >
> > [1] http://tracker.ceph.com/issues/16526
> > [2] https://github.com/nfs-ganesha/nfs-ganesha/issues/100
> >
> > Thanks
> > David
> >
>
> It looks like you're having ganesha do the mount as "client.admin", and
> I suspect that that may allow you to bypass quotas? You may want to try
> creating a cephx user with less privileges, have ganesha connect as that
> user and see if it changes things?
>
Actually, this may be wrong info.
How are you testing being able to write to the file past quota? Are you
using O_DIRECT I/O? If not, then it may just be that you're seeing the
effect of the NFS client caching writes.
--
Jeff Layton <jlayton(a)redhat.com>
5 years, 9 months
Correct usage of nfs3_read_cb callback?
by Bjorn Leffler
I'm implementing a new FSAL. At the end of a successful read2() function, I
call the callback as follows:
void myfsal_read2(struct fsal_obj_handle *obj_hdl,
bool bypass,
fsal_async_cb done_cb,
struct fsal_io_arg *read_arg,
void *caller_arg){
// ... read data ...
fsal_status_t status = fsalstat(ERR_FSAL_NO_ERROR, 0);
done_cb(obj_hdl, status, read_arg, caller_arg);
}
This generates the following error in src/Protocols/NFS/nfs_proto_tools.c,
line 213:
nfs_RetryableError :NFS3 :CRIT :Possible implementation error:
ERR_FSAL_NO_ERROR managed as an error
From the client side, read / write operations work as expected. If I don't
call the callback function, the NFS operation doesn't complete.
What are the correct usage of the callback functions, after successful
operations?
Thanks,
Bjorn
5 years, 9 months
Change in ...nfs-ganesha[next]: FSAL_GLUSTER: Fix a fd ref leak in lock_op2
by Soumya (GerritHub)
Soumya has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446592
Change subject: FSAL_GLUSTER: Fix a fd ref leak in lock_op2
......................................................................
FSAL_GLUSTER: Fix a fd ref leak in lock_op2
In glusterfs_lock_op2(), if the associated open_state has a
open fd, we dup the fd using glfs_dup and re-use it.
glfs_dup() allocates new glfd object which takes ref on
the underlying fd maintained by gfapi stack. This needs to be
released post processing the fop. Otherwise it leads to
fd leak. This patch addresses the same.
Change-Id: I264e4e957791ea88caaba9e7f3ae85157ce2dcbf
Signed-off-by: Soumya Koduri <skoduri(a)redhat.com>
---
M src/FSAL/FSAL_GLUSTER/handle.c
1 file changed, 4 insertions(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/92/446592/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446592
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I264e4e957791ea88caaba9e7f3ae85157ce2dcbf
Gerrit-Change-Number: 446592
Gerrit-PatchSet: 1
Gerrit-Owner: Soumya <skoduri(a)redhat.com>
Gerrit-MessageType: newchange
5 years, 10 months