Change in ...nfs-ganesha[next]: MDCACHE - Add MDCACHE {} config block
by Daniel Gryniewicz (GerritHub)
Daniel Gryniewicz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929
Change subject: MDCACHE - Add MDCACHE {} config block
......................................................................
MDCACHE - Add MDCACHE {} config block
Add a config block name MDCACHE that is a copy of CACHEINODE. Both can
be configured, but MDCACHE will override CACHEINODE. This allows us to
deprecate CACHEINODE.
Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b
Signed-off-by: Daniel Gryniewicz <dang(a)fprintf.net>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
M src/config_samples/ceph.conf
M src/config_samples/config.txt
M src/config_samples/ganesha.conf.example
M src/doc/man/ganesha-cache-config.rst
M src/doc/man/ganesha-config.rst
6 files changed, 31 insertions(+), 7 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/29/454929/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b
Gerrit-Change-Number: 454929
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Gryniewicz <dang(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 2 months
lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK
by Kinglong Mee
The latest ganesha/gluster supports seek according to,
https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#section-15.11
From the given sa_offset, find the next data_content4 of type sa_what
in the file. If the server can not find a corresponding sa_what,
then the status will still be NFS4_OK, but sr_eof would be TRUE. If
the server can find the sa_what, then the sr_offset is the start of
that content. If the sa_offset is beyond the end of the file, then
SEEK MUST return NFS4ERR_NXIO.
For a file's filemap as,
Part 1: HOLE 0x0000000000000000 ---> 0x0000000000600000
Part 2: DATA 0x0000000000600000 ---> 0x0000000000700000
Part 3: HOLE 0x0000000000700000 ---> 0x0000000001000000
SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1, sr_offset:0x70000) from ganesha/gluster;
SEEK(0x700000, SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from ganesha/gluster.
If an application depends the lseek result for data searching, it may enter infinite loop.
while (1) {
next_pos = lseek(fd, cur_pos, seek_type);
if (seek_type == SEEK_DATA) {
seek_type = SEEK_HOLE;
} else {
seek_type = SEEK_DATA;
}
if (next_pos == -1) {
return ;
cur_pos = next_pos;
}
The lseek syscall always gets 0x70000 from nfs client for those two cases,
but, if underlying filesystem is ext4/f2fs, or the nfs server is knfsd,
the lseek(0x700000, SEEK_DATA) gets ENXIO.
I wanna to know,
should I fix the ganesha/gluster as knfsd return ENXIO for the first case?
or should I fix the nfs client to return ENXIO for the first case?
thanks,
Kinglong Mee
4 years, 4 months
Re: [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing
by Daniel Gryniewicz
This is not one I've seen before, and a quick look at the code looks
strange. The only assert in that bit is asserting the parent is a
directory, but the parent directory is not something that was passed in
by Ganesha, but rather something that was looked up internally in
libcephfs. This is beyond my expertise, at this point. Maybe some ceph
logs would help?
Daniel
On 7/15/19 10:54 AM, David C wrote:
> This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.
>
>
> Hi All
>
> I'm running 2.7.3 using the CEPH FSAL to export CephFS (Luminous), it
> ran well for a few days and crashed. I have a coredump, could someone
> assist me in debugging this please?
>
> (gdb) bt
> #0 0x00007f04dcab6207 in raise () from /lib64/libc.so.6
> #1 0x00007f04dcab78f8 in abort () from /lib64/libc.so.6
> #2 0x00007f04d2a9d6c5 in ceph::__ceph_assert_fail(char const*, char
> const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.0
> #3 0x00007f04d2a9d844 in ceph::__ceph_assert_fail(ceph::assert_data
> const&) () from /usr/lib64/ceph/libceph-common.so.0
> #4 0x00007f04cc807f04 in Client::_lookup_name(Inode*, Inode*, UserPerm
> const&) () from /lib64/libcephfs.so.2
> #5 0x00007f04cc81c41f in Client::ll_lookup_inode(inodeno_t, UserPerm
> const&, Inode**) () from /lib64/libcephfs.so.2
> #6 0x00007f04ccadbf0e in create_handle (export_pub=0x1baff10,
> desc=<optimized out>, pub_handle=0x7f0470fd4718,
> attrs_out=0x7f0470fd4740) at
> /usr/src/debug/nfs-ganesha-2.7.3/FSAL/FSAL_CEPH/export.c:256
> #7 0x0000000000523895 in mdcache_locate_host (fh_desc=0x7f0470fd4920,
> export=export@entry=0x1bafbf0, entry=entry@entry=0x7f0470fd48b8,
> attrs_out=attrs_out@entry=0x0)
> at
> /usr/src/debug/nfs-ganesha-2.7.3/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1011
> #8 0x000000000051d278 in mdcache_create_handle (exp_hdl=0x1bafbf0,
> fh_desc=<optimized out>, handle=0x7f0470fd4900, attrs_out=0x0) at
> /usr/src/debug/nfs-ganesha-2.7.3/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1578
> #9 0x000000000046d404 in nfs4_mds_putfh
> (data=data@entry=0x7f0470fd4ea0) at
> /usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_op_putfh.c:211
> #10 0x000000000046d8e8 in nfs4_op_putfh (op=0x7f03effaf1d0,
> data=0x7f0470fd4ea0, resp=0x7f03ec1de1f0) at
> /usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_op_putfh.c:281
> #11 0x000000000045d120 in nfs4_Compound (arg=<optimized out>,
> req=<optimized out>, res=0x7f03ec1de9d0) at
> /usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_Compound.c:942
> #12 0x00000000004512cd in nfs_rpc_process_request
> (reqdata=0x7f03ee5ed4b0) at
> /usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_worker_thread.c:1328
> #13 0x0000000000450766 in nfs_rpc_decode_request (xprt=0x7f02180c2320,
> xdrs=0x7f03ec568ab0) at
> /usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
> #14 0x00007f04df45d07d in svc_rqst_xprt_task (wpe=0x7f02180c2538) at
> /usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:769
> #15 0x00007f04df45d59a in svc_rqst_epoll_events (n_events=<optimized
> out>, sr_rec=0x4bb53e0) at
> /usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:941
> #16 svc_rqst_epoll_loop (sr_rec=<optimized out>) at
> /usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1014
> #17 svc_rqst_run_task (wpe=0x4bb53e0) at
> /usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1050
> #18 0x00007f04df465123 in work_pool_thread (arg=0x7f044c0008c0) at
> /usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/work_pool.c:181
> #19 0x00007f04dda05dd5 in start_thread () from /lib64/libpthread.so.0
> #20 0x00007f04dcb7dead in clone () from /lib64/libc.so.6
>
> Package versions:
>
> nfs-ganesha-2.7.3-0.1.el7.x86_64
> nfs-ganesha-ceph-2.7.3-0.1.el7.x86_64
> libcephfs2-14.2.1-0.el7.x86_64
> librados2-14.2.1-0.el7.x86_64
>
> I notice in my Ceph log I have a bunch of slow requests around the time
> it went down, I'm not sure if it's a symptom of Ganesha segfaulting or
> if it was a contributing factor.
>
> Thanks,
> David
>
>
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel(a)lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
5 years, 3 months
GPFS LogCrit on lock_op2
by Frank Filz
On the call, I mentioned I would look at bypassing permission check for the
file owner for the open_func call in fsal_find_fd with open_for_locks.
It turns out there is a difference between FSAL_GPFS and FSAL_VFS
FSAL_VFS makes the ultimate call to open_by_handle as root, and therefor
even a non-owner of the file will not be an issue in opening the file
read/write.
GPFS calls GPFSFSAL_open which calls fsal_set_credentials so if the
permissions do not allow read/write when open_for_locks occurs, then the
file will instead be opened in the same mode as the OPEN stateid.
I think it would be good to evaluate when GPFSFSAL_open actually needs to be
called, and whether open_func should make a more direct call to
fsal_internal_handle2fd.
Frank
5 years, 4 months
FD hard limit exceeded
by Gin Tan
We are trying to figure out the hard limit for the FD, does nfs ganesha
impose a limit?
At the moment we are seeing these errors:
25/07/2019 19:27:59 : epoch 5d393d8c : nas2 : ganesha.nfsd-1681[cache_lru]
lru_run :INODE LRU :WARN :Futility count exceeded. Client load is opening
FDs faster than the LRU thread can close them.
25/07/2019 19:28:16 : epoch 5d393d8c : nas2 : ganesha.nfsd-1681[cache_lru]
lru_run :INODE LRU :WARN :Futility count exceeded. Client load is opening
FDs faster than the LRU thread can close them.
25/07/2019 19:28:34 : epoch 5d393d8c : nas2 : ganesha.nfsd-1681[svc_54]
mdcache_lru_fds_available :INODE LRU :CRIT :FD Hard Limit Exceeded, waking
LRU thread.
25/07/2019 19:29:02 : epoch 5d393d8c : nas : ganesha.nfsd-1681[svc_64]
mdcache_lru_fds_available :INODE LRU :CRIT :FD Hard Limit Exceeded, waking
LRU thread.
The system limit:
$ cat /proc/sys/fs/nr_open
2097152
And the limit for nfs-ganesha process:
$ cat /proc/1681/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 385977 385977
processes
Max open files 2097152 2097152 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 385977 385977 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
And the number of open files is
# ls /proc/1681/fd | wc -w
12739
I don't see why we are hitting the FD limit as we only have 12739 FD count.
It is impacting the NFS clients right now, file creation is fine but can't
open an existing file to write.
I'm using VFS FSAL, and the software versions are:
nfs-ganesha-2.7.5-1.el7.x86_64
nfs-ganesha-vfs-2.7.5-1.el7.x86_64
Thanks.
Gin
5 years, 5 months
Change in ...nfs-ganesha[next]: python3: ganesha_stats: Fix "TypeError: '<' not supported between ins...
by Madhu Thorat (GerritHub)
Madhu Thorat has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/463535 )
Change subject: python3: ganesha_stats: Fix "TypeError: '<' not supported between instances of 'str' and 'int'"
......................................................................
python3: ganesha_stats: Fix "TypeError: '<' not supported between instances of 'str' and 'int'"
"TypeError: '<' not supported between instances of 'str' and 'int'"
python3 gives above error for commands like:
ganesha_stats iov3 0, ganesha_stats iov4 0, ganesha_stats pnfs 0
ganesha_stats total 0.
Fixed by casting export_id to an int.
Change-Id: I9e5df51c8afc7a2e5dbf5d42f033acdc87ac5993
Signed-off-by: Madhu Thorat <madhu.punjabi(a)in.ibm.com>
---
M src/scripts/ganeshactl/Ganesha/glib_dbus_stats.py
M src/scripts/ganeshactl/ganesha_stats.py
2 files changed, 2 insertions(+), 2 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/35/463535/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/463535
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I9e5df51c8afc7a2e5dbf5d42f033acdc87ac5993
Gerrit-Change-Number: 463535
Gerrit-PatchSet: 1
Gerrit-Owner: Madhu Thorat <madhu.punjabi(a)in.ibm.com>
Gerrit-MessageType: newchange
5 years, 5 months
Should Ganesha collect latency stats for client ?
by Sachin Punadikar
Hello All,
I observed that Ganesha collect latencies for read, write & layout (NFSv4.1
& above) operations, while collecting stats per client & per export.
I can understand & appreciate the latency collection for "per export", but
for "per client" it is not suitable. Latency is not client specific hence
IMHO it should be removed.
I have posted below patch for the same.
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/463529
--
with regards,
Sachin Punadikar
5 years, 5 months
Change in ...nfs-ganesha[next]: Do not count latency for clients
by Sachin Punadikar (GerritHub)
Sachin Punadikar has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/463529 )
Change subject: Do not count latency for clients
......................................................................
Do not count latency for clients
Currently Ganesha keeps track of latency for read, write and layout
(for NFSv4.1 onwards) operations. Ideally we should not collect latency
related statistics for client, as latency can not be client specific.
Instead generic stats as number of operation, errors should be collected.
Change-Id: I708119b413bb6a777ed877706513211aa879ca22
Signed-off-by: Sachin Punadikar <psachin(a)in.ibm.com>
---
M src/support/server_stats.c
1 file changed, 103 insertions(+), 24 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/29/463529/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/463529
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I708119b413bb6a777ed877706513211aa879ca22
Gerrit-Change-Number: 463529
Gerrit-PatchSet: 1
Gerrit-Owner: Sachin Punadikar <psachin(a)in.ibm.com>
Gerrit-MessageType: newchange
5 years, 5 months
Announce Push of V2.9-dev.2
by Frank Filz
Branch next
Tag:V2.9-dev.2
Release Highlights
* Many patches to make string buffer use safer
* MDCACHE make cih_hash_key and cih_set_latched void
* Provide stats collection start time and duration
* reset should update stats couting start time
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
5671f72 Frank S. Filz V2.9-dev.2
3204e25 Sachin Punadikar reset should update stats couting start time
18db505 Sachin Punadikar Provide stats collection start time and duration
2b17cad Frank S. Filz MDCACHE make cih_hash_key and cih_set_latched void
d4b3079 Frank S. Filz Add fd counts to customer visible log messages about
open_fd_count
f1a4e1c Frank S. Filz Make sure utf8string is NUL terminated everywhere
ab992aa Frank S. Filz Use (void) cast on sprintf that should be safe.
7988b16 Frank S. Filz Use (void) cast on snprintf that should be safe
011d160 Frank S. Filz Use display_buffer for logging file handles, add new
Log macros
08453db Frank S. Filz Fix LogDebugAlt etc to only log component and log
level once
a3d5e3a Frank S. Filz Use display_opaque_bytes instead of sprint_mem for
clientid4
7cb6738 Frank S. Filz EXPRTS: make LogClientListEntry more efficient
60d682d Frank S. Filz Miscellaneous safer or more efficient string copy
8229312 Frank S. Filz Replace strmaxcpy with checked calls to strlcpy
0008876 Frank S. Filz PROXY: Make safer string handling and use
display_opaque_bytes_flags
2e161cf Frank S. Filz Convert hashtable to use display_buffer
f9a02fc Frank S. Filz Add and use gsh_strdupa instead of alloca(strlen());
strcpy();
860fc84 Frank S. Filz cleaner string handling in sm_notify
6861d8e Frank S. Filz 9P: Safe string handling
757c82c Frank S. Filz cleanup gss_credcache - safe string functions, connect
logging
0677057 Frank S. Filz Change string parsing in setup_client_saddr to allow
more address formats
1f70e76 Frank S. Filz MainNFSD: safer or more efficient string manipulation
744efc9 Frank S. Filz Rados revovery - reduce RADOS_KEY_MAX_LEN to 21
cd0d08f Frank S. Filz SAL recovery - safe string manipulation
19a1157 Frank S. Filz Add gsh concat functions to safely concatenate strings
7bdd1ea Frank S. Filz VFS: safer string copy in xattrs.c
c464634 Frank S. Filz GLUSTER: use memcpy instead of strcpy in lookup_path
and use less memory
b66d9d8 Frank S. Filz GLUSTER: get rid of dead code - fs_specific_has
9e806a7 Frank S. Filz Rework client_mgr to use rpc_tools.c/gsh_rpc.h and
related cleanup
f9a09c2 Frank S. Filz Add full support of AF_VSOCK to rpc_tools.c and
gsh_rpc.h
1fc5698 Frank S. Filz Clean up rpc_tools.c including fixingup sprint_sockip
etc.
dc2c0f1 Frank S. Filz nfs_ip_name cache: Larger hostname buffer and other
fixes
49d2aff Frank S. Filz display_buffer improvements
5 years, 6 months