Change in ...nfs-ganesha[next]: Implement a single layer of lookup_path.
by Solomon Boulos (GerritHub)
Solomon Boulos has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487442 )
Change subject: Implement a single layer of lookup_path.
......................................................................
Implement a single layer of lookup_path.
Now correctly fills in the attributes from the response, handles root lookups
with GETATTR3, and attempts to handle "." and ".." path lookups. It's unclear
that any of this state management (holding onto pointers, mostly) is safe.
Change-Id: Ieab9e84c317e267fc876ecd91cbf022856546255
Signed-off-by: Solomon Boulos <boulos(a)google.com>
---
M src/FSAL/FSAL_PROXY_V3/CMakeLists.txt
M src/FSAL/FSAL_PROXY_V3/main.c
M src/FSAL/FSAL_PROXY_V3/proxyv3_fsal_methods.h
M src/FSAL/FSAL_PROXY_V3/rpc.c
A src/FSAL/FSAL_PROXY_V3/utils.c
5 files changed, 421 insertions(+), 26 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/42/487442/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487442
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ieab9e84c317e267fc876ecd91cbf022856546255
Gerrit-Change-Number: 487442
Gerrit-PatchSet: 1
Gerrit-Owner: Solomon Boulos <boulos(a)google.com>
Gerrit-MessageType: newchange
4 years, 9 months
Change in ...nfs-ganesha[next]: Pass along user credentials when supplied.
by Solomon Boulos (GerritHub)
Solomon Boulos has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487441 )
Change subject: Pass along user credentials when supplied.
......................................................................
Pass along user credentials when supplied.
It's pretty easy to accept Ganesha's pre-computed user credentials
(uid, gid, etc.). This was the moment to finally start hoisting init
for the RPC client stuff into requiring an init function for one-time
setup as well.
Change-Id: I33d3372d5304368b9098f2893b27506cd009ebfb
Signed-off-by: Solomon Boulos <boulos(a)google.com>
---
M src/FSAL/FSAL_PROXY_V3/main.c
M src/FSAL/FSAL_PROXY_V3/proxyv3_fsal_methods.h
M src/FSAL/FSAL_PROXY_V3/rpc.c
3 files changed, 51 insertions(+), 21 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/41/487441/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487441
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I33d3372d5304368b9098f2893b27506cd009ebfb
Gerrit-Change-Number: 487441
Gerrit-PatchSet: 1
Gerrit-Owner: Solomon Boulos <boulos(a)google.com>
Gerrit-MessageType: newchange
4 years, 9 months
Change in ...nfs-ganesha[next]: MOUNT3 and LOOKUP3 partly working hardcoded inputs
by Solomon Boulos (GerritHub)
Solomon Boulos has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487440 )
Change subject: MOUNT3 and LOOKUP3 partly working hardcoded inputs
......................................................................
MOUNT3 and LOOKUP3 partly working hardcoded inputs
The first thing the NFS Proxy has to do is run MOUNT. Normally this
involves a trip through the portmapper to guess the port (2050 by
default), so we just skip all that and issue the RPCs we need
directly.
To verify that the result is even remotely correct though, the next
thing Ganesha wants is LOOKUP. So, even though this change doesn't
fill in the result from the LOOKUP, nor even pass through credentials
properly, it does correctly send the root filehandle returned from
MOUNT (verified via watching tshark).
Change-Id: I0dd3f2129c1dae0079aab7982734a830a2430c39
Signed-off-by: Solomon Boulos <boulos(a)google.com>
---
M src/FSAL/FSAL_PROXY_V3/CMakeLists.txt
M src/FSAL/FSAL_PROXY_V3/main.c
M src/FSAL/FSAL_PROXY_V3/proxyv3_fsal_methods.h
A src/FSAL/FSAL_PROXY_V3/rpc.c
4 files changed, 444 insertions(+), 6 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/40/487440/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487440
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I0dd3f2129c1dae0079aab7982734a830a2430c39
Gerrit-Change-Number: 487440
Gerrit-PatchSet: 1
Gerrit-Owner: Solomon Boulos <boulos(a)google.com>
Gerrit-MessageType: newchange
4 years, 9 months
Is there any limit on DBUS message size ?
by Sachin Punadikar
Hello All,
I am facing a crash in DBUS related APIs. I am sending below data from
Ganesha server to ganesha_stats utility.
Array of 20 elements consisting of { DBUS_TYPE_STRING, DBUS_TYPE_UINT64,
DBUS_TYPE_UINT64, DBUS_TYPE_UINT64 }
Array of 23 elements consisting of { DBUS_TYPE_STRING, DBUS_TYPE_UINT64,
DBUS_TYPE_UINT64, DBUS_TYPE_UINT64 }
Array of 72 elements consisting of { DBUS_TYPE_STRING, DBUS_TYPE_UINT64,
DBUS_TYPE_UINT64 }
I am getting below stack trace.
(gdb) bt
#0 0x00007fb868109c1f in raise () from /lib64/libpthread.so.0
#1 0x0000000000443421 in crash_handler (signo=11, info=0x7fb860ea4130,
ctx=0x7fb860ea4000)
at
/usr/src/debug/gpfs.nfs-ganesha-2.7.5-ibm055.01.el8.x86_64/MainNFSD/nfs_init.c:244
#2 <signal handler called>
#3 0x00007fb86950e01d in _dbus_marshal_read_uint32 () from
/lib64/libdbus-1.so.3
#4 0x00007fb86950e92a in _dbus_marshal_skip_basic () from
/lib64/libdbus-1.so.3
#5 0x00007fb8694f8cb3 in base_reader_next () from /lib64/libdbus-1.so.3
#6 0x00007fb8694f8b9f in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#7 0x00007fb8694f8c70 in base_reader_next () from /lib64/libdbus-1.so.3
#8 0x00007fb8694f8d4d in struct_reader_next () from /lib64/libdbus-1.so.3
#9 0x00007fb8694f8b9f in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#10 0x00007fb8694f8e78 in array_reader_next () from /lib64/libdbus-1.so.3
#11 0x00007fb8694f8b9f in _dbus_type_reader_next () from
/lib64/libdbus-1.so.3
#12 0x00007fb8694f6e73 in _dbus_header_cache_revalidate () from
/lib64/libdbus-1.so.3
#13 0x00007fb8694f7796 in _dbus_header_get_field_raw () from
/lib64/libdbus-1.so.3
#14 0x00007fb8694fbe0f in _dbus_message_iter_open_signature.part.4 () from
/lib64/libdbus-1.so.3
#15 0x00007fb8694fdde8 in dbus_message_iter_append_basic () from
/lib64/libdbus-1.so.3
#16 0x000000000051cfe2 in server_dbus_client_all_ops (iter=0x7fb860ea5150,
client=0x7fb834002de0)
at
/usr/src/debug/gpfs.nfs-ganesha-2.7.5-ibm055.01.el8.x86_64/support/server_stats.c:2074
#17 0x000000000044901d in gsh_client_all_ops (args=0x7fb860ea51e0,
reply=0x1a446f0, error=0x7fb860ea5230)
at
/usr/src/debug/gpfs.nfs-ganesha-2.7.5-ibm055.01.el8.x86_64/support/client_mgr.c:704
#18 0x000000000055f883 in dbus_message_entrypoint (conn=0x1a44380,
msg=0x1a44540, user_data=0x7efdd0 <cltmgr_interfaces>)
at
/usr/src/debug/gpfs.nfs-ganesha-2.7.5-ibm055.01.el8.x86_64/dbus/dbus_server.c:560
#19 0x00007fb869502be8 in _dbus_object_tree_dispatch_and_unlock () from
/lib64/libdbus-1.so.3
#20 0x00007fb8694f3384 in dbus_connection_dispatch () from
/lib64/libdbus-1.so.3
#21 0x00007fb8694f3748 in _dbus_connection_read_write_dispatch () from
/lib64/libdbus-1.so.3
#22 0x0000000000560433 in gsh_dbus_thread (arg=0x0) at
/usr/src/debug/gpfs.nfs-ganesha-2.7.5-ibm055.01.el8.x86_64/dbus/dbus_server.c:796
#23 0x00007fb8680ff2de in start_thread () from /lib64/libpthread.so.0
#24 0x00007fb867a0ca63 in clone () from /lib64/libc.so.6
If I reduce the number of elements being sent, then everything works fine.
So looks like hitting some DBUS message size limit.
May I know is there any way to increase the DBUS message size ?
--
with regards,
Sachin Punadikar
4 years, 9 months
Ganesh V4-dev.9 delayed
by Frank Filz
With the crazy that this week has been, and only two patches to merge, I'm
going to delay a merge until next week.
Thanks
Frank
4 years, 9 months
Change in ...nfs-ganesha[next]: If nfs_start_grace get called with refs taken, like after nfs_get_gra...
by Gaurav (GerritHub)
Gaurav has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487204 )
Change subject: If nfs_start_grace get called with refs taken, like after nfs_get_grace_status and before nfs_put_grace_status. Then we will have grace_status with GRACE_STATUS_CHANGE_REQ until next nfs_start_grace, till then some of RPCs will send JUKEBOX. I don't see reaper taking any action if GRACE_STATUS_CHANGE_REQ is set.
......................................................................
If nfs_start_grace get called with refs taken,
like after nfs_get_grace_status and before nfs_put_grace_status.
Then we will have grace_status with GRACE_STATUS_CHANGE_REQ until
next nfs_start_grace, till then some of RPCs will send JUKEBOX.
I don't see reaper taking any action if GRACE_STATUS_CHANGE_REQ is set.
We should fail the nfs_start_grace, if there are still refs and we were
not in grace. Let the caller retry it.
Reset the GRACE_STATUS_CHANGE_REQ else it will be set till next nfs_start_grace.
Change-Id: I19bba7a1150fd7ac1e61e2a597263ea78fc6b730
Signed-off-by: Gaurav Gangalwar <gaurav.gangalwar(a)gmail.com>
---
M src/MainNFSD/nfs_admin_thread.c
M src/SAL/nfs4_recovery.c
M src/include/sal_functions.h
3 files changed, 20 insertions(+), 6 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/04/487204/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487204
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I19bba7a1150fd7ac1e61e2a597263ea78fc6b730
Gerrit-Change-Number: 487204
Gerrit-PatchSet: 1
Gerrit-Owner: Gaurav <gaurav.gangalwar(a)gmail.com>
Gerrit-MessageType: newchange
4 years, 9 months
CephFS with active-active NFS Ganesha
by Michael Bisig
Hi all,
I move this issue/question from ceph-users to nfs-ganesha devel list as requested by Daniel Gryniewicz. (thanks for pointing me to that list)
I might have a configuration issue or at least a non-optimal working ganesha cluster. You might help me. :)
But I am not sure if my problem is by design, a bug or just a configuration issue. Anyway, thanks in advance for your help and time!
Specs:
Ceph v14.2.8
Ganesha v3.0 within Docker container running on Ubuntu 18.04 Image
Config: Please find attached my configuration. (other values are default such as GracePeriod or LeaseTime)
Setup:
Two running Ganesha daemons which I configured in the grace db (with rados_cluster backend). The db lies in the cephfsmetadata pool in a separate namespace. I add two nodes to the db using:
ganesha-rados-cluster add a
ganesha-rados-cluster add b (for sure on the right pool and ns in Ceph)
Both daemons can read/write to the db, and this is fine. They can also clean up rec-XX files after a restart (meaning deleting them if they are outdated). I can mount the nfs exposed path over both daemons. So far so good!
Problem:
When I turn off one daemon (e.g. b ), i.e. stopping the container, the shutdown works smoothly and the db finally shows:
a E
b NE
I assume that all clients connected to b are stale. But I experience that also all clients to a are stale (or at least most tasks). Meaning that I cannot read nor write to the mounted filesystem. But I can ls the mountpoint what means that it is not completely broken. This cluster state is not cleaned up, so waiting for 5 mins did not change the behavior over ganesha a. I would assume that at least after some periods the clients connected to daemon a can read/write as usual. Also the db, entries do not change.
If daemon b crashes (instead of shutdown). The clients connected to daemon a can still read/write and are not affected by the crash of b. So this is fine for a crash situation. This is probably related to the fact that daemon b cannot set the NEED flag in the db. After a while, the running daemon a shows a heartbeat warning, what is certainly expected and a very handy message to let you know that something in the cluster is shaky.
Expectation:
I would expect that a proper shutdown off one daemon does not affect the clients connected to the running ganesha a.
Logs are very clean:
# Situation where I stopped daemon b
11/03/2020 15:46:06 : epoch 5e68d0c3 : a : ganesha.nfsd-1[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
11/03/2020 15:46:31 : epoch 5e68d0c3 : a : ganesha.nfsd-1[reaper] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
--> and hear it hangs (so no GRACE lift appears, even waiting for 5-10mins what is not nice in an active-active environment)
Once I start the daemon again, everything works like a charm! And the logs show only ONE additional line (compared to above):
11/03/2020 15:46:06 : epoch 5e68d0c3 : a : ganesha.nfsd-1[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
11/03/2020 15:46:31 : epoch 5e68d0c3 : a : ganesha.nfsd-1[reaper] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
11/03/2020 15:54:53 : epoch 5e68d0c3 : a : ganesha.nfsd-1[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
I do not have more informative logs (using default log-level FULL_DEBUG) with warnings or errors, everything seems to work just fine!
Any explanation might help to understand the situation.
Kind regards,
Michael
4 years, 9 months
FW: 2nd Announcement: Spring 2020 NFS Bake-a-thon
by Frank Filz
-----Original Message-----
From: linux-nfs-owner(a)vger.kernel.org
[mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Bill Baker
Sent: Wednesday, March 11, 2020 2:57 PM
To: nfsv4(a)ietf.org; linux-nfs <linux-nfs(a)vger.kernel.org>;
fall-2019-bakeathon(a)googlegroups.com
Subject: 2nd Announcement: Spring 2020 NFS Bake-a-thon
Greetings,
Oracle plans on hosting the Spring 2020 Bake-a-thon at the Oracle offices in
Ann Arbor, MI. The event will run from the morning of May 11th through
Friday, May 15th, 2020.
That said, we know that everyone is concerned about COVID-19 and some
companies are restricting travel. Therefore, we need to know how many
people still plan to attend to see if we still have a quorum for holding the
event.
If you plan to attend, please let me know ASAP. If you typically attend or
had planned to attend, but cannot, let me know that too, so we can get an
estimate of the expected turnout.
Watch these lists for updates in the near future.
--
Bill Baker - Oracle NFS development
4 years, 9 months
Implementing RFC 5661 Section 9.5. Issues with Multiple Open-Owners
by Frank Filz
Currently the way Ganesha handles lock owners for NFS v4.x works for many
cases, but there are issues.
Ganesha links each lock owner with an open owner. It also links each lock
stateid with an open stateid. There can be multiple lock owners per open
owner and in parallel multiple lock stateids per open stateid. But we do not
support a single lock owner tying to multiple open owners as RFC 5661
Section 9.5. Issues with Multiple Open-Owners details.
One thing we could do is stop associating a lock owner with an open owner.
Looking through the code, this linkage seems to be only used in logging.
However, we need a per lock owner per file entity to hang lock state off of.
This looks like a stateid, however, Section 9.5 suggests there could be
multiple stateids for each lock owner/file pair, and locks SHOULD be
associated with the most recent stateid they were acquired with.
A problem for Ganesha with this is that FSAL_VFS has an open file
description (and file descriptor) per lock stateid. This is what it ties
locks (using the kernel OFD Locks) to represent the different lock owners.
But we then can't move a lock from one stateid to another (or at least we
can't move an exclusive lock). So while we COULD have one stateid per lock
owner/open owner/file tuple, it doesn't help us.
One possibility is we create an additional object that represents a lock
owner/file association. That entity holds all the FSAL locks for a given
lock owner/file association, while individual locks are also tied to a
specific stateid in the SAL layer. The SAL could easily handle moving locks
from one stateid to another when the stateids represent the same lock owner.
This solution would add an additional ref-counted object. This new object
would be another form of state_t to keep the FSAL logic the same, but would
no longer have an associated open state (which is fine, vfs_lock_op2 should
never need the open state). Each lock state would however refer to the lock
owner state, and thus if a lock state was passed to a read2 or write2 FSAL
method, the FSAL could use any of 3 file descriptors (the open state, the
lock state, or the new lock owner). The lock state though might not have an
fd associated with it.
I'm trying to think of another way to implement this, and so far this sounds
like the only path that will work.
Frank
4 years, 9 months