lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK
by Kinglong Mee
The latest ganesha/gluster supports seek according to,
https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#section-15.11
From the given sa_offset, find the next data_content4 of type sa_what
in the file. If the server can not find a corresponding sa_what,
then the status will still be NFS4_OK, but sr_eof would be TRUE. If
the server can find the sa_what, then the sr_offset is the start of
that content. If the sa_offset is beyond the end of the file, then
SEEK MUST return NFS4ERR_NXIO.
For a file's filemap as,
Part 1: HOLE 0x0000000000000000 ---> 0x0000000000600000
Part 2: DATA 0x0000000000600000 ---> 0x0000000000700000
Part 3: HOLE 0x0000000000700000 ---> 0x0000000001000000
SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1, sr_offset:0x70000) from ganesha/gluster;
SEEK(0x700000, SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from ganesha/gluster.
If an application depends the lseek result for data searching, it may enter infinite loop.
while (1) {
next_pos = lseek(fd, cur_pos, seek_type);
if (seek_type == SEEK_DATA) {
seek_type = SEEK_HOLE;
} else {
seek_type = SEEK_DATA;
}
if (next_pos == -1) {
return ;
cur_pos = next_pos;
}
The lseek syscall always gets 0x70000 from nfs client for those two cases,
but, if underlying filesystem is ext4/f2fs, or the nfs server is knfsd,
the lseek(0x700000, SEEK_DATA) gets ENXIO.
I wanna to know,
should I fix the ganesha/gluster as knfsd return ENXIO for the first case?
or should I fix the nfs client to return ENXIO for the first case?
thanks,
Kinglong Mee
4 years, 4 months
Correct usage of nfs3_read_cb callback?
by Bjorn Leffler
I'm implementing a new FSAL. At the end of a successful read2() function, I
call the callback as follows:
void myfsal_read2(struct fsal_obj_handle *obj_hdl,
bool bypass,
fsal_async_cb done_cb,
struct fsal_io_arg *read_arg,
void *caller_arg){
// ... read data ...
fsal_status_t status = fsalstat(ERR_FSAL_NO_ERROR, 0);
done_cb(obj_hdl, status, read_arg, caller_arg);
}
This generates the following error in src/Protocols/NFS/nfs_proto_tools.c,
line 213:
nfs_RetryableError :NFS3 :CRIT :Possible implementation error:
ERR_FSAL_NO_ERROR managed as an error
From the client side, read / write operations work as expected. If I don't
call the callback function, the NFS operation doesn't complete.
What are the correct usage of the callback functions, after successful
operations?
Thanks,
Bjorn
5 years, 10 months
Announce Push of V2.8-dev.20
by Frank Filz
Branch next
Tag:V2.8-dev.20
Release Highlights
* MDCACHE - Use atomics for readdir flags
* [GPFS] Sanity checks for max no. of V4 ACE's supported
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
c2da133 Frank S. Filz V2.8-dev.20
f8a75a9 Trishali Nayar [GPFS] Sanity checks for max no. of V4 ACE's
supported
23c05a5 Daniel Gryniewicz MDCACHE - Use atomics for readdir flags
5 years, 10 months
Re: Unable to Configure Vanilla NFS Ganesha with XFS
by Daniel Gryniewicz
Hi.
I've moved this to the mailing list (devel(a)lists.nfs-ganesha.org) rather
than the admin email address.
What version of Ganesha is this, and where did you get the packages?
Daniel
On 2/26/19 1:44 AM, Nilesh Somani wrote:
> Hi,
>
> I'm trying to configure NFS-Ganesha with XFS. I'm getting the following
> error when I start the service.
> Inkedganesha_log_LI.jpg
> I've already installed nfs-ganesha-xfs package. The output of symbol
> list is as follows:
>
> nm_output.JPG
>
> Could you please let me know how can I solve this? Is there anything
> that I'm doing wrong or missing? I'm using the XFS configuration
> provided as part of sample_configs.
>
> Thank you.
>
> Regards,
> Nilesh Somani
5 years, 10 months
Re: [nfsv4] file size and getattr
by Trond Myklebust
On Wed, 2019-02-27 at 00:13 +0000, Rick Macklem wrote:
> Trond Myklebust wrote:
> [stuff snipped]
> > Please see the Errata ID 2751
> > http://www.rfc-editor.org/errata/eid2751
>
> I'll admit I hadn't seen this errata before. However, it seems to be
> specific to
> the File Layout. For the Flexible File Layout...
>
> When I look in RFC-8435, I cannot find anything that states that a
> LayoutCommit
> is only required for case(s) where a Commit to the Storage Server is
> required.
> Sec. 2.1
> Clearly states that a Commit to the Storage Server is required
> before the client
> does a LayoutCommit when the write(s) were not done FILE_SYNC.
> However, I do not see any indication that the LayoutCommit is not
> to be done
> for the case where the write(s) are done FILE_SYNC.
>
> FF_FLAGS_NO_LAYOUTCOMMIT can be used to indicate to a client that
> LayoutCommits are not required, but this does not be dependent on how
> the write(s) to the Storage Server were done.
>
> The only way a Flexible File layout Metadata server can know what the
> current file size is (when a read/write layout is issued to a client)
> is to do a
> Getattr to the Storage Server.
> If a client is not required to do a LayoutCommit when the write(s) to
> the
> Storage Server are done FILE_SYNC, then the Metadata server must do
> Getattr RPCs to the Storage Server whenever it needs an up to date
> file size
> if a read/write layout is issued to a client.
>
> This can result in a lot of overhead that can be avoided by requiring
> the
> LayoutCommit to be done by a client after writing to a Storage
> Server,
> irrespective of the need for a Commit to the Storage Server.
> As such, I would rather not have this errata applied to RFC-8435.
>
Fair enough. I agree that the errata in question only applies to the
pNFS files layout, however you were talking about RFC5661 and whether
or not we were interpreting that correctly. Since RFC5661 only refers
to about the behaviour of the pNFS files layout, then I assumed that
was what you were referring to.
For flexfiles we may have a bug in the layoutcommit case. However note
that the counter argument to what you state above is that _if_ the
server requires a layoutcommit before it will acknowledge a file size
change, then pNFS is likely to underperform for applications such as
databases or VMs where each record is required to be written in stable
mode.
IOW: If all writes that need to be stable are also required to be
acknowledged with a layoutcommit (to the MDS), then your ability to
scale out your server will be in doubt.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust(a)hammerspace.com
5 years, 11 months
Re: [nfsv4] file size and getattr
by Rick Macklem
Trond Myklebust wrote:
[stuff snipped]
> Please see the Errata ID 2751 http://www.rfc-editor.org/errata/eid2751
I'll admit I hadn't seen this errata before. However, it seems to be specific to
the File Layout. For the Flexible File Layout...
When I look in RFC-8435, I cannot find anything that states that a LayoutCommit
is only required for case(s) where a Commit to the Storage Server is required.
Sec. 2.1
Clearly states that a Commit to the Storage Server is required before the client
does a LayoutCommit when the write(s) were not done FILE_SYNC.
However, I do not see any indication that the LayoutCommit is not to be done
for the case where the write(s) are done FILE_SYNC.
FF_FLAGS_NO_LAYOUTCOMMIT can be used to indicate to a client that
LayoutCommits are not required, but this does not be dependent on how
the write(s) to the Storage Server were done.
The only way a Flexible File layout Metadata server can know what the
current file size is (when a read/write layout is issued to a client) is to do a
Getattr to the Storage Server.
If a client is not required to do a LayoutCommit when the write(s) to the
Storage Server are done FILE_SYNC, then the Metadata server must do
Getattr RPCs to the Storage Server whenever it needs an up to date file size
if a read/write layout is issued to a client.
This can result in a lot of overhead that can be avoided by requiring the
LayoutCommit to be done by a client after writing to a Storage Server,
irrespective of the need for a Commit to the Storage Server.
As such, I would rather not have this errata applied to RFC-8435.
rick
5 years, 11 months
file size and getattr
by Marc Eshel
What is the file size returned by the NFS server for getattr after an
unstable write to the NFS client that did the write or to other NFS
clients.
As far as I know most file systems will always return the file size that
includes the unstable writes.
Does the NFSv4 spec allows the server to return file size that doesn't
include the unstable write to the writer or any other NFS client?
Marc.
5 years, 11 months
Change in ...nfs-ganesha[next]: [GPFS] Sanity checks for max no. of V4 ACE's supported
by Name of user not set (GerritHub)
ntrishal(a)in.ibm.com has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446242
Change subject: [GPFS] Sanity checks for max no. of V4 ACE's supported
......................................................................
[GPFS] Sanity checks for max no. of V4 ACE's supported
GPFS can support maximum of 638 entries for NFS V4 ACLs.
Added sanity checks for same.
Change-Id: I1dc4aade8a08ed93e8201dcadd9e90f0a3cfa067
Signed-off-by: Trishali Nayar <ntrishal(a)in.ibm.com>
---
M src/FSAL/FSAL_GPFS/fsal_convert.c
M src/FSAL/FSAL_GPFS/fsal_internal.c
M src/FSAL/FSAL_GPFS/fsal_internal.h
3 files changed, 21 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/42/446242/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/446242
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I1dc4aade8a08ed93e8201dcadd9e90f0a3cfa067
Gerrit-Change-Number: 446242
Gerrit-PatchSet: 1
Gerrit-Owner: ntrishal(a)in.ibm.com
Gerrit-MessageType: newchange
5 years, 11 months
Re: [nfsv4] file size and getattr
by Rick Macklem
I wrote:
[stuff snipped]
>For the pNFS case, when I implemented a pNFS server for FreeBSD, I assumed
>that the Metadata server only needed to return a correct size for a file
>to a client doing a Getattr with a read/write layout after it did a LayoutCommit.
>This implementation worked ok for the FreeBSD client, but did not work
>correctly for the Linux-4.17-rc2 kernel.
>To fix the server implementation to interoperate with the above Linux kernel,
>I had to add a tunable that makes the server get an up to date size for the
>Getattr operation for a client when a read/write layout is issued to the client.
>Just to be clear, I am referring to the case where the Getattr was done by the
>client that holds the read/write layout and not another client.
>(This does result in additional overheads whenever a client holds a read/write
> layout for the file.)
>I can't remember exactly how the Linux client failed, but I suspect it would
>see a premature EOF when the size returned by the MDS was smaller than
>the actual size of the file on the DS.
>
>I honestly don't know if this is a bug in the Linux client or a misinterpretation
>of RFC5661?
A little more info on this.
If I recall correctly, the Linux client only did LayoutCommits after Commits.
As such, no LayoutCommit were done after writes to a DS that were FILE_SYNC
(or stable, if you prefer). Without the LayoutCommit, the pNFS server didn't
know when it needed to get an up to date size.
The FreeBSD client does a LayoutCommit after writing to a DS whenever it is done
writing, such as an fsync() or close() syscall on the file or an unlock of a write lock
on the file.
rick
5 years, 11 months
Still seeing readdir issues in 2.7.1
by Ashish Sangwan
Hi Dang,
We are still seeing ls hang issue (which was supposed to be fixed by
b8fe6364c61c4ac0c086c67b4d06685beb55bc5f), on top of that there is one
more issue, sometimes we are getting EOF pre-maturely for readdir.
For example, while running rm -rf *, even though the command finishes
successfully, doing a subsequent ls again in the same directory gives
new listing (includes entries which were not seen previously). It
seems both of these are issue are related.
Do you think the below patch will help? It's untested (yet) and a shot
in the dark. Basically I want to switch back to the same behaviour for
mdcache_readdir_cached() which was before these two commits:
5dc6a70ed42275a4f6772b9802e79f23dc25fa73 and
654dd706d22663c6ae6029e0c8c5814fe0d6ff6a. What do you think?
diff --git a/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
b/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
index 20d6365..912324f 100644
--- a/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
+++ b/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
@@ -2826,8 +2826,14 @@
PTHREAD_RWLOCK_unlock(&directory->content_lock);
PTHREAD_RWLOCK_wrlock(&directory->content_lock);
has_write = true;
- first_pass = true;
- chunk = NULL;
+ /* reload the chunk after taking write lock */
+ if (chunk &&
+ mdcache_avl_lookup_ck(directory, next_ck, &dirent)) {
+ chunk = dirent->chunk;
+ } else {
+ chunk = NULL;
+ first_pass = true;
+ }
goto again;
}
Thanks,
Ashish
5 years, 11 months