> About avoiding DSYNC... They are already using NFS_Commit=TRUE
with
ganesha 2.7 (which has fix for NFS_Commit). so not understanding what avoid
write with DSYNC from Ganesha side means here, any thoughts ?
Ganesha can't control the data placement to disk, it is the client that
controls it, so Ganesha can't do anything here. The app is doing O_DIRECT,
so every write will have to hit the disk.
> And what do you think about the comment of having config
parameter in
ganesha to limit concurrent writes to the same file ?
Silly parameter to have in Ganesha. If the backend file system needs it,
they can trivially control it. I suggest you recreate the issue and try
GPFS 423 with ganesha2.7. The results may surprise you! It is possible that
this could be a backend file system issue rather than Ganesha issue.
Regards, Malahal.
On Fri, Oct 30, 2020 at 9:19 PM Madhu P Punjabi <madhu.punjabi(a)in.ibm.com>
wrote:
Hi Malahal,
Thank you for the response.
>> Can they replicate this NFSv3? If so, then this is not the issue. If
this happens only with NFSv4, then that might be the issue.
The tests have been made with NFSv3 only. They didn't use NFSv4.
Thanks,
Madhu Thorat
----- Original message -----
From: Malahal Naineni <malahal(a)gmail.com>
To: Madhu P Punjabi <madhu.punjabi(a)in.ibm.com>
Cc: ffilzlnx(a)mindspring.com, dang(a)redhat.com, nfs-ganesha <
devel(a)lists.nfs-ganesha.org>
Subject: [EXTERNAL] Re: Question about any configuration parameter that
can be used to limit number of concurrent writes requests for the same file
from ganesha to the FSAL.
Date: Fri, Oct 30, 2020 9:03 PM
Ganesha 2.3 uses same open fd for all the NFS client threads in the above
program. Ganesha 2.7 may have multiple fds (this depends on the client
behavior though) with NFSV4. Can they replicate this NFSv3? If so, then
this is not the issue. If this happens only with NFSv4, then that might be
the issue. If GPFS has some way of restricting multiple SYNCs to the same
fd, then Ganesha2.7 might be making it fail, leading to multiple SYNCs.
Ideally, restricting multiple SYNCs at file level in GPFS should solve the
issue assuming this is the case.
As far as SYNCs, ganesha2.3 or ganesha2.7 would just submit them to the
file system.
On Fri, Oct 30, 2020 at 6:43 AM Madhu P Punjabi <madhu.punjabi(a)in.ibm.com>
wrote:
Hi All,
There is an ongoing performance issue for one customer who moved from
ganesha 2.3 to ganesha 2.7 recently. They say they saw better performance
with ganesha 2.3
For measuring performance they wrote below program to write 50GB of data,
in the program multiple threads run at the same time and each thread opens
the same file and writes 1MB of data to the file.
They ran this program from 2 places - 1) directly on the GPFS node to
write data to the file system. 2) And also ran the same program from an old
RHEL5.5 NFS client.
The function which gets invoked inside each thread is:
void *write_data(void *data) {
int i;
struct thread_data *tdata = (struct thread_data *)data;
for(i=0;i<tdata->mb;i++) {
lseek(tdata->fd, tdata->offset, 0);
write(tdata->fd, tdata->data, SIZE);
fsync(tdata->fd);
tdata->offset += SIZE;
}
}
int main(int argc, char *argv[]) {
for(i=0;i<NUM_THREAD;i++) {
....
// open output file
char filepath[256];
tdata[i].mb = atoi(argv[2])/NUM_THREAD;
tdata[i].offset = tdata[i].mb*SIZE*(off_t)i;
sprintf(filepath, "%s/dac.bin", argv[1]);
tdata[i].fd = open(filepath,
O_CREAT|O_RDWR|O_DIRECT|O_SYNC|O_LARGEFILE, 0666);
if(tdata[i].fd<0) {
return(-1);
}
// write data
pthread_create(&pthread[i], NULL, &write_data,
&tdata[i]);
}
for(i=0;i<NUM_THREAD;i++) {
pthread_join(pthread[i], NULL);
close(tdata[i].fd);
}
}
*Does the above program to measure performance look problematic ?* in
the program multiple threads run in parallel and open the same file and do
write(), fsync() to the file.
They collected GPFS (file system) statistics data every 2 minutes or so
while running the program.
In the first case, when they ran the write program directly on the GPFS
node, they saw average time taken for writing data from GPFS node to the
file system remained almost the same in every 2 minutes of
statistics captured.
But in the second case, when they ran the program from NFS client, they
saw average time taken to write data to the file system kept on increasing.
The statistics were captured every 2 minutes. So their view was that when
ganesha was used, average time taken to write data to disk kept on
increasing.
We checked with our GPFS file system team which said that "this is
happening because there are multiple threads performing writes to the
same file with DSYNC options. And suggested to check if Ganesha has some
config parameters to avoid DSYNC on write and limit number of concurrent
writes to the same file."
*About avoiding DSYNC... They are already using NFS_Commit=TRUE with
ganesha 2.7 (which has fix for NFS_Commit). so not understanding what avoid
write with DSYNC from Ganesha side means here, any thoughts ?*
*And what do you think about the comment of having config parameter in
ganesha to limit concurrent writes to the same file ?*
*Frank, Dan*
I had checked about above similar question on IRC yesterday. Please let me
know if you have any other thoughts considering the above additional new
information ?
Thanks,
Madhu Thorat.