Ganesha 2.3 uses same open fd for all the NFS client threads in the above
program. Ganesha 2.7 may have multiple fds (this depends on the client
behavior though) with NFSV4. Can they replicate this NFSv3? If so, then
this is not the issue. If this happens only with NFSv4, then that might be
the issue. If GPFS has some way of restricting multiple SYNCs to the same
fd, then Ganesha2.7 might be making it fail, leading to multiple SYNCs.
Ideally, restricting multiple SYNCs at file level in GPFS should solve the
issue assuming this is the case.
As far as SYNCs, ganesha2.3 or ganesha2.7 would just submit them to the
file system.
On Fri, Oct 30, 2020 at 6:43 AM Madhu P Punjabi <madhu.punjabi(a)in.ibm.com>
wrote:
Hi All,
There is an ongoing performance issue for one customer who moved from
ganesha 2.3 to ganesha 2.7 recently. They say they saw better performance
with ganesha 2.3
For measuring performance they wrote below program to write 50GB of data,
in the program multiple threads run at the same time and each thread opens
the same file and writes 1MB of data to the file.
They ran this program from 2 places - 1) directly on the GPFS node to
write data to the file system. 2) And also ran the same program from an old
RHEL5.5 NFS client.
The function which gets invoked inside each thread is:
void *write_data(void *data) {
int i;
struct thread_data *tdata = (struct thread_data *)data;
for(i=0;i<tdata->mb;i++) {
lseek(tdata->fd, tdata->offset, 0);
write(tdata->fd, tdata->data, SIZE);
fsync(tdata->fd);
tdata->offset += SIZE;
}
}
int main(int argc, char *argv[]) {
for(i=0;i<NUM_THREAD;i++) {
....
// open output file
char filepath[256];
tdata[i].mb = atoi(argv[2])/NUM_THREAD;
tdata[i].offset = tdata[i].mb*SIZE*(off_t)i;
sprintf(filepath, "%s/dac.bin", argv[1]);
tdata[i].fd = open(filepath,
O_CREAT|O_RDWR|O_DIRECT|O_SYNC|O_LARGEFILE, 0666);
if(tdata[i].fd<0) {
return(-1);
}
// write data
pthread_create(&pthread[i], NULL, &write_data,
&tdata[i]);
}
for(i=0;i<NUM_THREAD;i++) {
pthread_join(pthread[i], NULL);
close(tdata[i].fd);
}
}
*Does the above program to measure performance look problematic ?* in
the program multiple threads run in parallel and open the same file and do
write(), fsync() to the file.
They collected GPFS (file system) statistics data every 2 minutes or so
while running the program.
In the first case, when they ran the write program directly on the GPFS
node, they saw average time taken for writing data from GPFS node to the
file system remained almost the same in every 2 minutes of
statistics captured.
But in the second case, when they ran the program from NFS client, they
saw average time taken to write data to the file system kept on increasing.
The statistics were captured every 2 minutes. So their view was that when
ganesha was used, average time taken to write data to disk kept on
increasing.
We checked with our GPFS file system team which said that "this is
happening because there are multiple threads performing writes to the
same file with DSYNC options. And suggested to check if Ganesha has some
config parameters to avoid DSYNC on write and limit number of concurrent
writes to the same file."
*About avoiding DSYNC... They are already using NFS_Commit=TRUE with
ganesha 2.7 (which has fix for NFS_Commit). so not understanding what avoid
write with DSYNC from Ganesha side means here, any thoughts ?*
*And what do you think about the comment of having config parameter in
ganesha to limit concurrent writes to the same file ?*
*Frank, Dan*
I had checked about above similar question on IRC yesterday. Please let me
know if you have any other thoughts considering the above additional new
information ?
Thanks,
Madhu Thorat.