Hello,

 

You may be running into a common problem with NFS. A common problem is the client doesn’t get any indication it should invalidate its cache of directory entries. Unless the client does something in the directory so it receives new attributes from the server, it will delay updating its cache. Depending on your workload you might have luck with reducing or eliminating the attribute cache delay. You can eliminate it entirely with the noac mount option, or just reduce or eliminate it for directories with acdirmin and acdirmax (check “man 5 nfs” for details).

 

At least with a single Ganesha instance, you aren’t dealing with basically the same problem between Ganesha instances…

 

Frank Filz

 

From: Renaud Fortier [mailto:renaud.fortier.1@ulaval.ca]
Sent: Thursday, April 20, 2023 5:05 AM
To: support@lists.nfs-ganesha.org
Subject: [NFS-Ganesha-Support] intermittent problem with NFS

 

Good morning,

I have an intermittent problem that I can not solve and today I ask for your help because I have no more idea. I have two Apache/PHP web servers (the clients) that use GlusterFS volumes via NFS (NFS-Ganesha) as data disks. The problem is that sometimes files saved to the server via PHP applications are not visible. Sometimes it's one of the two clients who doesn't see them, other times it's both clients. It's random because in general, everything works well. When I realize that a file is in trouble, I go to the directory “cd /data/folder” and I do an “ls -l”. I have no warnings or errors in the log files (clients and servers)

 

Do you have any ideas or ways to help me solve this problem?

 

Here's some informations. Let me know if you need anything else to help me. Thank you so much

 

Versions:

Linux: Debian 10 (clients and servers)

NFS-Ganesha: 4.0

GlusterFS: 10.4

 

Mount Clients:

192.168.11.90:/prod_web on /data type nfs4 (rw,noatime,nodiratime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.11.150,local_lock=none,addr=192.168.11.90)

 

Mount volume GlusterFS:

/dev/mapper/vg3-prod_web on /data/glusterfs/prod_web/brick1 type xfs (rw,noatime,nouuid,attr2,inode64,noquota)

 

Ganesha.conf:

NFS_CORE_PARAM {

  NFS_Protocols = 4;

  # Pour éliminer l'erreur: Cannot bind RQUOTA udp6 socket, error (Address already in use)

  Enable_RQUOTA = false;

  RQUOTA_Port = 0;

}

 

NFS_KRB5 {

  Active_krb5 = false;

}

 

# You don't want grace to be less than lease, if it is, clients may not detect

# server crash in time to reclaim lost locks during grace period.

NFSV4 {

  Lease_Lifetime = 15;

  Grace_Period = 15;

}

 

LOG {

  ## Default log level for all components

  Default_Log_Level = WARN;

}

 

EXPORT {

  Export_Id = 3;

  Path = "/prod_web";

  Pseudo = "/prod_web";

  Access_Type = RW;

  Squash = No_root_squash;

  Disable_ACL = true;

  Protocols = "4";

  Transports = "UDP","TCP";

  SecType = "sys";

  FSAL {

    Name = "GLUSTER";

    Hostname = localhost;

    Volume = "prod_web";

  }

}

 

Gluster Volume Info:

Volume Name: prod_web

Type: Replicate

Volume ID: e918bd26-3318-48b3-8902-1a3b1de4f0f3

Status: Started

Snapshot Count: 0

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: gluster1.fsaa.local:/data/glusterfs/prod_web/brick1/brick

Brick2: gluster2.fsaa.local:/data/glusterfs/prod_web/brick1/brick

Brick3: gluster3.fsaa.local:/data/glusterfs/prod_web/brick1/brick

Options Reconfigured:

storage.build-pgfid: on

diagnostics.brick-log-level: WARNING

performance.client-io-threads: on

nfs.disable: on

transport.address-family: inet

performance.cache-size: 1GB

performance.parallel-readdir: off

performance.read-ahead: off

cluster.readdir-optimize: on

client.event-threads: 4

server.event-threads: 4

features.cache-invalidation: on

features.cache-invalidation-timeout: 600

performance.cache-invalidation: on

performance.md-cache-timeout: 600

network.inode-lru-limit: 150000

auth.allow: 192.168.11.150,192.168.11.151,192.168.11.40

performance.nl-cache: on

performance.nl-cache-timeout: 600

diagnostics.client-log-level: WARNING

cluster.server-quorum-type: none

cluster.enable-shared-storage: enable

 

Renaud Fortier 
Administrateur RI 

Services communs des ressources informatiques, pédagogiques et technologiques (SCRIPT, FSAA, FFGG, SSE) 
Université Laval 
 
T  418 656-2131, poste 404789 
Clavardons sur Microsoft Teams 
 
Pavillon Paul-Comtois, local 3100-C 
Québec (Québec) G1V 0A6 
 
Avis relatif à la confidentialité