My first question is: Does the whole VM freeze?  Or just Ganesha? 
Ganesha's a userspace app, so it can't really freeze the VM; if the VM 
is frozen, then it's a problem either with the host or the guest OS.
If it's just Ganesha freezing, then you should turn on some logging. 
I'd start with NFS4 at DEBUG to start, maybe add FSAL at DEBUG after 
that, if nothing useful is in the log.  Then, when it freezes, grab the 
log and look at the end to see what's going on.
Daniel
On 10/25/19 2:41 PM, andre.roberge(a)maskicom.net wrote:
 Hi,
 
 I’m running a nfs-ganesha on a virtual server with SR-IOV(Intel XP520) which job is
basically recording 167 channels of TV feed for a time shift function. The set-up would
run for hours without any issues than the VM would just freeze with no apparent reason.
 
 My Ceph cluster is running fine and are barely using the available storage the load on
the server is minimal.
 
 There is no error on the Ceph side the nfs-ganesha virtual machine just freeze.
 
 The nfs-ganesha server is set-up with BGP to the host using frrouting 7 into leaf/spine
typologies  which is working fine.
 
 Here is my mount for the NFS
   
 10.70.0.67:/cephfs/timeshift1 /mnt/cephfs nfs4 noatime,soft,nfsvers=4.1,async,proto=tcp 0
0
 
 Here is my config for Ganesha
 
 NFS_Core_Param
 {
 }
 
 EXPORT_DEFAULTS {
 	Attr_Expiration_Time = 0;
 }
 
 CACHEINODE {
 	Dir_Chunk = 0;
 
 	NParts = 1;
 	Cache_Size = 1;
 }
 
 
 EXPORT
 {
 	Export_id=20133;
 
 	Path = "/";
 
 	Pseudo = /cephfs;
 
 	Access_Type = RW;
 
 	Protocols = 3,4;
 
 	Transports = TCP;
 
 	SecType = sys,krb5,krb5i,krb5p;
 
 	Squash = No_Root_Squash;
 
 	Attr_Expiration_Time = 0;
 
 	FSAL {
 		Name = CEPH;
 		User_Id = "admin";
 	}
 
          
 }
 EXPORT
 {
 	Export_id=20134;
 
 	Path = "/";
 
 	Pseudo = /cephobject;
 
 	Access_Type = RW;
 
 	Protocols = 3,4;
 
 	Transports = TCP;
 
 	SecType = sys,krb5,krb5i,krb5p;
 
 	Squash = Root_Squash;
 
 	FSAL {
 		Name = RGW;
 		User_Id = "cephnfs";
 		Access_Key_Id ="5XC5JJPHT1TVF7COSH23";
 		Secret_Access_Key = "6347daPBi79srlE3Kw6l4zDA8SMMkJQZjA1ug7LK";
 	}
 
          
 
 }
 
 RGW {
          ceph_conf = "/etc/ceph/ceph.conf";
          cluster = "ceph";
          name = "client.rgw.cephctl1";
          
 }
 
 LOG {
          Facility {
                  name = FILE;
                  destination = "/var/log/ganesha/ganesha.log";
                  enable = active;
          }
 
          
 }
 
 Here is my Ceph status at the freeze time.
   
    data:
      pools:   7 pools, 172 pgs
      objects: 1.43M objects, 3.7 TiB
      usage:   11 TiB used, 192 TiB / 204 TiB avail
      pgs:     172 active+clean
 
 Ceph version
 ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)
 
 Host Version Info
 CentOS 7.2 Kernel 5.3.2-1.el7
 
 Also, tried those kernels also with the same issue.
 
 CentOS Linux (5.3.2-1.el7.elrepo.x86_64) 7 (Core)
 CentOS Linux (3.10.0-1062.4.1.el7.x86_64) 7 (Core)
 CentOS Linux (3.10.0-1062.1.2.el7.x86_64) 7 (Core)
 
 
 Ganesha Version info :
 
 nfs-ganesha-ceph 2.8.2
 nfs-ganesha-rgw 2.8.2
 libcephfs2 14.2.4
 
 Any guidance on how to resolve this issue would be appreciated
 
 Andre Roberge
 andre.roberge(a)maskicom.net
 
 
 _______________________________________________
 Support mailing list -- support(a)lists.nfs-ganesha.org
 To unsubscribe send an email to support-leave(a)lists.nfs-ganesha.org