[NFS-Ganesha-Devel] Re: Question about recovery in cluster envirnment

Friday, 5 April 2019

One more thing to keep in mind:  Getting this wrong is a data corruption 
bug, which is the worst possible bug you can have in a storage system, 
so you should be pretty sure you've gotten it right.

Daniel

On 4/5/19 9:20 AM, Jeff Layton wrote:
...
 Easier said than done. Bear in mind that all of the recovery backend
 stuff is entirely for the purpose of dealing with server restarts, so
 you really do have to be careful not to leave gaps in particular
 scenarios.

 Let's say you do decide to synchronously store open and lock records
 in a central RADOS-based database and both Server A and Server B are
 using it.

 Server A crashes, and the client decides to reconnect to Server B
 using the recorded clientid/session. Server B says "Oh, this session
 was previously held by Server A." Now what happens?

 You need a mechanism to transfer the CephFS state (opens, locks, caps,
 etc.) from Server A to Server B. Nothing like that exists today, but
 we do have some tentative plans to allow cephfs clients to reclaim
 state they previously held. In principle, that could be extended to
 allow "takeover" in some fashion.

 But wait...it gets worse!

 Suppose we have a 3 node ganesha cluster and some of Server A's
 clients decide to go to Server C instead? Now a simple takeover is not
 enough -- you need a way to split that state granularly.

 Couple all of this with the basic truism that failures in these sorts
 of architectures are often cascading. You need to deal with the
 possibility that any node could just die at any time, and decide how
 you're going to deal with that. A lot of the original ganesha recovery
 backend work had gaping holes in the "takeover" mechanisms where
 failures at an inopportune time could make it so no clients could
 recover anything.

 This is very much a non-trivial problem in my experience, but don't
 let me dissuade you if you have considered these scenarios and have
 thoughts on how to address it.

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Devel] Re: Question about recovery in cluster envirnment