I have been investigating on how to implement pnfs for FSAL CEPH [1]. As suggested by Jeff , I spent time reading about flex file layout and wrote a POC just for my own understanding of pnfs protocol [2]. But as nfs-ganesha lacks support for flex filelayout, just to ensure if libcephfs api work, I wrote existing code using file layout. I have few doubts and would also like to understand whats the best way to breakdown the work here.
1. Advantage of using flex over file layout: As I understood, the advantage is that in flex file layout storage nodes can be nfs version 3 or 4. Hence, we can make the data path more simpler. Is there any other advantage wrt how ceph stores data (crush maps) that makes flex better than file layout.
2. Implementing Flex file layout support In existing code we have a file "flex_files_prot.x " but that does not match the latest rfc [1]. There are some other data structures defined in nfsv41.h, but we would require to implement the functions. I will submit a patch.
3. While working on POC, there were some implementation issue as well: - In LAYOUTGET, we populate opaque object device_id (16 bytes) and in GETDEVICEINFO, we should be able to send the network address for the storage device. To get started, I assumed we will co-locate the osds with storage device. And in GETDEVICEINFO, we use api ceph_ll_osdaddr to get the osd address. We will need some way to encode the osd_ids in the device_id and pass it to GETDEVICEINFO. Its a implementation detail, but just want to be sure I am not overlooking something obvious
- In LAYOUTGET, using ceph_ll_get_stripe_osd to get the osd corresponding to stripe. But I see a assert failure at [3], and I have to investigate if its because of my ceph setup or bug in the code. And as there is no documentation for this function, I am not sure exactly what is it doing.