[NFS-Ganesha-Devel] Re: Scalability issue with VFS FSAL and large amounts of read i/o in flight

Friday, 14 September 2018

I was working with 2.5 originally and wanted to see a fix with minimal
changes. The patch was supposed to be Work In Progress but was squashed
down as I didn't have enough arguments to back it up (ended up with a
timeout patch). Now, we have systems in the lab exhibiting this in-house
and customer locations as well (I added IOQ stats and they showed 1-second
latency from queuing to calling writev alone under stress).

I would put the IOQ in xprt itself as you did. Please post the patch and
let us review it.

Regards, Malahal.

On Fri, Sep 14, 2018 at 7:41 PM, Kropelin, Adam <kropelin(a)amazon.com&gt; wrote:

...
 > -----Original Message-----
 > Locally I've experimented with two approaches to address this.
 >
 > In one, we hash xprts to queues rather than using ifindex. This is
 similar in
 > concept to Malahal's old patch incrementing ifindex (which I was unaware
 of
 > until now) but has the additional benefit of ensuring that traffic for
 any given
 > xprt always lands on the same queue.

 Malahal, I re-read your patch and realize now that you rotated ifindex at
 creation time so it has the same behavior as my hash in terms of
 xprt-to-queue affinity. I originally thought you were rotating through
 queues at enqueue time. So I think these approaches are equivalent. I was
 just using 'fd & IOQ_IF_MASK' as a cheap hash so I think your approach is
 better in the N-clients-to-M-queues space. But I'm interested in feedback
 on the queue-per-xprt approach as that one is giving me the scalability and
 reliability I really need with zero client interference.

 --Adam

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Devel] Re: Scalability issue with VFS FSAL and large amounts of read i/o in flight