I was working with 2.5 originally and wanted to see a fix with minimal changes. The patch was supposed to be Work In Progress but was squashed down as I didn't have enough arguments to back it up (ended up with a timeout patch). Now, we have systems in the lab exhibiting this in-house and customer locations as well (I added IOQ stats and they showed 1-second latency from queuing to calling writev alone under stress).

I would put the IOQ in xprt itself as you did. Please post the patch and let us review it.

Regards, Malahal.

On Fri, Sep 14, 2018 at 7:41 PM, Kropelin, Adam <kropelin@amazon.com> wrote:
> -----Original Message-----
> Locally I've experimented with two approaches to address this.
>
> In one, we hash xprts to queues rather than using ifindex. This is similar in
> concept to Malahal's old patch incrementing ifindex (which I was unaware of
> until now) but has the additional benefit of ensuring that traffic for any given
> xprt always lands on the same queue.

Malahal, I re-read your patch and realize now that you rotated ifindex at creation time so it has the same behavior as my hash in terms of xprt-to-queue affinity. I originally thought you were rotating through queues at enqueue time. So I think these approaches are equivalent. I was just using 'fd & IOQ_IF_MASK' as a cheap hash so I think your approach is better in the N-clients-to-M-queues space. But I'm interested in feedback on the queue-per-xprt approach as that one is giving me the scalability and reliability I really need with zero client interference.

--Adam