On 9/18/18 11:42 AM, Malahal Naineni wrote:
I also put statistics on the send queue. Time-stamped when we queue
the request and calculated the wait time in the send queue when a thread actually
calls svc_ioq_flushv() on the request. On our customer system, the average wait time was
close to a
second! I am going to give my xp_fd hash patch and see what the average send queue times
say.
Talked with folks here, and now this sounds like some of your
sockets have gone into TCP Retransmission Timeout.
As I'm sure you know, we've been battling overly aggressive (piggy)
clients for years. You probably have a switch or router in the
path. Most everybody has one version or another of fair queuing.
Or RED.
Really would need a packet trace to see what's going on....