Uneven number of active requests in index/query node

Uneven number of active requests in index/query node
4 index / query nodes exits
one of the node have double number of requests there by CPU usage alerts are triggering

How are the requests being issued? From the SDK in a client app? Which SDK, which version? Is the Cluster connection created multiple times? Is the app executed multiple times? Is it just the CPU that is unevenly distributed or the number of reqiests? Does system:completed_sessions also show the unbalanced distribution?

The requests are executed via SDK
Go-Client SDK , Version 1.6
Cluster connection established once
CPU is not evenly distributed along with the requests
Completed request shows different pattern

Below includes the request count details
node 3 is the one having issue

#Active requests

active_requests node
528 node 3
185 node 4
137 node 2
128 node 1

Completed requests

completed_requests node
133 node 1
103 node 2
99 node 3
62 node 4

528 active_requests on one node seems like a lot. Suppose node 3 has 16 cores, each of those 528 requests would get 1/33 of a core if all the requests were cpu-bound. It’s worth looking into what those requests are and maybe try to optimize them.

If you are using transactions, the requests within the transaction are sent to the same node.

The completed requests look like they are weighted towards the nodes in order - something that would happen if the queries were balanced to nodes in order, and (re)started from node 1 multiple times.

Thanks for the insights
Will review on the CPU bound queries if persists
and check whether transactions are used

Will a graceful restart of the node 3 will help in reassemble the requests evenly across all nodes in the cluster

As far as I know, there is no mechanism to preserve query requests across restarts. The requests would disappear and node 3 might have 0 active requests, but the SDK might retry them.

What timeout is the SDK using on the requests? The default is 75 seconds for query requests. After 75 seconds the query service and/or the SDK should cancel the requests.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.