Partitioned index distribution across cluster nodes

Hi

I have a 3-node cluster running Couchbase Server 5.5.2. Index service runs on all three nodes. On this cluster, I have a bucket with UUIDs served as document ids. I create a number of partitioned indexes like this:

CREATE INDEX idx_attr1 ON bucket1(lower(attr1)) PARTITION BY HASH(META().Id);

I see that the load on nodes is distributed very much unequally, even though, partitioning by a uniformly distributed attribute (UUID) is expected to distribute the index more or less uniformly. One of the cluster nodes works hard (mostly CPU and also I/O) and other nodes are quite idle. Note that I just guess that the index distribution across the nodes is not uniform, because I have not found a way to check it. Could you help me to understand what’s going wrong?

Thanks in advance

Hi @LeonidGvirtz
In partitioned index, HASH(meta().id) determines, to which partition a given index key will go and that looks to be uniform as you are using UUID as meta().id. However, how the partitions themselves are distributed across the available set of indexer nodes is determined by an optimisation algorithm based on the free resource available on each indexer node. Please find more info at Couchbase GSI Index partitioning - The Couchbase Blog .

You mentioned that one of the cluster nodes works harder than the others. Can you please give more details about what was the activity on the node which having more CPU/IO consumption? Was the index building during high resource consumption or were there on-going scans?

Also, to find out the exact distribution of partitions across available nodes, please use below API to get “partitionMap”
curl -u username:password node:9102/getIndexStatus.

You could also do an explicit placement of partitions in specific indexer nodes by using “nodes” key inside WITH clause. Example:
CREATE INDEX ih ON customer(state, name, zip, status)
PARTITION BY HASH(state)
WHERE type = “cx” WITH {“num_partition”:16,
“nodes”:[“172.23.125.32:9001”, “172.23.125.28:9001”, “172.23.93.82:9001”,“172.23.45.20:9001” ]}

Furthermore, please share the cbcollect logs of all three nodes so that I can get a better picture about your cluster.

Thanks,
Prathibha

Hi Prathibha

getIndexStatus tells that partitions are spread across all 3 nodes:

“partitionMap”:{
“node1:8091”:[
2,
8,
6
],
“node2:8091”:[
1,
7,
3
],
“node3:8091”:[
5,
4
]
},

The index creation was the only activity on the cluster during my tests. The loaded node showed next to zero idle CPU and some I/O activity all the time, even though the storage was not saturated at all. Other nodes showed only light CPU activity and next to zero I/O. Note that I ran a number of tests and at least two different nodes were in the role of “hard-working node”, so it is not a problem of a specific cluster node.

Unfortunately, it will take me some time to provide cbcollect-info logs because the system is not available for my tests anymore, I will update the forum thread when I collect it.

Thanks
Leonid Gvirtz

Hi @LeonidGvirtz
Distribution of 8 partitions across three nodes looks fine to me. But to know why there is only one node with high CPU/IO utilization, it would help to have cbcollect. Please share the cbcollect logs when you get a chance to collect them.

Thanks,
Prathibha