I have a N1QL query which is covered and the predicate contains one array with (ANY x IN array SATISFIES x = “string”) and 4 range predicates.
The index contains ~600k documents.
Now the query performance is heavily dependant on the string I am querying for. For one string (a lot of documents contain that string in the array) the query completes in 368.26ms (index scan time: 00:00.3592). For another string (not so much documents contain that string in the array) it is 12.33s (index scan time: 00:12.3154).
How do I improve the IndexScan time? I thought index nodes need memory, number of cores is not that important. But the index resides completely in memory. Should I increase the number of cores or use faster cores or something different?
I read the recommended blog post and did some optimizations like order of keys in the index.
This is my index definition:
CREATE INDEX idx_score_desc
ON `bucket`(score DESC, upvotes, creationTime, creator.followerCount,countComments, ALL DISTINCT ARRAY t for t IN tags END)
WHERE (`_class` = "some.class.package.Class") AND score > 0
WITH {"nodes": [ "192.168.0.1:8091", "192.168.0.2:8091", "192.168.0.3:8091" ], "defer_build":true, "num_replica": 2};
My query:
SELECT RAW meta().id FROM `bucket`
WHERE `_class` = "some.class.package.Class"
AND creationTime BETWEEN 1501413511 AND 1532949511
AND upvotes BETWEEN 0 AND 29639
AND creator.followerCount BETWEEN -1 AND 16969739
AND countComments BETWEEN -1 AND 10000000
AND score > 0
AND ANY t IN tags SATISFIES t = "tag1" END
ORDER BY score DESC, upvotes, creationTime, creator.followerCount, countComments
LIMIT 16 OFFSET 0
It is pagination query. When query satisfies pagination it stops. If you have lot of t = “tag1” it satisfies early so less index disk reads and scans. If not many more index disk reads/scans.
One option you can try is Move ALL DISTINCT index to leading index key (Rule #11). Check if the query using index order. If not the current query plan is optimal
But back to my main question: What do I need to scale to improve pure IndexScan performance (IndexScan time): Number of cores, faster cores, memory, disks (faster SSDs)?