Hi,
We are in our final testing for qualifying Couchbase on a large customer project. At this stage we have a performance question. Perhaps are we doing it wrong.
Our model will look like this:
{
g: “global type of records”
l: “local filter”
a: [ “array filter values” ]
}
For our test we generated 500000 records:
[{“g”:“e4bae564-ca56-42de-968b-fcc78a8f604e”,“l”:“66502cf3-e952-4bf4-9355-a16c9e343d3a”,“a”:[“66502cf3-e952-4bf4-9355-a16c9e343d3a”]}
The g is always the same, will have the same “l” and “a” filters values, the value is new every 100 records. The only difference between a and l is the type, one is an array of string, one is a string.
So at the end we have 500000 records with same g, and 5000 groups of 100 records with the same l and a value.
The index is:
CREATE INDEX idx_tstp
ON TSTP
(g
,l
,(distinct (array v
for v
in a
end)),a
) WHERE (g
= “e4bae564-ca56-42de-968b-fcc78a8f604e”)
The question we have is the difference of performance between the two requests:
SELECT g FROM TSTP where l = “66502cf3-e952-4bf4-9355-a16c9e343d3a” AND g=“e4bae564-ca56-42de-968b-fcc78a8f604e” LIMIT 5
– and this one –
SELECT g FROM TSTP where ANY v in a SATISFIES v = “66502cf3-e952-4bf4-9355-a16c9e343d3a” END AND g=“e4bae564-ca56-42de-968b-fcc78a8f604e” LIMIT 5
I would have expect, because of the indexation, to have more or less the same performance, or we are going from 5ms for the first one to 300ms for the second one.
The issue is this kind of request will be quite done often in our solution. And we have to ensure some API response time below 1s.
Is there any way to improve the search of string values in an array ?
Test done on: couchbase 6.6.1, 1 Node, 16 Cores, 64Gb RAM, bucket 4GB Ram, Index 4GB Ram
Best regards,
David.