We have a very large index where we match content by some standard fields.
Let’s say we have the following index:
CREATE INDEX ON `bucket`(`large_set`, `medium_set`, `small_set`) WHERE `x`=1
Would it be more efficient to refactor the index to be:
CREATE INDEX ON `bucket`(`small_set`, `medium_set`, `large_set`) WHERE `x`=1
Assuming that all of our queries filter on all of the data in the ON clause?
Our current problem is index counts timing out:
SELECT count(1) FROM `bucket` WHERE `x`=1 AND `large_set` = "L" AND `medium_set` = "M" AND `small_set` = "L"
This is currently giving us problems (timeouts) with Couchbase 4.5.1…
Thanks - H
vsr1
July 18, 2018, 4:01pm
2
If you have equality predicates it doesn’t matter which order you defined. You can try increase index memory and even indexer timeout. cc @deepkaran.salooja
Check query plan it should be using the IndexCountScan. Count might be large. 5.5.0 EE has Index aggregation feature SQL Group by Index | Aggregate Index SQL | Couchbase
Thanks. Indeed the query plan looks clean (this is modified to redact proprietary info, so I did the best I could):
[
{
"plan": {
"#operator": "Sequence",
"~children": [
{
"#operator": "IndexCountScan",
"covers": [
"cover ((`bucket`.`large_set`))",
"cover ((`bucket`.`medium_set`))",
"cover ((`bucket`.`small_set`))",
"cover ((meta(`bucket`).`id`))"
],
"index": "index",
"index_id": "3c601261cc5ad742",
"keyspace": "bucket",
"namespace": "default",
"spans": [
{
"Range": {
"High": [
"\"L\"",
"\"M\"",
"successor("\"S\"))"
],
"Inclusion": 1,
"Low": [
"\"L\"",
"\"M\"",
"\"S\""
]
}
}
],
"using": "gsi"
},
{
"#operator": "IndexCountProject",
"result_terms": [
{
"as": "count",
"expr": "count(1)"
}
]
}
]
},
"text": "SELECT count(1) as count FROM `bucket` WHERE `large_set` = \"L\" AND `medium_set` = \"M\" AND `small_set` = \"S\" AND ...."
}
]