I am trying to understand how Couchbase index choice works, because a specific use case is acting in a confusing manner.
The desired behavior is to provide prefix and suffix wildcard searching on document names for documents of a given subtype (case insensitive). Our doc ID format is <doc type>---<doc name>
. An example query would be for meta().id like 'template---%option%'
, where I would want it to match template---Connor option 1
, template--- option 2
, template---Trevor Option
, etc.
We initially tried to use this index:
CREATE INDEX doc_ids ON bucket ( DISTINCT (ARRAY array_element FOR array_element IN SUFFIXES(LOWER(META().id)) END) ) WHERE (LOWER(META().id) LIKE "template---%")) AND NOT CONTAINS(META().id, "<AUDIT>")
with query:
select meta().id from bucket where lower(meta().id) like "template---%options%" and not contains(meta().id, "<AUDIT>")
However, it does not use the secondary index, and instead uses the primary index.It also takes a full second to run. I don’t understand why the secondary index isn’t chosen. Furthermore, if I change the select statement to select * from bucket
, OR if I just add use index (doc_ids)
, it happily uses the secondary index and runs in 5 ms, indicating that the index is sufficient.
When I changed the index definition statement to add the full meta().id:
CREATE INDEX doc_ids ON bucket ( meta().id, DISTINCT ( ARRAY array_element FOR array_element IN SUFFIXES(LOWER(META().id)) END) ) WHERE (LOWER(META().id) LIKE "template---%")) AND NOT CONTAINS(META().id, "<AUDIT>")
it works as I expect: all queries use the secondary index.
So:
- How is the select clause affecting the index choice?
- Why is it choosing a less efficient index?
- Why do I need to add meta().id to the index’ definition?
Any help or explanatory articles appreciated. Thanks!