We have tried to use query like the following to get documents for a particular environment and document type like the following:
/////////////////////
select * from travel WHERE META().id LIKE “dev1::airline::%”
////////////////////
We want to find all airline documents for a given development environment dev1. We noticed that the above query will use the primary index and perform very slow.
Another option we have is we can create additional fields on the documents which can store the value of development environment, and create index on the document type and development environment, then create query based on these two fields. However, we prefer to create as few index as possible. Since the environment and document type are already part of the document key, i am just wondering whether there are other ways to query the information and avoid the primary index scan. Thanks
If you are doing prefix search on document key with whole document or document key only then primary index is best. Even if use secondary index on different field it will be same.
Could you elaborate perform very slow. How many documents/size you are taking . If you need whole document you can project META().id only and use direct KV get from your SDKs to avoid two hop data.
select META().id from travel WHERE META().id LIKE “dev1::airline::%”
The performance are similar for smaller tenant. However, we have large tenant with over 2 million documents. The query similar to the following takes longer then 4 min to finish, then timout:
/////////////////////
select * from travel WHERE META().id LIKE “dev1::airline::%”
////////////////////
If we use a secondary index on the new fields, the query is able to finish.
In above use case Primary index used as secondary index. Please Post EXPLAIN and indexes definitions of both. Also you can use query profiling and see where it is taking time
Yes. Example: Your bucket has 10million documents.
if there is only 1 document that starts with “dev1::airline::” it will be ms
If there is 1M then it may take secs/minutes based on size
////////
if there is only 1 document that starts with “dev1::airline::” it will be ms
If there is 1M then it may take secs/minutes based on size
///////////
does the above apply to the query which uses secondary index as well? or when there are lots of documents to return (e.g. 0.5 M or 1M), the query uses secondary index will perform better?