Hurrah!! We just released Couchbase Server 4.5 which brings a bag full of wonderful features and enhancements. My earlier post Making the most of your arrays… with Array Indexing introduced the Array Indexing functionality. This blog post includes covering array indexes, support for more operators such as UNNEST, ALL, ANY AND EVERY etc.,

Covering Array Indexes

 Covering Indexes is a performance feature in N1QL, which significantly improves the query performance by avoiding trips to the data-service to fetch any documents. There is no special syntax for creating/using covering indexes. This is an automatic optimization triggered in N1QL, when a query can find all required data in the index itself.  So, covered indexes are just normal GSI indexes that are created with all index-keys covering the data needed for a query.

Typically, array indexes are created with required elements/attributes with-in the array, as the index keys. And, covering indexes don’t need any special attention. However, to create Covering Array Indexes, the array itself MUST also be added to the list of other index keys provided to the CREATE INDEX statement. All details of the array are required for N1QL to properly evaluate predicates in the query.  For example, the following command creates a covering array index on the ‘schedule[].flight’ values in the couchbase ‘travel-sample’ dataset:

Note that the array field ‘schedule’ itself is also required to be specified as an additional index key. To contrast, a non-covering array index doesn’t require that, and is created as follows:

The following SELECT statement shows how to use the covering array index. It finds “the schedules of United Airlines flight routes that have less than 10 scheduled flights in a week”.

Note that the SELECT follows the index-selection rules described in the next sub-section.

  • The predicates in where-clause (array_length(schedule) < 10) and (type = “route”) match those in the index definition.
  • The index keys i.flight and schedule are referenced, using the exact variable  ‘i’ used in the index definition
  • The explain query plan shows the  attributes and filters/predicates covered by the array index ‘isched_covered’

Performance

Covered Array Indexes get the best performance for queries that can leverage the index. For example,  the above query took 8ms in my laptop, when using the covered array index ‘isched_covered‘. However, same query using index ‘def_type‘ took 3sec. That’s a crazy 375x better performance for this query.

Index selection rules

Irrespective of the operators used, the DML must follow the index-selection requirements to make use of the array index.

  1. The WHERE-clause must be used in the SELECT or other DML statements, with matching index keys specified in the CREATE INDEX definition.
  2. For partial indexes, the predicates used in CREATE INDEX definition must be specified in the WHERE-clause of the DML.
  3. Variable names in the WHERE-clause must exactly match corresponding variable names used in the CREATE INDEX definition.

When multiple matching indexes  are available, N1QL might pick any of the matching indexes for executing the query. If you prefer your query to utilize a specific index, you can “suggest” it by using the USE INDEX clause to the DML.

Support for more N1QL operators

Couchbase 4.5 adds support for the operators UNNEST and ANY AND EVERY to work with array indexes.  These operators can be used in the queries with both covered, and non-covered array indexes. Note that:

  • ANY operator is already supported in the Developer Preview release. See example above.
  • EVERY operator is not supported in Couchbase 4.5.  However, note that the ANY AND EVERY operator is supported.  To clarify, EVERY operator evaluates to true for arrays with zero elements, whereas ANY AND EVERY evaluates to true when the array has atleast one matching element.

Using UNNEST with Array Indexes

Array-indexing supports the UNNEST operator which can be used to flatten the array attribute on which the index is created, and use it as part of the query. The UNNEST statement must use the exact same variable name (i.e ‘i ‘ in this example) as used in the CREATE INDEX statement. For example, the following query finds the details of United Airlines flight routes that have less than 10 scheduled flights in a week.

Performance

Covered Array Indexes get the best performance for queries that can leverage the index. For example,  the above query took 8ms in my laptop, when using the covered array index ‘isched_covered. However, same query using index ‘def_type‘ took 28sec. That’s an insane 3500x better performance for this query.

Using ANY AND EVERY  with Array Indexes

Array-indexing supports ANY AND EVERY operator. This can be used to find a boolean match across a non-null set of array elements on which the array index is created. For example, the following query finds  “the schedule of routes that have atleast 1, but less than 10,  all United Airlines flights in a week”.

Using ALL with Array Index

In Couchbase 4.5, array indexes can be created using CREATE INDEX with the keyword DISTINCT ARRAY only. Couchbase 4.5 release adds support for the ALL keyword to create the array index with all values of the array elements. For example,

You can learn more about array indexing and see more examples, such as composite and nested array indexes, in the Couchbase 4.5 documentation, and checkout the demo.

Give it a try, and let me know any questions/comments, or just how awesome it is ;-)

Cheers!!

Author

Posted by Prasad Varakur, Principal Product Manager, Couchbase

Prasad Varakur is a Principal Product Manager, Couchbase. Prasad is Product and Engineering leader in Databases(SQL, noSQL, Bigdata) & Distributed systems.

Leave a reply