N1ql - cover index is way slower than expected

Hi,

I’m trying to convert some of my views into n1ql for testing.

We are on 4.5.1-2844 Enterprise Edition.

My index definition is

 CREATE INDEX `idx_activity` ON `engage`(`clientId`,`username`,`target_form`,`target_id`,`action`,`created_at`) WHERE (`entityType` = "activity")

Index status shows the following stats
68M total indexed
11.9GB data size
44.5GB disk size (this was around 12.9gb and now it is 44.5gb)
187B average item size

My query:

select meta().id from engage where clientId='society6' and  username is not null and  target_form='user' and  target_id='moon326' and  action is not null and  created_at is not null and entityType='activity'

Explain output:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "IndexScan",
          "covers": [
            "cover ((`engage`.`clientId`))",
            "cover ((`engage`.`username`))",
            "cover ((`engage`.`target_form`))",
            "cover ((`engage`.`target_id`))",
            "cover ((`engage`.`action`))",
            "cover ((`engage`.`created_at`))",
            "cover ((meta(`engage`).`id`))"
          ],
          "filter_covers": {
            "cover ((`engage`.`entityType`))": "activity"
          },
          "index": "idx_activity",
          "index_id": "eecccd40fbda9a90",
          "keyspace": "engage",
          "namespace": "default",
          "spans": [
            {
              "Range": {
                "High": [
                  "successor(\"society6\")"
                ],
                "Inclusion": 0,
                "Low": [
                  "\"society6\"",
                  "null",
                  "\"user\"",
                  "\"moon326\"",
                  "null",
                  "null"
                ]
              }
            }
          ],
          "using": "gsi"
        },
        {
          "#operator": "Parallel",
          "~child": {
            "#operator": "Sequence",
            "~children": [
              {
                "#operator": "Filter",
                "condition": "(((((((cover ((`engage`.`clientId`)) = \"society6\") and (cover ((`engage`.`username`)) is not null)) and (cover ((`engage`.`target_form`)) = \"user\")) and (cover ((`engage`.`target_id`)) = \"moon326\")) and (cover ((`engage`.`action`)) is not null)) and (cover ((`engage`.`created_at`)) is not null)) and (cover ((`engage`.`entityType`)) = \"activity\"))"
              },
              {
                "#operator": "InitialProject",
                "result_terms": [
                  {
                    "expr": "cover ((meta(`engage`).`id`))"
                  }
                ]
              },
              {
                "#operator": "FinalProject"
              }
            ]
          }
        }
      ]
    },
    "text": "\nselect meta().id from engage where \nclientId='society6' and \nusername is not null and \ntarget_form='user' and \ntarget_id='moon326' and \naction is not null and \ncreated_at is not null and\nentityType='activity'"
  }
]

It looks like my query takes covering index, but the query is so slow to the point I never received an output. I had to cancel my query. I’ve tried to play with “is not null” (excluding it out, including it), but the result was same.

  1. Am I doing something wrong?
  2. Somehow my index stat shows 44.5gb disk size. It was at around 12gb. What happened?

Thank you in advance.

The following is right index for the query.

CREATE INDEX idx_activityONengage(clientId,target_form,target_id,username,action,created_at) WHERE (entityType = "activity")

IndexScans are range scan. When maximum number of predicates pushed to indexer(i.e. Spans in EXPLAIN output) it performs well. In our case username IS NOT NULL there is no upper bound and it is second index key. This makes following index keys upper bounds never pushed to indexer. Indexer produces more data needed and query trying to eliminate it (This has been improved in 5.0.0).

Best index key order should be equality query predicate followed by IN predicate followed by <,<= followed by >, >=

Hi vsr1,

Thank you so much. That fixed the issue. However, it’s still in 150~200ms range just to get meta().id. I guess I have to stick with view for this one as well.