Help me understand indexing and querying

I have been reading this and I was a bit confused. If I create an index on a bucket, I don’t have to call that specific index that I created, it’s done automatically indexed for future querying? also, this is what’s in my bucket.

[
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 0.34092514802274876,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:00.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "179933509776140"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 4,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:02.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "179933509776140"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 0.34092514802274876,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:02.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "179933509776140"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 4,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:02.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "2323"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 4,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:02.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "179933509776140"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "bluetoothAddress": "0C:F3:EE:08:A6:A3",
        "bluetoothName": "EMBeacon00000",
        "id1": "0x699ebc80e1f311e39a0f",
        "id2": "0x0cf3ee3bc018",
        "id3": "0x699ebc80e1f311e39a0f",
        "manufacturer": 65194,
        "receiverId": "42008780c4b9b329",
        "serialNumber": "179933509776140",
        "serviceUuid": 65194
      },
      "expiry": 0,
      "id": "BeaconList",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "bluetoothAddress": "0C:F3:EE:08:A6:A3",
        "bluetoothName": "EMBeacon00000",
        "id1": "0x699ebc80e1f311e39a0f",
        "id2": "0x0cf3ee3bc018",
        "id3": "0x699ebc80e1f311e39a0f",
        "manufacturer": 65194,
        "receiverId": "42008780c4b9b329",
        "serialNumber": "2323",
        "serviceUuid": 65194
      },
      "expiry": 0,
      "id": "BeaconList",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "assignedDate": "2003-02-17T09:10:33",
        "serialNumber": "222",
        "uldNumber": "2"
      },
      "expiry": 0,
      "id": "BeaconLink",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "assignedDate": "2003-02-17T09:10:00",
        "serialNumber": "999",
        "uldNumber": "2"
      },
      "expiry": 0,
      "id": "BeaconLink",
      "token": null
    }
  },
  {
    "BeaconBucketOne": {
      "cas": 0,
      "content": {
        "distance": 0.34092514802274876,
        "distanceTesting": 0.25,
        "newDistance": 0.19106446691360568,
        "receivedDate": "2017-03-20T11:31:01.279968-04:00",
        "receiverId": "42008780c4b9b329",
        "rssi": -50,
        "serialNumber": "179933509776140"
      },
      "expiry": 0,
      "id": "BeaconInfo",
      "token": null
    }
  }
]

I have 3 types in this document(assigned in Id)- BeaconLink, BeaconInfo, and BeaconList. I want to index them so querying is optimized. Is the best way to do that by using CREATE INDEX content ON ``BeaconBucketOne``(content); and making another index on id? If I use a query like, string query = "SELECT content.assignedDate,content.serialNumber,content.uldNumber FROM ``BeaconBucketOne`` WHERE id = 'BeaconLink'"; I am getting the results that I want but I’m not sure if I’m doing it in an optimized way. Thank you.

The following articles and book will help you understand index and query.


https://dzone.com/articles/understanding-index-scans-in-couchbase-n1ql-query
https://dzone.com/articles/designing-index-for-query-in-couchbase-n1ql

https://blog.couchbase.com/n1ql-practical-guide/

Thanks for sharing the links about the indexing and querying specifics. However I still have an unanswered question from those links:

  1. Is it a requirement to have a Primary Index at bucket level for running N1QL queries in addition to GSI? I am getting an error (Create Primary Index error) when I run my N1QL queries (using OR condition in query) with just having GSIs created and with no Primary Index at bucket level. Appreciate if some one clarifies this at their earliest possibility as our development team is unable to progress their coding purely depending upon GSIs and not having a Primary Index.

  2. Also I was told that Primary Indexes should not be promoted to PRODUCTION as they are slow and negatively impact the server performance. Is this a true statement? If so, can we exclusively depend upon GSIs?

  3. Are there any specific N1QL examples about usage of OR in the query (just with GSI and no Primary Index)? I could not find any online.
    Thanks in advance.

  1. All the queries that you run with given bucket has qualified secondary indexes, Primary index is not required. If not you get error, you can fix the error by creating proper secondary index or creating primary index.

  2. If you have tuned queries and created proper secondary index after that you can have primary index on production. With out tuning queries if create primary index it can impact performance.

  3. If predicate has OR each OR clause needs to have leading index key (If predicate has AND/OR’s it converts to Disjunctive Normal Form (A OR B) AND C => (A AND C) OR (B AND C)). If not that index will not be selected.
    If each OR clause has qualified different index, you can rewrite query as UNINON ALL
    In 4.6.2+ Each OR clause can use different index.

Thank you for such a quick response vsr1!!

I have just pasted a basic GSI/N1QL query. Can you please pin point what is wrong here with the index and how to fix the issue? Appreciate your help in this regard!! Your answers would greatly help with my understanding about the topic in question.

CREATE INDEX idx2 ON test(itemColor) USING GSI;
CREATE INDEX idx1 ON test(itemBrand) USING GSI;

SELECT itemBrand, itemColor FROM test WHERE itemBrand=“Iphone”
[
{
“itemBrand”: “Iphone”,
“itemColor”: “Black”
}
]

SELECT itemBrand, itemColor FROM test WHERE itemBrand=“Iphone” OR itemColor=“Black”

[
{
“code”: 4000,
“msg”: “No primary index on keyspace test. Use CREATE PRIMARY INDEX to create one.”,
“query_from_user”: “SELECT itemBrand, itemColor FROM test\nWHERE itemBrand = “Iphone” OR itemColor = “Black”\n”
}
]

However, it works with AND:

SELECT itemBrand, itemColor FROM test WHERE itemBrand = “Iphone” AND itemColor = “Black”
[
{
“itemBrand”: “Iphone”,
“itemColor”: “Black”
}
]

Couchbase Indexes are based on b-tree that means leading index key needs to be present in the predicate otherwise index will not be selected.

In case of OR each part of the OR needs to have leading index key .

WHERE itemBrand=“Iphone” OR itemColor=“Black”

idx1 will not be picked because leading key itemBrand is in first part of OR , but not in second part of OR.
idx2 will not be picked because leading key itemColor is not in first part of OR.

This is limitation in current version. This has been addressed with UnionScan in 4.6.2+. Each OR can use different index and results are merged.

You can achieve same things with the following rewrite.

SELECT itemBrand, itemColor FROM test WHERE itemBrand="Iphone" 
UNION 
SELECT itemBrand, itemColor FROM test WHERE  itemColor="Black";

For AND case uses both indexes and does IntersectScan.

Thanks a lot for an excellent explanation… vsr1!!!