Global Secondary Indexes do not provide automatic built-in replicas.So I can place two indexes on two separate nodes (nodeA and nodeB) on the same fields by
CREATE INDEX Index1_beer_name
ON `beer-sample`(name)
WHERE type="beer" USING GSI WITH {"nodes":["nodeA:8091"]};
CREATE INDEX Index2_beer_name
ON `beer-sample`(name)
WHERE type="beer" USING GSI WITH {"nodes":["nodeB:8091"]};
And I also can use USE INDEX in N1QL to hint which index should be used in N1QL.such as
SELECT name FRPM default WHERE type = "beer";
or
SELECT name FRPM default USE INDEX ( Index1_beer_name ) WHERE type = "beer";
or
SELECT name FRPM default USE INDEX ( Index2_beer_name ) WHERE type = "beer";
I have some questions:
If I don’t hint index in N1QL, and assume that this N1QL will use the index that created above,Which Index will be choose by QueryService? And what is the strategy?
If I have 2 Query Services, and I want to use USE INDEX to hint index in N1QL(because my application have several N1QL,and I created a lot of index,If I don’t hint index in N1QL,the query plan might not use th
At the prepare time, query chooses the suitable index. If you’ve given a index name, it will use it. At runtime, we load balance the index scans among the equivalent indexes based on the index definition. In this case both indexes qualify and so it will randomly use one of the 2 indexes (when no index hints are given).
When you use index hints it forces the query to use that index but only if the index qualifies. In the event the index doesn’t exist, you get an error. If the index exists but doesn’t qualify for the given query, it reverts to a primary scan in 4.5.0 and 4.5.1.
Just to be sure that I understand this right. Given the example above, if I add the replica in the index hint and use the statement
SELECT name FRPM default USE INDEX ( Index1_beer_name, Index2_beer_name ) WHERE type = "beer";
load balancing will work fine, I suppose. But
So if I add the replica in the index hint, I will get an error if nodeA with the Index1_beer_name is down?
If yes: Is there any possibility to use index hints and make use of the index replicas for higher availability? Or is the best practice to do without index hints if availability is important?
For now, you are best off avoiding index hints if you want HA.
The problem is that the indexer currently cannot distinguish between a non-existent index and an unavailable index. Both are reported as errors and the query errors out.
If you avoid index hints, you don’t run into this issue.
Thanks for the fast and helpful answer, even if the content is not really good news for us. Are there any plans to make GSI more usable in situations where availability is relevant in the foreseeable future?
The need of different names for index replicas on other nodes means our application must have quite detailed information on the Couchbase infrastructure and work with it quite deep in the application logic. At least if we want/must use index hints or otherwise refer to specific indices. The CREATE INDEX syntax WITH {“nodes”:[“nodename”]} suggests that someone thought about the possibility of creating an index with replicas under the same name on several nodes. That would be great, as the index replication could be transparent to the application. Is this intuitively expected feature postponed, dropped or not planned at all?
Yes, your observation is correct! We intend to provide easier methods to create and manage index replicas which is the reason it is an array. It should be available very soon. Until that point, as @geraldss indicates, it is best to avoid hints in HA situations.
We would like to use GSI Index Replication on multiple nodes (for HA), but would be able to use index hints in some queries (because they are faster with index hint in some specific cases).
Our wishlist would be:
easier way replicate GSI indexes on multiple notes without having to worry about unique index names
a way to use USE INDEX hints in N1QL queries without having to worry about on which node the index is.