Hi Cb Team,
i have some question about indexing sequence and would like to understand , since advised index give some better response time and different index.
i am fetching “orderid” from the documents and did not include “orderid” in my current index since i am using other predicates to filter out the data , now advised index suggested add the “orderid” in my current index , also suggested use where clause isEsOrdTr in the index as well but i also have document which contains isKsOrdTr , isTsOrdTr as well.
Can you please explain
Can you please share a link of any detail document and example for indexing sequence and predicates.
Query
SELECT ARRAY_AGG (b.OrderId) tn
FROM Bucket_Order b
WHERE b.OrderId BETWEEN 0 AND 19
AND b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false
Created index:
CREATE INDEX Order ON Bucket_Order (OrderId,EsOrderTriggerEpoch,isOrdTr)
PARTITION BY hash(OrderId) WITH { “defer_build”:false,“num_partition”:32 }
Advise index by couch base
CREATE INDEX adv_Order ON Bucket_Order (OrderId,OrderNumber,EsOrderTriggerEpoch) WHERE (isEsOrdTr` = false)
[ Note we have 3 types in OrdTr - > 1 : isEsOrdTr , 2 : isKsOrdTr , 3 : isTsOrdTr
SELECT ARRAY_AGG (b.OrderId) AS tn
FROM Bucket_Order b
WHERE b.OrderId BETWEEN 0 AND 19
AND b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false;
CREATE INDEX ix1 ON Bucket_Order (OrderId, EsOrderTriggerEpoch) WHERE isEsOrdTr = false;
CREATE INDEX ix2 ON Bucket_Order (EsOrderTriggerEpoch, OrderId) WHERE isEsOrdTr = false;
See which index performs better and use that (you can add your partition key). Also if you think orderId has duplicates you can use ARRAY_AGG (DISTINCT b.OrderId)
You can avoid the Aggregation also.
SELECT tn
LET tn = (SELECT RAW b.OrderId
FROM Bucket_Order b
WHERE b.OrderId BETWEEN 0 AND 19
AND b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false);
The following will explain why your index( may need to go through lot of index entries) vs recommended make difference. If equality is first or index where clause this can avoid number of index keys lookup can make faster
INDEX ix1 ON Bucket_Order (OrderId, EsOrderTriggerEpoch) WHERE isEsOrdTr = false; - On this index you have suggested Where clause and use equality predicate b.isEsOrdTr=false but it alos contain “True” value as well then how it will work ?
Might be i got your point let me elaborate and correct me if i wrong , basically we need to consider the documents in an index where isEsOrdTr value as “false” in case of value populating as “True” we are not considering in N1QL query then why need to add composite index Ex: ( OrderId , EsOrderTriggerEpoch , isEsOrdTr )
you have also mentioned "you can add your partition key " in your query currently my partition is on orderId number if i will add my partition in my where clause , then still i need to use same field in covering index like below, please explain.
CREATE INDEX ix1 ON Bucket_Order (OrderId, EsOrderTriggerEpoch) WHERE isEsOrdTr = false;
If you want to query on both isEsOrdTr = false and isEsOrdTr=true
CREATE INDEX ix1 ON Bucket_Order (isEsOrdTr, OrderId, EsOrderTriggerEpoch)
PARTITION BY hash( `OrderId` ) WITH { “defer_build”:false,“num_partition”:32 }
If you want query all the values of isEsOrdTr, what ever you have at the beginning is right.
CREATE INDEX Order ON Bucket_Order ( OrderId , EsOrderTriggerEpoch , isOrdTr )
PARTITION BY hash( OrderId ) WITH { “defer_build”:false,“num_partition”:32 }
CREATE INDEX Order ON Bucket_Order ( OrderId , EsOrderTriggerEpoch , isOrdTr )
PARTITION BY hash( OrderId ) WITH { “defer_build”:false,“num_partition”:32 }
but using this index perfomance is very poor 9s for 5k documents , hence now i am thinking use where clause but not sure if i can use where clause and PARTITION BY combine something like below
CREATE INDEX Order ON Bucket_Order ( OrderId , EsOrderTriggerEpoch , isOrdTr )
WHERE isEsOrdTr = false
PARTITION BY hash( OrderId ) WITH { “defer_build”:false,“num_partition”:32 } , but in this index i have question if do PARTITION By OrderId then why need to num_partition”:32 do you have any thoughts …?
Where clause, partition by independent you can use it.
Depends on how many document you have. If small you don’t need PARTITION BY. If you have many you can use default 8 or even less (if have less distinct values more partitions may leave them empty). As you use range on orderId it has to do scatter gather (no partition elimination).
If you want distinct orderId
SELECT RAW b.OrderId
FROM Bucket_Order b
WHERE b.OrderId BETWEEN 0 AND 19
AND b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false
GROUP BY b.OrderId ;
I have tried diffrent options and change the index and run the query this is my results.
CREATE INDEX IDX_Order ON Bucket_Order(EsOrderTriggerEpoch,OrderId)
PARTITION BY hash(OrderId)
WHERE (isEsOrdTr = false)
WITH { “num_partition”:32 }
Executed query - Total documents 800k - response time 871 ms
SELECT tn
LET tn = (SELECT RAW b.OrderId
FROM Bucket_Order b
WHERE b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false
LIMIT 100000);
Note: Spans range { “high”: “1591228785”, “inclusion”: 3, “low”: “1591210768” }
But when i have added additional required filter orderid it took 5 sec to retrive the same amount of orderids
SELECT tn
LET tn = (SELECT RAW b.OrderId
FROM Bucket_Order b
WHERE b.EsOrderTriggerEpoch>=1591210768 AND b.EsOrderTriggerEpoch<=1591228785
AND b.isEsOrdTr=false
AND b.OrderId >= 0 AND 19 <=b.OrderId
LIMIT 100000);
Hi Vsr1,
i have found why my response bumped it up, basically filter by orderId was not included in my index and hence it making a call to a data node to get a order id , i have made query change it worked fine.
CREATE INDEX ix1 ON `Bucket_Order ` ( isEsOrdTr, `EsOrderTriggerEpoch` , `OrderId` )
PARTITION BY hash( `OrderId` );
CREATE INDEX ix2 ON `Bucket_Order ` ( `EsOrderTriggerEpoch` , `OrderId` )
PARTITION BY hash( `OrderId` ) WHERE isEsOrdTr=false;
@mehul, you can check the following:
a. resident percent of the index(UI stats). If it is less than 100, allocate more memory so the scans don’t need to do disk fetch.
b. index partitions can be scanned in parallel if there are enough cores. with 16 partitions on a box, ideally it needs 16 cores for full concurrency. Also check how many cores have been allocated to the index service(UI->Settings).
Hi Deepkaran,
Do you have any document which shows how the data been store into index in the couchabase.
Now i am curioues since i have multicolumn indexes.
I have also checked resident percent and it 1oo , and for cores set up working with DBA team. Once we set up everything, we will run query and see hows query perfrome.
Hi Deepkaran,
I have checked ui-setting but not able to find cores section assigned to a index node.
I saw Ram assigned to a node , analytics , eventing but not cores , can you please navigate me.
Can, you please also explain
N1QL uses stale=ok for a consistency model . It was observed that this query utilized 24 cores completely to achieve an 80% latency of 5ms against a bucket of 20M documents.