Following is a simple query that expected to return aggregated results from an array within the documents based on a few predicates.
SELECT ARRAY_AGG({items.sessionid,items.text,items.topic,items.workflow, items.workflownodeid,items.sentiment, items.score,items.topmatches,items.sendername,items.sendertype,items.senderemail,items.timestamp,items.ts}) as items FROM `orders` AS r UNNEST r.items AS items WHERE r._type = "orders" AND r.storeid = 147 AND items.ts >= "1552238267" AND items.ts <= "1564804800" GROUP BY r.agentid
I have around 35K documents in the db of which only 1500 documents satisfy this requirement. The bucket ram usage is around 348MB and disk usage is 95MB.
The document structure is:
"order": {
"_id": "o.12345.12",
"_type": "orders",
"storeid": 1234,
"agentid": "123c345",
"items" : [
{
"id" : "1234",
"customer" : "John Doe",
"customerid" : 1234,
"itemid": 12344,
"itemdesc" : "Coffee table",
"ts": "1559721510883",
"category": "furniture"
},
{
"id" : "1235",
"customer" : "John Doe",
"customerid" : 1234,
"itemid": 12345,
"itemdesc" : "Desk lamp",
"ts": "1559721510900",
"category": "Lights and shades"
}
]
}
}
The query is expected to return an aggregated document in the form.
[
{
"items": [
{
"id" : "1234",
"customer" : "John Doe",
"customerid" : 1234,
"itemid": 12344,
"itemdesc" : "Coffee table",
"ts": "1559721510883",
"category": "furniture"
},
{
"id" : "1235",
"customer" : "John Doe",
"customerid" : 1234,
"itemid": 12345,
"itemdesc" : "Desk lamp",
"ts": "1559721510900",
"category": "Lights and shades"
}
]
}
]
As soon as the query is issued (testing with Couchbase console now), cbq-engine takes all server resources and crashes soon.
This is a two node cluster and both nodes are having data, index, query and search services configured. Server is community edition 6.0.0 build 1693. I’ve also created indexes on _type
, _storeid
and also an array index on ts
along with _type
and _storeid
. I can see that these indexes are used in explain
.
I’ve data for last 6 months in the db. Query returns with results in reasonable time if time duration in the query is small (for eg. 3-10 days). This happens mostly for large duration. ie a couple of months or more.
Is this an expected behavior from the db? Looking at the number of documents, I’m sure that it shouldn’t take more than a couple of GB even if the server has to remap entire data in memory. Is there a way I can optimise the query (sorry… i’m a novice in N1QL and couchbase). Appreciate some help here as our roll out is delayed due to this issue.