I am tryng to improve the performance of the following query.
The query takes around 500-900ms to run.
By checking the the “plan” of the query, the LET section takes up most of the time (90%) to go through the list of availability of each product to look for the matched location.
The query involves a few join and unnest, I am not sure how to build an index that could speed up the LET part here.
The join includes joining 3 documents together, 2 of them are in the same collection.
Any help will be really appreciated. Thanks
The query
SELECT items.itemId AS __id, META(items).cas AS __cas,
lists._id,
lists.type,
lists.uid,
lists.updatedDate,
{
"itemId":items.itemId,
"item":OBJECT_PUT(products, "availability", ava), /* I want to return the availability object of the locationId that I am querying instead of the whole list of availability */
"score": items.score,
"lastPurchaseDate":items.lastPurchaseDate
} AS joinedItem
FROM `bucket1`.`scope1`.`lists` AS lists USE KEYS "ITEM_LIST:USER100" /* The USER100 here is passed in as a variable when forming the query */
LEFT JOIN `bucket1`.`scope1`.`lists` AS rmlists ON KEYS CONCAT('REMOVED_ITEM_LIST:',lists.uid)
UNNEST lists.items AS items
INNER JOIN `bucket1`.`scope1`.`products` AS products ON KEYS CONCAT('PRODUCTS:',items.itemId)
LET rmItem = FIRST i FOR i IN rmlists.items WHEN i.itemId = items.itemId AND i.deleteDate != '' END, /* I want to filter out any item in the ITEM_LIST that exists in REMOVED_ITEM_LIST
and the rmlists.deleteDate is later than the items.lastPurchaseDate */
ava = FIRST a FOR a IN products.availability WHEN a.locationId = "bbbb-bbbb" END /* The locationId here is a variable that being passed in when forming the query.
I want to get the availability object of the locationId that I am querying */
WHERE (rmItem IS MISSING OR items.lastPurchaseDate > rmItem.deleteDate)
AND (ava IS NOT MISSING AND ava IS NOT NULL)
AND (products.dontShow IS MISSING OR products.dontShow IS NULL OR products.dontShow = false)
/* The following filters will sometimes be added depends on the business logic */
AND items.score > 0
AND ava.status = 1
AND ava.offer = true
AND products.brands IN ['BRAND_1']
AND EVERY cat IN ['CAT_B', 'CAT_C'] SATISFIES cat IN products.category END
/* The following sorting will be added depends on the business logic */
ORDER BY products.productOrder ASC
ORDER BY items.score DESC
LIMIT 50
The data model
Collection: `bucket1`.`scope1`.`lists`
Doc key: REMOVED_ITEM_LIST:USER100
{
"items": [ /* This array could have few hundred items */
{
"itemId": "1111",
"deleteDate": "2024-01-01T07:11:03Z"
},
{
"itemId": "2222",
"deleteDate": "2024-07-28T09:07:14Z"
}
],
"_id": "REMOVED_ITEM_LIST:USER100",
"type": "RemovedItemList",
"name": "RemovedItemList",
"uid": "USER100",
"updatedDate": "2024-07-01T07:11:03Z"
}
Collection: `bucket1`.`scope1`.`lists`
Doc key: ITEM_LIST:USER100
{
"items": [ /* This array could have few hundred items */
{
"itemId": "1111",
"score": 0.5
"lastPurchaseDate": "2024-03-28"
},
{
"itemId": "2222",
"score": 0.1
"lastPurchaseDate": "2024-03-28"
},
{
"itemId": "3333",
"score": 0.9
"lastPurchaseDate": "2024-07-10"
}
],
"type": "ItemList",
"uid": "USER100",
"updatedDate": "2024-07-10",
"_id": "ITEM_LIST:USER100"
}
Collection: `bucket1`.`scope1`.`products`
Doc key: PRODUCTS:1111
{
"productCode" : "1111"
"name": "4TB SSD"
"availability": [ /* This array could have few hundred items */
{
"locationId": "aaaa-aaaa",
"status": 1
},
{
"locationId": "bbbb-bbbb",
"status": 1,
"offer": true
},
{
"locationId": "cccc-cccc",
"status": 1
}
],
"brand": "BRAND_1",
"category": [
"CAT_A",
"CAT_B",
"CAT_C"
]
"type": "product",
"productOrder": 110,
"dontShow": false,
"updated": 1722343233
}