Hey all, I have a complex query that’s very slow when we introduce the ORDER BY. I was wondering if anybody could help me think outside the box to get this faster. Current query is taking around 8-10 seconds when processing ~1500 docs for a user.
We have item
type documents which look like this:
{
"type": "item",
"owner": "dGVzdEBzaG9wc3RyZWFtLmlv",
"labels": [
{
"categoryName": "Price",
"text": "$10"
},
{
"categoryName": "Style",
"text": "Dress"
},
{
"categoryName": "Size",
"text": "M"
}
]
}
Each item has a ‘labels’ array, each label has a ‘text’ and ‘categoryName’
What we need to do is query items, and order them dynamically at query time. The ordering needs to happen on multiple levels. This example below is our current implementation. It will order item docs by Style first in the order of (hat, dress, skirt), then by Size (XXS, XS, S, etc) and finally order by Price (low to high)
SELECT item,
IFNAN(NANIF(ARRAY_POSITION(["HAT","DRESS","SKIRT"], sortLabel_0), -1), 999) as sort_0,
IFNAN(NANIF(ARRAY_POSITION(["XXS","XS","S","M","L","XL","XXL"], sortLabel_1), -1), 999) as sort_1,
IFNAN(NANIF(ARRAY_POSITION(["$5","$10","$20"], sortLabel_2), -1), 999) as sort_2
FROM database AS item
LET sortLabel_0 = (SELECT RAW UPPER(l.text)
FROM database i
USE KEYS META(item).id
LEFT OUTER UNNEST i.labels AS l
WHERE upper(l.categoryName) = "STYLE")[0],
sortLabel_1 = (SELECT RAW UPPER(l.text)
FROM database i
USE KEYS META(item).id
LEFT OUTER UNNEST i.labels AS l
WHERE upper(l.categoryName) = "SIZE")[0],
sortLabel_2 = (SELECT RAW UPPER(l.text)
FROM database i
USE KEYS META(item).id
LEFT OUTER UNNEST i.labels AS l
WHERE upper(l.categoryName) = "PRICE")[0]
WHERE item.owner = "bHVsYXJvZXNoZXJ5bEBpY2xvdWQuY29t"
AND item.type = "item"
AND item._sync.rev IS NOT NULL
AND _deleted IS MISSING
ORDER BY sort_0 ASC, sortLabel_0 DESC, sort_1 ASC, sortLabel_1 DESC, sort_2 ASC, sortLabel_2 DESC, item.created DESC
OFFSET 0 LIMIT 200
This query is basically saying, for each item.label, find the Style and find it’s text position in the Style array. Then do the same for Size and then Price. If any of the texts are not found in the array, order it alphabetically after all the other found ones.
In the query above, labelSort_x is going to be the label text that matches the category, and sort_x is it’s actual position in the “order array”.
We have indexes on doc type, owner, label.text, label.categoryName. The query is fine, until we do Order By.
CREATE INDEX ss_idx_item_label ON database((label.text),(label.categoryName)) WHERE (type = item)
My question is, does anybody seen an immediate way to speed this up - or even a whole new approach to this? Again, the actual order needs to be dynamic. This is slow, because I’m guessing every item doc for this user is unnesting it’s labels and comparing it to the order array, which is very expensive it seems.
We’re new to Couchbase and still learning as much as we can, but this has us stumped.
Thank you very much!
-Tim
Edit: The reason we’re passing in the order to follow is because it’s defined at run / query time