Hi,
I have table purchases with such structure:
SELECT VALUE a
FROM purchases a
LIMIT 2
[
{
"type": "type_x",
"customer": "c4211995-e113-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-tim4-7856-boa3-d60ae9eb7ade",
"price": 13
},
{
"type": "type_x",
"customer": "c4214985-e419-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-ffb4-4626-b1a3-d60ae9eb7ade",
"price": 12
}
]
To speed up this query :
SELECT
customer,
seller,
SUM(price),
count(*)
FROM purchases
WHERE customer IS NOT MISSING
and seller IS NOT MISSING
GROUP BY
customer,
seller
I use such index :
CREATE INDEX adv_customer_seller_price ON `default`:`purchases`(`customer`,`seller`,`price`)
Without index query is completed in 16.1s, then with index it is completed in 358.2ms
Then there is a similar structure, only now data is in an array:
SELECT VALUE a
FROM purchases_array a
LIMIT 2
[
{
"arr": [
{
"type": "type_x",
"customer": "c4214985-e419-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-ffb4-4626-b1a3-d60ae9eb7ade",
"price": 12
},
{
"type": "type_x",
"customer": "c4214985-e419-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-ffb4-4626-b1a3-d60ae9eb7ade",
"price": 12
}
]
},
{
"arr": [
{
"type": "type_x",
"customer": "c4214985-e419-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-ffb4-4626-b1a3-d60ae9eb7ade",
"price": 12
},
{
"type": "type_x",
"customer": "c4214985-e419-4cf1-98cb-93bb06b5847e",
"seller": "44ca450c-ffb4-4626-b1a3-d60ae9eb7ade",
"price": 12
}
]
}
]
I need to speed up this query:
SELECT a.customer,
a.seller,
SUM(a.price),
COUNT(*)
FROM purchases_array p
UNNEST p.arr a
WHERE a.customer IS NOT MISSING
and a.seller IS NOT MISSING
GROUP BY a.customer,
a.seller
It runs in 8.2s without an index.
Index Advisor suggests such index:
CREATE INDEX adv_ALL_arr_seller_customer ON `default`:`purchases_array`(
ALL ARRAY FLATTEN_KEYS(`a`.`seller`,`a`.`customer`) FOR a IN `arr` END
)
But this index did not speed up the query.
I also tried such index:
CREATE INDEX adv_ALL_arr_seller_customer ON `default`:`purchases_array`(
ALL ARRAY FLATTEN_KEYS(`a`.`seller`,`a`.`customer`,`a`.`price`) FOR a IN `arr` END
)
with such query:
SELECT a.customer,
a.seller,
SUM(a.price),
COUNT(*)
FROM purchases_array p
UNNEST p.arr a
WHERE a.customer IS NOT MISSING
and a.seller IS NOT MISSING
and a.price IS NOT MISSING
GROUP BY a.customer,
a.seller
It didn’t help either.
Is there a way to speed up the query, and preferably as fast as in the case of an expanded structure (as in purchases)?