Hello! Is it possible to have a query where you return the total value of records AND the first n records of that query? I have the following query, which works if there are 5 or more records, but doesn’t work for less records than that:
SELECT ARRAY_AGG(gm.userId)[0:5] as users, COUNT(1) as totalUsers
FROM default gm
WHERE gm.docType="GroupMessage"
GROUP BY gm.groupId
Also, it probably fetches the whole array into memory, just to fetch 5 elements, is this correct?
Ideally, in the end, I’d like to have a query where I’d join the array of userIds with the users bucket, while keeping the information of the group chat and group messages. Do you guys have any suggestions?
First grouping needs to complete before apply array slice. i.e. If given group if there are 100 items ARRAY_AGG() needs to collect all items will not terminate after reaching 5.
If you need to JOIN don’t reduce slice during projection reduce during JOIN using this
users[0:LEAST(5,ARRAY_LENGTH(users))]
SELECT gc.message, ARRAY_AGG(users.firstName) as users, innerQuery.totalUsers
FROM (
SELECT ("GroupChat::" || gm.groupId) as chats, ARRAY_AGG("User::" || gm.userId) as users, COUNT(1) as totalUsers
FROM default gm
WHERE gm.docType="GroupMessage"
GROUP BY gm.groupId
) as innerQuery
JOIN default users ON KEYS innerQuery.users0:LEAST(5,ARRAY_LENGTH(innerQuery.users))]
JOIN default gc ON KEYS innerQuery.chats
WHERE users.docType="User"
GROUP BY gc, innerQuery.totalUsers;
Thanks for the fast reply! Could you elaborate on what you mean on the grouping before the slice?
I was able to come up with this query:
SELECT gc.message, ARRAY_AGG(users.firstName) as users, innerQuery.totalUsers
FROM (
SELECT ("GroupChat::" || gm.groupId) as chats, ARRAY_AGG("User::" || gm.userId)[0:5] as users, COUNT(1) as totalUsers
FROM default gm
WHERE gm.docType="GroupMessage"
GROUP BY gm.groupId
) as innerQuery
JOIN default users ON KEYS innerQuery.users
JOIN default gc ON KEYS innerQuery.chats
WHERE users.docType="User"
GROUP BY gc, innerQuery.totalUsers
which gives me what I want. The problems that persist are the following:
If the array has less than 5 elements, the result doesn’t come up
Ideally, I’d like to process as few info as possible in memory. If I could limit the number of elements in the array a priori it would be the best.
Nice, that works! But if I have 1M users in a group message, and have 10 group messages, I’ll be sending all of that info from the inner select to the outer scope, right? There’s no way to limit that right of the bat?
SELECT gc.message, ARRAY_AGG(users.firstName) as users, innerQuery.totalUsers
FROM (
SELECT ("GroupChat::" || gm.groupId) as chats, allusers[0:LEAST(5,ARRAY_LENGTH(allusers)] as users, COUNT(1) as totalUsers
FROM default gm
WHERE gm.docType="GroupMessage"
GROUP BY gm.groupId
LETTING allusers = ARRAY_AGG("User::" || gm.userId)
) as innerQuery
JOIN default users ON KEYS innerQuery.users
JOIN default gc ON KEYS innerQuery.chats
WHERE users.docType="User"
GROUP BY gc, innerQuery.totalUsers;
Yes, this one works as well, and it seems more efficient
But it still does load all of the user entries into an array before limiting it. Do you reckon that performance in this query won’t suffer if the number of elements is too big? I guess that what I am asking is that if there’s a way to use a sort of a LIMIT statement just for the records we’re fetching, or maybe structure the query in a different way…