I have a requirement to group the data by time intervals, (15 min, 30 min, 1 hour, 2 hour, A day etc). And count the number of partySize (its a field in the doc)
To explain my case better below are the sample docs that I created
- INSERT INTO places VALUES (“def01”,{ “createdAt”: “2023-08-07T01:00:00”, “name”: “andrea” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def02”, { “createdAt”: “2023-08-07T01:10:00”, “name”: “alex” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def03”, { “createdAt”: “2023-08-07T03:10:00”, “name”: “bob” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def04”, { “createdAt”: “2023-08-07T03:15:00”, “name”: “ben” , “partySize”:“5”, “tp”:“qp”});
- INSERT INTO places VALUES (“def05”, { “createdAt”: “2023-08-07T04:15:00”, “name”: “chris” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def06”, { “createdAt”: “2023-08-07T03:03:00”, “name”: “bacca” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def07”, { “createdAt”: “2023-08-07T03:17:00”, “name”: “buchi” , “partySize”:“1”, “tp”:“qp”});
- INSERT INTO places VALUES (“def08”,{ “createdAt”: “2023-08-07T01:10:00”, “name”: “andy duplicate” , “partySize”:“1”, “tp”:“qp”});
I was able to achieve what i needed by below query (yet tested only on smaller data set).
1. Trying to group them by intervals of 15 minutes from start and end datetime
SELECT time AS createdAt,
SUM(
(CASE WHEN DATE_DIFF_MILLIS(MILLIS(d.createdAt), MILLIS(time),‘millisecond’) BETWEEN 0 AND 899999 THEN d.partySize ELSE 0 END)
) AS count
FROM places AS d UNNEST DATE_RANGE_STR(“2023-08-07T01:00:00”, “2023-08-07T05:00:00”, ‘minute’, 15) AS time
WHERE d.tp=“qp” AND d.createdAt IS NOT NULL
GROUP BY time
ORDER BY time
Below is the output
[
{
“count”: 3,
“createdAt”: “2023-08-07T01:00:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T01:15:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T01:30:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T01:45:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T02:00:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T02:15:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T02:30:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T02:45:00”
},
{
“count”: 2,
“createdAt”: “2023-08-07T03:00:00”
},
{
“count”: 6,
“createdAt”: “2023-08-07T03:15:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T03:30:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T03:45:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T04:00:00”
},
{
“count”: 1,
“createdAt”: “2023-08-07T04:15:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T04:30:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T04:45:00”
}
]
2. try same query in span of 30 minutes
SELECT time AS createdAt,
SUM(
(CASE WHEN DATE_DIFF_MILLIS(MILLIS(d.createdAt), MILLIS(time),‘minute’) BETWEEN 0 AND 30 THEN d.partySize ELSE 0 END)
) AS count
FROM places AS d UNNEST DATE_RANGE_STR(“2023-08-07T01:00:00”, “2023-08-07T05:00:00”, ‘minute’, 30) AS time
WHERE d.tp=“qp” AND d.createdAt IS NOT NULL
GROUP BY time
ORDER BY time
Below is the output, i have omitted the 0 value ouput but notice the time intervals are seperated by 30 mins
{
“count”: 3,
“createdAt”: “2023-08-07T01:00:00”
},
{
“count”: 8,
“createdAt”: “2023-08-07T03:00:00”
},
{
“count”: 1,
“createdAt”: “2023-08-07T04:00:00”
},
{
“count”: 0,
“createdAt”: “2023-08-07T04:30:00”
}
3. Tried with 1 hour, and this is very the flaw in the query surfaced, my approach was same but the results are incorrect.
SELECT time AS createdAt,
SUM(
(CASE WHEN DATE_DIFF_MILLIS(MILLIS(d.createdAt), MILLIS(time),‘hour’) BETWEEN 0 AND 23 THEN d.partySize ELSE 0 END)
) AS count
FROM places AS d UNNEST DATE_RANGE_STR(“2023-08-07T01:00:00”, “2023-08-07T05:00:00”, ‘hour’, 1) AS time
WHERE d.tp=“qp” AND d.createdAt IS NOT NULL
GROUP BY time
ORDER BY time
I took some inspiration from this post
Please Help @geraldapeoples @vsr1