Using subselect as joined

eyalrubichi · January 31, 2016, 6:29am

When performing a JOIN operation, i have noticed that a subselect can be used on the left side of the join, but when trying it on the right side i could’nt make the query get compiled due to syntax errors.
Is there any way i can do something like:
SELECT * FROM bucket1 JOIN (SELECT * FROM bucket2 where…)

ldoguin · February 1, 2016, 9:21am

Hi, right now we only support Left outer joins. @geraldss can tell you more about this, he is our N1QL master.

cihangirb · February 2, 2016, 1:01am

you can do this using the JOIN syntax today.
SELECT c.state FROM
(SELECT b.brewery_id FROM beer-sample b WHERE type=“beer”) as a
JOIN beer-sample c ON KEYS a.brewery_id
WHERE c.type=“brewery”;

would this help?

eyalrubichi · February 2, 2016, 5:43am

Hi, thanks for the reply.
Yes, i noticed that this syntax works, but the syntax i was asking about should have the subselect on the right side of the JOIN.

for example, if B is my bucket and S is a select expression, i am aware of being able to do SELECT S JOIN B but what about SELECT B JOIN S

geraldss · February 2, 2016, 4:10pm

Hi @eyalrubichi,

No, that syntax is not currently supported. The reason is that N1QL performs JOINs by doing a key-value lookup of the right-hand side. This is not possible for generalized subqueries.

If you post or email your complete query, it may be rewritable using JOIN syntax. The right-hand subquery can usually be inlined into the outer query.

Thanks,
Gerald

eyalrubichi · February 2, 2016, 4:21pm

Hi,
thanks, here is the query that i want to write in an efficient way.
The data exists in one bucket with two ‘schemas’, user and user_device, while each user_device document has a user_id reference (so the relation is many-to-one and not one-to-many).
The ouput of the query should be all the users which the last activity of the recent device they used is between the given timestamps.

Currently i could only do this by using EXCEPT. While the first part runs quickly (small subset of the documents i need but this subset contains also documents that i don’t want), the second part kills that couchbase and calculation takes forever.

If i could use the first part and join this data (by user_id) it would run much faster, but i use CB4.0 and probably have no option of joining the data in the way that i want.

SELECT
meta(usr).id user_id,
usr.email as user_email,
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE
dvc.last_run BETWEEN 1452067057 AND 1452097057
GROUP BY meta(usr).id
EXCEPT
SELECT
meta(usr).id user_id,
usr.email as user_email,
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE
dvc.last_run > 1452097057
GROUP BY meta(usr).id;

geraldss · February 2, 2016, 6:22pm

Can you try the following query, and make sure you have a covering index on dvc(last_run, user_id).

SELECT 	meta(usr).id user_id,
	MIN(usr.email) as user_email
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE	dvc.last_run >= 1452067057
GROUP BY meta(usr).id
HAVING MAX(dvc.last_run) <= 1452097057;

Topic		Replies	Views
Subqueries and join in n1ql SQL++ n1ql	9	4104	January 4, 2016
Join with subquery including empty value SQL++	1	610	April 16, 2019
Subselect result on Join not working SQL++ query , n1ql	10	954	March 24, 2021
Join using joined bucket as key SQL++ query , n1ql , server	8	1658	June 6, 2018
Full outer join SQL++ query , n1ql	2	1251	January 5, 2022

Using subselect as joined

Related topics