Alternative to subquery for this specific use case?

Assuming that the data as follow,

metaid | document
test1 | {“id”:1,“name”:“a”,“deleted”:false}
test2 | {“id”:2,“name”:“a”,“deleted”:false}
test3 | {“id”:3,“name”:“a”,“deleted”:true}

select zd.idx from
(select meta().id as idx, deleted from data where id < 4 order by id DESC limit 1) as zd
where deleted = false;

The expected result is null as the subquery will return test 3 and the outer condition will filter it.

I wonder if there is an alternative to subquery in this specific use case.

Can’t you just do SELECT meta().id FROM data WHERE deleted = true ? Or do you need something more specific?

If you want the meta id of the deleted document with the minimum id, that would be

SELECT meta().id FROM data WHERE id = (SELECT min(t.id) FROM data as t WHERE deleted = true)

If deleted = false needs to apply afterwords you need to use subquery.

Also you can try the following query

SELECT zd.idx  FROM 
(SELECT RAW MAX([id,{"idx":META().id,"deleted":deleted}])[1]  FROM data WHERE id < 4 ) AS zd
WHERE  zd.deleted = false;

The following is another option but this may be expensive due to GROUP BY.
HAVING clause is allowed with GROUP BY. To add HAVING I add GROUP BY non existent field so that all can be put in single group

INSERT INTO default VALUES("test1", {"id":1,"name":"a","deleted":false}),
                    VALUES("test2", {"id":2,"name":"a","deleted":false}),
                    VALUES ("test3",{"id":3,"name":"a","deleted":true});

SELECT maxdoc.idx FROM default WHERE id < 4 GROUP BY nonexistentfield
LETTING maxdoc = (MAX([id,{"idx":META().id,"deleted":deleted}])[1])
HAVING (maxdoc.deleted = false);

@johan_larson: Yes I do need something more specific. The idea is to get the largest id, and should the value, ‘deleted’ is true, return null. Otherwise return test3. Should a new record is inserted test4 { id = 4, deleted = false; }

The following query will still return null.
select zd.idx from
(select meta().id as idx, deleted from data where id < 4 order by id DESC limit 1) as zd
where deleted = false;

@vsr1: Thank you for the ‘group by’ statement. It is clear that the group by statement can be easily expand to the following which returns records of distinct name of the largest id if it is not deleted.

SELECT maxdoc.idx FROM default WHERE id < 4 GROUP BY name
LETTING maxdoc = (MAX([id,{“idx”:META().id,“deleted”:deleted}])[1])
HAVING (maxdoc.deleted = false);

To expand the discussion, I wonder what will be the performance of the group by statement if you are well aware that the record will either be 1 record or none.

If there is 1 group or none should not be any performance impact. If there is aggregate it automatically as dummy group by.

You also mention that it can be expensive due to group by which I would presume is scenario driven. Perhaps I should present my use cases instead.

So there are 2 possible use-cases
INSERT INTO default VALUES(“test1”, {“id”:1,“name”:“a”,“deleted”:false});
INSERT INTO default VALUES(“test2”, {“id”:2,“name”:“a”,“deleted”:false});
INSERT INTO default VALUES(“test3”, {“id”:3,“name”:“b”,“deleted”:false});
INSERT INTO default VALUES(“test4”, {“id”:4,“name”:“c”,“deleted”:false}),
INSERT INTO default VALUES(“test5”, {“id”:5,“name”:“c”,“deleted”:true});

  1. We want the meta.id of each distinct name with id smaller than n. If the document (with the largest id) is deleted, skip this document
    The expected result is test2, test3 if n is 6

SELECT maxdoc.idx FROM default WHERE id < n GROUP BY name
LETTING maxdoc = (MAX([id,{“idx”:META().id,“deleted”:deleted}])[1])
HAVING (maxdoc.deleted = false);

  1. We want the meta.id of a specific name id smaller than n. If the document (with the largest id) is deleted, return null.
    The expected result is test2 if n is 6 and m is a
    The expected result is test4 if n is 5 and m is c
    The expected result is null if n is 6 and m is c

select zd.idx from
(select meta().id as idx, deleted from data where name = m AND id < n order by id DESC limit 1) as zd
where deleted = false;

Do you think, for 2 that I should use this instead
SELECT maxdoc.idx FROM default WHERE name = m and id < n GROUP BY donkeywhichdoesnotexist
LETTING maxdoc = (MAX([id,{“idx”:META().id,“deleted”:deleted}])[1])
HAVING (maxdoc.deleted = false);

Hi @wai.kwang.mak,
Use case 1 you need separate result for each name the query is right and there is no other way you can achieve without group by.

Use Case 2, You can use GROUP BY or SUBQUERY but i prefer subquery.
The reason is non required clause adding will few CPU cycles, what happens tomorrow added this field in document whole results will change. Is this show in performance difference i don’t think so.

@vsr1 Thank you for explanation. I will work towards your suggestion.

Hi expert,
why did not work this query?
I will appreciate if anyone let me know the reason.

Thanks.

Hi @vhp1360,

Could you please post query that you have problem.

I meant the top question

Hi @vhp1360,

The query in post #1 works correctly. @wai.kwang.mak would like to know is there alternative way to write the query without subquery.