Complex N1QL query with UNNEST need other lookup if empty

I have this simplified query that I try to get to work (the original query does some grouping etc. on more fields):

SELECT d.year,
       d.month,
       d.targetspecies,
       sourcecreated,
       IFMISSINGORNULL(d.hoursfished,0)*60 + IFMISSINGORNULL(d.minutesfished,0) minutesfished,
       IFMISSINGORNULL(ARRAY_LENGTH(d.catchkeys),0) catchcount,
       CASE WHEN ARRAY_LENGTH(d.catchkeys) = 0 THEN 1 ELSE 0 END zerotrips
FROM data d
UNNEST d.targetspecies specieskey
LET sourcecreated = IFMISSINGORNULL(d.sourcecreated,(CASE WHEN d.userkey IS NOT VALUED THEN '4' ELSE (CASE WHEN UPPER(d.`key`)=d.`key` THEN '1' ELSE '2' END) END))
WHERE d.type='FishingTrip'
    AND d.assockey='1'
    AND d.statspublic
    AND d.year IS VALUED
    AND d.month IS VALUED
ORDER BY d.year DESC,
         d.month,
         specieskey,
         sourcecreated
LIMIT 1000

Basically, for some of the documents I have an empty d.targetspecies. If I use the “UNNEST” then all of the documents with the empty field are not included in the query result (even if I don’t show the unnested values as above).

If there is an empty d.targetspecies then what I want is really to check all of the keys in d.catchkeys (0-n) and do a lookup to find the value of species in each of those and return the unique array of these values as d.targetspecies. The key for the direct lookup can be built like Catch:<key>.

Edit:
What I am trying to obtain is something like:

SELECT d.year,
       d.month,
       specieskey,
       sourcecreated,
       SUM(IFMISSINGORNULL(d.hoursfished,0)*60) + SUM(IFMISSINGORNULL(d.minutesfished,0)) minutesfished,
       SUM(IFMISSINGORNULL(ARRAY_LENGTH(d.catchkeys),0)) catchcount,
       SUM(CASE WHEN ARRAY_LENGTH(d.catchkeys) = 0 THEN 1 ELSE 0 END) zerotrips
FROM data d
UNNEST d.targetspecies specieskey
LET sourcecreated = IFMISSINGORNULL(d.sourcecreated,(CASE WHEN d.userkey IS NOT VALUED THEN '4' ELSE (CASE WHEN UPPER(d.`key`)=d.`key` THEN '1' ELSE '2' END) END))
WHERE d.type='FishingTrip'
    AND d.assockey='1'
    AND d.statspublic
    AND d.year IS VALUED
    AND d.month IS VALUED
GROUP BY d.year,
         d.month,
         specieskey,
         sourcecreated
ORDER BY d.year DESC,
         d.month,
         specieskey,
         sourcecreated

I have moved out on too deep water for my query knowledge - but can someone perhaps point me in the right direction to solve it - if this can be done?

/John

Basically, for some of the documents I have an empty d.targetspecies. If I use the “UNNEST” then all of the documents with the empty field are not included in the query result (even if I don’t show the unnested values as above).

Have you explored LEFT UNNEST ?

Oh yes, that got rid of the first problem!

So now I can continue to try and do the lookup for any “child” documents if the value for d.targetspecies is empty :+1:

Hmmmm… I know that I can do something like this for 1:1 join:

:
FROM data AS d LEFT
    JOIN data AS c ON 'Catch:'||d.`catchkey` = META(c).id
    AND c.type='Catch'
:

However, if I have:

  "catchkeys": [
    "CA99EEDB3096E806C1258AE1004A7AB6",
    "491F860894B10392C1258AE1004A578B",
    "386BF6ED8F7C4415C1258AE1004A4A8B",
    "491A1F1F6DF20399C1258AE1004A6B17"
  ],

how can I then get the specieskey from each of those documents and combine to a unique list using meta ids like: Catch:CA99EEDB3096E806C1258AE1004A7AB6?

Trying something like this:

:
FROM data d 
LEFT JOIN data c ON 'Catch:'||catchkey = META(c).id
LEFT UNNEST d.catchkeys catchkey
LEFT UNNEST d.targetspecies specieskey
:

… but it doesn’t recognize the catchkey in the LEFT JOIN line.

SELECT d.year, d.month, d.targetspecies, sourcecreated,
       IFMISSINGORNULL(d.hoursfished,0)*60 + IFMISSINGORNULL(d.minutesfished,0) minutesfished,
       IFMISSINGORNULL(ARRAY_LENGTH(d.catchkeys),0) catchcount,
       CASE WHEN ARRAY_LENGTH(d.catchkeys) = 0 THEN 1 ELSE 0 END zerotrips
FROM data d
UNNEST (CASE WHEN ARRAY_LENGTH(d.targetspecies) > 0
            TEHN d.targetspecies
            ELSE (SELECT RAW c.speices FROM data AS c USE KEYS (ARRAY 'Catch:'|| v FOR v IN d.catchkeys END))
            END) specieskey
LET sourcecreated = IFMISSINGORNULL(d.sourcecreated,(CASE WHEN d.userkey IS NOT VALUED THEN '4' ELSE (CASE WHEN UPPER(d.`key`)=d.`key` THEN '1' ELSE '2' END) END))
WHERE d.type='FishingTrip'
    AND d.assockey='1'
    AND d.statspublic
    AND d.year IS VALUED
    AND d.month IS VALUED
ORDER BY d.year DESC,
         d.month,
         specieskey,
         sourcecreated
LIMIT 1000;

Change fields in subquery

1 Like

Oh, this is so cool… :pray: You really are a SQL++ superman! :man_superhero:

I had one slight challenge left (took some testing and verification to find). But if I have a fishing trip without targetspecies and I have more catches of the same type of species then the aggregation counted too many trips and catches (in this sample the double).

This is a simple sample of the data, where I have a fishing trip without targetspecies:

    "catchkeys": [
      "CA99EEDB3096E806C1258AE1004A7AB6",
      "491F860894B10392C1258AE1004A578B",
      "386BF6ED8F7C4415C1258AE1004A4A8B"
    ],
    "key": "4D900788E957FFA1C1258AE1004A4A48",

It has 3 catchkeys and for each of them there is a specieskey (I incorrectly just used species in the question). So for each I have:

{
  "key": "CA99EEDB3096E806C1258AE1004A7AB6",
  "specieskey": "13",
  :
}
:
{
  "key": "491F860894B10392C1258AE1004A578B",
  "specieskey": "13",
  :
}
:
{
  "key": "386BF6ED8F7C4415C1258AE1004A4A8B",
  "specieskey": "32",
  :
}

To get the right species keys back when grouping I just had to change the UNNEST to include DISTINCT:

UNNEST (CASE WHEN ARRAY_LENGTH(d.targetspecies) > 0 THEN d.targetspecies ELSE (
        SELECT DISTINCT RAW c.specieskey
        FROM data AS c USE KEYS (ARRAY 'Catch:'|| v FOR v IN d.catchkeys END)) END) specieskey

… and now it works like a charm! I can’t help being impressed at what you can obtain with SQL++ and Couchbase!

Thanks!

I just wanted to update here in case someone else came across the question and dialogue.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.