Build result document with multiple nested array logic

ID: MSGPQ:201:E81863521C48:2020-10-29

{
  "propCode": "MSGPQ",
  "roomNumber": "201",
  "macAddress": "E81863521C48",
  "date": "2020-10-29",
  "source": "linearTv",
  "lastReportedTime": 1603227647,
  "channels": [
    {
      "channelName": "HBO",
      "channelId": 22,
      "sessions": [
        {
          "isOccupied": 1,
          "startTime": 1603227747,
          "stopTime": 1603227847,
          "duration": 665,
          "created": 1603227747,
          "tunerDetails": [
            {
              "created": 1603228647,
              "signalLocked": 1,
              "signalLevel": 56,
              "signalQuality": 28,
              "errorLevel": 0
            },
            {
              "created": 1603229647,
              "signalLocked": 0,
              "signalLevel": 56,
              "signalQuality": 28,
              "errorLevel": 0
            }
          ]
        },
        {
          "isOccupied": 0,
          "startTime": 1603227648,
          "stopTime": 1603227649,
          "duration": 0,
          "created": 1603227648,
          "tunerDetails": [
            {
              "created": 1603227657,
              "signalLocked": 0,
              "signalLevel": 56,
              "signalQuality": 28,
              "errorLevel": 0
            },
            {
              "created": 1603227667,
              "signalLocked": 0,
              "signalLevel": 56,
              "signalQuality": 28,
              "errorLevel": 0
            }
          ]
        }
      ]
    },
    {
      "channelName": "ESPN",
      "channelId": 850,
      "sessions": [
        {
          "isOccupied": 0,
          "startTime": 1603227747,
          "stopTime": 1603227847,
          "duration": 665,
          "created": 1603227747,
          "tunerDetails": [
            {
              "created": 1603228647,
              "signalLocked": 0,
              "signalLevel": 6,
              "signalQuality": 8,
              "errorLevel": 0
            },
            {
              "created": 1603229647,
              "signalLocked": 1,
              "signalLevel": 6,
              "signalQuality": 2,
              "errorLevel": 0
            }
          ]
        }
      ]
    }
  ]
}

Provided the single document above, I have a query to populate the channels array with the matching channels. The channels array contains two channels and I return one based on the channelName and channelId.

SELECT document.*,
    ARRAY c FOR c IN document.channels WHEN c.channelName = "HBO" AND c.channelId = 22 END AS channels 
FROM connectedroom.ecmp.tvSessions AS document
WHERE document.date BETWEEN "2020-10-20" AND "2020-10-30"
AND document.source = "linearTv"
AND document.propCode = "MSGPQ"
AND ANY c IN document.channels SATISFIES c.channelName = "HBO" AND c.channelId = 22 END
ORDER BY document.date DESC;

Output:

[
  {
    "channels": [
      {
        "channelId": 22,
        "channelName": "HBO",
        "sessions": [
          {
            "created": 1603227747,
            "duration": 665,
            "isOccupied": 1,
            "startTime": 1603227747,
            "stopTime": 1603227847,
            "tunerDetails": [
              {
                "created": 1603228647,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 1,
                "signalQuality": 28
              },
              {
                "created": 1603229647,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 0,
                "signalQuality": 28
              }
            ]
          },
          {
            "created": 1603227648,
            "duration": 0,
            "isOccupied": 0,
            "startTime": 1603227648,
            "stopTime": 1603227649,
            "tunerDetails": [
              {
                "created": 1603227657,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 0,
                "signalQuality": 28
              },
              {
                "created": 1603227667,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 0,
                "signalQuality": 28
              }
            ]
          }
        ]
      }
    ],
    "date": "2020-10-29",
    "lastReportedTime": 1603227647,
    "macAddress": "E81863521C48",
    "propCode": "MSGPQ",
    "roomNumber": "201",
    "source": "linearTv"
  }
]

Now, I would like to go one level deeper into the sessions array. I would like to match on the sessions.duration field and only return sessions with duration > 0.

The expected output would be:

[
  {
    "channels": [
      {
        "channelId": 22,
        "channelName": "HBO",
        "sessions": [
          {
            "created": 1603227747,
            "duration": 665,
            "isOccupied": 1,
            "startTime": 1603227747,
            "stopTime": 1603227847,
            "tunerDetails": [
              {
                "created": 1603228647,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 1,
                "signalQuality": 28
              },
              {
                "created": 1603229647,
                "errorLevel": 0,
                "signalLevel": 56,
                "signalLocked": 0,
                "signalQuality": 28
              }
            ]
          }        
        ]
       }
    ],
    "date": "2020-10-29",
    "lastReportedTime": 1603227647,
    "macAddress": "E81863521C48",
    "propCode": "MSGPQ",
    "roomNumber": "201",
    "source": "linearTv"
  }
]

Current query:

SELECT document.*,
    ARRAY c FOR c IN document.channels WHEN c.channelName = "HBO" AND c.channelId = 22 
    AND (ARRAY s FOR s IN c.sessions WHEN s.duration > 0 END) END AS channels
FROM connectedroom.ecmp.tvSessions AS document
WHERE document.date BETWEEN "2020-10-20" AND "2020-10-30"
AND document.source = "linearTv"
AND document.propCode = "MSGPQ"
AND ANY c IN document.channels SATISFIES c.channelName = "HBO" AND c.channelId = 22 END
AND ANY s IN document.channels.sessions SATISFIES s.duration > 0 END
ORDER BY document.date DESC;

Current results:

{
  "results": []
}

I’m unsure my approach is correct and I’m struggling to build the nested arrays in the result. From my understanding, the SELECT clause builds the result document?

SELECT document.*,
    (SELECT c.*, (SELECT s.*
                  FROM c.sessions AS s
                  WHERE s.duration > 0 ) AS sessions
     FROM document.channels AS c
     WHERE c.channelName = "HBO" AND c.channelId = 22) AS channels
FROM connectedroom.ecmp.tvSessions AS document
WHERE document.date BETWEEN "2020-10-20" AND "2020-10-30"
AND document.source = "linearTv"
AND document.propCode = "MSGPQ"
AND ANY c IN document.channels SATISFIES c.channelName = "HBO" AND c.channelId = 22 AND ANY s IN c.sessions SATISFIES s.duration > 0 END END
ORDER BY document.date DESC;

Subquery FROM caluse uses ARRAY field from parent document. Using subquery you could full SQL statement, Join/Filter/Aggregates/Order/limit etc

1 Like

Thank you for the quick response. This works great.

I’m struggling with the subquery SELECT vs ARRAY clause. What if I want to exclude the filter on the channels array and only filter on the sessions array?

You could do that by omitting WHERE clause on that subquery. One easy to understand, it must go to plan, execution like any query

(SELECT c.*, (SELECT s.*
                  FROM c.sessions AS s
                  WHERE s.duration > 0 ) AS sessions
     FROM document.channels AS c) AS channels

You could use any one of them ARRAY works if u are going to use as signle filed. If not it makes much complex, like follows as u need to update field you must use object functions (if need cascade them). It runs as built in expression(s)

FYI: In your query you have right track as these are nested ANY/ARRAY of nested must moved into outer once.

ARRAY OBJECT_PUT(c, "sessions", ARRAY s FOR s IN c.sessions WHEN s.duration > 0 EN)
FOR c IN document.channels
WHEN c.channelName = "HBO" AND c.channelId = 22
END
ARRAY OBJECT_PUT(c, "sessions", ARRAY s FOR s IN c.sessions WHEN s.duration > 0 EN)
FOR c IN document.channels
END
1 Like

I see, just omit the WHERE clause on channels and the fields on the SATISFIES clause:

AND ANY c IN document.channels SATISFIES ANY s IN c.sessions SATISFIES s.duration > 0 END END

As you mention, this follows the plan and execution flow
FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> DISTINCT -> ORDER BY -> LIMIT

Ok, thank you for helping with this clarification. Yes, the OBJECT functions are more complex. Since I’m not doing computations and I intend to use multiple fields and traverse multiple nested arrays, I will stick with the SELECT subqueries. As you can see, I must filter on a condition one level lower in the tunerDetails array as well.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.