Cbq-engine consuming massive memory for simple queries

jeffhoward001 · January 19, 2017, 11:46pm

The following very simple query makes the cbq-engine consume 10GB of memory. The data set inside the GROUP statement only returns approx. 4,000 records. The “drg-IDC-9” document is about 212KB in size.

Is it possible that it’s trying to tack on the entire 212KB json doc onto the end of each row returned inside the grouping? If so, is there a better way to write this? Inside the drg-IDC-9 document, there’s an array of codes. What we’re trying to do is join to the document to get the corresponding weight where the visits.drgCode matches the msDrg code in the drg-IDC-9 document.

SELECT SUBSTR(visits.dischargeDate,0,10) As dischargeDate,
AVG(FIRST c.weight for c IN codes.drgs WHEN c.msDrg = visits.drgCode END) As avgWeight
FROM bucket visits
JOIN bucket codes ON KEYS "drg-IDC-9"
WHERE meta(visits).id LIKE "visit-%" AND visits.patientClass LIKE "I%" AND visits.tenantName = "xxxxxxxx" 
AND visits.drgCode != "" AND visits.drgType LIKE "MS%"
AND SUBSTR(visits.dischargeDate,0,10) >= "2016-07-01" AND SUBSTR(visits.dischargeDate,0,10) < "2016-08-01"
GROUP BY SUBSTR(visits.dischargeDate,0,10)

geraldss · January 20, 2017, 3:06pm

The memory usage is not per query. It is for a steady state of processing many concurrent queries, continuously. As a first step, you can run a few clients concurrently, issue queries in a loop with a realistic pause, and then measure and report the memory usage.

The query engine and garbage collector may be holding on to memory for ongoing and future euse.

jeffhoward001 · January 20, 2017, 4:42pm

Thanks geraldss, that’s a self-remedying problem because the cbq-engine crashes if we increase the date boundary more than a few days Plus, we’re also doing this in a test environment where there are not steady-state queries running.

This issue is reproducible from a freshly-started service. It will go from several KB of memory to 10GB (following by crashing) in just a few minutes.

What I believe is happening is it’s jamming that entire 212KB json document onto the end of each row returned by the join. Can you confirm that is what would take place?

geraldss · January 20, 2017, 6:30pm

HI @jeffhoward001,

The JOIN is being performed before the WHERE, so it may be loading unnecessary documents in the JOIN.

Try putting the left side + WHERE in a subquery:

SELECT ...
FROM (
    SELECT ...
    FROM bucket AS visits
    WHERE ...
) AS visits
JOIN bucket AS codes ...
GROUP BY ...

I hope you have all the necessary indexes on visits. How big is codes.drgs?

It is possible to rewrite this to only load codes.drgs, but it is complicated. Try the query rewrite first.

jeffhoward001 · January 20, 2017, 9:33pm

I’m not sure if I can properly rewrite the query in that way. This is hitting on a bigger topic that we’ve struggled with in N1QL. I know the indexed-joins allow you to join on individual properties instead of document keys, but is this possible for nested properties?

Here’s an example: We have a single document that contains all the DRG codes in a nested array:

{
  "docId": "drg-IDC-9",
  "version": "IDC-9",
  "drgs": [
    {
      "msDrg": "001",
      "weight": 26.2466
    },
    {
      "msDrg": "002",
      "weight": 14.6448
    },
    {
      "msDrg": "003",
      "weight": 17.6569
    }
}

In a more SQL-like psudeo-code, all we really want to do is join the visits.drgCode on codes.drgs.msDrg to do a “look-up” for the relative weight (simplified version):

SELECT visits.dischargeDate, visits.drgCode, codes.weight 
FROM bucket visits
JOIN bucket codes ON KEYS "drg-IDC-9" WHERE visits.drgCode = codes.drgs.msDrg

It seems like that would be a simple enough query to write with N1QL, but I can’t find any good examples of joining to a nested property.

geraldss · January 20, 2017, 9:35pm

If you post two sample documents here, visits and codes, and the ANSI SQL query, we can propose a N1QL query.

And yes, you can properly rewrite your original query as I suggested.

jeffhoward001 · January 20, 2017, 11:01pm

I rewrote the query with a sub-select, and the cbq-engine is using much less memory, although the query is pretty slow. The bigger question at this point is, is the FIRST/FOR/IN/WHEN really the best way to iterate over the codes.drgs array, or is there another way to write this where we’re actually “joining” directly to the codes in the codes.drgs array?

Sample of the “drg-IDC-9” document:

{
  "docId": "drg-IDC-9",
  "version": "IDC-9",
  "drgs": [
    {
      "msDrg": "001",
      "weight": 26.2466
    },
    {
      "msDrg": "002",
      "weight": 14.6448
    },
    {
      "msDrg": "003",
      "weight": 17.6569
    }
}

Sample of the “visits” document:

{
  "dischargeDate": "2016-09-20T15:57:00-07:00",
  "drgCode": "001",
  "visitNumber": "123456890"
}

and here’s the simplest form of what we’re tying to accomplish:

SELECT visits.dischargeDate, visits.drgCode, codes.weight 
FROM visits
INNER JOIN "drg-IDC-9" codes  ON visits.drgCode = drgs.msDrg

geraldss · January 21, 2017, 2:16am

Here you go.

SELECT
        SUBSTR(visits.dischargeDate,0,10) As dischargeDate,
        AVG(FIRST c.weight for c IN drgs WHEN c.msDrg = visits.drgCode END) As avgWeight
FROM
        bucket visits
LET
        drgs = (SELECT RAW codes.drgs FROM bucket AS codes USE KEYS "drg-IDC-9")[0]
WHERE
        META(visits).id LIKE "visit-%" AND visits.patientClass LIKE "I%" AND visits.tenantName = "xxxxxxxx"
        AND visits.drgCode != "" AND visits.drgType LIKE "MS%"
        AND SUBSTR(visits.dischargeDate,0,10) >= "2016-07-01" AND SUBSTR(visits.dischargeDate,0,10) < "2016-08-01"
GROUP BY
        SUBSTR(visits.dischargeDate,0,10)
;

geraldss · January 21, 2017, 3:09am

If you are using 4.5.1 or above, you should also set max-parallelism to 2 x number of cores.

jeffhoward001 · January 23, 2017, 7:12pm

Thanks Geraldss. How do we set the max parallelism? Is that a query hint, or a server level setting?

ingenthr · January 23, 2017, 9:19pm

In this case, it’s a separate parameter outside the statement itself. Assuming you’re using one of the Couchbase SDKs @jeffhoward001, you’ll find a maxParallelism setting on N1QL parameters to set this on a per-statement basis. The N1QL from the SDK page for each language should cover this.

jeffhoward001 · January 24, 2017, 8:12pm

@geraldss, the provided query works, but it’s performance is significantly slower than an alternative version we created with individual documents for each msDrg using a traditional join (e.g. individual docs - drg-ms-001, drg-ms-002, etc). I assume this is because the index scan using the key name is more efficient than the FIRST/FOR/IN iterating over the drg-IDC-9 document?

Is there any plan for couchbase to be able to join on covering indexes where it doesn’t need to load the actual doc? For example, if we had a covering index for the “drg-IDC-9” document:

CREATE INDEX idx_drgs_covering ON bucket(drgs.msDrg,drgs.weight) WHERE meta().id = "drg-IDC-9"

Which would allow you to efficiently perform a direct look-up of the data without needing to load the actual document:

SELECT visit.dischargeDate, drgs.weight
FROM bucket visit
JOIN bucket drgs ON visit.drgCode = drgs.msDrg

geraldss · January 24, 2017, 8:45pm

@jeffhoward001, can you post the alternative version. Thanks.

jeffhoward001 · January 24, 2017, 9:46pm

Definitely, but to understand the new query it’s important to note that we took the contents of the previous nested “drgs” array out of the “drgs-IDC-9” document, and broke out each nested set of properties into their own document, e.g.:

{
  "docId": "drg-IDC-9",
  "version": "IDC-9",
  "drgs": [
    {
      "msDrg": "001",
      "weight": 26.2466
    },
    {
      "msDrg": "002",
      "weight": 14.6448
    },
    {
      "msDrg": "003",
      "weight": 17.6569
    }
}

was converted into:

{
  "docId": "drg-ms-001",
  "msDrg": "001",
  "weight": 26.2466
}
{
  "docId": "drg-ms-002",
  "msDrg": "002",
  "weight": 14.6448
}
{
  "docId": "drg-ms-003",
  "msDrg": "003",
  "weight": 17.6569
}

Then we did a traditional join like we normally would for non-nested content:

SELECT visits.dischargeDate, codes.weight
FROM bucket visits
JOIN bucket codes ON KEYS "drg-ms-" || visits.drgCode

This form is fine for the purposed of working through this project. But long-term we want to be sure we can write high-performing queries that can join to properties inside of nested arrays since we may not always have the liberty to expand out nested arrays into individual documents.

geraldss · January 24, 2017, 11:07pm

Ok, got it. For the query with a LET, did you set max-parallelism?

For the JOIN you want, we would need to provide non-key JOINs. Non-key JOINs are provided by Couchbase Analytics, which is in Developer Preview. For N1QL, non-key JOINs are a major undertaking and are not yet scheduled.

Your particular example with the index makes sense. We will use your example to think about this.

Topic		Replies	Views
Cbq_engine consumes all the memory available and crashes eventually Couchbase Server n1ql , array-query	4	2107	August 12, 2019
Cbq-engine for memory and cpu question Couchbase Server 40-rc , n1ql	4	3257	February 2, 2016
High CPU and RAM usage by 'cbq-engine' Couchbase Server	7	3776	November 13, 2018
[2497238.174658] Memory cgroup out of memory: Kill process 128809 (cbq-engine) score 1307 or sacrifice child Couchbase Server query	2	506	June 21, 2023
Rather fast subqueries make original query really slow even with empty data sets SQL++	9	3237	June 1, 2016

Cbq-engine consuming massive memory for simple queries

Related topics