Index on UNNEST JOIN

manusyone · December 12, 2017, 2:38pm

Hi there,

I am looking to find an index that can optimize a query with an UNNEST join, where fields and filters are from both sides of the JOIN. We currently have 15k users on the database, and the structure of one user is the following:

"user": {
  "name": "John Doe",
  "birthday": "2000-01-01",
  "active": true,
  "email": "john@doe.com",
  "type": "User",
  "stores": [{
     "status": "client"
     "storeId": 1,
     "signup": "2017-01-01"
     "special": true

   },
   {
     "status": "client"
     "storeId": 2,
     "special": false
   }]
}

Here is an example of a query:

SELECT users.name, users.email, users.otherProperties, stores.signup, stores.otherProperties
FROM default users
UNNEST users.stores as stores
WHERE users.type="User"
AND users.active=true
AND stores.status="client"
AND stores.special=true
ORDER BY stores.signup DESC
LIMIT 10

With 15 k documents, a query like this would take more than 5 seconds to run. Please note that the fields on the query, the filters and the order statement are dynamic. Is there a way of making this query perform better?

Thank you for your time,
Manuel

vsr1 · December 12, 2017, 2:52pm

Check both indexes and see which performs better. The UNNEST alias needs to match with Array index variable.

CREATE INDEX ix1 ON default(DISTINCT ARRAY brands.status FOR brands IN stores END) WHERE type = "User" AND active = true;
OR
CREATE INDEX ix1 ON default(DISTINCT ARRAY brands.special FOR brands IN stores END) WHERE type = "User" AND active = true;

SELECT users.name, users.email, users.otherProperties, brands.signup
FROM default users UNNEST users.stores AS brands
WHERE users.type="User" AND users.active=true
AND brands.status="client" AND brands.special=true
ORDER BY stores.signup DESC LIMIT 10;

You can also use following technique push both the predicates to indexer when both equality predicates

CREATE INDEX ix3 ON default(DISTINCT ARRAY [brands.status,brands.active] FOR brands IN stores END) WHERE type = "User" AND active = true;

SELECT users.name, users.email, users.otherProperties, brands.signup
FROM default users UNNEST users.stores AS brands
WHERE users.type="User" AND users.active=true
AND [brands.status,brands.special] = ["client",true]
ORDER BY stores.signup DESC LIMIT 10;

manusyone · December 12, 2017, 3:19pm

Hi @vsr1, thank you for the prompt reply. I have tested the indexes, but the query still takes more than 5 seconds to run. The mentioned indexes work well for these filters, but since they are dynamic it is possible that the filters don’t exist or exist in conjunction with other filters, meaning that the index will not be as efficient (or not be used in the query at all).

For example, another possible query could be:

SELECT users.otherProperties, stores.signup
FROM default users UNNEST users.stores AS stores
WHERE users.type="User" AND users.birthday >= "1990-01-01"
AND stores.storeId=1 AND stores.signup>="2017-01-01"
ORDER BY stores.signup DESC LIMIT 10;

It seems impossible to me to write an index that could cover such a dynamic range of filters. Would it be wiser to restructure the current data model to accommodate our performance requirements?

Thanks!

EDIT: The use case for this kind of queries is a user table, that a different store administrator can access, as well as a global administrator. In this user table you can check the user information, and filter and sort a paginated result set.

vsr1 · December 12, 2017, 5:00pm

It might be taking time because ORDER which needs to produce all possible results even though there is LIMIT 10.

You can create index like this. This singup and birthday, type applied during index scan produce qualified document and UNNEST further filters from array elements not qualified.

CREATE INDEX ix1 ON default(birthday, ARRAY v.signup FOR v IN stores END) WHERE type = "User";

    SELECT users.otherProperties, stores.signup
    FROM default users UNNEST users.stores AS stores
    WHERE users.type="User" AND users.birthday >= "1990-01-01"
    AND ANY v IN users.stores SATISFIES v.signup>="2017-01-01" 
    AND stores.storeId=1 AND stores.signup>="2017-01-01"
    ORDER BY stores.signup DESC LIMIT 10;

manusyone · December 12, 2017, 5:28pm

The query now took 8.5 seconds to run. Here is the EXPLAIN:

{
    "requestID": "dd8b1e3d-2e91-4103-b608-457cb8351972",
    "clientContextID": "725b88a6-4e53-451e-8cee-f9aa79dff8cf",
    "signature": "json",
    "results": [
        {
            "plan": {
                "#operator": "Sequence",
                "~children": [
                    {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "IndexScan",
                                "index": "ix1",
                                "index_id": "1458255227e07249",
                                "keyspace": "default",
                                "namespace": "default",
                                "spans": [
                                    {
                                        "Range": {
                                            "Inclusion": 1,
                                            "Low": [
                                                "\"1990-01-01\""
                                            ]
                                        }
                                    }
                                ],
                                "using": "gsi"
                            },
                            {
                                "#operator": "Fetch",
                                "as": "users",
                                "keyspace": "default",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Parallel",
                                "~child": {
                                    "#operator": "Sequence",
                                    "~children": [
                                        {
                                            "#operator": "Unnest",
                                            "as": "stores",
                                            "expr": "(`users`.`stores`)"
                                        }
                                    ]
                                }
                            },
                            {
                                "#operator": "Parallel",
                                "~child": {
                                    "#operator": "Sequence",
                                    "~children": [
                                        {
                                            "#operator": "Filter",
                                            "condition": "((((((`users`.`type`) = \"User\") and (\"1990-01-01\" <= (`users`.`birthday`))) and ((`stores`.`storeId`) = \"b44412f1-5595-44fc-a90d-0983fb60ec73\")) and any `v` in (`users`.`stores`) satisfies (\"2017-01-01\" <= (`v`.`signup`)) end) and (\"2017-01-01\" <= (`stores`.`signup`)))"
                                        },
                                        {
                                            "#operator": "InitialProject",
                                            "result_terms": [
                                                {
                                                    "expr": "(`users`.`name`)"
                                                },
                                                {
                                                    "expr": "(`stores`.`signup`)"
                                                }
                                            ]
                                        }
                                    ]
                                }
                            }
                        ]
                    },
                    {
                        "#operator": "Order",
                        "limit": "10",
                        "sort_terms": [
                            {
                                "desc": true,
                                "expr": "(`stores`.`signup`)"
                            }
                        ]
                    },
                    {
                        "#operator": "Limit",
                        "expr": "10"
                    },
                    {
                        "#operator": "FinalProject"
                    }
                ]
            },
            "text": "SELECT users.name, stores.signup\nFROM default users USE INDEX(ix1)\nUNNEST users.stores AS stores\nWHERE users.type=\"User\" \nAND users.birthday >= \"1990-01-01\"\nAND stores.storeId=\"b44412f1-5595-44fc-a90d-0983fb60ec73\"\nAND ANY v IN users.stores SATISFIES v.signup>=\"2017-01-01\" END\nAND stores.signup >=\"2017-01-01\"\nORDER BY stores.signup DESC \nLIMIT 10"
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "3.603619ms",
        "executionTime": "3.55958ms",
        "resultCount": 1,
        "resultSize": 4568
    }
}

The user document I have put here is a simplified version of the actual document. We are creating an index on the birthday, but every user field could potentially be a part of the filter. Should we index all of those fields that can be a part of the filter?

vsr1 · December 12, 2017, 6:01pm

Typo DISTINCT keyword is missing

CREATE INDEX ix1 ON default(birthday, DISTINCT ARRAY v.signup FOR v IN stores END) WHERE type = “User”;

Also checkout https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/adaptive-indexing.html

Topic		Replies	Views
Covering Indexes in JOIN query using UNNEST SQL++	4	1242	March 22, 2019
Creating Index when 2 documents having array fields as predicates are joined SQL++ n1ql	6	1185	October 14, 2017
Index suggestion for LET, JOIN and UNNEST SQL++ query , index	6	113	October 29, 2024
Need help to create Array Index on SELECT with UNNEST SQL++ index	2	821	March 11, 2022
How do I create an index to cope with UNNEST'ed queries? SQL++ n1ql	10	1470	August 8, 2018

Index on UNNEST JOIN

Related topics