Query Array with variable key

RV · February 13, 2017, 8:03am

Hi
I am trying to query the following array

{
“version”:1,
“owners”: {
“ownerABC”:{
“location”:“CA”,
…
},
“ownerDEF”:{
“location”:“AK”,
…
},

"ownerXYZ":{
     "location":"NY",
     ......
},

…
}

}

I want to fetch all documents where ‘ownerABC’ is in the owners array.

Since the key in the owners array is variable, I am not sure how to go about doing this.
I would also appreciate help with creating an index for this query.
Thanks!
RV

atom_yang · February 13, 2017, 11:11am

you can create index by:

CREATE INDEX `idx_obj_names` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END) USING GSI;

And you can use the N1QL

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END

RV · February 13, 2017, 4:49pm

Thank you so much, that worked!!!
I had another question, if you don’t mind.
Can a create an index one more level down and query on both the outer index and the nested index.
So if my document looks like this
{
“version”:1,
“owners”: {
“ownerABC”:{
“location”:“CA”,
“property”: {
“cars”: {
“Chevrolet”: 2,
“Cadillac”: 3
},
“homes”: []
},
…
},
…
}

Can I query where owner==‘ownerABC’ and property.cars==‘Chevrolet’

atom_yang · February 14, 2017, 2:50am

yes,
you can create index by:

CREATE INDEX `idx_obj_names_cars` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END,property.cars) USING GSI;

And you can use the N1QL

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND property.cars=='Chevrolet'

RV · February 14, 2017, 7:59am

Hmm that returns nothing. Is it because cars is an array? If I remove the
AND property.cars=='Chevrolet’
it works

atom_yang · February 14, 2017, 9:20am

This N1QL should be

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND a.property.cars=='Chevrolet'

you can use

explain SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND a.property.cars=='Chevrolet'

to check if the index be used.

RV · February 14, 2017, 9:50am

So I used this query ( i have a testbucket)

_explain SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) _
SATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’

This is the result. It shows that the index is being used and the resultCount =1

_
“requestID”: “5c05aa96-b824-4218-953d-513ef25fb72a”,
“clientContextID”: “383f9887-b4cb-4607-8a47-6cc61af7c535”,
“signature”: “json”,
“results”: [
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “DistinctScan”,
“scan”: {
"#operator": “IndexScan”,
“index”: “idx_obj_names_cars”,
“index_id”: “f54295e59dc06fd5”,
“keyspace”: “testbucket”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“ownerABC”",
"“Chevrolet”"
],
“Inclusion”: 3,
“Low”: [
"“ownerABC”",
"“Chevrolet”"
]
}
}
],
“using”: “gsi”
}
},
{
"#operator": “Fetch”,
“as”: “a”,
“keyspace”: “testbucket”,
“namespace”: “default”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “(any obj in object_names((a.owners)) satisfies (obj = “ownerABC”) end and (((a.property).cars) = “Chevrolet”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “a”,
“star”: true
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) \nSATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’”
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “11.235584ms”,
“executionTime”: “11.197927ms”,
“resultCount”: 1,
“resultSize”: 2853
}
}

But when I run without the explain I get this

_SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) _
SATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’

_
“requestID”: “ad06fe2e-0e70-4512-961f-3a12c95ca53b”,
“clientContextID”: “ce4addec-8615-4ea2-83d9-413612eb201e”,
“signature”: {
"": ""
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “8.257401ms”,
“executionTime”: “8.210229ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

atom_yang · February 14, 2017, 10:04am

My mistake, you should add this index

CREATE INDEX `idx_car_names` ON `default`(DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

and use this N1QL

SELECT a.* FROM default a 
WHERE ANY obj IN object_names(a.owners) SATISFIES obj == "ownerABC" END
  AND ANY car IN object_names(a.property.cars) SATISFIES car == "Chevrolet" END

Notice: you can not create index with more than one array expression in the index, that means if you add the index by

CREATE INDEX `idx_obj_car_names` ON `default`(DISTINCT ARRAY obj FOR obj IN object_names(owners) END,DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

you will get the error

[
  {
    "code": 5000,
    "msg": "GSI CreateIndex() - cause: Multiple expressions with ALL are found. Only one array expression is supported per index.",
    "query_from_user": "CREATE INDEX `idx_obj_car_names` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END,DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;"
  }
]

RV · February 14, 2017, 7:36pm

That didnt work.

So I deleted the previous owner/make index and have only this one index

CREATE INDEX idx_car_names ON default(DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

Then, if I run this I get nothing

_SELECT a.* FROM testbucket a _
WHERE ANY car IN object_names(a.property.cars) SATISFIES car == “Chevrolet” END

Again, if I do an explain on the above Select I see that it is using the idx_car_names index and the resultCount =1

{
“requestID”: “12160748-fae5-4a4c-822a-bce243371edb”,
“clientContextID”: “915f0382-1620-4dad-8698-b1a6742eb3ba”,
“signature”: “json”,
“results”: [
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “DistinctScan”,
“scan”: {
"#operator": “IndexScan”,
“index”: “idx_car_names”,
“index_id”: “f02d220a47c82d9b”,
“keyspace”: “testbucket”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“Chevrolet”"
],
“Inclusion”: 3,
“Low”: [
"“Chevrolet”"
]
}
}
],
“using”: “gsi”
}
},
{
"#operator": “Fetch”,
“as”: “a”,
“keyspace”: “testbucket”,
“namespace”: “default”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “any car in object_names(((a.property).cars)) satisfies (car = “Chevrolet”) end”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “a”,
“star”: true
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “SELECT a.* FROM testbucket a \nWHERE ANY car IN object_names(a.property.cars) SATISFIES car == “Chevrolet” END”
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “8.956874ms”,
“executionTime”: “8.911329ms”,
“resultCount”: 1,
“resultSize”: 2666
}
}

atom_yang · February 14, 2017, 7:48pm

RV:

“#operator”: “DistinctScan”, “scan”: { “#operator”: “IndexScan”, “index”: “idx_car_names”, “index_id”: “f02d220a47c82d9b”, “keyspace”: “testbucket”, “namespace”: “default”, “spans”: [ { “Range”: { “High”: [ “"Chevrolet"” ], “Inclusion”: 3, “Low”: [ “"Chevrolet"”

This means that the index works, can you post the result of the following N1QL

SELECT object_names(a.property.cars) FROM testbucket a

to check if there is a data match “Chevrolet”?

RV · February 14, 2017, 7:59pm

It doesn’t seem to pull anything

SELECT object_names(a.property.cars) FROM testbucket a

{
“requestID”: “dfda57b8-5246-4f0d-a2e8-8a3f31362d8c”,
“clientContextID”: “1b27afd9-05d5-465d-8998-35767418eded”,
“signature”: {
"$1": “array”
},
“results”: [
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “23.857004ms”,
“executionTime”: “23.816607ms”,
“resultCount”: 16,
“resultSize”: 32
}
}

And this is the actual sample document, (In case I messed up how I created it)

{
“version”: 1,
“owners”: {
“ownerABC”: {
“location”: “CA”,
“property”: {
“cars”: {
“Chevrolet”: 2,
“Cadillac”: 3
},
“bikes”: {
“Yamaha”: 2,
“Suzuki”: 3
}
}
}
}
}

atom_yang · February 14, 2017, 8:03pm

try this N1QL

SELECT object_names(a.property.cars) FROM default a WHERE a.property.cars is not missing

RV · February 14, 2017, 8:07pm

SELECT object_names(a.property.cars) FROM testbucket a WHERE a.property.cars is not missing

{
“requestID”: “b8362d8b-d735-4a59-81bd-11136c69d855”,
“clientContextID”: “f9c0a8e7-e422-4b00-a9df-2e24a0369893”,
“signature”: {
"$1": “array”
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “109.701231ms”,
“executionTime”: “109.664269ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

This is my index
Definition: CREATE INDEX idx_car_names ON testbucket((distinct (array car for car in object_names((property.cars)) end

RV · February 14, 2017, 8:10pm

The italicizing it messing with the text.
I ran
SELECT object_names(a.property.cars) FROM testbucket a WHERE a.property.cars is not missing

{
“requestID”: “682db9ec-0819-464b-922d-e88183980983”,
“clientContextID”: “0fceeac0-c0a0-4622-ae1c-87a9964efdae”,
“signature”: {
"$1": “array”
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “70.337743ms”,
“executionTime”: “70.299933ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

atom_yang · February 14, 2017, 8:18pm

so the car should be quote by owners.ownerABC.property.cars
do you means that ownerABC here is a variable key?

RV · February 14, 2017, 8:21pm

Yes, that was one on the things I thought might be a challenge. The keys under “owners” are variable.
e.g. ownerABC, ownerDEF, ownerGHI , etc

atom_yang · February 14, 2017, 9:06pm

maybe @geraldss can give some suggestion.

geraldss · February 15, 2017, 1:59am

You can use WITHIN instead of IN to get recursive traversal.

atom_yang · February 15, 2017, 2:28am

Index

CREATE INDEX owners_car ON `default`(OBJECT_NAMES(OBJECT_VALUES(owners)[*].property[*].cars[*])) USING GSI;

N1QL

SELECT a.*
 FROM default a  
WHERE "Chevrolet" IN OBJECT_NAMES(OBJECT_VALUES(a.owners)[*].property[*].cars[*])

Should work.

RV · February 15, 2017, 4:47am

YES! This worked. Thanks a ton, appreciate the time!!

And if I want to combine owners and car this works too

SELECT a.*
FROM testbucket a
WHERE “GMC” IN OBJECT_NAMES(OBJECT_VALUES(a.owners)[].property[].cars[*])
and ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == “ownerABC” END

Also is there a way to limit the response to only the owner that satisfies the criteria. So say the document has ownerABC, ownerXYZ, OwnerEFG. Since ownerABC matches, it returns the whole document with all the owners. Can I return just the obj that matches?

Thanks!

Topic		Replies	Views
Couchbase index for variable array in query SQL++ query , n1ql , index	2	1648	January 17, 2017
Nested Array Indexing in multiple levels SQL++	2	2168	January 23, 2018
N1QL Query assistance SQL++	3	676	February 20, 2019
N1QL with array SQL++	4	604	November 5, 2020
Is it possible to use array indexing and querying with object collections/dictionaries, not arrays? SQL++ query , n1ql	6	1601	December 7, 2017

Query Array with variable key

Related topics