Query Array with variable key

Hi
I am trying to query the following array

{
“version”:1,
“owners”: {
“ownerABC”:{
“location”:“CA”,

},
“ownerDEF”:{
“location”:“AK”,

},

"ownerXYZ":{
     "location":"NY",
     ......
},


}

}

I want to fetch all documents where ‘ownerABC’ is in the owners array.

Since the key in the owners array is variable, I am not sure how to go about doing this.
I would also appreciate help with creating an index for this query.
Thanks!
RV

you can create index by:

CREATE INDEX `idx_obj_names` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END) USING GSI;

And you can use the N1QL

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END
2 Likes

Thank you so much, that worked!!!
I had another question, if you don’t mind.
Can a create an index one more level down and query on both the outer index and the nested index.
So if my document looks like this
{
“version”:1,
“owners”: {
“ownerABC”:{
“location”:“CA”,
“property”: {
“cars”: {
“Chevrolet”: 2,
“Cadillac”: 3
},
“homes”: []
},

},

}

Can I query where owner==‘ownerABC’ and property.cars==‘Chevrolet’

yes,
you can create index by:

CREATE INDEX `idx_obj_names_cars` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END,property.cars) USING GSI;

And you can use the N1QL

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND property.cars=='Chevrolet'

Hmm that returns nothing. Is it because cars is an array? If I remove the
AND property.cars=='Chevrolet’
it works

This N1QL should be

SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND a.property.cars=='Chevrolet'

you can use

explain SELECT a.* FROM default a WHERE ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == "ownerABC" END AND a.property.cars=='Chevrolet'

to check if the index be used.

So I used this query ( i have a testbucket)

_explain SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) _
SATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’

This is the result. It shows that the index is being used and the resultCount =1

_
“requestID”: “5c05aa96-b824-4218-953d-513ef25fb72a”,
“clientContextID”: “383f9887-b4cb-4607-8a47-6cc61af7c535”,
“signature”: “json”,
“results”: [
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “DistinctScan”,
“scan”: {
"#operator": “IndexScan”,
“index”: “idx_obj_names_cars”,
“index_id”: “f54295e59dc06fd5”,
“keyspace”: “testbucket”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“ownerABC”",
"“Chevrolet”"
],
“Inclusion”: 3,
“Low”: [
"“ownerABC”",
"“Chevrolet”"
]
}
}
],
“using”: “gsi”
}
},
{
"#operator": “Fetch”,
“as”: “a”,
“keyspace”: “testbucket”,
“namespace”: “default”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “(any obj in object_names((a.owners)) satisfies (obj = “ownerABC”) end and (((a.property).cars) = “Chevrolet”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “a”,
“star”: true
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) \nSATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’”
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “11.235584ms”,
“executionTime”: “11.197927ms”,
“resultCount”: 1,
“resultSize”: 2853
}
}

But when I run without the explain I get this

_SELECT a.* FROM testbucket a WHERE ANY obj IN OBJECT_NAMES(a.owners) _
SATISFIES obj == “ownerABC” END AND a.property.cars==‘Chevrolet’

_
“requestID”: “ad06fe2e-0e70-4512-961f-3a12c95ca53b”,
“clientContextID”: “ce4addec-8615-4ea2-83d9-413612eb201e”,
“signature”: {
"": ""
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “8.257401ms”,
“executionTime”: “8.210229ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

My mistake, you should add this index

CREATE INDEX `idx_car_names` ON `default`(DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

and use this N1QL

SELECT a.* FROM default a 
WHERE ANY obj IN object_names(a.owners) SATISFIES obj == "ownerABC" END
  AND ANY car IN object_names(a.property.cars) SATISFIES car == "Chevrolet" END

Notice: you can not create index with more than one array expression in the index, that means if you add the index by

CREATE INDEX `idx_obj_car_names` ON `default`(DISTINCT ARRAY obj FOR obj IN object_names(owners) END,DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

you will get the error

[
  {
    "code": 5000,
    "msg": "GSI CreateIndex() - cause: Multiple expressions with ALL are found. Only one array expression is supported per index.",
    "query_from_user": "CREATE INDEX `idx_obj_car_names` ON `default`(DISTINCT ARRAY obj FOR obj IN OBJECT_NAMES(owners) END,DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;"
  }
]

That didnt work.

So I deleted the previous owner/make index and have only this one index

CREATE INDEX idx_car_names ON default(DISTINCT ARRAY car FOR car IN OBJECT_NAMES(property.cars) END) USING GSI;

Then, if I run this I get nothing

_SELECT a.* FROM testbucket a _
WHERE ANY car IN object_names(a.property.cars) SATISFIES car == “Chevrolet” END

Again, if I do an explain on the above Select I see that it is using the idx_car_names index and the resultCount =1

{
“requestID”: “12160748-fae5-4a4c-822a-bce243371edb”,
“clientContextID”: “915f0382-1620-4dad-8698-b1a6742eb3ba”,
“signature”: “json”,
“results”: [
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “DistinctScan”,
“scan”: {
"#operator": “IndexScan”,
“index”: “idx_car_names”,
“index_id”: “f02d220a47c82d9b”,
“keyspace”: “testbucket”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“Chevrolet”"
],
“Inclusion”: 3,
“Low”: [
"“Chevrolet”"
]
}
}
],
“using”: “gsi”
}
},
{
"#operator": “Fetch”,
“as”: “a”,
“keyspace”: “testbucket”,
“namespace”: “default”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “any car in object_names(((a.property).cars)) satisfies (car = “Chevrolet”) end”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “a”,
“star”: true
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “SELECT a.* FROM testbucket a \nWHERE ANY car IN object_names(a.property.cars) SATISFIES car == “Chevrolet” END”
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “8.956874ms”,
“executionTime”: “8.911329ms”,
“resultCount”: 1,
“resultSize”: 2666
}
}

This means that the index works, can you post the result of the following N1QL

SELECT object_names(a.property.cars) FROM testbucket a

to check if there is a data match “Chevrolet”?

It doesn’t seem to pull anything

SELECT object_names(a.property.cars) FROM testbucket a

{
“requestID”: “dfda57b8-5246-4f0d-a2e8-8a3f31362d8c”,
“clientContextID”: “1b27afd9-05d5-465d-8998-35767418eded”,
“signature”: {
"$1": “array”
},
“results”: [
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “23.857004ms”,
“executionTime”: “23.816607ms”,
“resultCount”: 16,
“resultSize”: 32
}
}

And this is the actual sample document, (In case I messed up how I created it)

{
“version”: 1,
“owners”: {
“ownerABC”: {
“location”: “CA”,
“property”: {
“cars”: {
“Chevrolet”: 2,
“Cadillac”: 3
},
“bikes”: {
“Yamaha”: 2,
“Suzuki”: 3
}
}
}
}
}

try this N1QL

SELECT object_names(a.property.cars) FROM default a WHERE a.property.cars is not missing

SELECT object_names(a.property.cars) FROM testbucket a WHERE a.property.cars is not missing

{
“requestID”: “b8362d8b-d735-4a59-81bd-11136c69d855”,
“clientContextID”: “f9c0a8e7-e422-4b00-a9df-2e24a0369893”,
“signature”: {
"$1": “array”
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “109.701231ms”,
“executionTime”: “109.664269ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

This is my index
Definition: CREATE INDEX idx_car_names ON testbucket((distinct (array car for car in object_names((property.cars)) end

The italicizing it messing with the text.
I ran
SELECT object_names(a.property.cars) FROM testbucket a WHERE a.property.cars is not missing

{
“requestID”: “682db9ec-0819-464b-922d-e88183980983”,
“clientContextID”: “0fceeac0-c0a0-4622-ae1c-87a9964efdae”,
“signature”: {
"$1": “array”
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “70.337743ms”,
“executionTime”: “70.299933ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

so the car should be quote by owners.ownerABC.property.cars
do you means that ownerABC here is a variable key?

Yes, that was one on the things I thought might be a challenge. The keys under “owners” are variable.
e.g. ownerABC, ownerDEF, ownerGHI , etc

maybe @geraldss can give some suggestion.

1 Like

You can use WITHIN instead of IN to get recursive traversal.

Index

CREATE INDEX owners_car ON `default`(OBJECT_NAMES(OBJECT_VALUES(owners)[*].property[*].cars[*])) USING GSI;

N1QL

SELECT a.*
 FROM default a  
WHERE "Chevrolet" IN OBJECT_NAMES(OBJECT_VALUES(a.owners)[*].property[*].cars[*])

Should work.

3 Likes

YES! This worked. Thanks a ton, appreciate the time!!

And if I want to combine owners and car this works too

SELECT a.*
FROM testbucket a
WHERE “GMC” IN OBJECT_NAMES(OBJECT_VALUES(a.owners)[].property[].cars[*])
and ANY obj IN OBJECT_NAMES(a.owners) SATISFIES obj == “ownerABC” END

Also is there a way to limit the response to only the owner that satisfies the criteria. So say the document has ownerABC, ownerXYZ, OwnerEFG. Since ownerABC matches, it returns the whole document with all the owners. Can I return just the obj that matches?

Thanks!

1 Like