Array Index problem

allen.zhao · January 19, 2018, 6:03pm

Hi Folk,

I got an Nested Array Index problem. I followed the reference and blog, but no luck. Need help from expertise.

Doc Structure:

{
	"type": "data",
	"name":"abc",
	"dc":"a1",
	...
	"Server": [
		{
			"Name": "webextsrv1",
			"Port": 1234,
			...
		},
		{
			...
		},
		...
	],
	"Cluster": [
		{
			"Name": "webext",
			"ClusterAddress": "x.x.x.x",
		},
		...
	],
	"Application": [
		{
			"Name": "myweb",
			"Version": "1.0",
			"Order": 100,
			...
		},
		...
	],
	...
}

Index:

CREATE INDEX `data-server1` ON `EM-NP`(ALL ARRAY a.Name FOR a IN Application END, Application)
WHERE (`type` = "data")

CREATE INDEX `data-server2` ON `EM-NP`(ALL Application)
WHERE (`type` = "data")

Count query:

select count(*)
from `EM-NP` e
USE INDEX (`data-serverX`)
UNNEST e.`Application` a
WHERE e.`type` = "data" AND a.Name like '%'  <== this is required in 4.5 to enable index.

Paged data query:

select meta(e).id as data_id, 
e.`name`, e.dc, ...
t.ownerteam, t.supportteam, t.dmz, t.enabled, 
a.* 
from `EM-NP` e 
USE INDEX (`data-serverX`)
LEFT join `EM-NP` t ON KEYS e.spec_id 
UNNEST e.`Application` a 
where e.`type` = 'data' AND a.`Name` like '%'
ORDER BY e.dc asc 
OFFSET 0 LIMIT 25

Problem 1, for index data-server1:

For count query, query time looks good, but count is incorrect, much less than it should be. E.g: 5000 vs 25000 (should be)

For data query: index is being used, however, when I request data after 5000, it resturns []

Problem 2, for index data-server2:

For both query, it seems index is not being used, however, count query speed is acceptable.
For data query, it take more than 1 minute to return data.

I also found a funny thing: if I change the query to this, it is faster (half of the time, 30+ seconds)

select meta(e).id as data_id, 
e.*,
t.*, 
a.* 
from `EM-NP` e

I tried both indices in couchbase 4.5 and 5.0 community edition, same issue.

All I want is some indices for nested arrays in each docs. I have about 2000 docs, each doc contains a bunch of arrays of data different type, such as Cluster, Application, Server… I need return a page of type data based on dynamic query conditions. Everything works fine, except the query execution time is not acceptable.

Appreciate for any help.

Allen

vsr1 · January 19, 2018, 6:34pm

Check your indexer.log make sure there are no skipped items due to index key size.
a.Name like ‘%’ means you want all the Name that are string. Is that is your use case or Are you looking something a.Name like ‘Prefix%’.

CREATE INDEX `data-server1` ON `EM-NP`(ALL ARRAY a.Name FOR a IN Application END) WHERE (`type` = "data")
SELECT SUM(ARRAY_COUNT(ARRAY 1 FOR v IN e.Application WHEN v.Name LIKE '%' END))
FROM `EM-NP` e WHERE e.`type` = "data" AND ANY a IN e.Application SATISFIES a.Name LIKE '%' END;

CREATE INDEX `data-server1c` ON `EM-NP`(ALL ARRAY a.Name FOR a IN Application END, Application) WHERE (`type` = "data")

data-server1c index makes cover but if the array is too big pre-5.0 it may skip indexing the document.

data-server2 index doesn’t qualify because the predicate needs to be matched with array index key. In your case it is o
bject so you need to give a = {“Name”…}

data query involves JOIN/UNNEST i.e 2 JOINS and the documents can be multiplier that is why it taking time

allen.zhao · January 19, 2018, 8:20pm

You are right. There are lots skipped items.

If I want to keep the doc structure, is there any known work-around?

All my code are based on this structure.

Regards,

Allen

vsr1 · January 19, 2018, 8:27pm

In 5.0.0 you should not have problem (no index key size limit or it might be very large number) .
Pre 5.0.0 remove Application as index key. This will uses non covered can (i.e fetch the document)

CREATE INDEX `data-server1` ON `EM-NP`(ALL ARRAY a.Name FOR a IN Application END)
WHERE (`type` = "data");

Based on COUNT query the following performs better. UNNEST acts as like JOIN. In your case you need only count not actual unnested document the following works better.

SELECT SUM(ARRAY_COUNT(ARRAY 1 FOR v IN e.Application WHEN v.Name LIKE ‘%’ END))
FROM EM-NP e WHERE e.type = “data” AND ANY a IN e.Application SATISFIES a.Name LIKE ‘%’ END;

Topic		Replies	Views
Query won't use a nested array index SQL++ query , index	2	801	October 21, 2019
Strange issue with index SQL++ n1ql	5	1209	April 2, 2019
Unsing array indexing what's wrong? SQL++	3	1294	September 30, 2016
Problem with Array Indexing SQL++	6	2216	April 9, 2016
Array index not being used by query SQL++ query , n1ql , index	6	2406	January 11, 2018