GSI on parsed document key

chotenvan · May 17, 2019, 9:14am

Hello!

I have a question about building GSI on document keys (with some key transformation), will it be more efficient than building GSI on document fields? And what is the right way to do so via N1QL?

I have documents of 2 types in my bucket:

user

with keys like
user:userId
for example
user:1234

and json data like
{
“docType”: “user”,
“userId”: 1234,
“userattr1”: “abc”,
…
}

user device

with keys like
device:userId:deviceType:deviceId
for example
device:1234:MOB:12345-12345-12345

and json data like
{
“docType”: “device”,
“userId”: 1234,
“deviceId”: “12345-12345-12345”,
“deviceType”: “MOB”,
“deviceattr1”: “abc”,
…
}

And I have the following queries:

get user by key
get device by key
get devices by user id

Document updates happen more often than document creates.
There is no problem with query by key performance, but when we got about 50 mln documents in bucket, MR view updates became too slow, because we require strong view query consistency.

So I need to query devices by user id and I want to try N1QL indexes instead of MR views.

I’ve built an index like that
CREATE INDEX idx_user_doc ON mybucket(userId, docType)

and use query like
SELECT * FROM mybucket WHERE userId = 12345 AND docType = ‘device’

But for my purpose, I don’t really need to use json fields, all info that I need is in the document key (user id).
I wonder maybe it would be more efficient to build index on just document key, so that Couchbase doesn’t need to parse json data and do that on each document attribute update, so the index will be updated only when new document is created or when document is deleted - not on each document attribute update (these won’t change the key).

What is the right way to build index in this case?
Can I parse meta.id somehow when creating index so that I can extract userId from device document key and build index on that?
Will it be more efficient than building index on json attributes?
Yeah and again, I’d like to have consistent reads when it comes to query devices by user id, but its ok if json itself is a bit outdated, what I don’t want is to miss document itself (so key consistency is required, but data consistency is not)

I have a feeling that maybe I need to use primary index instead, but how to parse keys and would it be efficient?

Edit: I have found SPLIT function so that I can split key by ":’, I guess, but still wonder if I should do it at all, if its GSI and not primary index.

Thanks in advance,
Cho

vsr1 · May 17, 2019, 12:39pm

You could use it.
Example:

CREATE INDEX idx_user_doc ON mybucket(SPLIT(META().id,":")[1]) WHERE SPLIT(META().id,":")[0] = "device";
SELECT RAW META(t).id FROM mybucket  AS t WHERE SPLIT(META(t).id,":")[0] = "device" AND   SPLIT(META(t).id,":")[1]  = "12345";
   SELECT RAW t FROM mybucket  AS t WHERE SPLIT(META(t).id,":")[0] = "device" AND   SPLIT(META(t).id,":")[1]  = "12345";

Remember userId is through document key is string.
If you need whole document then project document key and use SDKs to fetch document asynchronously. or Don’t try to access any field in the document, If you do that the expressions on document key can be over head then document fields options might be better.
UPDATE always need whole document. If that is majority of queries document fields might be better. You can also consider SDKs subdoc API to update.
Don’t create primary index

You can experiment and measure the performance and see which one performs better

https://blog.couchbase.com/create-right-index-get-right-performance/

chotenvan · May 17, 2019, 2:07pm

Thanks a lot, it works!
I’ll run more tests when the indexing of production-size test data set is finished.
Cheers!

Topic		Replies	Views
N1QL: performance and advanced features SQL++	3	1797	September 6, 2016
Couchbase GSI Indexes ( Non covering) SQL++	4	1047	August 2, 2017
N1ql Select Document by document KEY Golang Couchbase Server n1ql	5	1670	February 7, 2018
Global Secondary Index with null key Couchbase Server query , n1ql	3	2636	July 13, 2016
Couchbase server 5.0 key/value vs GSI performance gap Couchbase Server	2	1617	January 2, 2018

GSI on parsed document key

Related topics