Hello!
I have a question about building GSI on document keys (with some key transformation), will it be more efficient than building GSI on document fields? And what is the right way to do so via N1QL?
I have documents of 2 types in my bucket:
- user
with keys like
user:userId
for example
user:1234
and json data like
{
“docType”: “user”,
“userId”: 1234,
“userattr1”: “abc”,
…
}
- user device
with keys like
device:userId:deviceType:deviceId
for example
device:1234:MOB:12345-12345-12345
and json data like
{
“docType”: “device”,
“userId”: 1234,
“deviceId”: “12345-12345-12345”,
“deviceType”: “MOB”,
“deviceattr1”: “abc”,
…
}
And I have the following queries:
- get user by key
- get device by key
- get devices by user id
Document updates happen more often than document creates.
There is no problem with query by key performance, but when we got about 50 mln documents in bucket, MR view updates became too slow, because we require strong view query consistency.
So I need to query devices by user id and I want to try N1QL indexes instead of MR views.
I’ve built an index like that
CREATE INDEX idx_user_doc ON mybucket(userId, docType)
and use query like
SELECT * FROM mybucket WHERE userId = 12345 AND docType = ‘device’
But for my purpose, I don’t really need to use json fields, all info that I need is in the document key (user id).
I wonder maybe it would be more efficient to build index on just document key, so that Couchbase doesn’t need to parse json data and do that on each document attribute update, so the index will be updated only when new document is created or when document is deleted - not on each document attribute update (these won’t change the key).
What is the right way to build index in this case?
Can I parse meta.id somehow when creating index so that I can extract userId from device document key and build index on that?
Will it be more efficient than building index on json attributes?
Yeah and again, I’d like to have consistent reads when it comes to query devices by user id, but its ok if json itself is a bit outdated, what I don’t want is to miss document itself (so key consistency is required, but data consistency is not)
I have a feeling that maybe I need to use primary index instead, but how to parse keys and would it be efficient?
Edit: I have found SPLIT function so that I can split key by ":’, I guess, but still wonder if I should do it at all, if its GSI and not primary index.
Thanks in advance,
Cho