I would always expect mapGet() to be quicker than a N1QL query, given it is a simpler (and hence less flexible, but faster) API.
The numbers you are quoting seem quite high - what exactly are you timing? Specifically, are you including the time to perform the initial bucket connect (couchbase.getBucket()) in your measurements?
Given that’s a one-off task (you should connect once at the start of the application and re-use the same Bucket object) it doesn’t make sense to include in your timings.
getBucket is done outside of the measurement indeed. The timing is taken using time elapse between 2 System.nanotime. The test is done by wrapping 2 System.nanoTime before forloop and the bracket.
E.g.
long start = System.nanotime();
for (xxx){
}
long end = System.nanotime();
long elapsed = end - start;
I ran the test on the very box where one of the couchbase reside to minimize latency cost.
I will probably spend more time in next week on walking through the different method calls and execution plan of the query, just to be sure that it is not one of those things where the timing is off due to incorrect setup.
Here is my assumptions, please correct me if I am wrong
the cost of compiling n1ql is a finite cost depending on the complexity of the query. In this case, 7ms per query
the cost of request plus depends on how often the index is updated (in our case, we use forestdb).
Here are my questions
Does mapget depend on index? If no, am I right that the performance of mapget depends if the doc is in memory?
Does mapget ensure getting of the latest? If not, can it be configured as so?
My general use-case around this is that, I fetch the subset doing select outline from bucket where xx = 1 AND yy = 2 AND zz =3 and am deciding if I should separate them into 2 calls by select meta().id from bucket where xx = 1 AND yy = 2 AND zz =3 follow by fetching a subset of the document. I understand that this way of fetching (2 calls) release the query engine from parsing the json for returning result.
Here are the results from my test, running the java application on the box that hosted couchbase, so IO should be at its minimal.
query outline with no index (adhoc = false, REQUEST_PLUS): succeed (1010 ms)
query outline with no index (adhoc = true, REQUEST_PLUS): succeed (1708 ms)
query outline with no index (adhoc = false, NOT_BOUNDED): succeed (593 ms)
query outline with no index (adhoc = true, NOT_BOUNDED): succeed (1378 ms)
mapget outline: succeed (134 ms)
query metadata with index (adhoc = false, REQUEST_PLUS): succeed (564 ms)
query metadata with index (adhoc = true, REQUEST_PLUS): succeed (1554 ms)
query metadata with index (adhoc = false, NOT_BOUNDED): succeed (399 ms)
query metadata with index (adhoc = true, NOT_BOUNDED): succeed (1309 ms)
mapget metadata: succeed (100 ms)
Index is created (create index sessionBuckettest on sessionBucket (meta().id, metadata);
The code around query (where y = outline or metadata, x is the N1qlParams as defined in the bracket)
Here are my questions
Does mapget depend on index? If no, am I right that the performance of mapget depends if the doc is in memory?
Does mapget ensure getting of the latest? If not, can it be configured as so?
mapGet() would use a streaming parser over the item internal to the server where the item is in the active vbucket. It does not depend on an index. It does indeed need to be in memory, and it’ll be fetched if needed.
Since the request is going to the active vbucket, and that system is responsible for the item, it’ll always operate against the latest.
I note that there isn’t any API around mapGet from replication. In the typical use of get (if you replicate at least once),
try{
doc = b.get(id);
}
catch(Exception e1){
doc = b.getFromReplica(id, ReplicaMode.FIRST).get(0);
}
is that necessary in mapget? if yes (assuming that mapget will always fetch from the ‘primary’ node and never from replica), does it mean that I should at least try
try{
doc = b.mapget(id, subfield, JsonObject.class);
}
catch(Exception e1){
doc = b.mapget(id, subfield, JsonObject.class); //<- this will fail over to the replica if auto failover takes place
}