N1QL vs. Views performance with Node sdk

After fiddling with the Ruby sdk, I decided to redo some benchmark scripts with the node sdk so I could work with prepared indexes.

The good news is that the async nature of node makes these scripts way faster. What took about 30 seconds with the ruby sdk takes from 5 to 10 seconds with node.

The bad news is that 1) you can overload couchbase with too many concurrent n1ql calls and 2) even with prepped indexes n1ql comes out about 50% slower than traditional views:

Average time with
views: 4.7s
http (calling n1ql queries over http): 5.5s
sdk (calling n1ql queries via the sdk methods): 7.2s

This is with DP 4.6

The trouble here is that while it may make sense to move from views to n1ql to make things more flexible and hopefully reduce the view indexing load, if it will end up slowing things down by 50%, we can’t really justify it.

Hey @james,

Could you share the code that you are using for testing? I suspect that the additional error handling and other behaviours in libcouchbase to improve the developer experience might be causing the http ↔ sdk gap, but would need to see more of the code to truly understand.

Cheers, Brett

Sure @brett19,

For http, I’m just sending a request to the api:

var fullUrl = 'http://localhost:8093/query/service?prepared="get_jobs_by_cpu_id"&max_parallelism=0&$cpu_id="' + cpu_id + '"';
request({url: fullUrl, json:true}, function(err,res,body){
    console.log("Got via http: " + cpu_id.red, body.results.length);
});

For the sdk, it looks like this:

var q = "select * from master where _type='printer_job' and printer_cpu_id=$cpu_id";
var n1q = N1qlQuery.fromString(q);
if (rowsSent == 1) {
    // only set adhoc false the first time
    console.log("Jobs adhoc: false".red);
    n1q.adhoc(false);
}

bucket.query(
    n1q,
    { 'cpu_id': cpu_id },
    function(err,result){
        if (err) {
            console.log("Error getting jobs:", err);
            return;
        }

        console.log("Jobs for " + cpu_id.green + ": " + result.length);
    }
);

Hey @james,

Looks like you are only setting adhoc to false once. The SDK actually internally tracks the prepared statements. You should be sending the same query every time, with .adhoc(false) on all of them to ensure that the query is properly prepared and then run as a prepared statement. The reason the SDK works this way is that using prepared statements becomes transparent and you are no longer required to keep track of this state yourself.

Try this instead:

var qs = "select * from master where _type='printer_job' and printer_cpu_id=$cpu_id";
var q = N1qlQuery.fromString(qs).adhoc(false);

bucket.query(
    q,
    { 'cpu_id': cpu_id },
    function(err,result){
        if (err) {
            console.log("Error getting jobs:", err);
            return;
        }

        console.log("Jobs for " + cpu_id.green + ": " + result.length);
    }
);

P.S. Using max_parallelism=0 on the HTTP side, but not the SDK side will not produce a fair comparison between the two. You should remove the option from your HTTP side, or alternatively add it to the SDK query with something like:

q.options.max_parallelism=0;

Cheers, Brett

Thanks @brett19

I switched to adhoc(false) for all the calls and it gave some interesting results. The time to the first response from the db grew (almost doubled) but the time from the first response to the last shrank a little.

This is for a smaller data run (1000 calls):

no adhoc() set:

  • Time to first response 0.293s
  • Total time 2.644s

adhoc(false) for all calls:

  • Time to first response: 0.576s
  • Total time 3.089s

Not sure what to make of this except that it seems couchbase may be taking some time to prep the statement “on the fly”.