CouchBase - Java SDK - Connection Pool

Hi,

I am new to CouchBase, and need some help regarding connection pool, what configuration should be done to execute parallel execution through multiple threads.
We have one server node, in the below example one CouchbaseCluster is created, the opened bucket is reused in all threads.

We are connecting to the Couchbase Server through Java SDK, and we have some issues with the concurrent, parallel db access. I made a test, executing the same simple select query: select meta().id, lon, lat, instances from Airports where documentType =‘abc’ limit(10000).
Normally, one execution takes like 600 ms. If I run the above test with one thread, repeating the selects e.g. 9 times, everything is OK, total test takes 5400 ms.
Now if I execute 3 threads in parallel, executing the same select, each thread starting in the same time, each execution takes 1800 ms. Just like only one select is executing in the couchbase, and each thread is waiting for the other to finish its select.
So the 3 parallel thread/3 executions gives me the same time, as 1 thred/9 executions.
Also in CouchBase console it seems only one query is executed at a time.

What could be the problem, whythe parallel executionselects in CouchBase is not working?

Thank you

Which SDK version are you using? With tweaking we can get every version to your perf requirements but if you use 2.4.2 chances are OOTB performance is much better even without tweaking!

Thanks for you reply.

I am using 2.4.2 version of the java-client, but with the same results as for 2.3.6. A slightly difference that I noticed in 2.4.2 that the parallel threads blocked each other till each finished its select (but maybe this is just in the logging).

There is no possibility to configure the ‘connection pool’ size, either on server or client side? Only with tweaking?
I am used with SQL Server, and at first sight it seems strange that only one query can be executed at a time in couch.
In production, at high load, with many requests this will not cause blocks on some level?

Can you show me your code how you are using the SDK please? It is thread safe and concurrent, no pool needed and you need to reuse the bucket as a singleton throughout the lifetime of the app!

Please see below.
There is a main class, which calls a Thread class.
I made two tests: first in the sub class with the actual couch query, then with a simple Thread.sleep().
I would expect that the test with the query would act the same as the Thread.sleep() test: when executed with 3 parallel threads - each thread executed 3 times would have a 3 times faster execution time than 1 thread executed 9 times. (and the average time per thread/query would remain the same, not almost 3 times slower than in the 3 parallel threads case).
Thank you.

package couch;

import com.couchbase.client.java.Bucket;
import com.couchbase.client.java.CouchbaseCluster;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class CouchbasePerfTestController {
public static void main(String args[]) throws Exception {

    CouchbaseCluster cluster = CouchbaseCluster.create(Collections.singletonList("192.168.96.110"));
    Bucket bucket = cluster.openBucket("Hotel", "xxxxxx");

    try {
        int maxThreads = 1; // OR the second test with 3 parallel threads
        int batch = 9;

        int threadIndex = 1;
        int maxCycles =  batch/maxThreads;

        List<Thread> allThreads = new ArrayList<>();

        long startTime = System.currentTimeMillis();

        while (threadIndex <= maxThreads) {
            CouchThread couchThread = new CouchThread();

            couchThread.setCycles(maxCycles);
            couchThread.setBucket(bucket);

            Thread t = new Thread(couchThread);
            t.start();

            allThreads.add(t);

            threadIndex++;
        }

        for (Thread t : allThreads) {
            t.join();
        }

        long endTime = System.currentTimeMillis() - startTime;

        double avg = endTime / (batch/maxThreads);
        System.out.println("*** Total time " + endTime + " millis, average is " + avg + " millis ");

        bucket.close();
        cluster.disconnect();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

}

package couch;

import com.couchbase.client.java.Bucket;
import com.couchbase.client.java.query.N1qlQuery;

public class CouchThread implements Runnable {

private int cycles;
private Bucket bucket;

public int getCycles() {
    return cycles;
}

public void setCycles(int cycles) {
    this.cycles = cycles;
}

public Bucket getBucket() {
    return bucket;
}

public void setBucket(Bucket bucket) {
    this.bucket = bucket;
}

public void run() {

    int innerIndex = 1;

    while (innerIndex <= cycles) {
        long start = System.currentTimeMillis();

// System.out.println(“Current thread start time:” + Thread.currentThread().getId() + “, index:” + innerIndex + " — " + start);

        //1. Call query
        N1qlQuery query = N1qlQuery.simple("select meta().id, lon, lat, instances from Hotel where documentType ='bas' limit(10000)");
        bucket.query(query);

          //2. Thread sleep

// try {
// Thread.sleep(500);
// } catch (InterruptedException e) {
// e.printStackTrace();
// }

        long end = System.currentTimeMillis();
        System.out.println("Current thread time:" + Thread.currentThread().getId()+ ", index:" + innerIndex + " --- time: " + (end-start));

        innerIndex++;
    }
}

}

Another approach with Executors, with the same result (in this approach I could not see the execution time for each thread individually).
Sorry for the mess, I tried, but I couldn’t add the code as attachment.

package couch;

import com.couchbase.client.java.Bucket;
import com.couchbase.client.java.CouchbaseCluster;
import com.couchbase.client.java.query.N1qlQuery;
import com.couchbase.client.java.query.N1qlQueryResult;

import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class CouchbasePerfTestController {
public static void main(String args[]) throws Exception {

    CouchbaseCluster cluster = CouchbaseCluster.create(Collections.singletonList("192.168.96.110"));
    Bucket bucket = cluster.openBucket("Hotel", "xxxxxxx");

    int maxThreads = 3;
    int batch = 9;

    ExecutorService executor = Executors.newFixedThreadPool(maxThreads);

    Stream<Integer> intStream = new Random().ints(batch, -1, 1 ).boxed();
    List<Integer> intList = intStream.collect( Collectors.toList() );
    long startTime = System.currentTimeMillis();

    //1. Call query
    List<Callable<N1qlQueryResult>> callableList = intList
            .stream()
            .map(item -> (Callable<N1qlQueryResult>) (() -> bucket.query(N1qlQuery.simple("select meta().id, lon, lat, instances from Hotel where documentType ='bas' limit(10000)"))))
            .collect(Collectors.toList());

    //2. Thread sleep

// List<Callable> callableList = intList
// .stream()
// .map(item -> (Callable) (() -> {
// try {
// Thread.sleep(500);
// } catch (InterruptedException e) {
// e.printStackTrace();
// }
// return 1;
// }))
// .collect(Collectors.toList());

    List<N1qlQueryResult> ids = executor.invokeAll(callableList)
            .stream()
            .map(future -> {
                try {
                    return future.get();
                }
                catch (Exception e) {
                    throw new IllegalStateException(e);
                }
            })
            .collect( Collectors.toList() );

    long endTime = System.currentTimeMillis() -startTime;

    double avg = endTime / intList.size();

    cluster.disconnect();

    System.out.println("*** Total took " + endTime + " millis for " + intList.size() + " calls, average is " + avg + " millis ");
}

}

Hi Daschl,

Have you had a chance to look to the above examples?

Thank you

@CoucheEb would you be able to provide a sample against the travel-sample bucket which exhibits the same problems? Also with the index definition used, or ideally one that is already provided. I’m asking since I don’t have your dataset and its hard to figure out whats going on otherwise.

The other thing you can do is, on the environment, turn down the emit time for the latency metrics which will dump JSON to the console and you can inspect the time it takes until a response comes back from the server. This can help isolate client from server latency.

@daschl Thanks for the advice, I will try your suggestions, and will come back with results.