How to get and set multiple keys from the couchbase server by batching them in such a way that I have to make one network call containing multiple keys instead of multiple parallel network calls which usually happens with async APIs
Since documents reside on different nodes, there is no server API that accepts multiple documents. Each kv operation may go to a different server, and therefore are separate requests. Some SDKs expose multi-document APIs, but they are implemented by calling the single document server APIs. Using the asynchronous SDK APIs, the requests can be called concurrently and the responses processed concurrently.
Yeah but suppose i have n keys then using the asynchronous way will involve making n parallel connections, in my current scenario i have around 8k requests each holding 1500 keys so i was thinking if its possible in some sort of way to pass the 1500 keys in one network call and then let the couchbase handle the retrieval otherwise we could potentially end up with 1.2 million parallel calls
potentially end up with 1.2 million parallel calls
The SDKs have thresholds for the number of in-flight requests. When that threshold is reached the SDKs will queue the request. When the maximum queue length is reached requests are rejected by the SDK.
But long before that, it would be beneficial for your application itself to limit the number of concurrent calls… which is better, 1.2 million calls that are 1% complete, or 12,000 calls that are 100% complete, and 1.098 million calls that are 0% complete?.
Also having 1.2 million concurrent calls (and also one request for 1.2 million documents) requires memory to hold that request, and that whole response.
If you want to experiment with concurrent calls in the java sdk there is a test app at GitHub - mikereiche/loaddriver
For 1.2 million documents you should consider alternatives such as eventing.
Hi @Utkarsh3746 ,
As Mike noted, the Couchbase Java SDK coalesces network packets behind the scenes to limit the number of syscalls. However, you are right to be concerned about issuing too many requests at once. Some kind of throttling or backpressure is required.
The Couchbase Java SDK’s reactive API is geared towards this sort of use case. At the end of this post I’ve shared a link to an example that uses the reactive API to get a whole bunch of documents at once. With some elbow grease, you could adapt it for upsert as well.
Project Reactor operators have a “concurrency” parameter that limits how many things can happen at once. In this example it’s used to limit the number of in-flight requests per batch to 256 (you could tune it to suit your application / workload).
Thanks,
David
Hi,Thanks I will definitely check that out.I have another doubt. As per my usecase,if i group the keys based on the buckets they would have to be fetched from ,then can I use the REST APIs for bulk-get or perhaps using something like N1QL queries for reducing the number of network calls
There is no KV REST API for bulk-get.
When you use n1ql, the query engine makes the network requests to data nodes to retrieve the documents. So it using n1ql will add a network request. You will also find retrieving a document via n1ql much slower than kv, due to additional processing. But certainly do your own benchmarking.