Hi,
I did try to convert our really big couchbase application to the new Sdk 3.0.1. Our main concern are performance. Our application is mainly relying on the Get by key feature, and must be able to retrieve a lot of key in the most performance way.
The typical usage, is receiving a bunch of key, retrieving all documents, returning documents.
On our benchmark test:
- Couchbase Cluster : 1 node, 8 CPU 4Ghz, 64GB Ram, Ubuntu 18.04
- Application Server: 1 node, 16 CPU 4Ghz, 64GB Ram, Ubuntu 18.04
- Connection Wired 1Gb/s (No firewall, no proxy)
Operation: 100000 * GetAsync(Key) by partition range (Transform Block, BufferBlock), Paritition by 50
Couchbase Connection Pool: Min = 25, Max = 25
Sdk 2.7: 5-6s for retreiving the 100000 records. (Bucket ops >20K)
Sdk 3.0.1: 380s for retreiving the same 100000 records.(Bucket ops = 300-400)
Our code is more or less:
return await keys.SelectParallelRangeAsync(k=> collection.GetAsync(k), 50).ConfigureAwait(false);
SelectParallelRangeAsync is an helper:
public static async Task<T2> SelectParallelRangeAsync<T, T2>(this IEnumerable sequence, Func<T, Task> action, int batchSize)
{
var batcher = new TransformBlock<T, T2>(doc => action(doc), new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = batchSize });
var buffer = new BufferBlock();
batcher.LinkTo(buffer);
sequence.ForEach(s=>batcher.Post(s));
batcher.Complete();
await batcher.Completion.ConfigureAwait(false);
if (buffer.TryReceiveAll(out var results))
{
return results.ToArray();
}
return null;
}
For now we stick to 2.7 for performance reason. Please advise us.
Best regards,
David.