I have created flink (java ) to stream data from kafka topic to couchbase
In my flink’s logic, I use several N1QL query before upsert to couchbase
I have created GSI index and have tested each query only around 30ms
I have loaded 1,6 mio data to kafka to process to couchbase
It takes around 200 ops which is too long to process whole data
For example, before insert doc to couchbase, i need to query to existing bucket with some criteria from the doc, then append to my document before insert to couchbase.
Actually I try to create something like eventing but using java.
my query is simple, select from bucket with filter that indexed, and quite fast
is there any parameter to tune the performance?
what’s the best practice for this requirement?
My N1QL is used for query before upsert the data.
I have added queryEndpoint and kvEndpoint, but still only get around 300 ops/sec
I need more than 1000 ops per second
Is there any other parameter/setting to tune or maximize the performance?
How are you writing the documents to Couchbase? Are you upserting one document at a time using blocking methods? I wonder if you would see better performance if you could batch the documents into a tumbling window and upsert each window using async upserts.
Alternatively, I wonder if you could combine the query and the update into a single N1QL statement. I’m not a N1QL guru so I don’t know what that statement would look like… just putting the idea out there.