Slow responses using Java SDK KV operation

Dears

I have a simple Java class which would fetch records from Oracle DB and look up with key in couchbase. For Look up in couchbase using this statement
LookupInResult result = collection.lookupIn(“test-bucket:” + rs.getString(1), Collections.singletonList(LookupInSpec.get(“LastUpdate”)));
Understood that KV operations are fastest when key is known but I receive these messages during run of my class

[LatencyMetricsAggregatedEvent][600s] Aggregated Latency Metrics: {“operations”:{“kv”:{“lookup_in”:{“total_count”:402501,“percentiles_us”:{“50.0”:1335.295,“90.0”:1490.943,“99.0”:1695.743,“99.9”:2670.591,“100.0”:115867.647}}}},“meta”:{“emit_interval_s”:600}}

Why is the retrieval so costly and 90 percentile and above are so high?
What am I missing here, we are using Community Edition 6.0.0 build 1693 ‧ IPv4. It is an installation with 6 couchbase nodes.

Regards

Hi Srinivas,

Why is the retrieval so costly and 90 percentile and above are so high?

The “us” in “percentiles_us” indicates the values are in microseconds. If you divide by 1,000 to get milliseconds, are the numbers still larger than you were expecting?

Thanks,
David

@david.nault : Thanks for your quick response , may be an appropriate question would be how do i understand this stat and what it means? Is it the total time taken in msecs for the count of records mentioned?

Regards

Hi Srinivas,

The total_count and emit_interval_s values tell us there were 402,501 lookupIn requests in the last 600 second (ten minutes).

Let’s look at just this part:

"percentiles_us": {
  "50.0": 1335.295,
  "90.0": 1490.943,
  "99.0": 1695.743,
  "99.9": 2670.591,
  "100.0": 115867.647
}

Rounding to the nearest tenth of a millisecond, we can say that in the last 10 minutes:

  • 50% of the lookupIn operations in completed in less than 1.3 milliseconds
  • 90% completed in less than 1.5 milliseconds
  • 99% completed in less than 1.7 milliseconds
  • 99.9% completed un less than 2.7 milliseconds
  • 100% completed in less than 116.9 milliseconds.

Outliers at the high end might be due to things like JVM warmup, garbage collection, network hiccups, busy server, etc.

The SDK documentation for these metrics also says:

[the latency values] are aggregated across all nodes and include potential retries, so this really shows end-to-end latency from a users/SDK perspective.

Thanks,
David

2 Likes