Measure Latency Between CB and Storage layer

Hi Team,

We had requirement where we need to measure the latency between CB layer and storage layer.
Is cbstats command will help in the above requirement.

Thanks,
Debasis

By “storage layer” do you mean durability (secondary storage, disk, whatever)? For what purpose? The stats of the disk should give a good indication of what is going on. I imagine the difference in time between a mutation that has durability vs one that does not would be a good indication of the time for storage.

in the cbstats documentation cbstats | Couchbase Docs in the navigation panel there is a section for timing timings | Couchbase Docs

The two stats you want are:
1.kv_vb_queue_age_seconds is the measure of how long on average it takes items in the in-memory queue to write to disk.

2.This will give you the average of general response to write to disk kv_ep_commit_time_seconds/kv_ep_commit_num

During rebalance you’ll see the above stats increase ,some times a lot, but don’t worry your doing more write to disk then normal.

source: Data Service Metrics | Couchbase Docs

@househippo Thanks for sharing the metrices name. Actually we are performing the performance test with cbc-pillowfight tool so are these metrices will help because it looks like helpful during KV operation. Along with write latency we want also need latency during read operation between CB and storage layer.

Thanks,
Debasis

@househippo, thanks! I want to ensure we’re interpreting these metrics correctly—could you confirm the following?

  • kv_vb_queue_age_seconds: Like most other stats, I assume this is cumulative since the last restart/reset. The documentation defines it as “Sum of disk queue item age in seconds.” If, since the last restart, there have been 100 total items—where 10 items spent 1 millisecond in the queue and the remaining 90 items spent 2 milliseconds—then this metric would be: ((10Ă—1)+(90Ă—2))/100=1.9 milliseconds
  • kv_ep_commit_time_seconds: The doc defines this as “Number of milliseconds of most recent commit.” Did you mean kv_ep_commit_time_total_seconds instead? Since it’s also defined as “Number of milliseconds of most recent commit,” dividing it by kv_ep_commit_num seems to make more sense. Both appear to be cumulative since the last restart/reset.

Would appreciate your confirmation. Thanks!

If you look at the docs: Data Service Metrics | Couchbase Docs.

It will tell you if its a counter or guage. Generally here is how the Couchbase Writing to disk works via a process called ep_engine . The Fill is how many I want to write to disk. The Drain is how many I’m really doing. The queue_age is how long its in the queue. The disk_commit_time is generally the response of the disk to write. Watch out I’ve seen avg_age reach 700 second , but a commit time of a few milli seconds.

Q: WHY?
A: You ran out of IOPS and the cloud provider is throttling you.

1 Like