Difference between ep_queue_size, ep_diskqueue_items and disk_write_queue

I’m building my own monitoring dashboards in Grafana for my Couchbase Cluster and got confused with the following metrics:

  • ep_queue_size - Number of items queued for storage.
  • ep_diskqueue_items - Total items in disk queue.
  • disk_write_queue - Disk write queue, items.

Graphs for ep_queue_size and ep_diskqueue_items are identical. Graph for disk_write_queue most of the time equals to the other two, but sometimes it’s two times bigger.

Could someone please explain what is the difference between the three metrics?

A general notice about the stats data, it’s very difficult to find a good explanation of the metrics in the official documentation and some metrics seems to be undocumented at all.

This is the canonical documentation for ep-engine stats: https://github.com/membase/ep-engine/blob/master/docs/stats.org

In terms of the specific stats you asked about:

  • ep_queue_size and ep_diskqueue_items are aliases to the same statistic - the number of items waiting to be persisted to disk. The only reason there’s two names is historical, there was an effort to give more consistent / user-friendly names, but we didn’t want to delete the “old” stat in case some users were depending on it.
  • disk_write_queue - this is a compound stat generated by ns_server (hence why you won’t find it listed in the stats.org link above). I just checked the ns_server source and it’s the sum of ep_queue_size and ep_flusher_todo. As per stats.org, the definition of ep_flusher_todo is:

ep_flusher_todo - Number of items currently being written

Thanks for the explanation and the stats.org link particularly! :slight_smile:

Hi ,
were you able to monitor couchbase in grafana.
If yes, what is data source type you used, if you have steps can you forward please.