How to get average document size in bucket

melboulos · June 9, 2020, 1:50pm

When sizing it’s helpful to know what the average document size is. The below query will output that.

select AVG(length(ENCODE_JSON(t))) as size
from `travel-sample` as t

matthew.groves · June 17, 2020, 1:04pm

Thanks for posting this tip, @melboulos

pccb · March 2, 2021, 10:29am

Thanks. Certainly useful.

But curious to ask the Couchbase experts on this forum, will it get the answer by scanning every document (if yes, that can be very expensive and result in a lot of CPU utilization and IO) OR will it get the answer fro some metadata?

pccb · March 2, 2021, 11:20am

Update: tried it myself by executing the above query on a small bucket with just 200k docs.

Observations:
(1) Since it is a n1ql, it needs an index. Since we want to query all docs, a primary index is needed which is not so good. (2) It took pretty long.

Conclusion: For a bucket with just 200k docs and a resident ratio of 100% since it took pretty long, it leads me to believe that it actual scans every do. Which means it a very expensive operation to try on a production cluster.

Does somebody have another way? A more faster/efficient way to get the avg size of docs in a bucket? If a precise value cannot be arrived at, a somewhat-accurate value will also do.

Thanks

Ritz · April 21, 2021, 2:04pm

@couchbase team ,do we have any other way apart from N1ql to get average bucket size as N1ql gets expensive when we have humungous data.

Topic		Replies	Views
How to get average size of document_id in couchbase Couchbase Server	12	5232	July 4, 2016
Document Size with N1QL query SQL++	7	11295	April 26, 2019
Couchbase bucket size map-reduce Couchbase Server	3	1018	February 2, 2018
Identify the size of an individual document in a bucket Couchbase Server n1ql	2	1108	November 24, 2020
How to find the max/min size of documents in a large bucket Couchbase Server	1	1012	August 19, 2022

How to get average document size in bucket

Related topics