Hello,
our team is currently struggling with Couchbase performance. We are evaluating Couchbase for usage in our application as the storage backend for sensor data. The sensor delivers (roughly) 4 to 5 results per second which are transmitted as JSON strings describing the measurement. Those messages all have the same format and the different event types they represent are encoded in the message as a string. The main reasons for considering Couchbase for evaluation are:
- measurement data is already transmitted as JSON formatted strings
- we will need scalability in the future as our system should be able to support mutliple of these sensors
- availability is of higher importance than consistency w.r.t. the CAP theorem
I ran some test queries on two different setups:
- Production environment which runs our live system
Hardware/System configuration:
- 4 cores
- 32 GB RAM
- Couchbase installed on 500 GB SSD (16% of disk occupied)
- CentOS Linux release 7.5.1804 (kernel 3.10.0-862.9.1.el7.x86_64)
Couchbase Community Edition 5.0.1 (build 5003) in docker container on 1 Node
Memory quotas:
- Data Service: 12500
- Index Service: 4096
- Search Service: 512
Index Storage Mode: Standard Global Secondary
4 Buckets:
- A (6 GB memory quota, bucket type: Couchbase), currently holds ~ 28.000.000 docs of three types, same format
- B (2 GB memory quota, bucket type: Couchbase), currently holds ~ 1.000 docs
- C (500 MB memory quota, bucket type: Couchbase), currently holds 1 doc (for testing purposes)
- D (2 GB memory quota, bucket type: Couchbase), currently holds ~ 30 docs (rather new)
Documents in A have the following format:
{
“_class”: ,
“latitude”: ,
“longitude”: ,
“positionMeasure1”: ,
“positionMeasure2”: ,
“eventType”: ,
“time”: , “2018-08-09T06:42:47.219Z”,
“originalMessage”: <map<string, any>>
}
Each of this messages represents a single measurement of a sensor which contains two different measures for position (represented as doubles), GPS coordinates, the type of the measured event (in our application eventType is an enum representing this) and a time stamp in the format “yyyy-mm-ddTHH:MM:SS.sssZ”. The originalMessage is a map containing additional information on the event.
Currently, the eventType has three realizations, let’s call them eventType1, eventType2 and eventType3. The vast majority of the documents in A are of type eventType1, followed by eventType2 and a clear minority of eventType3.
The bucket A has been indexed as follows:
CREATE PRIMARY INDEX aIndex
ON A
CREATE INDEX time
ON A
(time
)
CREATE INDEX eventTypeAndTime
ON A
(eventType
,time
)
- Test environment which runs isolated
Hardware/System configuration:
- 12 cores
- 64 GB RAM
- Couchbase installed on SSD
- Debian 4.9.65-3+deb9u2
Couchbase Enterprise 5.5.0 running in a docker container on this single node
Memory quotas (per default setup):
- Data Service: 37900
- Index Service: 512
- Search Service: 512
Index Storage Mode: Standard Global Secondary
Only one bucket A (others omitted as not relevant)
Same indexes created:
CREATE PRIMARY INDEX aIndex
ON A
CREATE INDEX time
ON A
(time
)
CREATE INDEX eventTypeAndTime
ON A
(eventType
,time
)
On these two systems I ran the following queries from the web UI:
- Query 1: SELECT COUNT(*) FROM
A
WHERE eventType == “eventType1” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z” - Query 2: SELECT COUNT(*) FROM
A
WHERE eventType == “eventType2” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z” - Query 3: SELECT * FROM
A
WHERE eventType == “eventType1” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z” - Query 4: SELECT * FROM
A
WHERE eventType == “eventType2” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z” - Query 5: SELECT time FROM
A
WHERE eventType == “eventType1” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z” - Query 6: SELECT time FROM
A
WHERE eventType == “eventType2” AND time BETWEEN “2018-07-11T00:00:00.000Z” AND “2018-07-25T00:00:00.000Z”
On the production environment I got the following results:
- uses IndexCountScan2 on the eventTypeAndTime index as expected, runs approx. 9.5s.
- uses IndexCountScan2 on the eventTypeAndTime index as expected, runs approx. 640ms.
- uses IndexCountScan2 on the eventTypeAndTime index as expected, runs either into timeout or web UI can’t be loaded (> 2.000.000 results).
- uses IndexCountScan2 on the eventTypeAndTime index as expected, runs approx. 1m (> 150.000 results).
- uses IndexCountScan2 (covering) on the eventTypeAndTime index as expected, runs approx. 48s.
- uses IndexCountScan2 (covering) on the eventTypeAndTime index as expected, runs approx. 1.4s.
The test environment gave the following numbers:
- 1.3s (IndexScan3: 1.3s, Project: 0)
- 120ms (IndexScan3: 120ms, Project: 0)
- Can’t be evaluated this way, browser says web UI can’t be loaded.
- 7.5s (IndexScan3: 270ms, Fetch: 2.9s, Filter for eventType: 1.1s, Project: 50ms)
- 10s (IndexScan3 (covering): 4.3s, Filter for eventType: 2.5s, Project: 2.4s)
- 640ms (IndexScan3 (covering): 270ms, Filter for eventType: 160ms, Project: 150ms)
On of the main use cases we have in mind would be to load data from the bucket A for eventType2 for, e.g., two weeks as demonstrated by the queries above to do some statistical analysis on the data. However, in our opinion, the performance is not good enough in these cases. In particular when considering those queries that can not use a covering index, the timings we observe are not acceptable. We have gone over our setup and also the DB configuration several times but have not found a way of substantially improving our situation. We are aware that Couchbase is designed to run on multiple nodes to which the services can be distributed. However we have currently set up one node only as we are monitoring only one sensor and thought this should do the job. So my question now is: Is there anything obvious we have made wrong in our setup? Should the data be stored in a different way? Maybe the indexing is not done properly? Is the hardware properly sized (I have gone through the basic sizing instructions …)
Any help is highly appreciated and I’ll gladly supply additional information/data should that be needed.