We are currently using Couchbase Community 4.0 in a small, 3 node cluster. We have not deployed dedicated Index servers; the load on the cluster is modest which doesn’t seem to justify dedicated resources.
However, I am curious to know what type of discretion should be used in the number of indexes created, other than pure hardware limitations. The reason I ask is because a couple of years ago when we first started exploring the use of Couchbase (in the pre-index, pre-N1QL days), we created a large number of Views (probably 20 design documents with approximately 5 views in each). We started running into performance problems and later found out that we were well beyond the number of Views the Couchbase would reliably handle.
To avoid putting ourselves in a situation again, I’d like to know the restraints (beyond hardware) or “best practices” in terms of the number of indexes. We’re not looking to get too crazy, but are considering creating about 30-40 different indexes.
Hello,
Here is my feedback about you question.
Using GSI (Global secundary indexes) starting with 4.0, providme better perfomance than Views.
For many reasons:
They are global, not local to Vbuckets. This mean, that the key location, for range querys are much better, because the index has all the index entries ordered, thus a range scan is very faster.
The index live in RAM memory, instead Views that live on disk structures.
4.5 provide more performance with MOI (Memory Optimized in Memory). You need swith to MOI at cluster level, before start to create index.
About the number of indexes, there is no maximum number of indexes per cluster / bucket that I know.
Always you need to follow the trade-off approach.Between index space used by your index and the eficient query use. Also you need to think about the index maintenance.Couchbase although maintaining indexes is Asyncronous , ie that does not affect performance in the writings of new documents.
It is possible to avoid possible contentions between services Data and indexes, interesting add new nodes with Service Index isolated from data services. My recommendation would add two more to its cluster nodes one for indexes and one for querys.
Sure others provide more feedback.
I’ve been looking into this same thing recently as well. The biggest limitation that I’ve found is that, as the number of indexes increases, the time between when you save a document and when that document can be found in a query increases.
Example - if you save a document and then immediately run a query where the result set should contain the document you just saved, the probability that your new document will appear in the result set goes down as the number of indexes increases.