We are using Couchbase (via the Java SDK) to implement a highly available session store.
Now that we have experimented a little bit, we think it is more and more doubtful that Couchbase can cover this use case well.
The main obstacle seems to be the failover procedure with its hard lower limit of 30 seconds until automatic failover. For a highly available session store having 30 seconds outage when the master node fails is unacceptable. Previous discussion: Node failure blocks Java client
Are we correct in assuming that a clustered, HA session store is not a use case that Couchbase covers well?
I’ve moved this into the Couchbase Server category since I guess your question is more about our auto-failover capabilities than the Java SDK.
High performance caching is one of the things Couchbase can do very well, since this is where it came from. When we are talking about “high availability” instantly the discussion is around consistency /vs availability where Couchbase, with the way its partitioning documents clearly falls on the consistency side.
That doesn’t mean couchbase isn’t reliable of course. Note that you can manually fail over nodes any time, there doesn’t have to be a 30s window, and you can automate that via our APIs. The reason why auto failover is set to 30 seconds min is that there are many heuristics needed to make sure that a node has actually failed and is not just slow and not responding - keep in mind failover is not “reversible”, you have to rebalance the node back in (with delta recovery though its much faster than it used to be these days).
Finally, while I’m not a product manager, I can tell you that for the next major release we upped our game on auto failover timings and were able to significantly reduce that minimum time. Be sure to watch for announcements on that front