Application loses communication with Couchbase cluster

Using Couchbase Community 4.0.0-4051 with a cluster of 3 server nodes.

During a hardware upgrade of a Couchbase server node within a 3 node cluster, our java application is losing communication with the couchbase cluster.
The process we are following is

  • Perform a graceful fail over on the selected server node
  • Power down the server, upgrade RAM and power on the server
  • Select Delta-recovery and re-balance cluster.

After a re-balance of the server node, the java application will lose communication with the couchbase cluster. The application is connecting to the couchbase cluster with a command line parameter ‘couchbase.nodes.hostNames=servernode1,servernode2,servernode3’

Any ideas on configuration changes to help fix this issue so the application is not affected by a server node being down or rebalanced?

Typically, when you’re removing a node you’ll click the “remove” and then rebalance, not do a failover. At least originally, failover was not intended for these kinds of maintenance operations. While it’s “graceful” to the replication of the cluster, it’s not graceful to the connections from the client library. The docs cover this with remove.

If you have the hardware available, the best approach would be to add a node and remove a node in a single rebalance operation. Then you wouldn’t see any impact on the running application as long as you have all of your services replicated within the cluster.

I’ll note that the docs aren’t very helpful with respect to when to use failover and when to use remove. Maybe @anil can help get a documentation improvement going.

Thank you for your response ingenthr.