tarting with an up&running 2-node cluster of CB 7.1.3 EE on CentOS7:
[rgr@cb7-a ~]# /opt/couchbase/bin/couchbase-cli server-list -c 127.0.0.1 -u Administrator -p admin123
ns_1@192.168.99.151 192.168.99.151:8091 healthy active
ns_1@cb7-a.infra.somewhere.com cb7-a.infra.somewhere.com:8091 healthy active
[rgr@cb7-a ~]# /opt/couchbase/bin/couchbase-cli bucket-list -c 127.0.0.1 -u Administrator -p admin123
conv_session_info
bucketType: membase
numReplicas: 1
ramQuota: 536870912
ramUsed: 331122208
After simulating a failed node/server by shutting it down, its state changes to unhealty, as expected:
(no auto-failover and such in place for this test)
[rgr@cb7-a ~]# /opt/couchbase/bin/couchbase-cli server-list -c 127.0.0.1 -u Administrator -p admin123
ns_1@192.168.99.151 192.168.99.151:8091 unhealthy active
ns_1@cb7-a.infra.somewhere.com cb7-a.infra.somewhere.com:8091 healthy active
Now I want to force a failover, so that the replica items on the remaining server are activated:
(trimming the curl command a bit, otherwise posting is not allowed)
curl /controller/failOver -d 'otpNode=ns_1@192.168.99.151'
HTTP/1.1 504 Gateway Time-out
Bummer, it fails (as the node can’t be reached), so try harder:
curl /controller/failOver -d 'otpNode=ns_1@192.168.99.151' -d allowUnsafe=true
HTTP/1.1 200 OK
That worked, but now the 2nd node is gone from the cluster
rgr@cb7-a ~]# /opt/couchbase/bin/couchbase-cli server-list -c 127.0.0.1 -u Administrator -p admin123
ns_1@cb7-a.infra.somewhere.com cb7-a.infra.somewhere.com:8091 healthy active
whereas with CB upto 6.6 it would remain in the cluster as ‘unhealthy inactiveFailed’
Being the node removed from the cluster, it can’t be added back into the cluster after it has been started again:
[rgr@cb7-a ~]# /opt/couchbase/bin/couchbase-cli recovery -c 127.0.0.1 -u Administrator -p admin123 --server-recovery 192.168.99.151
ERROR: Server not found 192.168.99.151:8091