Hi,
I have two couchbase servers in one cluster and each couchbase server has sync gatway and i was testing sync gateway and couchbase server failover and somehow sync gateway was going to offline couchbase server for data until i restart sync gateway on online coucbase server.
Steps i have performed:
- Two nodes A and B online with sync gateway installed on each node in one cluster.
- I have shutdown node B physically without using couchbase fail over feature.
- Then i tried to access through sync gatway example.com:4984/test and it was trying to connect node B couchbase data through node A sync gateway.
- no issues just opening sync gateway gatway example.com:4984
- I have even failover and removed node B couchbase server from cluster but still same sync gateway was trying to get data from node B.
- After that i have restarted sync gateway on node A then it started connecting to node A bucket.
This way it does not seem to be a redundant service as i needed to restart sync gateway on online server which will take mobile users go offline.
Is it by design like this or am i doing something wrong here?
Please help me as i want app to work even one couchbase server goes offline and one sync gateway go offline.
Sync gateway logs:
15:13:23.679151 2015-08-07T15:13:23.679+10:00 WARNING: Couldn’t interpret error type *net.OpError, value dial tcp 10.10.10.20:11210: connection timed out – base.ErrorAsHTTPStatus() at error.go:63
15:13:23.679252 2015-08-07T15:13:23.679+10:00 HTTP: #047: --> 500 Internal error: dial tcp 10.10.10.20:11210: connection timed out (68139.1 ms)
15:14:24.272075 2015-08-07T15:14:24.272+10:00 WARNING: Couldn’t interpret error type *net.OpError, value read tcp 10.10.10.20:11210: connection timed out – base.ErrorAsHTTPStatus() at error.go:63
15:14:24.272211 2015-08-07T15:14:24.272+10:00 HTTP: #006: --> 500 Internal error: read tcp 10.10.10.20:11210: connection timed out (1036224.3 ms)
15:14:30.861035 2015-08-07T15:14:30.861+10:00 HTTP: #048: GET /kodoprod/
15:14:30.861111 2015-08-07T15:14:30.861+10:00 WARNING: Couldn’t interpret error type *net.OpError, value dial tcp 10.10.10.20:11210: connection timed out – base.ErrorAsHTTPStatus() at error.go:63
15:14:30.861288 2015-08-07T15:14:30.861+10:00 HTTP: #048: --> 500 Internal error: dial tcp 10.10.10.20:11210: connection timed out (68137.1 ms)
15:14:30.897841 2015-08-07T15:14:30.897+10:00 HTTP: #050: GET /favicon.ico/
Thanks,
Karunakar