Curious about experiences with view queries using the different stale settings (ok, update_after, false) during failover, and separately rebalance, on Couchbase 4.0 Community. In general, should it be assumed that anything other than ok is going to have the potential for significant application impact/failure?
For example, imagine losing a node in a 3-node cluster and then failing over. Replicas and “view index replicas” are enabled, but most of the views start to reindex, and during this period “stale: false” queries timeout over and over until the respective views clear out of initial failover re-index. Things can be this way for 10’s of minutes, and even “stale: ok” queries occasionally timeout. This seems to indicate that any system built using “stale:false” is suspect and subject to failure if feeding anything of consequence. Back-to-back timeouts are not going to get you a response.
Does this match others experiences? 1) that a failover is going to go into a protracted re-index despite replicas being indexed (seems counter-intuitive)? and 2) stale:ok with retry is the only safe way to make it through a failover and subsequent rebalance if you need to avoid timeouts and get reasonable query response times during the recovery (say 10s or less on queries)?