Lets say we have 1 cluster with 4 nodes, 2 nodes in US, and 2 nodes in the Europe.
we have replication for the bucket set to 2 (so this means 3 nodes will have the same document, after replication).
My question is…
When a read request comes into the cluster, will it:
read from the closest or more performent node? (are the reads sent in parellel to multiple nodes, and the fist result wins?) (I am currently assuming a get request by the primary key).
lets assume nodes 1,2,3 have the document in question, and the read request is received on node 4.
now lets assume the latency hit from node 4 to nodes 1,2 are say 1 second…but to between 3 & 4 its only 20ms…will the document in question come from the node 3 (always…or this entirely random)?
As a side question:
if replication is set to 2 (meaning the document should be replicated to additional nodes besides the master), and we do a set() with replicateTo.TWO, what will happen if 2 of the 4 nodes down? will this fail, or will it return success? I’m assuming it will fail. (is there a way to ‘have it succeed’ even if it can only replicate to 1 node, iff the 2nd replicate it needs to acquire is simply not available).
I’m hoping to be able to test this out in the next few days…(
when the request comes to the cluster, it will go to one node only: the node where the active document is stored.
Replicas should not be accessed by the client. They are here in case of a failure. If a node disappear, the replicas on the other nodes will be rebalanced.
What you want to do here is have a cluster in Europe and one in the US. Then you have to use XDCR between those cluster to keep them in sync.
As for the side question, if you have two replicas and four nodes, it means that you have three nodes containing the document. So if two of them fail you should be ok, assuming that the failing nodes were the replicas or that the rebalance is over.
Thanks for the reply. The reason we are trying to not use XDCR is its async, and we are trying for force the system to have a replica copy in each region (US/EU). This would ‘guarentee’ that reads in each region would get the most recent value but would have the benefit of ‘fast reads’ since it would hit a local replica (at least that is goal).
After a bit more poking around, it does look like its possible to read from a replica:
asyncReplicaGet(“foo”);
This could of course still get routed to a replica in a remote region which is not ideal…I also see that ConnectionFactory has a parameter shouldOptimize, which mahy in theory optmize the routes based on the cluster basted on fasted route…its no entirely clear what this really does.