I have two clusters: one with three nodes - there is one bucket (with 2 replicas) and the other cluster, with just one node. I have setup XDCR from 3node cluster to 1node cluster. Number of documents in 3k cluster is 1.8m (+ 3.8m replicas), but number of replicated documents to 1node cluster is only 600k - exactly 1/3
of all documents. In web console -> xdcr replication, there is a lot of messages like:
Error replicating vbucket 988. Please see logs for details.
xdcr_error log has two types of issues:
-
out of 25 docs, succ to send 24 docs, fail to send others (by error type, enoent: 1, not-my-vb: 0, einval: 0, timeout: 0 other errors: 0
(this is 2.2.0 issue, which according to Jira is fixed with 2.5.0 - but it should cause single documents, not 66.6% of them, to be missing. This fix is sadly not available for Community Edition), -
WARNING! Database delete purger current sequence is ahead of replicator starting sequence …that means one or more deletion is lost (vb: 62, purger seq: 2051, repl start seq: 0).
I’ve been using XDCR before from 1-node to 1-node cluster (without replicas within datacenters), it was working fine. Problems started when I’ve added 2 nodes to one of the clusters and re-created buckets/xdcr (with num_replicas=2 in 3node cluster).
Any suggestions?