We have setup couchbase cross dc replication between 2 data centers. We have enabled two-way replication from either way.
We are observing replication issues from one data center.
There are some errors observed in xDCR logs and it says replication from one particular node is not happening. Around 340k mutations(operations to be replicated) are pending to replicated since 5 days.
The error we could see from the couchbase logs is:
" The bootstrap node listed in the reference: 18d54377abe5218f4a0286506b5b473b is not valid as it has been moved to a different cluster than the original target cluster."
Here is our current couchbase setup details:
· Community Edition 6.0.0 build 1693 ‧ IPv4
· 5 nodes cluster (8 cpu, 16 GB, CentOS 7)
· 10 buckets
· 8 ephemeral buckets
· 2 couchbase buckets
· Each bucket RAM quota 800 MB overall
· Enabled --services data, index, query in 1 bucket
· The same setup is available in two data centers. Replication is enabled between these two clusters
Hi yessara, were any nodes added or removed from either cluster after the bi-directional replication was setup? Especially the nodes used as XDCR Remote Cluster hosts?
Let us assume the two cluster names to be C1 and C2
Yes, we have stopped all couchbase services in all the nodes in C1 and re-created the cluster (C1). After that the replication got automatically stopped from C2 -> C1. We have enabled the replication again from C2 -> C1 and also enabled replication from C1 -> C2. One of the node in cluster (C2) is having this issue in replication.
We did not remove / add any nodes explicitly, only restarts were performed in both c1 and c2.
On C1, please delete the replication from C1->C2, delete the remote cluster and create both again. The replication should catch up.
Thanks Pavithra,
Do you mean delete on C2 -> C1 ? And add it again. Coz one of the nodes on C2 report the bootstrap node moved issue.
How to do delete and create the cluster seamlessly? We have data and also ongoing traffic. How to achieve this seamlessly?
No, C1->C2. The bootstrapping error is on the target cluster.
yessara,
Yes, only delete the C1->C2 replication and C2 remote cluster from C1 and add them back again. There is no need to modify anything on C2. This will be seamless. Once the C1->C2 replication is re-created, it will catch up.
After deleting replication and the remote cluster and then add them again worked. Thanks Pavithra!