Couchbase XDCR only replicate 90% of data

sliu · January 11, 2016, 7:31pm

We’re currently using couchbase Version: 3.0.1 Community Edition (build-1444), and experiencing an issue with XDCR.

Currently we have 2 data centers A and B in production with XDCR enabled(1 Billion docs in A and B). We’re doing a Unidirectional XDCR from datacenter A to new datacenter C for 1 billion docs. We find that it can only copy 90% docs(900 million) to datacenter C. Datacenter C still have XDCR requesting coming in, but the num will hold around 900 million. This issue is there for about one day.

Could anyone give me some help on this? Thanks a lot.

anil · January 11, 2016, 9:49pm

Hi, Going through your scenario it seems that 10% of data is not getting replicated every time. Can you check if your documents have Time To Live (TTL) set? If yes, then next setting I will ask to check is ‘Metadata Purge Interval’ this could explain why 10% data is not getting replicated. If not, I would suggest that you should open a issue in our JIRA tracker referencing this post as well as providing cbcollect_info from the cluster and any additional steps which might help us reproduce it.

sliu · January 11, 2016, 10:25pm

Hi Anil,

Thank you for you reply. our docs have TTL about 30 days. it looks ok?

our Metadata Purge Interval is only 0.04 in Cluster A, B, C, is this a issue? what value do you recommend?

Thanks.
Steve

anil · January 12, 2016, 12:58am

Hi Steve, Looks like purge is happening quite frequently every hour (0.04) and deleting the docs and metadata. I cannot say with certainty that’s the issue without knowing all the details. But I would recommend setting the Metadata Purge Interval to default of 3 days and checking if that fixes the issue.

Anil Kumar

sliu · January 12, 2016, 6:39pm

Hi Anil,

I changed the Metadata Purge Interval to 6 days. and after one night, the number in cluster C increased by 1%. so it looks like there is some other issue here.

I also checked the XDCR errors from web UI and saw errors below. This error is in XDCR between our production XDCR(A <-> B), and also in this backup XDCR(A -> C). Does this mean there are some bad docs in the bucket so XDCR will not sync the docs in that bucket?

2016-01-12 18:29:52 [Vb Rep] Error replicating vbucket 455. Please see logs for details.
2016-01-12 18:29:52 [Vb Rep] Error replicating vbucket 446. Please see logs for details.
2016-01-12 18:29:52 [Vb Rep] Error replicating vbucket 433. Please see logs for details.

Thanks.
Steve

anil · January 13, 2016, 1:48am

Hi Steve,

To find the root cause I would suggest to turn-on verbose logging using the Advanced XDCR Settings. Also I would suggest that you should open a issue in our JIRA tracker referencing this post as well as providing cbcollect_info from the cluster and any additional steps which might help us reproduce it.

Anil Kumar

Topic		Replies	Views
XDCR not replicating all documents, but no errors Couchbase Server	0	1433	December 1, 2016
XDCR - only 1/3 documents transferred Couchbase Server	4	2223	February 5, 2015
Metadata Purge Interval and XDCR Couchbase Server	4	4290	June 1, 2015
XDCR not working properly from Couchbase 7.0.2 to Couchbase 7.2.2 Kubernetes xdcr	2	447	October 13, 2023
XDCR exactly half of the documents are not being replicated Couchbase Server	1	1050	August 18, 2017

Couchbase XDCR only replicate 90% of data

Related topics