Couchbase DCP skip intermediate states

We have a CB cluster and a source kafka-connect which reads all the mutations from particular buckets - scopes and collections.
I’ve seen it multiple times that if there are any issues in source connector and it restarts after a while - all of the intermediate mutations for a particular document are skipped. Rather, the events for the initial and most likely the last state of the documents are being received in the kafka consumer.
I’ve read somewhere that this is due to the stream compaction. Could you please help me to understand this behaviour.

Hi @nitin1878 !

The behavior you describe is documented here: Delivery Guarantees | Couchbase Docs

If this is a problem for your use case, and you’re using Couchbase Server 7.2 or later with a bucket backed by the Magma storage engine, the issue can be mitigated (but not completely resolved in all cases) by enabling Change History | Couchbase Docs

Thanks,
David

There’s also a nice concise explanation of compaction here: What is compaction term related to Couchbase - Stack Overflow

It was posted almost a decade ago, but I don’t think the concept has changed in any significant way.

1 Like