Compaction Issue causes data loss?

Hey everyone,

We had an issue with our cluster recently (running on 4.5 currently)… Compaction was not completing, it seemed to be continually running on a single node and the disk was getting close to full. We canceled the compaction and had to bounce the node to get things stable. We are using a 4 node cluster with a replication factor of 2.

What we are seeing now is some documents missing, and some documents reverted to an earlier version.

Is it possible that the scenario described above could cause documents to be missing or reverted to a previous version?

Thanks,

-Damon

Compaction keeps the old file and creates a new compacted file. During compaction no writes happen and if compaction is cancelled it will go back to using the old file. So, there should be no data loss.

There are two issues

  1. Compaction not completing.
  2. Data loss after “bouncing” the node

For 1, we need to look at the logs to see for clues for why compaction not completing.
For 2, Bouncing will remove the active node from the cluster, the data may not have been replicated to other nodes and can be lost. This may be the reason for data loss.

1 Like