We are running a 3 node cluster of couchbase community edition 4.0.0-4051
Since today some documents seem to have disappeared. Examining the cluster nodes I found that there is a problem with one node.
the memcached.log rotates like crazy showing thousands of entries like:
2017-03-28T21:32:37.612737+02:00 WARNING (BUCKET) Fatal error in persisting SET ``DOC-ID-1'' on vb 6!!! Requeue it...
2017-03-28T21:32:37.612804+02:00 WARNING (BUCKET) Fatal error in persisting SET ``DOC-ID-134'' on vb 6!!! Requeue it...
2017-03-28T21:32:37.612872+02:00 WARNING (BUCKET) Fatal error in persisting SET ``DOC-ID-14'' on vb 6!!! Requeue it...
2017-03-28T21:32:37.612957+02:00 WARNING (BUCKET) Fatal error in persisting SET ``DOC-ID-341'' on vb 6!!! Requeue it...
2017-03-28T21:32:37.613035+02:00 WARNING (BUCKET) Fatal error in persisting SET ``DOC-ID-94'' on vb 6!!! Requeue it...
Trying to run cbtransfer
results in the following error:
error: could not read couch store file: /data/couchbase/data/BUCKET/6.couch.14; exception: malformed data in file
This particular file is JSON document:
{
"ep_max_checkpoints" : "2",
"ep_tap_queue_fill" : "0",
"ep_flushall_enabled" : "1",
"ep_tap_backlog_limit" : "5000",
"mem_used" : "8752200",
"ep_tap_queue_backfillremaining" : "0",
"ep_chk_persistence_timeout" : "10",
"vb_pending_queue_size" : "0",
"vb_pending_ops_create" : "0",
"ep_dcp_count" : "0",
"ep_alog_sleep_time" : "1440",
...
"ep_item_eviction_policy" : "value_only",
"ep_vb_total" : "0",
"ep_total_new_items" : "0",
"vb_replica_meta_data_memory" : "0",
"vb_replica_ops_create" : "0",
"ep_tap_bg_fetched" : "0",
"vb_replica_queue_fill" : "0",
"ep_diskqueue_fill" : "0",
"ep_max_num_workers" : "3"
}
All other files with this naming scheme are binary data files. The file’s date corresponds to a reboot of the cluster.
What is going on here? Is there a way to fix this problem?
Thanks in advance!
Stefan