Rebalance failing - possibly due to primary index running out of space?

mvgetz · July 21, 2016, 2:07pm

We created a primary index on a bucket with 450M documents and it ran out of space on the indexing volume. This caused the node indexing thread to continually restart. A graceful removal of that node from the cluster did not work so we did a hard failover. Since then, rebalances will not work. I cleaned the data off of the node and added it back to the cluster but that did not help any.

Lot of errors in the logs:
memcached.log
2016-07-20T21:59:20.487794-05:00 WARNING (stats) Notified the timeout on checkpoint persistence for vbucket 877, id 0, cookie 0x7fe182ab9a80 2016-07-20T21:59:20.487841-05:00 WARNING 121: Slow SEQNO_PERSISTENCE operation on connection (127.0.0.1:58703 => 127.0.0.1:11209): 31000 ms 2016-07-20T21:59:20.498013-05:00 WARNING (stats) Notified the timeout on checkpoint persistence for vbucket 876, id 0, cookie 0x7fe182af4780 2016-07-20T21:59:20.498067-05:00 WARNING 122: Slow SEQNO_PERSISTENCE operation on connection (127.0.0.1:54497 => 127.0.0.1:11209): 31000 ms

And many others in the various logs. Here is the collected info from the node that ran out of space while creating the Primary index on the stats bucket.

The cluster is still working and it does not look like we have lost any data yet.

Any help would be appreciated!
Thanks!
Mark

Topic		Replies	Views
Rebalance failed. See logs for detailed reason. You can try again Couchbase Server	0	585	September 3, 2019
After adding a new node to cluster rebalance stuck and server is unable to start or stop Couchbase Server	1	2179	March 8, 2016
Rebalance Stalled while running 4.1.1-5914 Community Edition (build-5914) Couchbase Server	2	658	September 5, 2018
Replace a node, and rebalance Couchbase Server	9	3487	April 11, 2017
Trying to recover from an outage, rebalancing fails immediately Couchbase Server	3	310	September 10, 2023

Rebalance failing - possibly due to primary index running out of space?

Related topics