Hi there, I’m currently using couchbase 3.0.1.
I got 3 buckets in the server, with 1 node only.
Just now I faced a problem, the server kept on having “Write Commit Failure” alert.
I tried to cbbackup the data out, 1 of the bucket hangs at “100.0% (58461/estimated 58475 msgs)”, while the 2 others hang at very beginning (even before folder is created).
Here are some logs I found:
Mon Jun 8 22:22:09.972486 HKT 3: (iptb) Warning: couchstore_save_local_document failed error=error reading file [errno = 0: ‘Success’]
Mon Jun 8 22:22:09.972702 HKT 3: (iptb) Warning: failed to save local doc, name=/vol1/couchbase_data/iptb/491.couch.491
Mon Jun 8 22:22:09.972912 HKT 3: (iptb) Warning: failed to set new state, active, for vbucket 491
Mon Jun 8 22:22:09.972994 HKT 3: (iptb) VBucket snapshot task failed!!! Rescheduling
Mon Jun 8 22:22:09.973943 HKT 3: (iptb) Warning: couchstore_save_local_document failed error=error reading file [errno = 0: ‘Success’]
Mon Jun 8 22:22:09.974074 HKT 3: (iptb) Warning: failed to save local doc, name=/vol1/couchbase_data/iptb/311.couch.311
Mon Jun 8 22:22:09.974167 HKT 3: (iptb) Warning: failed to set new state, active, for vbucket 311
Mon Jun 8 22:22:09.974238 HKT 3: (iptb) VBucket snapshot task failed!!! Rescheduling
Now my service is not able to start (as there will be write commit failure, leading to loss of data).
Please tell me how can I recover my service!!!
I know this is a basic question, but how much free disk space do you have?
Also, what operating system are you running?
original environment:
CentOS 6.4
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_seed8gram2core-lv_root
6.6G 5.1G 1.3G 81% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 32M 428M 7% /boot
/dev/sdb1 20G 9.8G 9.0G 53% /vol1
then I copied the data to these environment:
CentOS 6.4
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root 14G 2.9G 11G 22% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 32M 428M 7% /boot
/dev/sdb1 50G 897M 46G 2% /vol1
CentOS 6.5
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_tw-lv_root 50G 12G 35G 26% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 485M 32M 428M 7% /boot
/dev/mapper/vg_tw-lv_home 172G 8.9G 155G 6% /home
Couchbase data in /vol1/
All gave the same result
Thanks
Also, when I try to list the Documents of the bucket that hangs at 100%, it shows:
“Error: internal (memcached_error)”
While found in the log:
[ns_server:info,2015-06-09T11:04:59.429,babysitter_of_ns_1@127.0.0.1:<0.78.0>:ns_port_server:log:169]memcached<0.78.0>: Tue Jun 9 11:04:59.228044 HKT 3: (iptb) couchstore_all_docs failed for database file of vbucket = 311 rev = 146, errCode = 4294967294
memcached<0.78.0>: Tue Jun 9 11:04:59.230594 HKT 3: (iptb) couchstore_all_docs failed for database file of vbucket = 311 rev = 146, errCode = 4294967294
memcached<0.78.0>: Tue Jun 9 11:04:59.247649 HKT 3: (iptb) couchstore_all_docs failed for database file of vbucket = 491 rev = 273, errCode = 4294967294
memcached<0.78.0>: Tue Jun 9 11:04:59.251020 HKT 3: (iptb) couchstore_all_docs failed for database file of vbucket = 491 rev = 273, errCode = 4294967294
no one’s gonna check this case?
When you says
How did you achieve that? I’m wondering if you copied/rsynced, restored from backup, xdcr or else.
The issue could be file permissions, corrupted data files, hardware failure, etc…
Is this a single node or a cluster? What the health state of cb? Are some of your processes being killed by the OS for lack or ram?
Please provide more details
To get your data out of your original node (write commit failure) did you try cbbackup couchfiles-store://[path to data dir] instead of http://[cluster name]:8091 ?
I just copied the data and configuration files. Files permission was set same as a brand new installed couchbase server
The CB status was healthy (“running” from service couchbase-server status)
Dun see any killed service from the kernel log, no lack of RAM shown.
I’ve listed all the keys with views and inserted them to a brand new CB server, new server running fine, old server’s problem is still a mystery. At least I wanna know how to prevent this from happening again…
also failed with couchfiles-store://[path to data dir]