Why is my Couchbase Indexer failing?

Joshua_Fox · May 7, 2019, 12:02pm

Logs show this, repeatedly.

>  Service 'indexer' exited with status 2. Restarting. Messages:
>     goproj/src/github.com/couchbase/plasma/page.go:826 +0x111
>     github.com/couchbase/plasma.(*Plasma).Persist(0xc435096400, 0x7f0498362e00, 0xc436f38600, 0xc4374c1b80, 0x0, 0x0)
>     goproj/src/github.com/couchbase/plasma/persistor.go:139 +0x174
>     github.com/couchbase/plasma.(*Plasma).PersistAll2.func1(0x7f0498362e00, 0x0, 0x0, 0xffffffffffffffff, 0x21, 0x21)
>     goproj/src/github.com/couchbase/plasma/persistor.go:182 +0x5a
>     github.com/couchbase/plasma.(*Plasma).VisitPartition(0xc435096400, 0x0, 0x0, 0xffffffffffffffff, 0xc440d28ce0, 0x0, 0x0)
>     goproj/src/github.com/couchbase/plasma/page_visitor.go:64 +0x1ef
>     github.com/couchbase/plasma.(*Plasma).PageVisitor.func1(0xc440d28cf0, 0xc440d28d00, 0x1, 0x1, 0xc435096400, 0xc440d28ce0, 0x0, 0x0, 0xffffffffffffffff)
>     goproj/src/github.com/couchbase/plasma/page_visitor.go:40 +0x89
>     created by github.com/couchbase/plasma.(*Plasma).PageVisitor
>     goproj/src/github.com/couchbase/plasma/page_visitor.go:41 +0x1ae
>     [goport(/opt/couchbase/bin/indexer)] 2019/05/07 15:00:54 child process exited with status 2

Couchbase was working fine until today.

The Java SDK always gives an error “Indexer In Warmup State. Please retry the request later.” The indexing GUI has red margins, as if the indexes are unbuilt. Dropping indexes from the GUI or the Java SDK always fails.

Restarting the Couchbase server does not help.

This thread shows the same error message on Windows. It says that this can happen if the data is corrupted on power-off. I certainly hope that that did not happen. Any machine can potentially lose power, and unrecoverable corruption of production data is a reason to strictly avoid Couchbase.

This is “Enterprise Edition 6.0.1 build 2037 ,” a single node on a development laptop, Ubuntu 18.10.

deepkaran.salooja · May 8, 2019, 12:37am

Please share the indexer.log or share the cbcollect from UI->Logs. We’ll need to look at the full stack to understand the exact problem.

Joshua_Fox · May 8, 2019, 7:33am

UI-> Logs , then “collect logs”, gives this,

 Error: Unable to collect logs from the following nodes:
127.0.0.1 Node errors:
**127.0.0.1**File "/opt/couchbase/bin/cbcollect_info", line 331 except OSError, e: ^ SyntaxError: invalid syntax

My Ubuntu 18.10 has Python3 on the system Python, but I see that cbcollect still uses Python2. (When installing Couchbase, I also had to wrestle with Python2/3 errors).

I did not uninstall/install Python recently, but I did attempt to install cbc. I think that the indexer started failing before that, but anyway,these instructions for installing cbc , failed with the message shown below.

(Note that sudo was required for the following command.)

sudo perl couchbase-csdk-setup
...

Running apt-get -qq update..
Running: apt-get -q install libcouchbase2-core libcouchbase2-libevent libcouchbase2-bin libcouchbase-dev
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package libcouchbase2-core
E: Unable to locate package libcouchbase2-libevent
E: Unable to locate package libcouchbase2-bin
E: Unable to locate package libcouchbase-dev
Couldn't install! at couchbase-csdk-setup line 207, <STDIN> line 2.
...

Joshua_Fox · May 8, 2019, 9:39am

Please share the indexer.log o
Here it is, with index-names redacted: (It is 20 MB and as a text file, cannot be attached inside the forum)
https://drive.google.com/file/d/1US40mQBBbaeZ7zGA_B-kA9BifYRCDk_y/view?usp=sharing

Joshua_Fox · May 8, 2019, 9:40am

Further analysis shows that /opt/couchbase/var/lib/couchbase/data/@2i has 36 GB and may have filled my 100 GB system disk. I thought I had configured Couchbase to put data on my 1 TB data disk, but I can see that this may be the root cause.

But in that case… why don’t the various error messages (above) say “disk space low”?

amit.kulkarni · May 9, 2019, 12:36pm

Hi @Joshua_Fox,

You can see Alert on couchbase server Web UI when disk is getting full on the couchbase cluster nodes.

Joshua_Fox · May 12, 2019, 8:59am

Amit, thank you.

Thank you. I suggest that

Couchbase should shutdown cleanly when 3% of the disk remains. The user cannthen clean up and restart. This is better than just entering an unclear error state and potentially borking the OS.
The advance warning is valuable, but when the disk fills up, Couchbase should clearly shows"disk full" with errors

amit.kulkarni · May 13, 2019, 4:23pm

Hi @Joshua_Fox,

Thanks a lot for sharing the log file. Looks like it may be a newly identified bug. I have opened MB-34153.

Please note that the couchbase indexing should not panic in case of disk getting full. It should keep retrying the write operation until the error goes away. So, any panic is a potential bug. Thanks for reporting it.

Starting with couchbase server 5.5, couchbase indexing service detects the in disk corruption (if any) and ensures availability of non-corrupt indexes. MB-28139. Please note that the corruption (as mentioned in the other forum post) could have been caused by the events outside of the couchbase’s control (for example hardware misbehaving).

Finally,
Thanks for the suggestion on better handling of the disk full scenario. I will make a suggestion for this internally.

Joshua_Fox · May 14, 2019, 7:41am

Thank you. I understand that resource exhaustion is difficult to do correctly, so my 2 cents as a user may help here.

amit.kulkarni
      Couchbase




    May 13
Hi @Joshua_Fox,

Thanks a lot for sharing the log file. Looks like it may be a newly identified bug. I have opened MB-34153.

Please note that the couchbase indexing should not panic in case of disk getting full. It should keep retrying the write operation until the error goes away.

Really? Shouldn’t Couchbase stop indexing when the disk is more than 97% full? After all, you don’t want to make things worse.

So, any panic is a potential bug. Thanks for reporting it.

Starting with couchbase server 5.5, couchbase indexing service detects the in disk corruption (if any) and ensures availability of non-corrupt indexes. MB-28139. Please note that the corruption (as mentioned in the other forum post) could have been caused by the events outside of the couchbase’s control (for example hardware misbehaving).

I was running Couchbase in a GCE VM but also on my laptop, which is where I experienced disk exhaustion.

This may or may not be an exceptional use case, but here goes:

Couchbase did not allow me to define a data-directory on my large data disk – it simply rejected such directory choices-- so I left the data on my smaller system disk.
The data was protected deep inside a system directory (opt) so that Baobab and similar didn’t even show me the cause of the disk exhaustion until I retried with sudo.
Soon after I forceably deleted this data, my OS died. (The fan spun noisily for a week. After it was cleaned, Ubuntu would not boot up and I had to reinstall an Ubuntu OS, which did work.)

I do *not *think that Couchbase caused it, but it is true that these occurred at the same time. One can consider a hardware failure causing problems with Couchbase (though I don’t think that happened),

Topic		Replies	Views
Couchbase 6.0 indexing failing Couchbase Server	22	2297	November 15, 2021
5.1.1 EE: Service 'indexer' exited with status 134 Couchbase Server n1ql , index	17	3303	February 21, 2019
Service 'indexer' exited with status 2 Couchbase Server index	1	1012	May 31, 2022
Service 'indexer' exited with status 1 SQL++	4	2804	July 22, 2016
Service 'indexer' exited with status 2. Restarting. Messages: Couchbase Server	9	2154	September 1, 2020

Why is my Couchbase Indexer failing?

Related topics