We are using 4.5.0-2601 Enterprise Edition (build-2601). I should note that we upgraded from 4.1.1 a couple days ago.
The Circular Write Mode compaction was scheduled to start at 2:22AM. A series of core dumps occurred starting at 2:22AM and up to 3:58AM, when all disk space had been consumed, on the Couchbase Node with the index.
Is this a known issue? And if not, what information (core, logs, etc.) do you need to investigate this? (The core file is ~ 10GB).
Here are logs from the babysitter.log file for the first core dump.
[ns_server:info,2016-09-08T02:22:13.143-04:00,babysitter_of_ns_1@127.0.0.1:<0.122.0>:supervisor_cushion:handle_info:58]Cushion managed supervisor for indexer failed: {abnormal,1}
[error_logger:error,2016-09-08T02:22:13.143-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]** Generic server <0.123.0> terminating
** Last message in was {#Port<0.3538>,{exit_status,1}}
** When Server state == {state,#Port<0.3538>,indexer,
{["[goport] 2016/09/08 02:22:13 /opt/couchbase/bin/indexer terminated: signal: aborted (core dumped)",
"\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/net/http/transport.go:853 +0x10a6",
“created by net/http.(*Transport).dialConn”,
"\t/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc8231abfb0 sp=0xc8231abfa8"],
[“runtime.goexit()”]},
indexer,undefined,[],0}
** Reason for termination ==
** {abnormal,1}
[error_logger:error,2016-09-08T02:22:13.144-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: ns_port_server:init/1
pid: <0.123.0>
registered_name: []
exception exit: {abnormal,1}
in function gen_server:terminate/6 (gen_server.erl, line 744)
ancestors: [<0.122.0>,<0.121.0>,ns_child_ports_sup,ns_babysitter_sup,
<0.56.0>]
messages: [{‘EXIT’,#Port<0.3538>,normal}]
links: [<0.122.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 196650
stack_size: 27
reductions: 8440393
neighbours:
[ns_server:debug,2016-09-08T02:22:13.144-04:00,babysitter_of_ns_1@127.0.0.1:<0.129.0>:supervisor_cushion:init:39]starting ns_port_server with delay of 5000
[error_logger:error,2016-09-08T02:22:13.144-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]** Generic server <0.122.0> terminating
** Last message in was {die,{abnormal,1}}
** When Server state == {state,indexer,5000,
{1473,284053,337615},
undefined,infinity}
** Reason for termination ==
** {abnormal,1}
[error_logger:error,2016-09-08T02:22:13.145-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: erlang:apply/2
pid: <0.121.0>
registered_name: []
exception exit: {abnormal,1}
in function restartable:loop/4 (src/restartable.erl, line 69)
ancestors: [ns_child_ports_sup,ns_babysitter_sup,<0.56.0>]
messages: []
links: [<0.69.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 2586
stack_size: 27
reductions: 1482
neighbours:
[error_logger:error,2016-09-08T02:22:13.145-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,ns_child_ports_sup}
Context: child_terminated
Reason: {abnormal,1}
Offender: [{pid,<0.121.0>},
{name,
{indexer,"/opt/couchbase/bin/goport",[],
[use_stdio,exit_status,stderr_to_stdout,stream,
{log,“indexer.log”},
{env,
[{“GOPORT_ARGS”,
"["/opt/couchbase/bin/indexer","-vbuckets=1024","-cluster=127.0.0.1:8091","-adminPort=9100","-scanPort=9101","-httpPort=9102","-streamInitPort=9103","-streamCatchupPort=9104","-streamMaintPort=9105","-storageDir=/opt/couchbase/var/lib/couchbase/data/@2i","-diagDir=/opt/couchbase/var/lib/couchbase/crash","-nodeUUID=9cb6edda57a624f08193c894ef30c1ce","-storageMode=forestdb"]"},
{“GOTRACEBACK”,“crash”},
{“CBAUTH_REVRPC_URL”,
“http://%40:67c09ea0f990c89ccbc3532fd6b7b362@127.0.0.1:8091/index”}]}]}},
{mfargs,
{restartable,start_link,
[{supervisor_cushion,start_link,
[indexer,5000,infinity,ns_port_server,
start_link,
[#Fun<ns_child_ports_sup.2.49698737>]]},
86400000]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,worker}]
[error_logger:info,2016-09-08T02:22:13.161-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================PROGRESS REPORT=========================
supervisor: {local,ale_dynamic_sup}
started: [{pid,<0.131.0>},
{name,‘sink-indexer’},
{mfargs,
{ale_disk_sink,start_link,
[‘sink-indexer’,
"/opt/couchbase/var/lib/couchbase/logs/indexer.log",
[{rotation,
[{compress,true},
{size,41943040},
{num_files,10},
{buffer_size_max,52428800}]}]]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,worker}]
[ns_server:debug,2016-09-08T02:22:13.220-04:00,babysitter_of_ns_1@127.0.0.1:<0.128.0>:restartable:start_child:98]Started child process <0.129.0>
MFA: {supervisor_cushion,start_link,
[indexer,5000,infinity,ns_port_server,start_link,
[#Fun<ns_child_ports_sup.2.49698737>]]}
[error_logger:info,2016-09-08T02:22:13.220-04:00,babysitter_of_ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================PROGRESS REPORT=========================
supervisor: {local,ns_child_ports_sup}
started: [{pid,<0.128.0>},
{name,
{indexer,"/opt/couchbase/bin/goport",[],
[use_stdio,exit_status,stderr_to_stdout,stream,
{log,“indexer.log”},
{env,
[{“GOPORT_ARGS”,
"["/opt/couchbase/bin/indexer","-vbuckets=1024","-cluster=127.0.0.1:8091","-adminPort=9100","-scanPort=9101","-httpPort=9102","-streamInitPort=9103","-streamCatchupPort=9104","-streamMaintPort=9105","-storageDir=/opt/couchbase/var/lib/couchbase/data/@2i","-diagDir=/opt/couchbase/var/lib/couchbase/crash","-nodeUUID=9cb6edda57a624f08193c894ef30c1ce","-storageMode=forestdb"]"},
{“GOTRACEBACK”,“crash”},
{“CBAUTH_REVRPC_URL”,
“http://%40:67c09ea0f990c89ccbc3532fd6b7b362@127.0.0.1:8091/index”}]}]}},
{mfargs,
{restartable,start_link,
[{supervisor_cushion,start_link,
[indexer,5000,infinity,ns_port_server,
start_link,
[#Fun<ns_child_ports_sup.2.49698737>]]},
86400000]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,worker}]
[ns_server:info,2016-09-08T02:22:13.681-04:00,babysitter_of_ns_1@127.0.0.1:<0.76.0>:ns_port_server:log:210]memcached<0.76.0>: 2016-09-08T02:22:13.480478-04:00 NOTICE (bwecl) DCP (Producer) eq_dcpq:secidx:proj-bwecl-MAINT_STREAM_TOPIC_50:70:b9:ed:fe:b7:a:54-11994220998227306265/2 - (vb 344) Stream closing, sent until seqno 268958 remaining items 0, reason: The stream closed early because the conn was disconnected
And the initial log from the indexer.log
2016-09-08T02:21:57.362-04:00 [Info] ForestDBSlice::OpenSnapshot SliceId 0 IndexInstId 955245833351623398 Creating New Snapshot SnapshotInfo: seqnos: 110248364, 110248364, 0 committed:false
2016-09-08T02:21:57.362-04:00 [Info] StorageMgr::handleCreateSnapshot Added New Snapshot Index: 955245833351623398 PartitionId: 0 SliceId: 0 Crc64: 985532960051374772 (SnapshotInfo: seqnos: 110248364, 110248364, 0 committed:false) SnapCreateDur 52.804µs SnapOpenDur 287.405µs
[FDB ERR] Read error: read offset 199762582933037056 exceeds the file’s current offset 121453547520 in a database file ‘/opt/couchbase/var/lib/couchbase/data/@2i/bwecl_bwidx_bwecl_sp_startTime___269554588572028ac529f7faceb54b18_955245833351623398_0.index/data.fdb.2’
[FDB ERR] Failed to read the B±Tree block (block id: 48770161848886, block address: 0x7f333da2c000)
fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0x0 pc=0x7f3559c72b22]
runtime stack:
runtime.throw(0x1214980, 0x2a)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/panic.go:530 +0x90 fp=0x7f3555ff8208 sp=0x7f3555ff81f0
runtime.sigpanic()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/sigpanic_unix.go:12 +0x5a fp=0x7f3555ff8258 sp=0x7f3555ff8208
goroutine 11138540 [syscall, 3 minutes, locked to thread]:
runtime.cgocall(0xc8a648, 0xc820195b38, 0x0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/cgocall.go:123 +0x11b fp=0xc820195ae8 sp=0xc820195ab8
github.com/couchbase/indexing/secondary/fdb._Cfunc_fdb_compact_upto(0x7f3554415aa0, 0x7f3544175b00, 0x1a7b91b, 0x0)
??:0 +0x41 fp=0xc820195b38 sp=0xc820195ae8
github.com/couchbase/indexing/secondary/fdb.(*File).CompactUpto(0xc823aa9680, 0xc823e77400, 0x91, 0xc824a039a0, 0x0, 0x0)
/home/couchbase/jenkins/workspace/watson-unix/goproj/src/github.com/couchbase/indexing/secondary/fdb/file.go:113 +0x32d fp=0xc820195bc8 sp=0xc820195b38
github.com/couchbase/indexing/secondary/indexer.(*fdbSlice).Compact(0xc8220f3680, 0xecf644b27, 0xc83b4bd6ca, 0x1998ca0, 0x0, 0x0)
…