Hi,
The nodes on my couchbase server seems to be constantly going into pending status which then goes down eventually and then goes back up again after some time.
This happened as I try to upload more and more documents to the nodes using the Java SDK. My bucket is currently holding about 490 million documents with full-eviction. It’s running on 3 nodes with about 46.8GB of memory and about 2.1 TB of HDD space in total. Below is what i’ve got from error.log on one of the node.
[stats:error,2015-06-30T8:03:28.331,ns_1@ec2-54-153-141-152.ap-southeast-2.compute.amazonaws.com:<0.925.0>:stats_collector:handle_info:124]Exception in stats collector: {exit,
{noproc,
{gen_server,call,
['ns_memcached-Sample',{stats,<<>>},180000]}},
[{gen_server,call,3,
[{file,"gen_server.erl"},{line,188}]},
{ns_memcached,do_call,3,
[{file,"src/ns_memcached.erl"},{line,1399}]},
{stats_collector,grab_all_stats,1,
[{file,"src/stats_collector.erl"},{line,84}]},
{stats_collector,handle_info,2,
[{file,"src/stats_collector.erl"},
{line,116}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,604}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]}
[stats:error,2015-06-30T8:03:28.332,ns_1@ec2-54-153-141-152.ap-southeast-2.compute.amazonaws.com:<0.925.0>:stats_collector:handle_info:124]Exception in stats collector: {exit,
{noproc,
{gen_server,call,
['ns_memcached-Sample',{stats,<<>>},180000]}},
[{gen_server,call,3,
[{file,"gen_server.erl"},{line,188}]},
{ns_memcached,do_call,3,
[{file,"src/ns_memcached.erl"},{line,1399}]},
{stats_collector,grab_all_stats,1,
[{file,"src/stats_collector.erl"},{line,84}]},
{stats_collector,handle_info,2,
[{file,"src/stats_collector.erl"},
{line,116}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,604}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]}
[ns_server:error,2015-06-30T8:03:37.090,ns_1@ec2-54-153-141-152.ap-southeast-2.compute.amazonaws.com:ns_doctor<0.329.0>:ns_doctor:update_status:229]The following buckets became not ready on node 'ns_1@ec2-54-153-141-152.ap-southeast-2.compute.amazonaws.com': ["Sample",
"office"], those of them are active ["Sample",
"office"]
[ns_server:error,2015-06-30T8:04:24.298,ns_1@ec2-54-153-141-152.ap-southeast-2.compute.amazonaws.com:ns_log<0.277.0>:ns_log:handle_cast:210]unable to notify listeners because of badarg
Type :quit<Enter> to exit Vim
I’m confused as to what is causing this problem and would like to know what’s causing it so that i can prevent it from happening in production…
Does anyone know why? Is the server possibly not big enough?