.RuntimeException: java.util.concurrent.TimeoutException

Alessandro.79 · November 30, 2016, 7:27pm

Hi, i have a problem with my cluster with 8 server nodes.
The problem is when one of the server node is down, the couchbase entries in a loop without upsert nothing and return wtih this response :
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:75)
at com.couchbase.client.java.CouchbaseBucket.upsert(CouchbaseBucket.java:353)
at com.couchbase.client.java.CouchbaseBucket.upsert(CouchbaseBucket.java:348)

Thanks for any advice

ingenthr · November 30, 2016, 9:28pm

The design is that we will continue trying an operation until timeout. If the item you’re trying to work with is on the down node, it will timeout. If you failover the node rather than just leave it down, you should see operations recover. There’s more on this in the docs.

Alessandro.79 · December 1, 2016, 9:09am

Thanks for the answer
The problem persists even when the node comes back up.
The system always responds with timeout.
At the time the solution is to restart the service that makes the entry in the cluster.
The cluster connection settings are:

DefaultCouchbaseEnvironment.builder()
	.connectTimeout(180 * 1000) // 180 Seconds 
	.keepAliveInterval(3600 * 1000) // 3600 Seconds 	
	.build();

daschl · December 1, 2016, 7:46pm

Which SDK version are you using?

Alessandro.79 · December 2, 2016, 8:42am

using:
java-client-2.2.3.jar

unhuman · December 2, 2016, 4:13pm

Very similar to the problem we’re having here:

But our cluster nodes are definitely up.

Alessandro.79 · December 2, 2016, 5:16pm

unhuman

did you find a solution?
Thanks

unhuman · December 2, 2016, 6:54pm

@Alessandro.79

Not yet. We have checked to ensure we have network path on all ports (8091, 8092, 8093, 11210, 11211) - we do. We are trying now flipping off Full Ejection. We are also trying to test from other locations, but that’s not set up yet.

unhuman · December 2, 2016, 7:26pm

@Alessandro.79

We just turned off Full Ejection, which restarted the bucket. Our test was able to run successfully. This isn’t a confirmation that it’s a fix to the problem (what is your bucket setting?) but we’re working for now.

unhuman · December 2, 2016, 10:21pm

@Alessandro.79 Looks like I spoke too soon. Came back.

daschl · December 3, 2016, 11:08am

@Alessandro.79 can you try 2.3.5 just as a sanity check please?

unhuman · December 5, 2016, 2:28pm

@Alessandro.79

The application server (couchbase client) errors that are mentioned above indeed seem to have stopped with the change to the eviction pattern (full → value). We had some flakey tests that caused me to jump the gun on saying that we weren’t fixed.

Since it was late Friday, I’m hesitant to confirm for sure we’re good, but I’m at least hopeful at this point. I don’t think it was the client version (2.2.5 → 2.3.5) that had any involvement, but we’re not rolling back to test. We’ll take our 2.3.5 as a benefit of this process.

Also, make sure your Java application is allocated enough memory; we have definitely seen oddball errors when the JVM runs out of memory.

I’ll update again if we see anything else.

Alessandro.79 · December 5, 2016, 2:41pm

Thanks for the info!!
For now my solution is if find a “java.lang.RuntimeException: java.util.concurrent.TimeoutException” close connection and re-open
Maybe so I solved

unhuman · December 5, 2016, 3:53pm

If you solve it that way, it would seem you’re just working around a client bug. But, certainly an interesting approach.

We are seeing new errors now:

    {"timestamp":"2016-12-05T15:45:46.695Z","level":"WARN","thread":"cb-io-1-2","logger":"com.couchbase.client.deps.io.netty.channel.AbstractChannel",
"message":"Force-closing a channel whose registration task was not accepted by an event loop: [id: 0x7d28468c]","context":"default",
"exception":"java.util.concurrent.RejectedExecutionException: event executor terminated
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:800)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:345)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:338)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:743)
    at com.couchbase.client.deps.io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:422)
    at com.couchbase.client.deps.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:72)
    at com.couchbase.client.deps.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:60)
    at com.couchbase.client.deps.io.netty.channel.MultithreadEventLoopGroup.register(MultithreadEventLoopGroup.java:64)
    at com.couchbase.client.deps.io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:320)
    at com.couchbase.client.deps.io.netty.bootstrap.Bootstrap.doConnect(Bootstrap.java:134)
    at com.couchbase.client.deps.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:90)
    at com.couchbase.client.core.endpoint.BootstrapAdapter.connect(BootstrapAdapter.java:50)
    at com.couchbase.client.core.endpoint.AbstractEndpoint$4.call(AbstractEndpoint.java:300)
    at com.couchbase.client.core.endpoint.AbstractEndpoint$4.call(AbstractEndpoint.java:297)
    at rx.Single$1.call(Single.java:90)
    at rx.Single$1.call(Single.java:70)
    at rx.Single$2.call(Single.java:171)

and

{"timestamp":"2016-12-05T15:45:46.698Z","level":"ERROR","thread":"cb-io-1-2","logger":"com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.rejectedExecution",
"message":"Failed to submit a listener notification task. Event loop shut down?","context":"default",
"exception":"java.util.concurrent.RejectedExecutionException: event executor terminated
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:800)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:345)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:338)
    at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:743)
    at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:767)
    at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:435)
    at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
    at com.couchbase.client.deps.io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:852)

unhuman · December 5, 2016, 3:54pm

@daschl Please see ^ and provide guidance

Found this: Java client don't connect another cluster node when first is down - #17 by daschl which links Loading...

I don’t know if that will fix our current issue. We’re going to also try going back to 2.2.5 to see if this issue happens with that version. This is an app that was recently updated from 1.4.x to 2.2.5 that started exhibiting this behavior.

Alessandro.79 · January 5, 2017, 1:21pm

The close and re-open connection solution not working…
I’m always looking for the reason why this happens.
You have found the solution?

unhuman · January 5, 2017, 9:46pm

We haven’t been encountering these problems of late. I wouldn’t say we’re fixed, but… for now we’ve got other things to look at. We also had too much going on on the server directly, so we eliminated some design docs, etc. That may have helped…

Our cluster has stabilized so that has helped our application stabilize, I assume. Doesn’t mean we’re good forever, though.

nitinvavdiya · March 23, 2017, 6:48am

I have same problem with 'java-client'-2.4.3.
My code is like :
Cluster cluster = CouchbaseCluster.create("192.168.1.10",192.168.1.11); Bucket bucket = cluster.openBucket("test"); Document doc = SerializableDocument.create("TestKey", 5000000, "Test); bucket.upsert(doc); Thread.sleep(50000); //after thread sleep I am going to down 192.168.1.10 and after this i am going to get entry SerializableDocument doc = bucket.get(key, SerializableDocument.class); //now this method will throw exception

After down one node it throw exception java.lang.RuntimeException: java.util.concurrent.TimeoutException and log shows continues timout error.
This should not happen. If one node is down then it should return value from other node

ingenthr · March 24, 2017, 6:23am

Actually, it shouldn’t. It’s specifically designed not to automatically read from a replica unless there is a failover. Please see the documentation on reading from replicas.

Also, if you have a question about a scenario, it’s probably better to start a new topic rather than pick up an old loosely related one. Thanks!

unhuman · June 21, 2017, 2:05pm

This has come and bit us again. Googling around it appears to be a netty error (which Couchbase include). I found this open Netty issue: https://github.com/netty/netty/issues/5304

We can’t reproduce this reliably, but it has recurred - application restarted.

Thanks for any help!

-H

Topic		Replies	Views
Couchbase client randomly timeouts during document upsert Java SDK	12	7165	December 8, 2017
Signal disconnected Java SDK	1	1637	September 26, 2017
Question for connection timeout Java SDK	3	2429	September 19, 2016
Couchbase Java Client 2.2.5 stuck in bad state Java SDK	11	2908	March 27, 2017
Unable to connect to DB - java.util.concurrent.TimeoutException Java SDK connections	5	40927	September 25, 2019

.RuntimeException: java.util.concurrent.TimeoutException

Related topics