We have a spring boot application that’s used java sdk client 3.0.4 with a couchbase server 6.5 that’s run inside a kubernetes.
We try to execute 50 http request by second over 4 pods on 6 couchbase nodes (3 data and 3 query) that’s contain 500000 items and we have 50% of request with execution time superior than 5s.
The principal error is
com.couchbase.client.core.error.AmbiguousTimeoutException: QueryRequest
at com.couchbase.client.java.AsyncUtils.block(AsyncUtils.java:51)
at com.couchbase.client.java.Cluster.query(Cluster.java:393)
at com.carrefour.fr.cs.slot.infra.repository.Couchbase.query(Couchbase.java:38)
More detail :
com.couchbase.client.core.error.AmbiguousTimeoutException: QueryRequest {“cancelled”:true,“completed”:true,“coreId”:“0x57cfd6b600000001”,“idempotent”:false,“reason”:“TIMEOUT”,“requestId”:2210,“requestType”:“QueryRequest”,“retried”:12,“retryReasons”:[“ENDPOINT_NOT_AVAILABLE”,“ENDPOINT_TEMPORARILY_NOT_AVAILABLE”],“service”:{“operationId”:“24d826b0-0574-4de5-86ed-3b46a13a3c9e”,“statement”:“select agendaType from slots where type = ‘AGENDA’ and metiCode = $metiCode”,“type”:“query”},“timeoutMs”:10000,“timings”:{“totalMicros”:10504532}}] with root cause
We have check the indexes and all seems good.
The server does captured none slow queries.
Same here. Tried SDK 3.0.6 and 3.0.8, using reactive API. Couchbase server 6.5.1. Got this exception on upsert and mutateIn. The only way to continue upserting data is a restart of container.
I also sometimes face the same issue with sdk 3.0.9 and couchbase server 6.5.1. On restarts of my application, all the replace/upsert operation starts throwing AmbigousTimeoutException repeatedly. Strange thing is that n1ql queries run successfully. Only after restarting the application does the replace/upsert operations become stable.
We experienced a similar issue.
In our case, we ultimately concluded that the issue was caused by the Java service running out of Heap memory (which was completely unexpected. We thought it was a problem with the Couchbase server, but it turned out to be a GC issue on the client that caused the delay).
ENDPOINT_NOT_AVAILABLE on startup of an application against a healthy cluster usually indicates that the SDK has not had enough time to complete initialization before requests were made (initialization is asynchronous). The SDK method waitUntilReady can be called by the application to wait asynchronously for initialization to complete before proceeding to send requests.