Random ObjectThreadException errors in CouchBase Python driver when using N1QL queries from multiple threads

vkhoroz · July 12, 2021, 2:18pm

When using Python SDK version 3.0.1 or later, a cluster.query call fails randomly when used from multiple threads.
A simple script below can be used to reproduce this quite easily (increase number of threads if not reproduced, but normally at 10 threads more than half of queries fail).

COUCHBASE_URI="changeme"
COUCHBASE_USER="changeme"
COUCHBASE_PASS  ="changeme"

import traceback
from concurrent.futures import ThreadPoolExecutor
from couchbase.cluster import Cluster, ClusterOptions
from couchbase_core.cluster import PasswordAuthenticator

cluster = Cluster(
    COUCHBASE_URI,
    ClusterOptions(PasswordAuthenticator(COUCHBASE_USER, COUCHBASE_PASS)),
)

def query():
    try:
        q = cluster.query("SELECT * FROM ['test'] as ks")
        print(list(q))
    except:
        traceback.print_exc()

pool = ThreadPoolExecutor(10)
for i in range(10):
    pool.submit(query)

A traceback observed is quite notable:

Traceback (most recent call last):
  File "/srv/ota-lite/ota_lite/test_couchbase_lockmode.py", line 16, in query
    q = cluster.query("SELECT * FROM ['test'] as ks")
  File "/usr/lib/python3.8/site-packages/couchbase/cluster.py", line 592, in query
    return self._maybe_operate_on_an_open_bucket(CoreClient.query,
  File "/usr/lib/python3.8/site-packages/couchbase/cluster.py", line 611, in _maybe_operate_on_an_open_bucket
    if self._is_6_5_plus():
  File "/usr/lib/python3.8/site-packages/couchbase/cluster.py", line 547, in _is_6_5_plus
    response = self._admin.http_request(path="/pools").value
  File "/usr/lib/python3.8/site-packages/couchbase/management/admin.py", line 159, in http_request
    return self._http_request(type=LCB.LCB_HTTP_TYPE_MANAGEMENT,
couchbase.exceptions.ObjectThreadException: <Couldn't lock. If LOCKMODE_WAIT was passed, then this means that something has gone wrong internally. Otherwise, this means you are using the Connection object from multiple threads. This is not allowed (without an explicit lockmode=LOCKMODE_WAIT constructor argument, C Source=(src/oputil.c,661)>

A problem lies inside a check for _is_6_5_plus which internally creates a shared thread-unsafe bucket to verify a CouchBase server version. Although a better approach would be to allow a user to pass this as a configured option.

There are two workarounds:

Set LOCKMODE_WAIT on a cluster object which would obviously result into a dramatic performance hit.
Use a CoreClient.query directly and pass a per-thread bucket object to it (requires a dozen lines of obscure hackish code).

At the moment we’ve used the second workaround in our code, though it would be nice if CouchBase maintainers fix a Python SDK itself.

vkhoroz · July 12, 2021, 2:33pm

It looks like this bug was introduced while fixing an even worse bug https://issues.couchbase.com/browse/CCBC-1204.
Although that migth be a relief for single-threaded applications, it doesn’t really help for multi-threaded applications as a more dangerous traceback has been introduced with that fix. Moreover, investigating a code a little bit it seems that this fix might result into normal KV operations failing randomly when using both KV and N1QL in one application (and we’ve seen that in our application logs quite a few).

AV25242 · July 13, 2021, 7:08pm

Hello @vkhoroz first of all thank you very much for looking into what would have potentially caused a defect.
@jcasey can you look at this ?

Also @vkhoroz just for your awareness there is a async couchbase python library (acouchbase) that you can use for async operations.

jcasey · July 27, 2021, 4:28pm

Yes, this is an issue that we are tracking that has a dependency upon LCB.

In the interim, if possible, you might look into the acouchbase API. Docs on querying can be found here.

Topic		Replies	Views
Multithread in Python SDK for Query N1QL Python SDK query , n1ql	8	3545	May 6, 2022
Couldn't lock exception even with LOCKMODE_WAIT passed? Python SDK	11	2239	October 23, 2020
Python SDK 3.0 Sample Code results in issues on query Python SDK query , connections , n1ql , server , sdk	5	3096	August 4, 2020
Asynchronous N1QL does not work Python SDK	2	1149	January 21, 2022
Segmentation fault: 11 Python SDK	4	1522	September 26, 2020

Random ObjectThreadException errors in CouchBase Python driver when using N1QL queries from multiple threads

Related topics