Hi,
situation:
- 3-node cluster with server 2.1.1
- client libcouchbase 2.7.3 (and 2.7.4)
simplified the tests to using “cbc cat”, on the master node. The apps that use the libraries behave the same way.
#case 1: FAIL
When calling 2.7.3 (2.7.4 gives strictly comparable output):
# cbc-cat MY_KEY -U "couchbase://couch2.mydomain.com/session?dnssrv=off&detailed_errcodes=on&bootstrap_on=http" -v 0ms [I0] {23898} [INFO] (instance - L:402) Version=2.7.3, Changeset=095afbb1a83bfef8def6d50bf50ee494b6c3c67f 0ms [I0] {23898} [INFO] (instance - L:403) Effective connection string: couchbase://couch2.mydomain.com/session?dnssrv=off&detailed_errcodes=on&bootstrap_on=http&console_log_level=2&. Bucket=session 0ms [I0] {23898} [INFO] (connection - L:458) <couch2.mydomain.com:8091> (SOCK=0x17711c0) Starting. Timeout=2000000us 3ms [I0] {23898} [INFO] (connection - L:130) <couch2.mydomain.com:8091> (SOCK=0x17711c0) Connected established 4ms [I0] {23898} [WARN] (htconfig - L:128) <couch2.mydomain.com:8091> Got 404 on config stream. Assuming terse URI not supported on cluster 36ms [I0] {23898} [INFO] (confmon - L:153) Setting new configuration. Received via HTTP 36ms [I0] {23898} [INFO] (connection - L:458) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Starting. Timeout=2500000us 37ms [I0] {23898} [INFO] (connection - L:130) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Connected established 38ms [I0] {23898} [ERROR] (negotiation - L:131) <couch2.mydomain.com:11210> (SASLREQ=0x178b000) Error: 0x2d, IO Error 38ms [I0] {23898} [ERROR] (server - L:467) <NOHOST:NOPORT> (SRV=0x17738f0,IX=0) Connection attempt failed. Received LCB_ESOCKSHUTDOWN (0x2D) from libcouchbase, received 0 from operating system 38ms [I0] {23898} [INFO] (bootstrap - L:164) Not requesting a config refresh because of throttling parameters. Next refresh possible in 9998ms or 99 errors. See LCB_CNTL_CONFDELAY_THRESH and LCB_CNTL_CONFERRTHRESH to modify the throttling settings 53ms [I0] {23898} [INFO] (connection - L:458) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Starting. Timeout=2500000us 54ms [I0] {23898} [INFO] (connection - L:130) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Connected established 54ms [I0] {23898} [ERROR] (negotiation - L:131) <couch2.mydomain.com:11210> (SASLREQ=0x178afb0) Error: 0x2d, IO Error 54ms [I0] {23898} [ERROR] (server - L:467) <NOHOST:NOPORT> (SRV=0x17738f0,IX=0) Connection attempt failed. Received LCB_ESOCKSHUTDOWN (0x2D) from libcouchbase, received 0 from operating system 54ms [I0] {23898} [INFO] (bootstrap - L:164) Not requesting a config refresh because of throttling parameters. Next refresh possible in 9981ms or 98 errors. See LCB_CNTL_CONFDELAY_THRESH and LCB_CNTL_CONFERRTHRESH to modify the throttling settings 84ms [I0] {23898} [INFO] (connection - L:458) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Starting. Timeout=2500000us 85ms [I0] {23898} [INFO] (connection - L:130) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Connected established 86ms [I0] {23898} [ERROR] (negotiation - L:131) <couch2.mydomain.com:11210> (SASLREQ=0x17731a0) Error: 0x2d, IO Error 86ms [I0] {23898} [ERROR] (server - L:467) <NOHOST:NOPORT> (SRV=0x17738f0,IX=0) Connection attempt failed. Received LCB_ESOCKSHUTDOWN (0x2D) from libcouchbase, received 0 from operating system 86ms [I0] {23898} [INFO] (bootstrap - L:164) Not requesting a config refresh because of throttling parameters. Next refresh possible in 9950ms or 97 errors. See LCB_CNTL_CONFDELAY_THRESH and LCB_CNTL_CONFERRTHRESH to modify the throttling settings 131ms [I0] {23898} [INFO] (connection - L:458) <couch2.mydomain.com:11210> (SOCK=0x178aef0) Starting. Timeout=2500000us ... 2373ms [I0] {23898} [INFO] (bootstrap - L:164) Not requesting a config refresh because of throttling parameters. Next refresh possible in 7662ms or 82 errors. See LCB_CNTL_CONFDELAY_THRESH and LCB_CNTL_CONFERRTHRESH to modify the throttling settings 2536ms [I0] {23898} [WARN] (retryq - L:142) Failing command (seq=0) from retry queue with error code 0x2d MY_KEY The remote host closed the connection (0x2d)
case 2: OK
when using 2.7.1, the connection works (same cluster, same key, removed the non supported options in the connection string):
# cbc-cat MY_KEY -U "couchbase://couch2.mydomain.com/session?bootstrap_on=http" -v 0ms [I0] {1570} [INFO] (instance - L:401) Version=2.7.1, Changeset=63c54b1b63db427ab3cf9a131c08ad591044b798 0ms [I0] {1570} [INFO] (instance - L:402) Effective connection string: couchbase://couch2.mydomain.com/session?bootstrap_on=http&console_log_level=2&. Bucket=session 4ms [I0] {1570} [INFO] (instance - L:135) DNS SRV lookup failed: DNS/Hostname lookup failed 4ms [I0] {1570} [INFO] (connection - L:450) <couch2.mydomain.com:8091> (SOCK=0x79bbb0) Starting. Timeout=2000000us 6ms [I0] {1570} [INFO] (connection - L:116) <couch2.mydomain.com:8091> (SOCK=0x79bbb0) Connected 184ms [I0] {1570} [WARN] (htconfig - L:128) <couch2.mydomain.com:8091> Got 404 on config stream. Assuming terse URI not supported on cluster 216ms [I0] {1570} [INFO] (confmon - L:153) Setting new configuration. Received via HTTP 216ms [I0] {1570} [INFO] (connection - L:450) <couch2.mydomain.com:11210> (SOCK=0x7af3c0) Starting. Timeout=2500000us 217ms [I0] {1570} [INFO] (connection - L:116) <couch2.mydomain.com:11210> (SOCK=0x7af3c0) Connected MY_KEY CAS=0x17e673a1ea37234, Flags=0x3000002. Size=3 bla
#Case 3: OK
When using 2.7.3 against a single-server instance (also 2.1.1), that uses IP address configured on the server instance and not a full domain name, the connection works:
#cbc-cat MY_KEY -U "couchbase://127.0.0.1/default?dnssrv=off&detailed_errcodes=on&bootstrap_on=http" -v 0ms [I0] {10566} [INFO] (instance - L:402) Version=2.7.3, Changeset=095afbb1a83bfef8def6d50bf50ee494b6c3c67f 0ms [I0] {10566} [INFO] (instance - L:403) Effective connection string: couchbase://127.0.0.1/default?dnssrv=off&detailed_errcodes=on&bootstrap_on=http&console_log_level=2&. Bucket=default 0ms [I0] {10566} [INFO] (connection - L:458) <127.0.0.1:8091> (SOCK=0x1919120) Starting. Timeout=2000000us 0ms [I0] {10566} [INFO] (connection - L:130) <127.0.0.1:8091> (SOCK=0x1919120) Connected established 1ms [I0] {10566} [WARN] (htconfig - L:128) <127.0.0.1:8091> Got 404 on config stream. Assuming terse URI not supported on cluster 42ms [I0] {10566} [INFO] (confmon - L:153) Setting new configuration. Received via HTTP 42ms [I0] {10566} [INFO] (connection - L:458) <127.0.0.1:11210> (SOCK=0x1930930) Starting. Timeout=2500000us 42ms [I0] {10566} [INFO] (connection - L:130) <127.0.0.1:11210> (SOCK=0x1930930) Connected established MY_KEY CAS=0x2a29017b986, Flags=0x0. Size=3 bla
I MUST use “dnssrv=off” here, or the bootstrap fails.
remarks:
For case 1:
-
“dnssrv=off” makes no difference
-
using an IP address in the connection string makes no difference. Once bootstrapped, a full DNS name is used.
-
removing the options to be like case 2 lowers the amount of info printed, but does not make the connection work.
-
forcing dnssrv (couchbase+dnssrv://…, and removing dsnsrv=off) does not work, as we get the message “Not scheduling HTTP provider since no nodes have been configured for HTTP bootstrap”, which is strange, as without that, the bootstrap works on http.
-
and sorry, I really prefer to avoid moving up from 2.1.1 right now.
#preliminary conclusion:
The cause is maybe in the DNS lookups.