Hi,
we have an issue where our NodeJS clients are receiving Server timeouts every half a minute. Requests are running well, but then fail for a few seconds.
Our setup:
Server side:
4.0.0-4051 Community Edition (build-4051)
Debian Wheezy
no replicas
XDCR, one server per location
Client side:
Node v4.4.1
couchnode: 2.14.
I did run the application with LCB_LOGLEVEL=5 and the initial output looks like this:
0ms [I0] {21516} [INFO] (instance - L:385) Version=2.5.6, Changeset=fa6ce043d715a3b0fd25712673c4763f782b08ee
0ms [I0] {21516} [INFO] (instance - L:386) Effective connection string: couchbase://10.240.0.9/default. Bucket=default
0ms [I0] {21516} [DEBUG] (instance - L:65) Adding host 10.240.0.9:8091 to initial HTTP bootstrap list
0ms [I0] {21516} [DEBUG] (instance - L:65) Adding host 10.240.0.9:11210 to initial CCCP bootstrap list
2ms [I0] {21516} [DEBUG] (confmon - L:89) Preparing providers (this may be called multiple times)
2ms [I0] {21516} [DEBUG] (confmon - L:99) Provider FILE is DISABLED
2ms [I0] {21516} [DEBUG] (confmon - L:96) Provider CCCP is ENABLED
2ms [I0] {21516} [DEBUG] (confmon - L:96) Provider HTTP is ENABLED
2ms [I0] {21516} [DEBUG] (confmon - L:99) Provider MCRAW is DISABLED
2ms [I0] {21516} [TRACE] (confmon - L:292) Start refresh requested
3ms [I0] {21516} [TRACE] (confmon - L:271) Current provider is CCCP
3ms [I0] {21516} [INFO] (cccp - L:118) Requesting connection to node 10.240.0.9:11210 for CCCP configuration
3ms [I0] {21516} [DEBUG] (lcbio_mgr - L:416) <10.240.0.9:11210> (HE=0x35cac70) Creating new connection because none are available in the pool
3ms [I0] {21516} [DEBUG] (lcbio_mgr - L:321) <10.240.0.9:11210> (HE=0x35cac70) Starting connection on I=0x356f1b0
3ms [I0] {21516} [INFO] (connection - L:450) <10.240.0.9:11210> (SOCK=0x356f330) Starting. Timeout=2000000us
3ms [I0] {21516} [TRACE] (connection - L:344) <10.240.0.9:11210> (SOCK=0x356f330) Received completion handler. Status=0. errno=0
3ms [I0] {21516} [INFO] (connection - L:116) <10.240.0.9:11210> (SOCK=0x356f330) Connected
3ms [I0] {21516} [WARN] (lcbio - L:103) FIXME: Unknown iops/os error code 95. Using NETWORK_ERROR
3ms [I0] {21516} [INFO] (lcbio - L:112) Translating errno=95, lcb=0x10 to NETWORK_ERROR
3ms [I0] {21516} [INFO] (connection - L:121) <10.240.0.9:11210> (SOCK=0x356f330) Couldn’t set TCP_NODELAY
3ms [I0] {21516} [DEBUG] (lcbio_mgr - L:271) <10.240.0.9:11210> (HE=0x35cac70) Received result for I=0x356f1b0,C=0x356f330; E=0x0
3ms [I0] {21516} [DEBUG] (lcbio_mgr - L:223) <10.240.0.9:11210> (HE=0x35cac70) Assigning R=0x34b5ed0 SOCKET=0x356f330
3ms [I0] {21516} [DEBUG] (ioctx - L:101) <10.240.0.9:11210> (CTX=0x35cb8c0,unknown) Pairing with SOCK=0x356f330
4ms [I0] {21516} [DEBUG] (negotiation - L:355) <10.240.0.9:11210> (SASLREQ=0x35630f0) Found feature 0x3 (TCP NODELAY)
4ms [I0] {21516} [DEBUG] (ioctx - L:151) <10.240.0.9:11210> (CTX=0x35cb8c0,sasl) Destroying. PND=0,ENT=1,SORC=1
4ms [I0] {21516} [DEBUG] (ioctx - L:101) <10.240.0.9:11210> (CTX=0x35cbae0,unknown) Pairing with SOCK=0x356f330
4ms [I0] {21516} [DEBUG] (ioctx - L:151) <10.240.0.9:11210> (CTX=0x35cbae0,bc_cccp) Destroying. PND=0,ENT=1,SORC=1
4ms [I0] {21516} [INFO] (lcbio_mgr - L:491) <10.240.0.9:11210> (HE=0x35cac70) Placing socket back into the pool. I=0x356f1b0,C=0x356f330
4ms [I0] {21516} [INFO] (confmon - L:174) Setting new configuration. Received via CCCP
4ms [I0] {21516} [DEBUG] (bootstrap - L:55) Instance configured!
4ms [I0] {21516} [DEBUG] (confmon - L:89) Preparing providers (this may be called multiple times)
4ms [I0] {21516} [DEBUG] (confmon - L:99) Provider FILE is DISABLED
4ms [I0] {21516} [DEBUG] (confmon - L:96) Provider CCCP is ENABLED
4ms [I0] {21516} [DEBUG] (confmon - L:99) Provider HTTP is DISABLED
4ms [I0] {21516} [DEBUG] (confmon - L:99) Provider MCRAW is DISABLED
10004ms [I0] {21516} [DEBUG] (lcbio_mgr - L:464) <10.240.0.9:11210> (HE=0x35cac70) Idle connection expired
ready
10527ms [I0] {21516} [DEBUG] (lcbio_mgr - L:416) <10.240.0.9:11210> (HE=0x35cac70) Creating new connection because none are available in the pool
10527ms [I0] {21516} [DEBUG] (lcbio_mgr - L:321) <10.240.0.9:11210> (HE=0x35cac70) Starting connection on I=0x356f010
10527ms [I0] {21516} [INFO] (connection - L:450) <10.240.0.9:11210> (SOCK=0x356f070) Starting. Timeout=2500000us
10528ms [I0] {21516} [TRACE] (connection - L:344) <10.240.0.9:11210> (SOCK=0x356f070) Received completion handler. Status=0. errno=0
10528ms [I0] {21516} [INFO] (connection - L:116) <10.240.0.9:11210> (SOCK=0x356f070) Connected
10528ms [I0] {21516} [WARN] (lcbio - L:103) FIXME: Unknown iops/os error code 95. Using NETWORK_ERROR
10528ms [I0] {21516} [INFO] (lcbio - L:112) Translating errno=95, lcb=0x10 to NETWORK_ERROR
10528ms [I0] {21516} [INFO] (connection - L:121) <10.240.0.9:11210> (SOCK=0x356f070) Couldn’t set TCP_NODELAY
10528ms [I0] {21516} [DEBUG] (lcbio_mgr - L:271) <10.240.0.9:11210> (HE=0x35cac70) Received result for I=0x356f010,C=0x356f070; E=0x0
10528ms [I0] {21516} [DEBUG] (lcbio_mgr - L:223) <10.240.0.9:11210> (HE=0x35cac70) Assigning R=0x36021a0 SOCKET=0x356f070
10528ms [I0] {21516} [TRACE] (server - L:522) <10.240.0.9:11210> (SRV=0x35cb800) Session not yet negotiated. Negotiating
10528ms [I0] {21516} [DEBUG] (ioctx - L:101) <10.240.0.9:11210> (CTX=0x351cb80,unknown) Pairing with SOCK=0x356f070
10529ms [I0] {21516} [DEBUG] (negotiation - L:355) <10.240.0.9:11210> (SASLREQ=0x356f290) Found feature 0x3 (TCP NODELAY)
10529ms [I0] {21516} [DEBUG] (ioctx - L:151) <10.240.0.9:11210> (CTX=0x351cb80,sasl) Destroying. PND=0,ENT=1,SORC=1
10529ms [I0] {21516} [DEBUG] (ioctx - L:101) <10.240.0.9:11210> (CTX=0x3578880,unknown) Pairing with SOCK=0x356f070
10529ms [I0] {21516} [DEBUG] (server - L:543) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Setting initial timeout=2498ms
Afterwards the application is running fine until these message pop up and requests timeout:
246856ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890920, opaque=42726, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890970, opaque=42727, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x38909c0, opaque=42728, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890a10, opaque=42729, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890a60, opaque=42730, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890ab0, opaque=42731, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890b00, opaque=42732, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890b50, opaque=42733, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890ba0, opaque=42734, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890bf0, opaque=42735, opcode=0x0) with error 0x17
246857ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3890c40, opaque=42736, opcode=0x0) with error 0x17
246857ms [I0] {21516} [TRACE] (confmon - L:292) Start refresh requested
246857ms [I0] {21516} [ERROR] (server - L:463) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Server timed out. Some commands have failed
246857ms [I0] {21516} [DEBUG] (server - L:467) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Scheduling next timeout for 711 ms
246857ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
246857ms [I0] {21516} [TRACE] (confmon - L:271) Current provider is CCCP
246857ms [I0] {21516} [INFO] (cccp - L:110) Re-Issuing CCCP Command on server struct 0x35cb800 (10.240.0.9:11210)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42726)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42727)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42728)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42729)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42730)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42731)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42732)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x1, SEQ=42733)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42734)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42735)
246858ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=42736)
247393ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
or
375263ms [I0] {21516} [DEBUG] (server - L:467) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Scheduling next timeout for 2500 ms
378064ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3ef7790, opaque=80545, opcode=0x0) with error 0x17
378064ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3ef77e0, opaque=80546, opcode=0x0) with error 0x17
378065ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3ef7830, opaque=80547, opcode=0x0) with error 0x17
378065ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3ef7880, opaque=80548, opcode=0x0) with error 0x17
378065ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x3ef78d0, opaque=80549, opcode=0x0) with error 0x17
378065ms [I0] {21516} [TRACE] (confmon - L:292) Start refresh requested
378065ms [I0] {21516} [ERROR] (server - L:463) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Server timed out. Some commands have failed
378065ms [I0] {21516} [DEBUG] (server - L:467) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Scheduling next timeout for 4 ms
378066ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
378066ms [I0] {21516} [TRACE] (confmon - L:271) Current provider is CCCP
378066ms [I0] {21516} [INFO] (cccp - L:110) Re-Issuing CCCP Command on server struct 0x35cb800 (10.240.0.9:11210)
378067ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=80545)
378067ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=80546)
378067ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=80547)
378067ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=80548)
378067ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=80549)
378069ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
or
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409eb40, opaque=114050, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409eb90, opaque=114051, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ebe0, opaque=114052, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ec30, opaque=114053, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ec80, opaque=114054, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ecd0, opaque=114055, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ed20, opaque=114056, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ed70, opaque=114057, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409edc0, opaque=114058, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ee10, opaque=114059, opcode=0x0) with error 0x17
519179ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409ee60, opaque=114060, opcode=0x0) with error 0x17
519180ms [I0] {21516} [WARN] (server - L:380) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Failing command (pkt=0x409eeb0, opaque=114061, opcode=0x0) with error 0x17
519180ms [I0] {21516} [TRACE] (confmon - L:292) Start refresh requested
519180ms [I0] {21516} [ERROR] (server - L:463) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Server timed out. Some commands have failed
519180ms [I0] {21516} [DEBUG] (server - L:467) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Scheduling next timeout for 2500 ms
519180ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
519180ms [I0] {21516} [TRACE] (confmon - L:271) Current provider is CCCP
519180ms [I0] {21516} [INFO] (cccp - L:110) Re-Issuing CCCP Command on server struct 0x35cb800 (10.240.0.9:11210)
519180ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114050)
519180ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114051)
519180ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114052)
519180ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114053)
521314ms [I0] {21516} [ERROR] (cccp - L:133) NOHOST:NOPORT Got I/O Error=0x17
521314ms [I0] {21516} [INFO] (confmon - L:202) Provider ‘CCCP’ failed
521314ms [I0] {21516} [TRACE] (confmon - L:226) Maximum provider reached. Resetting index
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114054)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114055)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114056)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114057)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114058)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114059)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114060)
521315ms [I0] {21516} [WARN] (server - L:240) <10.240.0.9:11210> (SRV=0x35cb800,IX=0) Found stale packet (OP=0x0, RC=0x0, SEQ=114061)
521316ms [I0] {21516} [INFO] (confmon - L:166) Not applying configuration received via CCCP. No changes detected. A.rev=33, B.rev=33
Hope you could help me on this. Did a lot of research but could not find any solution.
Thanks,
Jens