Got LiteCore error: Connection reset by peer

We are trying to use CBLite android 2.1.0 and CBLite iOS 2.0.3 with Sync Gateway 2.1.1 and Couchbase server 4.6. **Everything seems perfectly fine with iOS ** but on android when we start the replication the web socket keep on closing. Following are some logs we can pull off. Please help me in understanding and fixing the issue at earliest.

`
11-13 17:40:53.960 32242-32242/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}]: Starting

11-13 17:40:53.965 32242-32242/com.example.apple.couchbasesample I/LiteCore [Actor]: Starting Scheduler<0xeadb4628> with 8 threads

11-13 17:40:53.966 32242-32242/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1}==> N8litecore4repl10ReplicatorE /data/user/0/com.example.apple.couchbasesample/files/my-database.cblite2/ ->ws://syncgateway.myapp.co/app-dev/_blipsync

11-13 17:40:53.967 32242-32242/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} Push=one-shot, Pull=one-shot, Options={{auth:{password:"********", type:“Basic”, username:“admin”}, headers:{User-Agent:“CouchbaseLite/2.1.0-176 (Java; Android 8.1.0; SM-T595) Build/0 Commit/a38462c LiteCore/ (176)”}}}

11-13 17:40:53.967 32242-32290/com.example.apple.couchbasesample I/LiteCoreJNI: socket_open() socket -> 0xeadc56e0 socketFactoryContext -> 0xd801827

11-13 17:40:53.967 32242-32290/com.example.apple.couchbasesample W/C4Socket: C4Socket.open() socket -> 3940308704

11-13 17:40:53.968 32242-32290/com.example.apple.couchbasesample W/C4Socket: C4Socket.open() clazz -> com.couchbase.lite.internal.replicator.CBLWebSocket

11-13 17:40:53.970 32242-32242/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}] is connecting, progress 0/0, error: null
C4ReplicatorListener.statusChanged() status -> C4ReplicatorStatus{activityLevel=2, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=0, errorCode=0, errorInternalInfo=0}

11-13 17:40:53.972 32242-32296/com.example.apple.couchbasesample I/Sync: statusChanged() c4Status -> C4ReplicatorStatus{activityLevel=2, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=0, errorCode=0, errorInternalInfo=0}

11-13 17:40:53.973 32242-32290/com.example.apple.couchbasesample E/WS: CBLWebSocket.socket_open()

11-13 17:40:53.974 32242-32296/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}] is connecting, progress 0/0, error: null

11-13 17:40:54.019 32242-32297/com.example.apple.couchbasesample D/OpenGLRenderer: HWUI GL Pipeline

11-13 17:40:54.028 32242-32242/com.example.apple.couchbasesample D/InputTransport: Input channel constructed: fd=83

11-13 17:40:54.029 32242-32242/com.example.apple.couchbasesample D/ViewRootImpl@acc0fc3[MainActivity]: setView = DecorView@26fdb79[MainActivity] TM=true MM=false

11-13 17:40:54.040 32242-32290/com.example.apple.couchbasesample D/NetworkSecurityConfig: No Network Security Config specified, using platform default

11-13 17:40:54.049 32242-32242/com.example.apple.couchbasesample D/ViewRootImpl@acc0fc3[MainActivity]: dispatchAttachedToWindow

11-13 17:40:54.079 32242-32290/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} activityLevel=connecting: connectionState=1

11-13 17:40:54.081 32242-32288/com.example.apple.couchbasesample I/LiteCore [Sync]: {DBWorker#2}==> N8litecore4repl8DBWorkerE ->ws://syncgateway.myapp.co/app-dev/_blipsync
{DBWorker#2} activityLevel=idle: pendingResponseCount=0, eventCount=1

11-13 17:40:54.081 32242-32289/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} No local checkpoint ‘cp-JZqS6npdiHxtkunu+WVyZl3Y0Dg=’
{Repl#1} activityLevel=connecting: connectionState=1

11-13 17:40:54.081 32242-32291/com.example.apple.couchbasesample I/LiteCore [Sync]: {Pull#3}==> N8litecore4repl6PullerE ->ws://syncgateway.myapp.co/app-dev/_blipsync
{Pull#3} activityLevel=busy: pendingResponseCount=0, _caughtUp=0, _waitingForChangesCallback=0, _pendingRevMessages=0, _activeIncomingRevs=0

11-13 17:40:54.081 32242-32289/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} pushStatus=busy, pullStatus=busy, dbStatus=idle, progress=0/0
{Repl#1} activityLevel=connecting: connectionState=1
{Repl#1} pushStatus=busy, pullStatus=busy, dbStatus=idle, progress=0/0
{Repl#1} activityLevel=connecting: connectionState=1

11-13 17:40:54.100 32242-32242/com.example.apple.couchbasesample V/Surface: sf_framedrop debug : 0x4f4c, game : false, logging : 0

11-13 17:40:54.105 32242-32242/com.example.apple.couchbasesample D/ViewRootImpl@acc0fc3[MainActivity]: Relayout returned: old=[0,0][0,0] new=[0,0][1920,1200] result=0x7 surface={valid=true 3453126656} changed=true

11-13 17:40:54.107 32242-32297/com.example.apple.couchbasesample I/Adreno: QUALCOMM build : 160a517, I47625b5b56
Build Date : 06/13/18
OpenGL ES Shader Compiler Version: EV031.22.00.01_06_07
Local Branch :
Remote Branch :
Remote Branch :
Reconstruct Branch :

11-13 17:40:54.114 32242-32297/com.example.apple.couchbasesample I/Adreno: PFP: 0x005ff087, ME: 0x005ff063

11-13 17:40:54.121 32242-32297/com.example.apple.couchbasesample I/zygote: android::hardware::configstore::V1_0::ISurfaceFlingerConfigs::hasWideColorDisplay retrieved: 0

11-13 17:40:54.122 32242-32297/com.example.apple.couchbasesample I/OpenGLRenderer: Initialized EGL, version 1.4

11-13 17:40:54.122 32242-32297/com.example.apple.couchbasesample D/OpenGLRenderer: Swap behavior 2

11-13 17:40:54.128 32242-32297/com.example.apple.couchbasesample D/libGLESv1: STS_GLApi : DTS, ODTC are not allowed for Package : com.example.apple.couchbasesample

11-13 17:40:54.129 32242-32297/com.example.apple.couchbasesample D/OpenGLRenderer: eglCreateWindowSurface = 0xcdad6c30, 0xcdd28808

11-13 17:40:54.188 32242-32242/com.example.apple.couchbasesample D/ViewRootImpl@acc0fc3[MainActivity]: MSG_RESIZED_REPORT: frame=Rect(0, 0 - 1920, 1200) ci=Rect(0, 36 - 0, 0) vi=Rect(0, 36 - 0, 0) or=2

11-13 17:40:54.189 32242-32242/com.example.apple.couchbasesample D/ViewRootImpl@acc0fc3[MainActivity]: MSG_WINDOW_FOCUS_CHANGED 1

11-13 17:40:54.193 32242-32242/com.example.apple.couchbasesample V/InputMethodManager: Starting input: tba=android.view.inputmethod.EditorInfo@76a2104 nm : com.example.apple.couchbasesample ic=null

11-13 17:40:54.193 32242-32242/com.example.apple.couchbasesample D/InputMethodManager: startInputInner - Id : 0

11-13 17:40:54.193 32242-32242/com.example.apple.couchbasesample I/InputMethodManager: startInputInner - mService.startInputOrWindowGainedFocus

11-13 17:40:54.199 32242-32254/com.example.apple.couchbasesample D/InputTransport: Input channel constructed: fd=90

11-13 17:40:54.313 32242-32242/com.example.apple.couchbasesample V/InputMethodManager: Starting input: tba=android.view.inputmethod.EditorInfo@e48540f nm : com.example.apple.couchbasesample ic=null

11-13 17:40:54.314 32242-32242/com.example.apple.couchbasesample D/InputMethodManager: startInputInner - Id : 0

11-13 17:41:09.335 32242-32298/com.example.apple.couchbasesample W/WS: WebSocketListener.onFailure() response -> null
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:209)
at java.net.SocketInputStream.read(SocketInputStream.java:139)
at okio.Okio$2.read(Okio.java:139)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:345)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:217)
at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:212)
at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:147)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
at java.lang.Thread.run(Thread.java:764)

11-13 17:41:09.336 32242-32298/com.example.apple.couchbasesample I/LiteCoreJNI: [NATIVE] closed() socket -> 0xeadc56e0

11-13 17:41:09.337 32242-32293/com.example.apple.couchbasesample I/LiteCoreJNI: socket_dispose() socket -> 0xeadc56e0

11-13 17:41:09.339 32242-32293/com.example.apple.couchbasesample W/C4Socket: C4Socket.dispose() handle -> 3940308704

11-13 17:41:09.340 32242-32295/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} Connection closed with errno 104: “Connection reset by peer” (state=1)

11-13 17:41:09.341 32242-32290/com.example.apple.couchbasesample I/LiteCore [Sync]: {DBWorker#2} activityLevel=idle: pendingResponseCount=0, eventCount=1

11-13 17:41:09.341 32242-32288/com.example.apple.couchbasesample I/LiteCore [Sync]: {Push#4}==> N8litecore4repl6PusherE ->ws://syncgateway.myapp.co/app-dev/_blipsync
{Push#4} activityLevel=stopped: pendingResponseCount=0, caughtUp=0, changeLists=0, revsInFlight=0, blobsInFlight=0, awaitingReply=0, revsToSend=0, pendingSequences=0
{Pull#3} activityLevel=busy: pendingResponseCount=0, _caughtUp=0, _waitingForChangesCallback=0, _pendingRevMessages=0, _activeIncomingRevs=0

11-13 17:41:09.342 32242-32295/com.example.apple.couchbasesample E/LiteCore [Sync]: {Repl#1} Got LiteCore error: Connection reset by peer (2/104)

11-13 17:41:09.342 32242-32295/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} activityLevel=stopped: connectionState=-1
{Repl#1} now stopped

11-13 17:41:09.347 32242-32295/com.example.apple.couchbasesample I/Sync: C4ReplicatorListener.statusChanged() status -> C4ReplicatorStatus{activityLevel=0, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=2, errorCode=104, errorInternalInfo=1000}

11-13 17:41:09.350 32242-32295/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#1} activityLevel=stopped: connectionState=-1

11-13 17:41:09.351 32242-32296/com.example.apple.couchbasesample I/Sync: statusChanged() c4Status -> C4ReplicatorStatus{activityLevel=0, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=2, errorCode=104, errorInternalInfo=1000}

11-13 17:41:09.354 32242-32296/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}]: Transient error (C4Error{domain=2, code=104, internalInfo=1000}); will retry in 2 sec…

11-13 17:41:09.359 32242-32296/com.example.apple.couchbasesample V/Sync: com.couchbase.lite.AndroidNetworkReachabilityManager@79cb607: startListening() registering com.couchbase.lite.AndroidNetworkReachabilityManager$NetworkReceiver@d99d334 with context android.app.Application@710f55d

11-13 17:41:09.369 32242-32242/com.example.apple.couchbasesample V/Sync: NetworkReceiver.onReceive() Online -> true

11-13 17:41:09.371 32242-32296/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}] is offline, progress 0/0, error: CouchbaseLiteException{domain=‘POSIXErrorDomain’, code=104, msg=Connection reset by peer}

11-13 17:41:09.372 32242-32242/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}]: Server may now be reachable; retrying…

11-13 17:41:09.373 32242-32242/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}]: Retrying…

11-13 17:41:09.377 32242-32242/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#5}==> N8litecore4repl10ReplicatorE /data/user/0/com.example.apple.couchbasesample/files/my-database.cblite2/ ->ws://syncgateway.myapp.co/app-dev/_blipsync
{Repl#5} Push=one-shot, Pull=one-shot, Options={{auth:{password:"********", type:“Basic”, username:“admin”}, headers:{User-Agent:“CouchbaseLite/2.1.0-176 (Java; Android 8.1.0; SM-T595) Build/0 Commit/a38462c LiteCore/ (176)”}}}

11-13 17:41:09.377 32242-32289/com.example.apple.couchbasesample I/LiteCoreJNI: socket_open() socket -> 0xeadc5ec0 socketFactoryContext -> 0xd801827

11-13 17:41:09.378 32242-32289/com.example.apple.couchbasesample W/C4Socket: C4Socket.open() socket -> 3940310720
C4Socket.open() clazz -> com.couchbase.lite.internal.replicator.CBLWebSocket

11-13 17:41:09.379 32242-32289/com.example.apple.couchbasesample E/WS: CBLWebSocket.socket_open()

11-13 17:41:09.379 32242-32242/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}] is connecting, progress 0/0, error: null

11-13 17:41:09.380 32242-32242/com.example.apple.couchbasesample I/Sync: C4ReplicatorListener.statusChanged() status -> C4ReplicatorStatus{activityLevel=2, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=0, errorCode=0, errorInternalInfo=0}

11-13 17:41:09.381 32242-32296/com.example.apple.couchbasesample I/Sync: statusChanged() c4Status -> C4ReplicatorStatus{activityLevel=2, progressUnitsCompleted=0, progressUnitsTotal=0, progressDocumentCount=0, errorDomain=0, errorCode=0, errorInternalInfo=0}

11-13 17:41:09.383 32242-32296/com.example.apple.couchbasesample I/Sync: Replicator[<-> Database@c5bfae6{name=‘my-database’} URLEndpoint{url=ws://syncgateway.myapp.co/app-dev}] is connecting, progress 0/0, error: null

11-13 17:41:09.392 32242-32289/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#5} activityLevel=connecting: connectionState=1

11-13 17:41:09.393 32242-32289/com.example.apple.couchbasesample I/LiteCore [Sync]: {DBWorker#6}==> N8litecore4repl8DBWorkerE ->ws://syncgateway.myapp.co/app-dev/_blipsync
{DBWorker#6} activityLevel=idle: pendingResponseCount=0, eventCount=1

11-13 17:41:09.394 32242-32289/com.example.apple.couchbasesample I/LiteCore [Sync]: {Repl#5} No local checkpoint ‘cp-JZqS6npdiHxtkunu+WVyZl3Y0Dg=’
{Repl#5} activityLevel=connecting: connectionState=1
{Pull#7}==> N8litecore4repl6PullerE ->ws://syncgateway.myapp.co/app-dev/_blipsync
{Pull#7} activityLevel=busy: pendingResponseCount=0, _caughtUp=0, _waitingForChangesCallback=0, _pendingRevMessages=0, _activeIncomingRevs=0
{Repl#5} pushStatus=busy, pullStatus=busy, dbStatus=idle, progress=0/0
{Repl#5} activityLevel=connecting: connectionState=1
{Repl#5} pushStatus=busy, pullStatus=busy, dbStatus=idle, progress=0/0
{Repl#5} activityLevel=connecting: connectionState=1

11-13 17:41:24.590 32242-32309/com.example.apple.couchbasesample W/WS: WebSocketListener.onFailure() response -> null
`

1 Like

From the error, it seems like the connection to SG is lost (java.net.SocketException: Connection reset). I couldn’t see anything wrong specifically from the log. Is it possible that there is network connection issue with android emulator or device?

No @pasin, the internet is working perfectly fine. And also if it is the case with internet, it should give similar error in iOS as well right, but fortunately iOS is working fine.

@ajaykoppisetty Did you ever find your issue, I am having a similar problem where the replication on the device keeps getting closed with a 104 error, sometimes it does not even get closed, it may just stop replicating and then after a while it receives the 104.

@meirrosendorff
Any hints here? What version of CBL are you using? For Android, I suspect? Logs?

@blake.meike
Thank you for the response.
I am using CBL 2.8.6 and Sync Gateway version 2.7.
Yes this is for Android
After some investigation I believe the the 104 error is related to android placing my app in the background and having the network disconnected. I don’t know if you have seen this before?

I seem to be having a separate issue (or possibly related) where the syncing just goes offline on some devices, the replicators in this case seem to be just sitting in a state of idle but without any syncing occurring. Unfortunately I don’t have any logs as this is just being reported by some of my clients at their sites and I am yet to replicate it locally so I will have to ask another question once I have managed to replicate the issue. This issue does seem to be specific to some clients which makes me suspect it is network related, but then the replicators should be going offline not sitting idle. There didn’t seem to be anything significant in the SG logs.
Would you have any advice on where to start investigating this ?
Apologies I know it is a vague problem, we are still trying to replicate it, once we have I will hopefully be able to give a more concrete example.

These are the sync gateway logs I have for warnings and errors
We had confirmed issues on 31/01/2021 between 12:00 and 16:00

sg_error.zip (836.2 KB)
sg_warn.zip (1.6 MB)

Android doesn’t really have a “background” state. It is entirely possible, though, that the OS will kill your application, while the replication is running. For Android, I highly recommend scheduled one-shot replications, instead of continuous, FWIW.

The logs you have included show the client abruptly closing a connection. Again, FWIW, this often happens when there is some kind of network device between the client and the SGW, that doesn’t handle the websocket protocol, correctly.

As for tracking down the issue:

  1. Get me a log. If I can’t see this, it will be much harder for me to diagnose it or be sure that I’ve fixed it. Perhaps a CustomLogger that, when enabled, logs to a remote?
  2. Try 3.0. I’ve done some work in this area, based on guesses about @ajaykoppisetty 's original issue. It may work better for you.

Thanks so much, so we need the continuous replication as we need the devices to all be aligned with each other as much as possible, what we do is monitor the state of the replicator and when it goes offline we start a new one.

Out of interest what part of the logs tell you that the client is abruptly closing the connection ?

I know we are using devices that go through an apn, these are logs from the firewall, not sure if anything jumps out to you.

I am going to try see what I can do about releasing an app version with a custom logger, what level of logging would be needed, just warnings and errors ?

Are you referring to the 3.0 version of CBL or SG?

Really appreciate all the help

Hah! I hear you… Lots of people “need” faster than light travel, too. Unfortunately…

You really cannot, on Android, expect continuous replication to work, except when the app is in the foreground. FWIW, if the app isn’t in the foreground, there’s really not much reason to keep it in sync… If a sync falls in the forest and nobody is around to hear it…

Second guessing the replicator, like that, seems like a pretty major kludge, to me. If the replicator is OFFLINE, it will retry the connection with exponential backoff, as it should. Restarting it… well… hmmm…

I see lots of log messages that look like this:

2022-02-03T08:59:51.254+02:00 [ERR] c:[794adee1] Invalid response to 'changes' message: RPY#138 -- EOF.  Body:  -- rest.(*blipSyncContext).Logf() at blip_sync.go:262

Those indicate that the client is closing the connection.

I don’t have enough information to draw conclusions from your firewall log. If those Policy Violation records are source: client, then I think you might have your culprit.

Oh… and CBL 3.0. IANA SGW guy.