davidj
January 15, 2018, 6:05pm
1
My iOS app, with CB Lite 1.4.1, uses continuous replication from sync gateway version 1.4 (Couchbase server 4.6.2-3905) to pull documents. When the app is first run it starts pulling down data and eventually the replication goes idle, but not all documents have been replicated. If I kill and restart the app, more documents are replicated. It may require several restarts before all of the documents are replicated from sync gateway. I’ve turned on logging and I see this pattern in the log:
First app run…
2018-01-15 17:00:43.158 [info][tid:com.facebook.react.JavaScript] Starting Couchbase Sync Gateway
replication with https://sgw.CENSORED/master/
17:00:43.163‖ Router: POST http://lite.couchbase./_replicate + body stream
Accept: application/json
Content-Type: application/json
Connection: keep-alive
Authorization: Basic dXNlcjpwdw==
Content-Length: 246
17:00:43.164‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] STARTING ...
17:00:43.165‖ Database: {{ Begin transaction (level 1)...
17:00:43.165‖ Database: }} Commit transaction (level 1)
17:00:43.167‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/]: Reachability state =
<sgw.CENSORED>:reachable (02), suspended=0
17:00:43.167‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/]: Going online
17:00:43.167‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/]: No local checkpoint; not getting
remote one
17:00:43.167‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] Progress: set active = 1
17:00:43.167‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: Starting...
17:00:43.168‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: POST
//sgw.CENSORED/master/_changes?feed=normal&heartbeat=300000&style=all_docs {
"Accept-Encoding" = gzip;
Authorization = "Basic CENSORED";
"Content-Length" = 80;
"Content-Type" = "application/json";
Host = "sgw.CENSORED";
"User-Agent" = "CouchbaseLite/1.3 (iOS 1.4.1/86c63659b56d87406eac3b1e4db631fe67098139)";
}
17:00:43.168‖ ChangeTracker: Changes feed using proxy settings {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
"__SCOPED__" = {
en4 = {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
};
};
}
17:00:43.168‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: Started...
<https://sgw.CENSORED/master/_changes?feed=normal&heartbeat=300000&style=all_docs>
17:00:43.169‖ Router: Response -- status=200, body=24 bytes
Server: CouchbaseLite 1.4.1 (build 88)
Content-Type: application/json
17:00:43.169‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: Event 1 on
<__NSCFInputStream: 0x60800010f300>
17:00:46.952‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/]: Server is Couchbase Sync
Gateway/1.4.0
17:00:46.953‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] Progress: 0 / 100
17:00:46.954‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] queued 100 remote revisions from
seq=973093:878993 (100 in bulk, 0 individually)
17:00:46.954‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] bulk-fetching 100 remote
revisions...
17:00:46.955‖ RemoteRequest: CBLBulkDownloader[/master/_bulk_get]: Starting...
17:00:46.955‖ RemoteRequest: CBLRemoteSession starting CBLBulkDownloader[/master/_bulk_get]
17:00:46.957‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] Progress: 0 / 200`
Eventually getting to this point…
17:07:05.065‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] Progress: 92712 / 92800
17:07:05.066‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] queued 100 remote revisions from
seq=973093:972921 (100 in bulk, 0 individually) 17:07:05.066‖ Sync:
CBLRestPuller[https://sgw.CENSORED/master/] bulk-fetching 100 remote
revisions...
17:07:05.066‖ RemoteRequest: CBLBulkDownloader[/master/_bulk_get]: Starting...
17:07:05.066‖ RemoteRequest: CBLRemoteSession starting CBLBulkDownloader[/master/_bulk_get]
17:07:05.070‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: EndEncountered
<__NSCFInputStream: 0x60800010f300>
17:07:05.070‖ ChangeTracker: CBLSocketChangeTracker[0x6080001da7c0 master]: RESUME
17:07:05.073‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/]: Caught up with changes!
At that point no more documents are replicated. After the restart replication starts again from this 973093 and finds more documents:
17:22:08.747‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] queued 15 remote revisions from
seq=973093:887952 (15 in bulk, 0 individually)
17:22:08.747‖ Sync: CBLRestPuller[https://sgw.CENSORED/master/] bulk-fetching 15 remote revisions...
In this case I had to restart my app two more times before it completed replication. It looks like the rest puller thinks it’s caught up when it isn’t. Any ideas what’s wrong here?
EDIT: I just tried this on the Android version of my app and I’m seeing similar behavior. I had to force stop the app a couple of times to get all documents replicated.
Thanks,
David
traun
January 15, 2018, 8:34pm
2
Are there any errors in the Sync Gateway logs?
CBLite has additional log levels that can be enabled which might help:
In particular LogSync + LogSyncVerbose
Also, a tool like Wireshark or HTTPScoop running on the same machine as Sync Gateway will allow you to look at the HTTP requests, which might reveal a pattern. Eg, do you see any HTTP errors and retries?
These resources might help to understand the replication protocol:
davidj
January 15, 2018, 8:45pm
3
I’ve been tailing the sgw log and have not seen any errors. I already have the log levels cranked up. The back end is a sync gateway cluster talking to a cb cluster in a docker swarm. I think next I’ll load a local sync gateway with the same set of docs and see if I can reproduce, in which case it would be easier to monitor the network traffic.
davidj
January 17, 2018, 1:22am
4
I’ve not seen any HTTP errors monitoring the network and sync gateway logs. I modified the replicate request to specify a poll
value of 30000ms and websocket
false, which I believe should disable websockets in case that’s part of the problem. However, I still see the behavior that the first time the app replicates with sync gateway it eventually reaches a state where no more changes are received, but upon killing and restarting it resumes replicating.
The last replication request before I killed the client emitted this log:
00:47:12.420‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: Starting...
00:47:12.421‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: POST //CENSORED.amazonaws.com/master/_changesfeed=normal&heartbeat=300000&style=all_docs&since=94107 {
"Accept-Encoding" = gzip;
Authorization = "Basic CENSORED";
"Content-Length" = 94;
"Content-Type" = "application/json";
Host = "CENSORED.amazonaws.com";
"User-Agent" = "CouchbaseLite/1.3 (iOS 1.4.1/86c63659b56d87406eac3b1e4db631fe67098139)";
}
00:47:12.422‖ ChangeTracker: Changes feed using proxy settings {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
"__SCOPED__" = {
en4 = {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
};
};
}
00:47:12.422‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: Started... <http://CENSORED.amazonaws.com/master/_changes?feed=normal&heartbeat=300000&style=all_docs&since=94107>
00:47:12.423‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: Event 1 on <__NSCFInputStream: 0x618000311670>
00:47:12.601‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: EndEncountered <__NSCFInputStream: 0x618000311670>
00:47:12.601‖ ChangeTracker: CBLSocketChangeTracker[0x6180003c0e10 master]: Next poll of _changes feed in 30 sec...
Then, after killing and restarting the app, replication resumes with this logged:
2018-01-17 00:49:27.826 [info][tid:com.facebook.react.JavaScript] Starting Couchbase Sync Gateway replication with http://CENSORED.amazonaws.com/master
00:49:27.832‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master] STARTING ...
00:49:27.835‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master]: Reachability state = <CENSORED.amazonaws.com>:reachable (02), suspended=0
00:49:27.835‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master]: Going online
00:49:27.835‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master] Progress: set active = 1
00:49:27.835‖ RemoteRequest: CBLRemoteJSONRequest[GET http://CENSORED.amazonaws.com/master/_local/4fc015c942e4c8a18d80fe9f33d93e6d141d7b74]: Starting...
00:49:27.836‖ RemoteRequest: CBLRemoteSession starting CBLRemoteJSONRequest[GET http://CENSORED.amazonaws.com/master/_local/4fc015c942e4c8a18d80fe9f33d93e6d141d7b74]
00:49:27.836‖ Router: Response -- status=200, body=24 bytes
Server: CouchbaseLite 1.4.1 (build 88)
Content-Type: application/json
00:49:28.559‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master]: Server is Couchbase Sync Gateway/1.4.0
00:49:28.560‖ RemoteRequest: CBLRemoteJSONRequest[GET http://CENSORED.amazonaws.com/master/_local/4fc015c942e4c8a18d80fe9f33d93e6d141d7b74]: Got response, status 200
00:49:28.560‖ RemoteRequest: CBLRemoteSession done with CBLRemoteJSONRequest[GET http://CENSORED.amazonaws.com/master/_local/4fc015c942e4c8a18d80fe9f33d93e6d141d7b74]
00:49:28.560‖ RemoteRequest: CBLRemoteJSONRequest[GET http://CENSORED.amazonaws.com/master/_local/4fc015c942e4c8a18d80fe9f33d93e6d141d7b74]: Finished loading
00:49:28.560‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master]: Replicating from lastSequence=94106:30367
00:49:28.560‖ ChangeTracker: CBLSocketChangeTracker[0x6000001d7160 master]: Starting...
00:49:28.561‖ ChangeTracker: CBLSocketChangeTracker[0x6000001d7160 master]: POST //CENSORED.amazonaws.com/master/_changes?feed=normal&heartbeat=300000&style=all_docs&since=94106%3A30367 {
"Accept-Encoding" = gzip;
Authorization = "Basic CENSORED";
"Content-Length" = 102;
"Content-Type" = "application/json";
Host = "CENSORED.amazonaws.com";
"User-Agent" = "CouchbaseLite/1.3 (iOS 1.4.1/86c63659b56d87406eac3b1e4db631fe67098139)";
}
00:49:28.561‖ ChangeTracker: Changes feed using proxy settings {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
"__SCOPED__" = {
en4 = {
ExceptionsList = (
"*.local",
"169.254/16"
);
FTPPassive = 1;
};
};
}
00:49:28.561‖ ChangeTracker: CBLSocketChangeTracker[0x6000001d7160 master]: Started... <http://CENSORED.amazonaws.com/master/_changes?feed=normal&heartbeat=300000&style=all_docs&since=94106:30367>
00:49:28.562‖ ChangeTracker: CBLSocketChangeTracker[0x6000001d7160 master]: Event 1 on <__NSCFInputStream: 0x6080001152a0>
00:49:31.597‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master] Progress: 0 / 100
00:49:31.602‖ Sync: CBLRestPuller[http://CENSORED.amazonaws.com/master] queued 11 remote revisions from seq=94106:30368 (11 in bulk, 0 individually)
In the first case it’s requesting changes since 94107
and getting nothing. In the second case it’s requesting changes since 94106:30367
and finding documents. When the client has eventually replicated all documents it again requests changes since 94107
and is idle. Why is the client apparently prematurely getting to the the 94107
sequence number? What’s happening when the pull replication starts again after the app is restarted that allows the client to figure out it needs to start at 94106:30367
instead of where it previously left off at 94107
?
jens
January 17, 2018, 6:56pm
5
This is mysterious. I’ve filed a Github issue and pasted in your initial description (thanks for the detailed info & logs!) Let’s continue there.