davidj
November 1, 2019, 11:02pm
1
I’m working on an iOS app with CB Lite 2.6.1. Testing pull replication of a fairly large set of small documents from sync gateway in the simulator I had no problem - it replicated all 700K documents fairly quickly. Going to my iPhone 7 test device, it works fine until it hits about the 350K document mark, then starts getting connection timeouts and eventually stalls out completely, stuck in the connecting state. I’m unable to replicate beyond 429K documents.
While this is happening, any other activity on the database, such as queries, saving or fetching a document, essentially hang, taking 60+ sec to complete, if at all.
CPU is only at about 45%. Memory usage fluctuates between 80MB-120MB, disk is around 100 MB/s. Any idea what’s going on here?
David
davidj
November 1, 2019, 11:29pm
2
After killing the app and restarting, now it crashes during replication:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000124
VM Region Info: 0x124 is not in any region. Bytes before following region: 4335976156
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
--->
__TEXT 000000010271c000-0000000102b00000 [ 3984K] r-x/r-x SM=COW ...seLiteExample
Termination Signal: Segmentation fault: 11
Termination Reason: Namespace SIGNAL, Code 0xb
Terminating Process: exc handler [2976]
Triggered by Thread: 3
Thread 3 name: Dispatch queue: Repl->wss://sgw.dev.XXXX/master/_blipsync
Thread 3 Crashed:
0 CouchbaseLiteSwift 0x00000001039da204 std::__1::__function::__func<litecore::repl::Replicator::getRemoteCheckpoint()::$_0, std::__1::allocator<litecore::repl::Replicator::getRemoteCheckpoint()::$_0>, void (litecore::blip::MessageProgress const&)>::operator()(litecore::blip::MessageProgress const&) + 811524 (Replicator.cc:479)
1 CouchbaseLiteSwift 0x00000001039da1f8 std::__1::__function::__func<litecore::repl::Replicator::getRemoteCheckpoint()::$_0, std::__1::allocator<litecore::repl::Replicator::getRemoteCheckpoint()::$_0>, void (litecore::blip::MessageProgress const&)>::operator()(litecore::blip::MessageProgress const&) + 811512 (Replicator.cc:0)
2 CouchbaseLiteSwift 0x00000001039fdaec std::__1::__function::__func<litecore::repl::Worker::sendRequest(litecore::blip::MessageBuilder&, std::__1::function<void (litecore::blip::MessageProgress const&)>)::$_0, std::__1::allocator<litecore::repl::Worker::sendRequest(litecore::blip::MessageBuilder&, std::__1::function<void (litecore::blip::MessageProgress const&)>)::$_0>, void (litecore::blip::MessageProgress)>::operator()(litecore::blip::MessageProgress&&) + 957164 (Worker.cc:0)
3 CouchbaseLiteSwift 0x00000001039fd850 invocation function for block in std::__1::function<void (litecore::blip::MessageProgress)> litecore::actor::Actor::_asynchronize<litecore::blip::MessageProgress>(std::__1::function<void (litecore::blip::MessageProgress)>)::'lambda'(litecore::blip::MessageProgress)::operator()(litecore::blip::MessageProgress) + 956496 (Actor.hh:0)
4 CouchbaseLiteSwift 0x0000000103a7d458 litecore::actor::GCDMailbox::safelyCall(void () block_pointer) const + 1479768 (GCDMailbox.cc:91)
5 CouchbaseLiteSwift 0x0000000103a7d52c invocation function for block in litecore::actor::GCDMailbox::enqueue(void () block_pointer) + 1479980 (GCDMailbox.cc:102)
6 libdispatch.dylib 0x0000000188e25610 _dispatch_call_block_and_release + 24
7 libdispatch.dylib 0x0000000188e26184 _dispatch_client_callout + 16
8 libdispatch.dylib 0x0000000188dd2464 _dispatch_lane_serial_drain$VARIANT$mp + 608
9 libdispatch.dylib 0x0000000188dd2e58 _dispatch_lane_invoke$VARIANT$mp + 420
10 libdispatch.dylib 0x0000000188ddc340 _dispatch_workloop_worker_thread + 588
11 libsystem_pthread.dylib 0x0000000188e75fa4 _pthread_wqthread + 276
12 libsystem_pthread.dylib 0x0000000188e78ae0 start_wqthread + 8
jens
November 4, 2019, 9:39pm
3
Yikes! Please file a bug report . This sounds like at least three different issues (replicator, slow db access, crash) but we’ll figure it out there.
jens
November 4, 2019, 9:52pm
4
The crash is known , but we don’t have a release with the fix yet. The workaround is to avoid pull-only replications. (This doesn’t happen on all pull-only replications, but if the previous replication was aborted due to a crash or other disconnect, a subsequent pull-only replication is likely to trigger this.)
davidj
November 7, 2019, 3:16pm
5