I’m testing the latency of the _changes feed to see how practical its use is as a replacement for more traditional http api type calls to a server. Its seems to be very slow. In a simple test case it took 1.664 seconds to index and return a document.
2016-09-20T15:55:30.155+01:00 HTTP: #009: PUT /sync_gateway/dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 (ADMIN)
2016-09-20T15:55:30.155+01:00 CRUD+: Invoking sync on doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” rev 1-9fe23fc803a90eb9b8db7658b3abb244
2016-09-20T15:55:30.155+01:00 CRUD: Doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” in channels "{callbacks}"
2016-09-20T15:55:30.155+01:00 Cache: SAVING #3
2016-09-20T15:55:30.156+01:00 CRUD: Stored doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / "1-9fe23fc803a90eb9b8db7658b3abb244"
2016-09-20T15:55:30.156+01:00 HTTP+: #009: --> 201 (0.8 ms)
2016-09-20T15:55:30.159+01:00 HTTP: #010: GET /sync_gateway/_changes?feed=longpoll&channels=callbacks&include_docs=true&since=%220%22&filter=sync_gateway%2Fbychannel&timeout=10000 (ADMIN)
2016-09-20T15:55:30.159+01:00 Changes+: Int sequence multi changes feed…
2016-09-20T15:55:30.159+01:00 Changes: MultiChangesFeed({callbacks}, {Since:0 Limit:0 Conflicts:false IncludeDocs:true Wait:true Continuous:false Terminator:0xc82046b9e0 HeartbeatMs:0 TimeoutMs:10000 ActiveOnly:false}) …
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:30.159+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 0 changes valid from #1
2016-09-20T15:55:30.159+01:00 DIndex+: [changesFeed] Found 0 changes for channel callbacks
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed waiting…
2016-09-20T15:55:30.159+01:00 Changes+: Waiting for “sync_gateway”'s count to pass 0
2016-09-20T15:55:31.823+01:00 Cache: Received #3 after 1667ms (“dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / “1-9fe23fc803a90eb9b8db7658b3abb244”)
2016-09-20T15:55:31.823+01:00 Cache: Deferring #3 (1 now waiting for #2…#2)
2016-09-20T15:55:31.823+01:00 Cache: Received #2 after 1675ms (“person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482” / “1-d913db1c33482c3423efaef679061a89”)
2016-09-20T15:55:31.823+01:00 Cache: #2 ==> channel ""
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel "callbacks"
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel ““
2016-09-20T15:55:31.823+01:00 Changes+: Notifying that “sync_gateway” changed (keys=”{, callbacks}”) count=4
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(*uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:31.823+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 1 changes valid from #1
2016-09-20T15:55:31.823+01:00 DIndex+: [changesFeed] Found 1 changes for channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: Sending seq:3 from channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed sending &{Seq:3 ID:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 Deleted:false Removed:{} Doc:map[owner_id:person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482 schema:glowinthedark.co.uk/test/dog type:dog _rev:1-9fe23fc803a90eb9b8db7658b3abb244 _id:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 channels:[callbacks] name:fly namespace:glowinthedark.co.uk/test] Changes:[map[rev:1-9fe23fc803a90eb9b8db7658b3abb244]] Err: allRemoved:false branched:false}
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 HTTP+: #010: --> 200 OK (0.0 ms)
This is roughly 100 times the time it takes couchdb to perform the same action
Is there any way to configure sync gateway to improve this indexing performance? As is it cannot be used as a replacement for traditional api type calls if I need a timely response to my client.
There’s most of the delay right there — I’m not up-to-date on SG implementation details, but it looks like that’s the latency for Couchbase Server’s change notification after the document is written. Which is longer than I’d expect. What sort of load is the server under?
This was a test on my dev machine with no load on the server other than this test. This means that the Couchbase Server wouldn’t have the proper memory/cpu resources of the production server. Might this cause the index/cache to be slow? When syncing between two iPads using the same dev machine it seems to be much quicker than this.
I had two other thoughts:
Might the fact that I’m testing a single isolated document inserted with a PUT rather than a from a replication mean it takes longer to index? Is there some form of input buffer that waits for inactivity before flushing?
Does the SG web sockets changes feed differ internally and so get changes sooner?
After some more tests it seems that there is something ticking every two seconds. (or with a 2 second timeout) Something like a buffer that flushes every two seconds. The latency in the changes feed is almost entirely caused by this buffering and not by computation or io.
The latency varies between roughly 0.05 - 2.0 for a random request. The latency of successive longpoll changes requests are deterministic if timed from the return of the previous request. They always return 2.0 seconds after the previous one.
Is this expected behaviour? Is this 2 seconds of buffering configurable? and if so are there other performance implications of changing it?
That’s odd; low latency is important to us, so I don’t think there would be anything like a 2-sec timer involved in the changes-feed plumbing. However, I haven’t worked on development of SG for over a year, so I’m no longer in touch with its inner workings.
I had a quick look at the code and got as far as listener.tapNotifier.Wait() on line 155 in change_listener.go
looks like it blocks here waiting for changes from the tap feed. Maybe the tap feed batches up it change sets and sends them every 2 seconds rather than streaming them as they happen.
@jens so it looks like this is known issue with very low priority in the issues.
I understand that with mutli-master replication and eventual consistency low latency is not a design priority, but it’s still nice to have. This bug adds an arbitrary 0 - 2000 ms to _changes on top of Couchbase’s “sub-millisecond” latency!