_changes feed latency is very high

paulharter · September 20, 2016, 3:14pm

Hi

I’m testing the latency of the _changes feed to see how practical its use is as a replacement for more traditional http api type calls to a server. Its seems to be very slow. In a simple test case it took 1.664 seconds to index and return a document.

2016-09-20T15:55:30.155+01:00 HTTP: #009: PUT /sync_gateway/dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 (ADMIN)
2016-09-20T15:55:30.155+01:00 CRUD+: Invoking sync on doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” rev 1-9fe23fc803a90eb9b8db7658b3abb244
2016-09-20T15:55:30.155+01:00 CRUD: Doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” in channels "{callbacks}"
2016-09-20T15:55:30.155+01:00 Cache: SAVING #3
2016-09-20T15:55:30.156+01:00 CRUD: Stored doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / "1-9fe23fc803a90eb9b8db7658b3abb244"
2016-09-20T15:55:30.156+01:00 HTTP+: #009: --> 201 (0.8 ms)
2016-09-20T15:55:30.159+01:00 HTTP: #010: GET /sync_gateway/_changes?feed=longpoll&channels=callbacks&include_docs=true&since=%220%22&filter=sync_gateway%2Fbychannel&timeout=10000 (ADMIN)
2016-09-20T15:55:30.159+01:00 Changes+: Int sequence multi changes feed…
2016-09-20T15:55:30.159+01:00 Changes: MultiChangesFeed({callbacks}, {Since:0 Limit:0 Conflicts:false IncludeDocs:true Wait:true Continuous:false Terminator:0xc82046b9e0 HeartbeatMs:0 TimeoutMs:10000 ActiveOnly:false}) …
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:30.159+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 0 changes valid from #1
2016-09-20T15:55:30.159+01:00 DIndex+: [changesFeed] Found 0 changes for channel callbacks
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed waiting…
2016-09-20T15:55:30.159+01:00 Changes+: Waiting for “sync_gateway”'s count to pass 0
2016-09-20T15:55:31.823+01:00 Cache: Received #3 after 1667ms (“dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / “1-9fe23fc803a90eb9b8db7658b3abb244”)
2016-09-20T15:55:31.823+01:00 Cache: Deferring #3 (1 now waiting for #2…#2)
2016-09-20T15:55:31.823+01:00 Cache: Received #2 after 1675ms (“person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482” / “1-d913db1c33482c3423efaef679061a89”)
2016-09-20T15:55:31.823+01:00 Cache: #2 ==> channel ""
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel "callbacks"
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel ““
2016-09-20T15:55:31.823+01:00 Changes+: Notifying that “sync_gateway” changed (keys=”{, callbacks}”) count=4
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(*uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:31.823+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 1 changes valid from #1
2016-09-20T15:55:31.823+01:00 DIndex+: [changesFeed] Found 1 changes for channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: Sending seq:3 from channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed sending &{Seq:3 ID:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 Deleted:false Removed:{} Doc:map[owner_id:person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482 schema:glowinthedark.co.uk/test/dog type:dog _rev:1-9fe23fc803a90eb9b8db7658b3abb244 _id:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 channels:[callbacks] name:fly namespace:glowinthedark.co.uk/test] Changes:[map[rev:1-9fe23fc803a90eb9b8db7658b3abb244]] Err: allRemoved:false branched:false}
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 HTTP+: #010: --> 200 OK (0.0 ms)

This is roughly 100 times the time it takes couchdb to perform the same action

Is there any way to configure sync gateway to improve this indexing performance? As is it cannot be used as a replacement for traditional api type calls if I need a timely response to my client.

Thanks

Paul

jens · September 21, 2016, 6:22am

There’s most of the delay right there — I’m not up-to-date on SG implementation details, but it looks like that’s the latency for Couchbase Server’s change notification after the document is written. Which is longer than I’d expect. What sort of load is the server under?

paulharter · September 21, 2016, 11:26am

Hi Jens,

This was a test on my dev machine with no load on the server other than this test. This means that the Couchbase Server wouldn’t have the proper memory/cpu resources of the production server. Might this cause the index/cache to be slow? When syncing between two iPads using the same dev machine it seems to be much quicker than this.

I had two other thoughts:

Might the fact that I’m testing a single isolated document inserted with a PUT rather than a from a replication mean it takes longer to index? Is there some form of input buffer that waits for inactivity before flushing?

Does the SG web sockets changes feed differ internally and so get changes sooner?

I’m testing with:

==== Couchbase Sync Gateway/1.3.0(274;8c3ee28) ====
Couchbase Server Version: 4.5.0-2601 Enterprise Edition (build-2601)

Thanks

P

paulharter · September 21, 2016, 1:32pm

Hi Jens,

After some more tests it seems that there is something ticking every two seconds. (or with a 2 second timeout) Something like a buffer that flushes every two seconds. The latency in the changes feed is almost entirely caused by this buffering and not by computation or io.

The latency varies between roughly 0.05 - 2.0 for a random request. The latency of successive longpoll changes requests are deterministic if timed from the return of the previous request. They always return 2.0 seconds after the previous one.

Is this expected behaviour? Is this 2 seconds of buffering configurable? and if so are there other performance implications of changing it?

Thanks

P

jens · September 21, 2016, 6:25pm

That’s odd; low latency is important to us, so I don’t think there would be anything like a 2-sec timer involved in the changes-feed plumbing. However, I haven’t worked on development of SG for over a year, so I’m no longer in touch with its inner workings.

@adamf, @andy — does this ring a bell at all?

paulharter · September 22, 2016, 10:46am

I had a quick look at the code and got as far as listener.tapNotifier.Wait() on line 155 in change_listener.go

looks like it blocks here waiting for changes from the tap feed. Maybe the tap feed batches up it change sets and sends them every 2 seconds rather than streaming them as they happen.

paulharter · September 24, 2016, 8:46am

If this buffering isn’t by design as part of how the changes are delivered from CB I’ll raise it as an issue on GitHub.

jens · September 24, 2016, 3:59pm

Please do file an issue. It’s definitely an error to have changes-feed latency that long on an unloaded server. Thanks!

benjamin_glatzeder · September 25, 2016, 1:59pm

Could you link the issue so that I can follow it. Thanks!

paulharter · September 25, 2016, 11:15pm

The issue is here:

paulharter · October 17, 2016, 12:42pm

@jens so it looks like this is known issue with very low priority in the issues.

github.com/couchbase/sync_gateway

Documents written by Sync Gateway appear in 2-second clusters on the TAP feed

opened 09:55PM - 28 Jan 15 UTC

closed 05:40PM - 05 Sep 18 UTC

adamcfraser

performance chore P3: low

Looking at the timestamps for incoming events on the TAP feed, they are showing …up in buffered sets, 2 seconds apart. Original issue here: https://forums.couchbase.com/t/how-to-deal-with-sync-gateway-couchbase-server-tap-notification-delay/2644/ I was able to replicate this in a local environment (running gateload against my local instance) - timestamps for incoming TAP entries look something like this when listening to a TAP feed that should be sending continuous updates: 10:53:22.187906 10:53:22.188207 … 50 more updates … 10:53:22.203737 10:53:24.186818 <— 2 sec jump 10:53:24.187307 10:53:24.187607 … 50 more updates… 10:53:24.201941 10:53:26.187628 <— 2 sec jump 10:53:26.187723 10:53:26.187864 … I don't see the same results when using something other than sync gateway to write the records. When using the go-couchbase populate.go to write to the server, the events showing up on the SG tap feed are evenly distributed. Based on this it seems like something in the way SG is doing writes. Needs followup to get to the root cause.

I understand that with mutli-master replication and eventual consistency low latency is not a design priority, but it’s still nice to have. This bug adds an arbitrary 0 - 2000 ms to _changes on top of Couchbase’s “sub-millisecond” latency!

jens · October 17, 2016, 4:40pm

You should comment on that issue, then. I don’t work on SG so I’m not the best person to complain to…

paulharter · October 17, 2016, 6:38pm

@jens Sorry - I nudged you as I was frustrated by the low priority of this issue. Adam has updated me now on the progress.

Topic		Replies	Views
Changes latency Sync Gateway	4	2058	May 20, 2015
Sync gateway Changes Feed Very, Very Slow Sync Gateway java , android	1	961	May 29, 2020
Sync gateway taking 30 seconds to compute changes_view? Sync Gateway	10	3222	September 11, 2015
_changes feed timeouts an SG stuck Sync Gateway	2	1485	April 20, 2016
Sync gateway taking 13 - 20s to get changes_view Sync Gateway	4	1880	June 14, 2016

_changes feed latency is very high

Related topics