_changes feed latency is very high

Hi

I’m testing the latency of the _changes feed to see how practical its use is as a replacement for more traditional http api type calls to a server. Its seems to be very slow. In a simple test case it took 1.664 seconds to index and return a document.

2016-09-20T15:55:30.155+01:00 HTTP: #009: PUT /sync_gateway/dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 (ADMIN)
2016-09-20T15:55:30.155+01:00 CRUD+: Invoking sync on doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” rev 1-9fe23fc803a90eb9b8db7658b3abb244
2016-09-20T15:55:30.155+01:00 CRUD: Doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” in channels "{callbacks}"
2016-09-20T15:55:30.155+01:00 Cache: SAVING #3
2016-09-20T15:55:30.156+01:00 CRUD: Stored doc “dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / "1-9fe23fc803a90eb9b8db7658b3abb244"
2016-09-20T15:55:30.156+01:00 HTTP+: #009: --> 201 (0.8 ms)
2016-09-20T15:55:30.159+01:00 HTTP: #010: GET /sync_gateway/_changes?feed=longpoll&channels=callbacks&include_docs=true&since=%220%22&filter=sync_gateway%2Fbychannel&timeout=10000 (ADMIN)
2016-09-20T15:55:30.159+01:00 Changes+: Int sequence multi changes feed…
2016-09-20T15:55:30.159+01:00 Changes: MultiChangesFeed({callbacks}, {Since:0 Limit:0 Conflicts:false IncludeDocs:true Wait:true Continuous:false Terminator:0xc82046b9e0 HeartbeatMs:0 TimeoutMs:10000 ActiveOnly:false}) …
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:30.159+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 0 changes valid from #1
2016-09-20T15:55:30.159+01:00 DIndex+: [changesFeed] Found 0 changes for channel callbacks
2016-09-20T15:55:30.159+01:00 Changes+: MultiChangesFeed waiting…
2016-09-20T15:55:30.159+01:00 Changes+: Waiting for “sync_gateway”'s count to pass 0
2016-09-20T15:55:31.823+01:00 Cache: Received #3 after 1667ms (“dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891” / “1-9fe23fc803a90eb9b8db7658b3abb244”)
2016-09-20T15:55:31.823+01:00 Cache: Deferring #3 (1 now waiting for #2#2)
2016-09-20T15:55:31.823+01:00 Cache: Received #2 after 1675ms (“person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482” / “1-d913db1c33482c3423efaef679061a89”)
2016-09-20T15:55:31.823+01:00 Cache: #2 ==> channel "
"
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel "callbacks"
2016-09-20T15:55:31.823+01:00 Cache: #3 ==> channel “
2016-09-20T15:55:31.823+01:00 Changes+: Notifying that “sync_gateway” changed (keys=”{
, callbacks}”) count=4
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed: channels expand to channels.TimedSet{“callbacks”:channels.VbSequence{VbNo:(*uint16)(nil), Sequence:0x0}} …
2016-09-20T15:55:31.823+01:00 Cache: getCachedChanges(“callbacks”, {1 0 0 0 %!d(string=) 0 0 0 %!d(string=)}) --> 1 changes valid from #1
2016-09-20T15:55:31.823+01:00 DIndex+: [changesFeed] Found 1 changes for channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: Sending seq:3 from channel callbacks
2016-09-20T15:55:31.823+01:00 Changes+: MultiChangesFeed sending &{Seq:3 ID:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 Deleted:false Removed:{} Doc:map[owner_id:person_c9ac2a64-e4ae-4f4a-a8c9-5b3ab463a482 schema:glowinthedark.co.uk/test/dog type:dog _rev:1-9fe23fc803a90eb9b8db7658b3abb244 _id:dog_a5cb1bb8-2b82-418f-883a-4b2eb0d6e891 channels:[callbacks] name:fly namespace:glowinthedark.co.uk/test] Changes:[map[rev:1-9fe23fc803a90eb9b8db7658b3abb244]] Err: allRemoved:false branched:false}
2016-09-20T15:55:31.823+01:00 Changes: MultiChangesFeed done
2016-09-20T15:55:31.823+01:00 HTTP+: #010: --> 200 OK (0.0 ms)

This is roughly 100 times the time it takes couchdb to perform the same action

Is there any way to configure sync gateway to improve this indexing performance? As is it cannot be used as a replacement for traditional api type calls if I need a timely response to my client.

Thanks

Paul

There’s most of the delay right there — I’m not up-to-date on SG implementation details, but it looks like that’s the latency for Couchbase Server’s change notification after the document is written. Which is longer than I’d expect. What sort of load is the server under?

Hi Jens,

This was a test on my dev machine with no load on the server other than this test. This means that the Couchbase Server wouldn’t have the proper memory/cpu resources of the production server. Might this cause the index/cache to be slow? When syncing between two iPads using the same dev machine it seems to be much quicker than this.

I had two other thoughts:

Might the fact that I’m testing a single isolated document inserted with a PUT rather than a from a replication mean it takes longer to index? Is there some form of input buffer that waits for inactivity before flushing?

Does the SG web sockets changes feed differ internally and so get changes sooner?

I’m testing with:

==== Couchbase Sync Gateway/1.3.0(274;8c3ee28) ====
Couchbase Server Version: 4.5.0-2601 Enterprise Edition (build-2601)

Thanks

P

Hi Jens,

After some more tests it seems that there is something ticking every two seconds. (or with a 2 second timeout) Something like a buffer that flushes every two seconds. The latency in the changes feed is almost entirely caused by this buffering and not by computation or io.

The latency varies between roughly 0.05 - 2.0 for a random request. The latency of successive longpoll changes requests are deterministic if timed from the return of the previous request. They always return 2.0 seconds after the previous one.

Is this expected behaviour? Is this 2 seconds of buffering configurable? and if so are there other performance implications of changing it?

Thanks

P

That’s odd; low latency is important to us, so I don’t think there would be anything like a 2-sec timer involved in the changes-feed plumbing. However, I haven’t worked on development of SG for over a year, so I’m no longer in touch with its inner workings.

@adamf, @andy — does this ring a bell at all?

I had a quick look at the code and got as far as listener.tapNotifier.Wait() on line 155 in change_listener.go

looks like it blocks here waiting for changes from the tap feed. Maybe the tap feed batches up it change sets and sends them every 2 seconds rather than streaming them as they happen.

1 Like

If this buffering isn’t by design as part of how the changes are delivered from CB I’ll raise it as an issue on GitHub.

1 Like

Please do file an issue. It’s definitely an error to have changes-feed latency that long on an unloaded server. Thanks!

Could you link the issue so that I can follow it. Thanks!

The issue is here:

1 Like

@jens so it looks like this is known issue with very low priority in the issues.

I understand that with mutli-master replication and eventual consistency low latency is not a design priority, but it’s still nice to have. This bug adds an arbitrary 0 - 2000 ms to _changes on top of Couchbase’s “sub-millisecond” latency!

You should comment on that issue, then. I don’t work on SG so I’m not the best person to complain to…

@jens Sorry - I nudged you as I was frustrated by the low priority of this issue. Adam has updated me now on the progress.