I’m seeing some behaviour on my sync gateway that I can’t make sense of. When deleting a document through the admin REST API, the oldDoc argument in the sync function is null in some cases.
One of the documents for which this happens looks like this when requesting it through the sync gateway:
{
“_id”: “cb0badc4b0e964173247fa0cb2bab898”,
“_rev”: “663-05d0ac14fc567b26f81413f85e09edff”,
“type”: “foo”,
“userId”:“bar”
}
When running a GET cb0badc4b0e964173247fa0cb2bab898?open_revs=all, I see two leaf revisions:
–9aad31a4adae350f872f600bdf6a3f44c7ccc8f50a9140a2f1e254e955ac
Content-Type: application/json
Since rev 2507 is deleted, rev 663 should always be picked.
When I now try to DELETE cb0badc4b0e964173247fa0cb2bab898?rev=663-05d0ac14fc567b26f81413f85e09edff, my sync function fails with many “No old revision” errors until it finally fails with in one of my guards:
2018-04-10T11:27:58.825Z CRUD+: No old revision "cb0badc4b0e964173247fa0cb2bab898" / "2506-2a6b78ebefb0549fafe80795fcf5b92c"
...
2018-04-10T11:27:59.005Z CRUD+: No old revision "cb0badc4b0e964173247fa0cb2bab898" / "1508-059f837867266744554dc2dee463481d"
Sync fn rejected: new=map[_deleted:true _id:cb0badc4b0e964173247fa0cb2bab898 _rev:2507-83b1c60d11c7f2f4379232f92775d946] old= --> 403 No parent revision
Two things here are confusing.
Shouldn’t oldDoc be the body of rev 663 instead of null, since DELETE equals to an update with _deleted=true on that revision?
Why is the log output showing _rev:2507-83b1c60d11c7f2f4379232f92775d946 although I specifically tried to delete ?rev=663-05d0ac14fc567b26f81413f85e09edff
It seems to me that the sync gateway is picking the wrong revision to delete. Am I missing something or did I run into a bug?
Deleting the winner via DELETE …/1e801180ef17205a772314bb96c60dfb?rev=4559-5b9429c0ca943c139c698dca7f17ac28 fails and generates log output with the other revision 5067 on the SG:
2018-04-11T11:16:36.437Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4074-90a3c6cd32f703f70ab0d7eb5514f18b"
2018-04-11T11:16:36.437Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4073-bee68f85909e260529a320a5a43a6374"
2018-04-11T11:16:36.437Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4072-054f8c6fe07882c509629d73ae650c27"
2018-04-11T11:16:36.438Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4071-112f37ecc9b035df98609a20ac11600d"
2018-04-11T11:16:36.438Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4070-e5880de6816b9b8c1237db764b9fa011"
2018-04-11T11:16:36.438Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4069-0c8cf138af2bda87110968fd75e8ab61"
2018-04-11T11:16:36.438Z CRUD+: No old revision "1e801180ef17205a772314bb96c60dfb" / "4068-f3c1a75ec137cfcc0c77518d70374314"
2018-04-11T11:16:36.438Z Sync fn rejected: new=map[_deleted:true _id:1e801180ef17205a772314bb96c60dfb _rev:5067-77fa7c9e633cda5ee7a87bcb5851ff87] old= --> 403 No parent revision
The high revision numbers were due to a client-side bug that caused a lot of updates, which is probably also what broke the document. Here is a gist to the full document (with some properties stripped). Unfortunately it’s a little bigger than my previous example.
It is also worth noting that in both cases I could see the parent revision (4558-89cfa15d7f5331ad22a1cf2585b226da for this doc) in the sync metadata, but could not access it
on the SG:
GET .../1e801180ef17205a772314bb96c60dfb?rev=4558-89cfa15d7f5331ad22a1cf2585b226da
{
"error": "not_found",
"reason": "missing"
}
@adamf the difference in generation numbers for these revisions is high. Maybe a factor? I looked at the gist, and the revisions diverge at generation 4068. That seems awfully suspicious given the problem happens with rev 5067.
This looks like a consequence of the fact that when a tombstone causes an existing revision becomes the active revision, the sync function is executed based on the new active revision. In that scenario Sync Gateway appears to be walking up the revision tree looking for an ancestor body to use as old doc (the ‘No old revision’ logging you’re seeing), not finding one (which is expected in the circumstances), and ending up with an empty oldDoc.
I suspect that revs_limit plays into this, since I have only observed this behaviour when the revisions diverge heavily (as you were saying, about 1000 in the second example, possibly more in the first).
The way I understand the problem at the moment is that deleting rev 5066 caused rev 4559 to become active (since it is the only non-deleted leaf). This revision has no history anymore, since it has been pruned which also explains the “not_found” on the parent rev 4885.
My question still is why I’m not simply receiving the body of 4559 as oldDoc. I can create new revisions with 4559 as parent, so implicitly deleting it via _deleted=true should be a workaround. And why does the sync gateway not start walking the revision tree at rev 4559 but at 5066?