If two people edit the same document separately and then the updated document is replicated we end up with an additional document with an ID of “_sync:rb:/PY2nTSD4RxlajUNSVz/P3Nc2kYxB5VkuTEwPjz5Z9s=”.
We are doing CB Server to CB Server replication using the Sync Gateway.
They are causing problems for our application because they show up in query results like our normal documents but they don’t have a valid ID, and don’t really represent a proper entity within the system.
What are these “_sync:rb:/” documents and what should we do with them?
Hi, “_sync:rb” documents are copies of older/alternative revisions which are created when a document has conflicts (like in your case where the same revision has been updated by two people).
I’ve walked through the “Resolving Conflicts” documentation have struck the following problems.
The method outlined for creating the conflict from the documentation doesn’t create any documents with the “_sync:rb” problem.
I successfully implemented the algorithm for resolving the conflict and it resolved the conflict for the test scenario from the documentation.
It also detected the conflict for our application and “resolved” it - but it doesn’t seem to do anything about deleting the “_sync:rb” documents. They are still existing within the database and are still causing problems for our application.
There doesn’t seem to be any visible linkage between the “_sync:rb” documents and the application document. They don’t share the same ID or revision numbers. I know it’s the same document by looking at other attributes in the document, but I can’t see how CB is maintaining any linkage between the documents.
I’m still stuck with the same problem of these “_sync:rb” documents cluttering up the database. I could just do a brute-force delete on them, but that seems wrong. Any help appreciated, and thank-you in advance.
“_sync:rb” documents are only created if it’s deemed inefficient to store the document body inline in the parent’s own metadata. This threshold is currently 250 bytes, so I suspect the test you did was using a small enough document that it was inlined instead.
These documents absolutely are directly correlated to an existing doc and rev. We generate the key based on those two properties by base64 encoding the SHA256 hash of the docID/revID:
Please do be aware that Sync Gateway has to store data inside the Couchbase Server bucket in order for it to work properly. You cannot go around deleting “_sync: …” documents and expect everything to still work OK in Sync Gateway.
If these documents are causing issues for you application, can you not filter out documents with a “_sync:” prefix? We’ve namespaced all of our metadata documents for this reason.
Thanks, that explains the gaps in my understanding of how things work.
But it leads to the next problem. In order to filter out the “_sync:” documents we’re going to have to modify every piece of N1QL in the entire application and that’s a bit fiddly when you have to start dealing with joins as well.
Is there any global query setting to filter out these Couchbase specific documents? They don’t belong to the application so why should a normal looking Select query return them?
This particular scenario should improve with the introduction of collections in upcoming versions of Couchbase Server and Couchbase Mobile which will allow you to separate documents into different collections inside the same bucket.
No. As mentioned earlier in this thread, the _sync:rb documents arise from documents that have active conflicts. These need resolving even if we store the conflict data inline instead of creating a separate document.
Trying to fit your documents under the size threshold to avoid separate _sync:rb docs being created is just avoiding one particular symptom of having conflicting revisions in the database.