Operational Transforms Instead of "Most Revisions Wins"

I understand that Couchbase uses a “most revisions wins” strategy for resolving sync conflicts. Has there been any discussion about adopting a more advanced Operational Transform approach, instead?

Example

Consider this case, where two users, Alice and Bob, start with identical state on a document:

"someDoc": {
    "title": "Blah",
    "children": ["A", "B", "C"]
}

Alice adds a new child (D) while her device is briefly disconnected from the Internet:

"someDoc": {
    "title": "Blah",
    "children": ["A", "B", "C", "D"]
}

Meanwhile, Bob, who remains connected, makes two back-to-back changes on his end, deleting A and then deleting B (he does not yet have Alice’s change, adding D), so he ends up in this state, which syncs to the cloud:

"someDoc": {
    "title": "Blah",
    "children": ["C"]
}

If I understand the docs correctly, when Alice reconnects and sync completes, Bob’s document “wins” because it has more revisions. So the final state that both Alice and Bob get is:

"someDoc": {
    "title": "Blah",
    "children": ["C"]
}

But this is pretty naive conflict-resolution. What we really want is, “Integrate the changes that each user has made,” (an operational transform) so that we end up with this final state:

"someDoc": {
    "title": "Blah",
    "children": ["C", "D"]
}

In my app’s case, the entries in the children array are UUIDs to other documents in the object graph. So losing one to the ether causes corruption in the graph. (I’m currently minimizing the chances of this by disabling edits when sync is offline, but that’s…not great.)

I understand that Couchbase lets me write custom conflict resolvers. But to apply operational transforms correctly is a big job that requires a lot of tracking—it’s the exact sort of thing that a paid sync service should really provide, no?

Realm Sync has (had) it. Is there any roadmap for achieving this in Couchbase Sync Gateway?

Operational transforms require live event propagation between clients; for example Google Docs. They’re not appropriate for offline sync.

You may be mistaking them for CRDTs, which do work offline. There are a lot of CRDT algorithms & data structures, with different costs/benefits for different use cases. They all have pretty high time & space overhead compared to simpler conflict resolution. We see them as something for an application developer to choose to use, not something we provide out of the box.

Nah, it can be done with an OpLog: a history of operations that have occurred on each device. Replay the OpLog and you have the transform without live propagation. (As a bonus, you could then even implement undo support!)

Realm did it: see here

It is possible for me to add my own Collection and write transaction history, sure. But if I’m paying $1,000/mo for Capella, this is the kind of thing I want that money used for. I pay for DBaaS precisely so I don’t HAVE to spend my time reinventing the wheel. I want to outsource conflict resolution to really smart database engineers who can do a better job and cover it with 7,000 unit tests. Querying my custom “transaction” collection inside the custom JavaScript conflict resolution handler: I’m not even sure that’s possible, for one, and the performance would probably be terrible.

Would this conflict resolution strategy be slower than the existing option? Yes. But if it were a choice, it would be the RIGHT tradeoff for many apps.

Capella and Couchbase have been a little frustrating because each time I say, “Hey, the competition has feature X and that’s really state-of-the-art. Can you guys do that?” The answer is, “Build it yourself.” If I have to build my own database system, what’s the point of paying for one, you know?

I’m not an expert on these data structures, but I would call what Realm did more of an event-based CRDT, whose implementation is based on OT principles. ¯\_(ツ)_/¯

I’m just a humble engineer/architect so I can’t speak to the majority of your post. I can say that Couchbase is a smaller company than Mongo.

Querying my custom “transaction” collection inside the custom JavaScript conflict resolution handler

JavaScript? Hang on – if you’re talking about App Services, that’s not where conflict resolution would be happening. App Services only accepts revisions that fit the linear history of the document; when a client tries to push a conflict, the push fails and the client instead pulls the server revision and resolves it locally. So, conflict resolution happens on-device and in the language of your app. There’s no reason it should be slow.

If I have to build my own database system, what’s the point of paying for one, you know?

But you’re not building your own database system. Conflict resolution is a great feature, but it’s frosting atop an extremely tall layer cake that you don’t need to think about because a lot of people spent 15 years assembling it, and a lot of people are, uh, maintaining the cake for you so it doesn’t … tip over or get stale or something. (Great analogy!)

I would love us to build a CRDT-based conflict resolution system for you. I’m hoping that, as Couchbase continues to grow, I’ll get the resources to work on that. :crossed_fingers:

I would love us to build a CRDT-based conflict resolution system for you. I’m hoping that, as Couchbase continues to grow, I’ll get the resources to work on that. :crossed_fingers:

That would be excellent. Conflict resolution is hard and exactly the kind of thing I want to outsource to an SDK. Handling it myself (A) takes me away from building the actual features of my apps and (B) is very difficult to justify to clients as a development expense.