Partial update (mutateIn) inside transactions?

Lucas_Majerowicz · October 15, 2021, 3:59pm

Hi,

Is there any way to do a partial update inside a transaction?

All I need is to update a single field, so I would like to avoid having to fetch the entire document and do a full update. The documents can be relatively big and performance for this use case is critical. For that reason I would also like to avoid using n1ql.

To clarify the transaction involves updates to several documents.

Thanks,
Lucas

graham.pople · October 15, 2021, 4:54pm

Hi @Lucas_Majerowicz

There is no support for key-value Sub-Document support inside transactions at this time. Thanks for raising this though; requests like this help us know where to focus our engineering effort in the future.

It would be worth testing out the performance with N1QL. It won’t be quite as efficient as key-value Sub-Document for reasons including the query parsing cost, but if you use USE KEYs then there is no index lookup required and the query service can fetch the documents directly from the key-value service. And while the query service will be fetching the full documents, this will be intra-cluster.

Lucas_Majerowicz · October 15, 2021, 5:16pm

Hi @graham.pople ,

Thanks for your prompt response.

Follow up question: say I have 100k ‘elements’ to process. For each element I need to do 1 full document update and 2 document partial updates. It’s important to have atomicity for the 3 updates of every element, but don’t need atomicity for the entire 100k elements. If the process fails in the middle, it’s ok as long elements are left in a consistent state: either 0 or 3 updates for every element.

To maximize performance, would it be better to have multiple parallel single-element transactions (100k transactions each having 3 updates)? Or maybe having each transaction update multiple elements would be better (fewer but bigger transactions, e.g. 10k transactions each one having 30 updates)?

Note that each document partial update would be done using n1ql as you suggested. So 1 replace and 2 queries using keys in a transaction.

Thanks again,
Lucas

graham.pople · October 15, 2021, 5:47pm

Hi @Lucas_Majerowicz
Great question. Under the hood, each transaction attempt writes three times into a document called an Active Transaction Record (PENDING, COMMITTED, and removal). And each key-value document write (insert, replace, remove) is doubled - once for staging, once for committing. So 100k transactions each doing 1 element (3 updates) will be 100k * (3 ATR writes + 3 staging writes + 3 committing writes) = 900k total writes. While 10k transactions would be 10k * (3 ATR writes + 30 staging writes + 30 committing writes) = 630k total writes. So the fixed overhead-per-attempt of the ATR writes clearly steers towards less-but-larger transactions.

But there is a balancing act to achieve. If a transaction conflicts with another, it must rollback and retry, which adds more writes. Smaller transactions will probably be less likely to conflict. So that really depends on your workload - if conflicts are unlikely in your case, you can likely push to bigger transactions.

Those are the main performance considerations. The transactions protocol is very well distributed: writes will be distributed pretty evenly across all key-value nodes thanks to auto-sharding, and there are 1,024 Active Transaction Records by default to distribute over, so there aren’t any hotspots there that you need to avoid.

One final thing to note, since you are pushing these updates through N1QL, is the query service handles statements in serial. E.g. 3 query nodes will be processing at most 3 statements regardless of how many parallel transactions are taking place. This would be another factor that would steer me to recommend fewer-but-larger transactions.

Lucas_Majerowicz · October 15, 2021, 7:21pm

Hi @graham.pople,

Thanks for the detailed answer. I’ll need to run some tests and see what transaction size gives the best performance.

Best,
Lucas

mreiche · May 18, 2023, 9:32pm

I realize this is old (I was looking for something else), but - Why not have the application get the document, then, in-memory, do the full update followed by the two partial updates, then replace the document? Even without the transaction, the update will be atomic.

Topic		Replies	Views
N1QL Update vs a mutation via the sdk? Couchbase Server java , n1ql	3	1014	July 3, 2019
Batch Update statements SQL++ n1ql	10	948	December 13, 2023
How to partially update a document SQL++ sdk	3	4674	May 31, 2017
Bulk subdocument Update API Go SDK	2	1007	August 8, 2022
Bulk update issue Java SDK java	1	1764	July 12, 2017

Partial update (mutateIn) inside transactions?

Related topics