The Java Transactions SDK prints the following in our application:
Only one Transactions object should be created per application, but have found 2 doing background cleanup of transactions. This will degrade app performance and should be fixed immediately.
We do create 2 Transactions objects per application (application meaning one JVM), however, each Transaction object is for a separate bucket, because the application itself is using two buckets at a time. One regular bucket and one ephemeral.
Here is how the Transaction objects are being created:
Transactions.create(
javaCluster,
TransactionConfigBuilder.create()
.logOnFailure(true, Event.Severity.TRACING)
.cleanupLostAttempts(true)
.cleanupClientAttempts(true)
.metadataCollection(javaTransactionsCollection)
.numATRs(couchbaseConfiguration.numATRs)
.durabilityLevel(couchbaseConfiguration.durabilityLevel)
.cleanupWindow(couchbaseConfiguration.cleanupWindow)
.expirationTime(couchbaseConfiguration.expirationTime)
.keyValueTimeout(couchbaseConfiguration.keyValueTimeout)
.build()
)
In the above code for each bucket, then a separate collection is provided:
.metadataCollection(javaTransactionsCollection)
It seems logical to have a separate Transaction metadata collection named transaction
for each bucket the application uses. When that is the case, the following log seems to be unnecessary:
Only one Transactions object should be created per application, but have found 2 doing background cleanup of transactions. This will degrade app performance and should be fixed immediately.
Is it okay to have separate Transaction metadata collection for each bucket? Or should it be better to use one bucket’s metadata collection to record transactions?
Thanks,
Z
Hi @zoltan.zvara
This is great timing, as I published an update to our docs describing when it’s safe to use multiple Transactions objects literally just 2 days ago, on Using Couchbase Transactions | Couchbase Docs. E.g. when using metadata collections, it is fine to have multiple Transactions objects.
However, the code should not log that warning if metadata collections are in use. I’ve just found that’s actually not the case - I will get that fixed for the next release.
Is it okay to have separate Transaction metadata collection for each bucket? Or should it be better to use one bucket’s metadata collection to record transactions?
Also worth considering is the default behaviour, without custom metadata collections. This is allowed to create a set of metadata documents on each bucket. The metadata document used will be on the same bucket & vbucket as the first mutated document in the transaction. This could lead to a small performance win if you are using any of the persist Durability levels (e.g. where the disk write must complete), as it is more likely that those first two writes will be performed in the same underlying disk flush.
But to answer your questions. Yes, is it is ok to have separate metadata collections per bucket - but it perhaps is not optimal. Some factors to consider as part of the choice:
- Each set of metadata adds a small amount of cleanup overhead. By default you have 1,024 documents in each metadata set, and the cleanup process is tuned to read all documents every 60 seconds - around 17 reads per second. So the overhead is very small, and perhaps not something to worry about unless dealing with dozens/hundreds of metadata sets.
- The default behaviour - if metadata collections are not used - is for each Transactions object to do cleanup checking of each bucket. This gives you some useful redundancy. With metadata collections, if all your apps checking bucket A go down, then nothing is doing cleanup on bucket A.
- OTOH, having a separate metadata collection per bucket does provide nice separation of concerns, and isolates each bucket from each other.
I would say that we added custom metadata collections more for a microservices style use-case, where each microservice is isolated to only access a very limited number of collections. But, it can certainly also be used for your purpose.
1 Like