Data modeling - Migrating from mongoDb to couchbase

I want to migrate my banking system from mongodb to couchbase
In mongodb I have 2 collection transactions and activity

The transactions collection contains _id,sourceUserId,destinationUserId,amount,tax,… fields , And I shard it by _id

The activity collection , is a collection that depends on transactions , When I insert/update a document in transactions I insert/update 2 documents in activity

The activity collection contains _id, transactionId,userId,sourceUserId,destinationUserId,amount,tax,… fields , And I shard it by userId

When user A send money to user B we have these 3 documents

transactions collection :
_id=1,sourceUserId=A,destinationUserId=B,amount=100,tax=0

activity collection :
_id=100,transactionId=1,userId=A,sourceUserId=A,destinationUserId=B,amount=100,tax=0

_id=101,transactionId=1,userId=B,sourceUserId=A,destinationUserId=B,amount=100,tax=0

transactions collection is only for internal usage and activity collection is for report to end user

So when user A need to fetch the transaction activities , I target a shard as actitity collection is sharded by userId

How can I migrate to couchbase?
As activity collection is duplicate data of transactions collection , Is it needed to migrate this collection? As there is no shard key in couchbase?

What is the best model design for couchbase?

1- Just migrate transactions and create two GSI on sourceUserId and destinationUserId , when user A need to fetch activities , I can run SELECT * FROM myBucket WHERE type="transaction" AND (sourceUserId="A" OR destinationUserId="A") , Is it good? As I have heavy read

2-I can create view (But I dont know how , and it is better that GSI approach?)

3-Can I have same data model as mongodb? 1 transaction + 2 activity document , How can I efficiently fetch user A activities

4-Is there any better ways?

Hi @socketman2016, approach 1 seems good and is where I’d start. You can always try out a view later on to see if it’s faster (or Couchbase Analytics, which is currently in beta).

@graham.pople thank you very much

As I have RDBMS , cassandra and mongodb background and I am newbie in couchbase ,Couchbase in a little strange for me!!

In other databases we have shard/partitioning and there is a SHARD KEY , and in multi tenant application always we shard data by tenant id

In couchbase we have a bucket , 1024 vBucket and each document has a unique KEY , I know , as far as we have KEY we can retrieve doc very fast

In mongoDB we have a Targeted Operations vs. Broadcast Operations

image

image

I know that in multi tenant application we must focus on Targeted Operations

In couchbase , Views use a process called scatter-gather , is it simillar to Broadcast Operations? or not?

approach 1 seems good and is where I’d start.

How exactly it works? I cannot understand how exactly couchbase works
As far as I know , When I run SELECT * FROM myBucket WHERE type="transaction" AND (sourceUserId="A" OR destinationUserId="A") , As I have two GSI on sourceUserId and destinationUserId , couchbase firstly scan these indexes (index service) and at this time we know KEYS then easily retrieve docuement based on KEYS and then filter it type="transaction" , Right? or not?
If right? there is no Targeted Operations in couchbase, Right?

Another question , As far as I know , The low cardinality field is not a good choice for indexing , Is it right in couchbase too? I always have a type="something" in where clause

View results are persisted and regularly updated, so they’re very fast to query. Though they do use scatter-gather, it’s just to pull the persisted data from all nodes - e.g. they’re not analogous to Mongo’s broadcast operations. Still, N1QL is the newer technology and should be your default over views for standard queries (though views still have some advantages, e.g. for some aggregations, and reduce operations - you can read more in the view docs).

I work on the SDKs rather than N1QL so I’m no expert on that area, but yes I believe your understanding is correct on what N1QL does under the hood. I would think indexes sourceUserId+type and destinationUserId+type make a lot of sense, but I’ll leave my N1QL colleagues to confirm that.

Is the low cardinality field is not a good choice for indexing in couchbase ? I always have a type="something" in where clause must I index it even I have 5 types

You can include type as index key or you can create partial indexes in cocuahbse. cc @keshav_m

Checkout the following topic in https://blog.couchbase.com/wp-content/uploads/2017/10/N1QL-A-Practical-Guide-2nd-Edition.pdf
Designing Index For Query In Couchbase N1QL
To better understand and help you checkout OPTIMIZATION & INDEXING sections