When I’m out at events, I get a lot of questions regarding the differences between MongoDB and Couchbase Server as they are both in the NoSQL space and are both document databases. One particular question is related to data modeling. MongoDB uses BSON and Couchbase uses JSON, so wouldn’t the data model be different?
We’re going to take a look at some MongoDB document models and see how to accomplish the same in Couchbase with minimal to no effort.
Let me start by saying that both MongoDB and Couchbase have some great documentation on modeling NoSQL documents. Both refer to a similar set of practices that we’ll explore.
Embedded Data Models
Given that you’re able to have incredibly complex NoSQL documents with nested objects and nested arrays, we can do things a little differently than in a relational database.
Take the following MongoDB document for example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
{ "_id": <ObjectId>, "first_name": "Nic", "last_name": "Raboy", "address": { "city": "Mountain View", "state": "California" }, "social_media": [ { "type": "twitter", "url": "https://www.twitter.com/nraboy" }, { "type": "mastodon", "url": "https://toot.cafe/@nraboy" } ] } |
In the above example, we have nested objects and arrays embedded into a single document that is defined by an object id in some MongoDB Collection.
In Couchbase, the same MongoDB document might look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
{ "_type": "person", "first_name": "Nic", "last_name": "Raboy", "address": { "city": "Mountain View", "state": "California" }, "social_media": [ { "type": "twitter", "url": "https://www.twitter.com/nraboy" }, { "type": "mastodon", "url": "https://toot.cafe/@nraboy" } ] } |
You’ll probably notice that the above Couchbase example is nearly identical to the MongoDB example with the exception of the _id
property becoming a _type
property. In Couchbase, the document id exists as a meta property rather than within the document itself. Since there is no concept of Collections in Couchbase, documents are usually differentiated by a type property, but it is not a requirement.
Embedding objects and arrays into a single document is great because it limits the number of steps required to do common operations on the data. However, as the document becomes larger, performance could become an issue.
This is where the next document modeling approach comes into play.
Normalized or Referred Data Models
Anyone coming from a relational database knows that data should be normalized across multiple tables within the database. These tables are then joined in some fashion when the data needs to be accessed together.
The concept of this can still be applied in NoSQL, even if it is not exactly the same as you’d find in a relational database.
Going back to the first example, let’s do what MongoDB refers to as a normalized data model:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ "_id": <ObjectId1>, "first_name": "Nic", "last_name": "Raboy", "address": <ObjectId2>, "social_media": [ { "type": "twitter", "url": "https://www.twitter.com/nraboy" }, { "type": "mastodon", "url": "https://toot.cafe/@nraboy" } ] } |
The embedded model changed slightly by moving address
into its own document. In this case, the address
property now equals an object id and the new document looks like this:
1 2 3 4 5 |
{ "_id": <ObjectId2>, "city": "Mountain View", "state": "California" } |
Splitting the document into multiple documents could help in several areas. The documents are now smaller and operations on them can be potentially faster. The data is also now normalized in the sense that now multiple people can exist at the same address without having to worry about data duplication that would exist in the embedded model.
Per the MongoDB documentation:
… However, client-side applications must issue follow-up queries to resolve the references. In other words, normalized data models can require more round trips to the server.
The application layer is responsible for managing the relationships in the normalized model. The application will also need to make more requests against the database.
Now what would the normalized model look like in Couchbase? Instead of calling it normalized, it is often called a referred model in Couchbase, and it would look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ "_type": "person", "first_name": "Nic", "last_name": "Raboy", "address": "address1", "social_media": [ { "type": "twitter", "url": "https://www.twitter.com/nraboy" }, { "type": "mastodon", "url": "https://toot.cafe/@nraboy" } ] } |
Remember, the id is stored as meta information and most documents have a property to define what type of document it is. Looking at the address
property we are using a key to a different document. The concept is the same as what MongoDB has with object ids, but in MongoDB, the object id is a data type.
Looking at the document containing address data, we have something like this:
1 2 3 4 5 |
{ "_type": "address", "city": "Mountain View", "state": "California" } |
The document key or id of the above document would be address1 to match what is expected in the other document.
Here is the kicker though. The referred documents in Couchbase can be joined in a single server-side operation through N1QL rather than forcing the application layer to take care of it.
Conclusion
What I could have probably summarized in a single sentence was that data modeling in MongoDB and data modeling in Couchbase is the same. It doesn’t matter if one uses BSON or JSON, the concepts apply through both. If you were to switch from MongoDB to Couchbase, everything you knew about MongoDB documents could be carried over.
The core differences come in regards to how you query the documents. It is much easier to query for data in Couchbase through N1QL and the other query strategies, regardless how you’ve chosen to model your documents.
If you’re interested in seeing a modeling and querying video I recorded, check it out here.
For more information on using Couchbase, check out the Couchbase Developer Portal which contains examples and other documentation.