Migrating Buckets to Collections & Scopes via Eventing: Part 1
First I want to point out an excellent blog written by Shivani Gupta, How to Migrate to Scopes & Collections in Couchbase 7.0, which covers in great detail other methods of migrating bucket-based documents to Scopes and Collections in Couchbase. I encourage you to also read about the multiple non-Eventing methods that Shivani touches upon.
Whether you’re new to Couchbase or a seasoned vet, you’ve likely heard about Scopes and Collections. If you’re ready to try them, this article helps you make it happen.
Scopes and Collections are a new feature introduced in Couchbase Server 7.0 that allows you to logically organize data within Couchbase. To learn more, read this introduction to Scopes and Collections.
You should take advantage of Scopes and Collections if you want to map your legacy RDBMS to a document database or if you’re trying to consolidate hundreds of microservices and/or tenants into a single Couchbase cluster (resulting in much lower TCO).
Using Eventing for Scopes & Collections Migration
In this article, I’ll discuss the mechanics of another high performance method to migrate from an older Couchbase version to Scopes and Collections in Couchbase 7.0.
You only need the Data Service (or KV) and Eventing to migrate from buckets to collections. In a well-tuned, large Couchbase cluster, you can migrate over 1 million documents a second. Yes, no N1QL, and no index needed.
In the follow up post (Part 2), I will provide a simple fully automated methodology to do large migrations with dozens (or even hundreds) of data types via a simple Perl script.
Prerequisites: Learning about Eventing
In this article, we will use the latest version of Couchbase (7.0.2), but prior 7.0 versions work fine as well.
If you are not familiar with Couchbase or the Eventing service, please walk through the following resources, including at least one Eventing example:
- Setup a working Couchbase 7.0 server as per the directions under “Start Here!”
- Understand how to deploy a basic Eventing function as per the directions in the Data Enrichment example. Look at “Case 2” where we will only use the “source” bucket:
- Two buckets “bulk” and “rr100” of size 100 MB.
- A source keyspace “bulk.data.source”.
- An eventing scratchpad keyspace “rr100.eventing.metadata”.
- See the documentation for detailed steps on how to create a bucket.
Eventing Function: ConvertBucketToCollections
Eventing allows you to write pure business logic. The Eventing service takes care of the entire infrastructure needed to manage and scale your function (horizontally and vertically) across multiple nodes in a performant and reliable fashion.
All Eventing functions have two entry points – OnUpdate(doc, meta)
and OnDelete(meta, options)
. Note that we’re not worried about the latter entry point in this example.
When a document changes or mutates (insert, upsert, replace, etc.), a copy of the document and some metadata about the document is passed to a small JavaScript entry point OnUpdate(doc, meta)
.
Eventing Functions can be deployed with two different Deployment Feed Boundaries, either “From now” or “Everything“. The latter allows access to every current document in a Bucket in Couchbase 6.6 or a Keyspace (Bucket/Scope/Collection) in Couchbase 7.0.
The scriptlet ConvertBucketToCollections from the main Eventing docs shows how to utilize Eventing to take data from a source bucket to a destination bucket and split your data into collections.
Step 1: Load Sample Data
In the Couchbase UI, select “Settings/Sample Buckets“. Check beer-sample
and click on the button “Load Sample Data“.
Step 2: Make the Needed Keyspaces
This example requires three buckets: “beer-sample” (i.e., your document store to migrate), “rr100″ (i.e., a scratchpad for Eventing that can be shared with other Eventing functions) and bulk (the bucket to create your migrated collections in). The “rr100″ and “bulk” bucket should have a minimum size of 100MB.
In the Couchbase UI, select “Buckets” and hit the “ADD BUCKET” link in the upper right.
Create two Buckets with size 100 MB, “rr100” (for the Eventing storage or scratch pad) and “bulk” (for the migration target).
In Bucket “rr100″ create scope “eventing“.
In the Scope “rr100.eventing” create the collection “metadata“.
In Bucket “bulk” create scope “data“.
In the Scope “bulk.data” create the collections “beer” and “brewery“.
At this point you should have three (3) buckets as follows:
with the following collections in the “bulk” bucket:
and the following collections in the “rr100″ bucket:
Step 3: Create the Eventing Function
In the Couchbase UI, select “Eventing” and hit the “ADD FUNCTION” link in the upper right.
The settings for the Eventing Function are as follows:
Hit the button “Save” then paste this script in the Function Editor panel:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
function OnUpdate(doc, meta) { if (doc.type === 'beer') { if (DO_COPY) beer_col[meta.id] = doc; if (DO_DELETE) { if (!beer_col[meta.id]) { // safety check log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id); } else { delete src_col[meta.id]; } } } if (doc.type === 'brewery') { if (DO_COPY) brewery_col[meta.id] = doc; if (DO_DELETE) { if (!brewery_col[meta.id]) { // safety check log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id); } else { delete src_col[meta.id]; } } } } |
Your code editor should look like:
Hit the button “Save and Return”
What the ConvertBucketToCollections does
The OnUpdate(doc, meta)
logic will process all data in the beer-sample
._default._default keyspace and will perform the following on any past (historical) and any new (future) mutations.
-
- First, the property of the doc.type is checked in two near identical code blocks to see if it matches either
beer
, orbrewery
. If there’s a match, continue. - A global constant
DO_COPY
(provided via the Functions settings via a Constant Binding alias) is checked to see if the item should be copied. - If
DO_COPY
is true, the document will be written to target collection or keyspacebeer_col
orbrewery_col
(defined via the Functions settings via a Bucket Binding alias) depending on the code block that matched. - A global constant
DO_DELETE
(provided via the Functions settings via a Constant Binding alias) is checked to see if the item should be removed from the source keyspace or collection (defined via the Functions settings via a Bucket Binding alias) - If
DO_DELETE
is true, the document will be removed from the collection or keyspacesrc_col
(defined via the Functions settings via a Bucket Binding alias).
- First, the property of the doc.type is checked in two near identical code blocks to see if it matches either
We could increase the workers from 1 to the number of vCPUs for better performance, but our dataset is trivial so we just leave the worker count as one (1). Note: The setting for workers is found in the expandable section Settings in the middle of the Function Settings dialog.
Deploying the Eventing Function
Now it’s time to deploy the Eventing function. We’ve reviewed a bit of the code and the design of the ConvertBucketToCollections migration script, and now it’s time to see everything working together.
At this point, we have a function in JavaScript so we need to add it to our Couchbase cluster and deploy it into an active state.
Hit the button “Deploy“.
The Eventing Service takes about 18 seconds to deploy your Eventing Function, at which point you should immediately see 7303 items processed. Since the dataset is static, you are finish as all items have been processed. Since the dataset is static, you are finished as all items have been processed.
Hit the button “Undeploy“.
Looking at the Migrated Data
Now that we are done using the Eventing Function, we can inspect the Buckets and Collections to see what happened.
In the Couchbase UI, select “Buckets”
Now select “Scopes & Collections” for the bucket “bulk”, then expand the scope “data”.
In the Couchbase UI, select “Documents“, then select the Keyspace “bulk.data.beer” and you will see the migrated documents in that collection.
In the Couchbase UI, select “Documents“, then select the Keyspace “bulk.data.brewery” and you will see the migrated documents in that collection.
Let’s Improve the Eventing Function
Remember, Eventing can enrich data on the fly, and if we are truly splitting up a bucket (circa Couchbase 6.x) into separate collections (circa Couchbase 7.0), we no longer need the type property. So let’s modify our Function to transform our data, too.
For example, given the document with key “abhi_brewery” in our source data in beer-sample
._default._default:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
{ "name": "Abhi Brewery", "city": "", "state": "", "code": "", "country": "India", "phone": "", "website": "", "type": "brewery", "updated": "2011-09-27 00:35:48", "description": "", "address": [] } |
Here’s the modification to our Eventing Function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
function OnUpdate(doc, meta) { if (!doc.type) return; var type = doc.type; if (DROP_TYPE) delete doc.type; if (type === 'beer') { if (DO_COPY) beer_col[meta.id] = doc; if (DO_DELETE) { if (!beer_col[meta.id]) { // safety check log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id); } else { delete src_col[meta.id]; } } } if (type === 'brewery') { if (DO_COPY) brewery_col[meta.id] = doc; if (DO_DELETE) { if (!brewery_col[meta.id]) { // safety check log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id); } else { delete src_col[meta.id]; } } } } |
And since we add one new global constant DROP_TYPE
, we also modify the settings as follows:
Final Thoughts
If you found this article helpful and are interested in continuing to learn about eventing – click here the Couchbase Eventing Service.
Now that you understand the mechanics of using Eventing to migrate your buckets to scopes and collections, please explore the follow up post (Part 2), where I provide a simple fully automated methodology to do large migrations with dozens of data types via a simple Perl script.
Resources
- Download: Download Couchbase Server 7.0
- Eventing Scriptlet: Function: ConvertBucketToCollections
References
- Couchbase Eventing documentation
- What’s New: Couchbase Server 7.0
- How to Migrate to Scopes & Collections in Couchbase 7.0
- Other Couchbase blogs on Eventing
I would love to hear from you on how you liked the capabilities of Couchbase and the Eventing service, and how they benefit your business going forward. Please share your feedback via the comments below or in the Couchbase forums.