Best way to clone a bucket to a demo environment?

jda · September 24, 2019, 1:29pm

I have a demo environment that we use for testing and development of our mobile app and server side web app.

It makes sense to clone the current production environment now and again with a “clean” set of updated data - and without having a lot of sync/repl. history floating around…

So I have previously done a full backup (bucket name is “data”):

/opt/couchbase/bin/cbbackup couchbase://db1.prod-env.dk:8091 /home/couchbase/backup -b data -u xxxx -p yyyy

And then restored it on the test server:

/opt/couchbase/bin/cbrestore /home/couchbase/backup couchbase://db2.test-env.dk:8091 -b data -u xxxx -p yyyy

There is also a Sync.Gateway connected to the test environment (not in the production environment yet).

So this leads me to the following questions for best practice:

Should I delete the data bucket from the test server prior to restoring?
If so - should I do this on both my nodes in the the cluster?
Should I do anything on the Sync.Gateway server prior to the restore?
I guess I should do a resync on the Sync.Gateway server after the restore?
Anything else I need to do to get a “clean” start on the test environment (loosing e.g. repl. history)?

Thanks in advance

jda · October 4, 2019, 1:09pm

Would be nice if someone would share their experience on this?

perry · October 8, 2019, 2:35pm

Hi @jda and apologies for the delay in getting back to you.

Backup and Restore this way is a perfectly acceptable approach to getting a demo/dev environment and you shouldn’t necessarily need to do anything else. You could also consider using XDCR from your production to development/testing to keep a copy up to date. Note that you can ‘pause’ XDCR without any ill effects.

To answer your questions:

Should I delete the data bucket from the test server prior to restoring?
[pk] - Most likely, yes…though it’s not strictly necessary. A ‘restore’ in Couchbase is more akin to a merge…it will overwrite documents that already exist, but it won’t delete documents that are in the bucket and not in the backup. So a delete and recreate of the bucket will give you the cleanest state. Note also that our Enterprise Edition backup tool has more options for backing up different components, managing backup archives, performing incremental backups, etc.
If so - should I do this on both my nodes in the the cluster?
[pk] - If the nodes are in the same cluster, there is no need to manage them separately. Deleting and recreating the bucket from anywhere will act on the cluster as a whole.
Should I do anything on the Sync.Gateway server prior to the restore?
[pk] - In theory no, but shutting it down first will give you the cleanest behavior. Otherwise, as you do the restore, SGW will be trying to import and resync those items as they come in. In a dev environment this probably doesn’t matter, but if you had clients or applications accessing SGW they would see a stream of documents and sync activity.
I guess I should do a resync on the Sync.Gateway server after the restore?
[pk] - SGW should take care of all of this for you either when you start it back up again or incrementally as you are loading data.
Anything else I need to do to get a “clean” start on the test environment (loosing e.g. repl. history)?
[pk] - Depends a bit on how “clean” you want it to be. Keeping replication history is actually a necessary part of managing a multi-master system and also shouldn’t interfere with the functioning of your demo application. So unless you’re looking to accomplish something specific, I would suggest leaving everything as the defaults out of the box.

Thanks for reaching out and asking these questions, are you seeing any behavior that you think is not expected?

jda · October 8, 2019, 4:40pm

Hi @perry

Thanks for getting back! Yes, I was starting to wonder if it was a silly question - or a difficult one

Thanks for the feedback - and no, I’m not experiencing problems right now. I’ m just trying to get to know the platform better. S0 after having done a lot of testing in the demo/dev environments then sometimes it is nice to have a clean start before testing new features.

So I guess from your answer that I can take it that the “replication history” is in the Couchbase server - and not in the Sync.Gateway. Then it makes sense to do like you describe.

Thanks for the insigth

/John

jda · October 8, 2019, 4:49pm

One question though…

I can take a backup of the entire cluster - but also just one node. Is it sufficient to just take the one node? It only takes up half the disk size

perry · October 8, 2019, 5:02pm

No, you’ll need to take a backup of the whole cluster if you want all the data. Couchbase automatically and transparently shards all of the buckets across all of the nodes with the data service enabled on them.

jda · October 8, 2019, 5:39pm

Ok, thanks.

So there is no compression in the backup? It appears to be more or less the same size as the database size in the web database console… That’s why I thought I only had to take one of the two nodes…

perry · October 8, 2019, 6:33pm

There is compression in the Enterprise tool cbbackupmgr (https://docs.couchbase.com/server/current/backup-restore/cbbackupmgr-backup.html) but , IIRC, it’s not included in the Community Edition which uses an older tool (note the cbbackup/cbrestore vs cbbackupmgr)

There’s also compression on the server-side in the Enterprise Edition.

Keep in mind that the Enterprise Edition is free to download and develop on so if you think you’d like to take advantage of some of those capabilities, you can do so now and then get a license when you’re ready to go into production.

jda · October 8, 2019, 6:36pm

Ok, thanks. That makes sense.

The customer for this first project on Couchbase is a University with (very) limited economical resources… Which is why the Community Edition in this case is a very good match

But I plan on working with this platform ahead on my future projects

perry · October 8, 2019, 6:45pm

Oh absolutely! And that’s exactly what is so great about open-source so we’re happy to have developers and users such as yourself working.

We’d even love for you to do a guest blog and/or be more involved in the community. Let me know if that’s something you’d like to pursue.

jda · October 8, 2019, 6:48pm

Could be… - but not the next couple of months as I have a really hard deadline that has also been challenged by some of the other issues I have reported here (not this one)

Topic		Replies	Views
Is it possible to do a cbrestore on a non empty bucket? Couchbase Server backup	0	1660	November 25, 2016
Restore data to the same bucket does not work Couchbase Server	3	1653	February 7, 2020
A good way to copy the configuration only for a new environment Couchbase Server	2	1806	May 20, 2015
Couchbase bucket transfer Couchbase Server	0	1306	July 12, 2018
Changing Data Between Backup And Restore Couchbase Server	1	1284	August 22, 2016

Best way to clone a bucket to a demo environment?

Related topics