I have a demo environment that we use for testing and development of our mobile app and server side web app.
It makes sense to clone the current production environment now and again with a “clean” set of updated data - and without having a lot of sync/repl. history floating around…
So I have previously done a full backup (bucket name is “data”):
/opt/couchbase/bin/cbbackup couchbase://db1.prod-env.dk:8091 /home/couchbase/backup -b data -u xxxx -p yyyy
And then restored it on the test server:
/opt/couchbase/bin/cbrestore /home/couchbase/backup couchbase://db2.test-env.dk:8091 -b data -u xxxx -p yyyy
There is also a Sync.Gateway connected to the test environment (not in the production environment yet).
So this leads me to the following questions for best practice:
Should I delete the data bucket from the test server prior to restoring?
If so - should I do this on both my nodes in the the cluster?
Should I do anything on the Sync.Gateway server prior to the restore?
I guess I should do a resync on the Sync.Gateway server after the restore?
Anything else I need to do to get a “clean” start on the test environment (loosing e.g. repl. history)?
Hi @jda and apologies for the delay in getting back to you.
Backup and Restore this way is a perfectly acceptable approach to getting a demo/dev environment and you shouldn’t necessarily need to do anything else. You could also consider using XDCR from your production to development/testing to keep a copy up to date. Note that you can ‘pause’ XDCR without any ill effects.
To answer your questions:
Should I delete the data bucket from the test server prior to restoring?
[pk] - Most likely, yes…though it’s not strictly necessary. A ‘restore’ in Couchbase is more akin to a merge…it will overwrite documents that already exist, but it won’t delete documents that are in the bucket and not in the backup. So a delete and recreate of the bucket will give you the cleanest state. Note also that our Enterprise Edition backup tool has more options for backing up different components, managing backup archives, performing incremental backups, etc.
If so - should I do this on both my nodes in the the cluster?
[pk] - If the nodes are in the same cluster, there is no need to manage them separately. Deleting and recreating the bucket from anywhere will act on the cluster as a whole.
Should I do anything on the Sync.Gateway server prior to the restore?
[pk] - In theory no, but shutting it down first will give you the cleanest behavior. Otherwise, as you do the restore, SGW will be trying to import and resync those items as they come in. In a dev environment this probably doesn’t matter, but if you had clients or applications accessing SGW they would see a stream of documents and sync activity.
I guess I should do a resync on the Sync.Gateway server after the restore?
[pk] - SGW should take care of all of this for you either when you start it back up again or incrementally as you are loading data.
Anything else I need to do to get a “clean” start on the test environment (loosing e.g. repl. history)?
[pk] - Depends a bit on how “clean” you want it to be. Keeping replication history is actually a necessary part of managing a multi-master system and also shouldn’t interfere with the functioning of your demo application. So unless you’re looking to accomplish something specific, I would suggest leaving everything as the defaults out of the box.
Thanks for reaching out and asking these questions, are you seeing any behavior that you think is not expected?
Thanks for getting back! Yes, I was starting to wonder if it was a silly question - or a difficult one
Thanks for the feedback - and no, I’m not experiencing problems right now. I’ m just trying to get to know the platform better. S0 after having done a lot of testing in the demo/dev environments then sometimes it is nice to have a clean start before testing new features.
So I guess from your answer that I can take it that the “replication history” is in the Couchbase server - and not in the Sync.Gateway. Then it makes sense to do like you describe.
No, you’ll need to take a backup of the whole cluster if you want all the data. Couchbase automatically and transparently shards all of the buckets across all of the nodes with the data service enabled on them.
So there is no compression in the backup? It appears to be more or less the same size as the database size in the web database console… That’s why I thought I only had to take one of the two nodes…
There’s also compression on the server-side in the Enterprise Edition.
Keep in mind that the Enterprise Edition is free to download and develop on so if you think you’d like to take advantage of some of those capabilities, you can do so now and then get a license when you’re ready to go into production.
The customer for this first project on Couchbase is a University with (very) limited economical resources… Which is why the Community Edition in this case is a very good match
But I plan on working with this platform ahead on my future projects
Could be… - but not the next couple of months as I have a really hard deadline that has also been challenged by some of the other issues I have reported here (not this one)