Export millions of documents

Hi all,

we have a bucket of 400M+ documents on Couchbase 7.2.3 instance and we need to export them to an external disk as a backup.

I tried to use cbexport, but I’ve seen some limitations:

  1. No metadata are exported, hence the document ID isn’t exported.
  2. The scope/collection fields should be added to the documents, adding fields not handled by the application using those documents.

Is there something that I’m missing?
Is there a better way to perform this kind of activity?

Adding a bit of context:
The Couchbase instance is deployed on a Kubernetes Cluster.
I’m using another pod that is not part of the cluster, but has the same image version, in order to attach the disk and use the cbexport utility.

to an external disk as a backup

Maybe use cbbackupmgr? https://docs.couchbase.com/server/current/manage/manage-backup-and-restore/manage-backup-and-restore.html

1 Like

At the end, we transferred the content of the backup disk (generated by the cbbackupmgr tool) to the external hdd.

We’ll then use that disk to restore the data with the cbbackupmgr restore command.

@fabiosst With the export command, try using the –include-key ‘meta().id’ parameter to get the document ID. The image shows the export of the travel-sample data and the ID is in the export. Perhaps that will work for you.

–include-key
Couchbase stores data as key value pairs where the value is a JSON document and the key is an identifier for retrieving that document. By default cbexport will only export the value portion of the document. If you wish to include the key in the exported document then this option should be specified. The value passed to this option should be the field name that the key is stored under. If the value passed already exists as a field in the document, it will be overridden with the key. If the JSON document is not an object it will be turned into one and the value added to a field named ‘value’. If the key value passed is ‘value’ then the key will not be written. It will display a warning for any document it has converted into an object.

/opt/couchbase/bin/cbexport json -c couchbase://127.0.0.1 --u Administrator -p YourPasswordHere --no-ssl-verify -b archive -o /tmp/lines.json -f lines -t 4 --include-key ‘meta().id’

1 Like