Load bucket data into json file

sagarkvs · April 4, 2022, 11:13am

I have a requirement to load buckets data into multiple JSON files with 100 documents in each file.

I have tried cbtransfer but it copies the into csv file as a whole server specific, which contains duplicate records.

can someone suggest how I can achieve this.

dh · April 5, 2022, 8:44am

You could, if you have a Query node, use the REST API with an ordered SELECT * query with OFFSET and LIMIT clauses to segment your data.

If you are on Linux/MacOS and you have a tool such as ‘jq’ installed to trim additional output, a script like this would achieve your aim:

$ cat t.sh
#!/usr/bin/bash
S="SELECT count(1) c FROM \`travel-sample\` t"
C=`curl -su Administrator:password -d "metrics=false&statement=${S}" http://localhost:8093/query/service| jq .results[0].c`
for ((i=0;i<$C;i+=100))
do
    curl -su Administrator:password -d "metrics=false&statement=SELECT t.* FROM \`travel-sample\` t ORDER BY meta().id OFFSET $i LIMIT 100" http://localhost:8093/query/service| jq .results > export_$((i/100)).out
done

(I’m not saying this is the best way, just a way to achieve your aim - each export file containing an anonymous array of documents.)

HTH.

jon.strabala · April 5, 2022, 7:10pm

You could try the following two commands

cbexport json -c couchbase://127.0.0.1 -u $CB_USERNAME -p $CB_PASSWORD -f lines -b source_bucket -o all_output.json

split -d -a 10 -l 100 all_output.json

I think you can avoid the intermediate file all_output.json you can do the following if your system supports /dev/stdout and use a pipeline

cbexport json -c couchbase://127.0.0.1 -u $CB_USERNAME -p $CB_PASSWORD -f lines -b source_bucket -o /dev/stdout | egrep -v '(^$)' | split -d -a 10 -l 100

You can also speed things up if you have lots of CPU cores by adding -t 16 to the cbexport command.

Topic		Replies	Views
Loading json into buckets Couchbase Server	7	4074	November 10, 2016
Loading a JSON file with 1000 documents into a new Database Couchbase Server	2	4086	February 10, 2018
How to use curl to export/import all documents from/to a bucket as json (including id)? Couchbase Server rest	16	2499	September 22, 2022
Export data from bucket Couchbase Server	4	402	December 17, 2023
Unable to bulk load JSON Couchbase Server	4	2830	July 14, 2015

Load bucket data into json file

Related topics