Hi team,
I am facing an issue when I am trying to export data from a bucket using cbtransfer tool. My environment is as follows:
Couchbase version: Version: 2.1.1 community edition (build-764)
Cluster details:3 node couchbase cluster
One of the buckets is DIMENSION_METADATA. At the moment there are only 19 documents in this bucket.
I am trying to use cbtransfer tool to export the data as csv file. I was under the impression that this tool would export all the data belonging to that bucket from all the couchbase nodes into the specified csv file. From my experiments I felt that the tool is clearing the file before appending the contents from each node. In the following example I had expected 19+1(header of csv) lines in my csv output file but received only 5+1(including header) lines in my output file.
EXAMPLE1*****************************************************START
[root@smaster1 ~]# /opt/couchbase/bin/cbtransfer http://localhost:8091 csv:/tmp/dm_1.csv -b DIMENSION_METADATA -u Administrator -p password -v
2016-04-01 15:48:55,403: mt cbtransfer…
2016-04-01 15:48:55,404: mt source : http://localhost:8091
2016-04-01 15:48:55,404: mt sink : csv:/tmp/dm_1.csv
2016-04-01 15:48:55,404: mt opts : {‘username’: ‘’, ‘source_vbucket_state’: ‘active’, ‘destination_vbucket_state’: ‘active’, ‘verbose’: 1, ‘dry_run’: False, ‘extra’: {‘max_retry’: 10.0, ‘rehash’: 0.0, ‘data_only’: 0.0, ‘nmv_retry’: 1.0, ‘conflict_resolve’: 1.0, ‘cbb_max_mb’: 100000.0, ‘try_xwm’: 1.0, ‘batch_max_bytes’: 400000.0, ‘report_full’: 2000.0, ‘batch_max_size’: 1000.0, ‘report’: 5.0, ‘design_doc_only’: 0.0, ‘recv_min_bytes’: 4096.0}, ‘single_node’: False, ‘bucket_destination’: None, ‘destination_operation’: None, ‘vbucket_list’: None, ‘threads’: 4, ‘key’: None, ‘password’: ‘’, ‘id’: None, ‘bucket_source’: ‘DIMENSION_METADATA’}
2016-04-01 15:48:55,531: mt bucket: DIMENSION_METADATA
2016-04-01 15:48:55,626: w0 source : http://localhost:8091(DIMENSION_METADATA@50.50.50.127:8091)
2016-04-01 15:48:55,627: w0 sink : csv:/tmp/dm_1.csv(DIMENSION_METADATA@50.50.50.127:8091)
2016-04-01 15:48:55,627: w0 : total | last | per sec
2016-04-01 15:48:55,627: w0 batch : 22 | 22 | 231.5
2016-04-01 15:48:55,627: w0 byte : 123 | 123 | 1294.3
2016-04-01 15:48:55,627: w0 msg : 7 | 7 | 73.7
2016-04-01 15:48:55,674: w0 source : http://localhost:8091(DIMENSION_METADATA@smaster2.product.company.com:8091)
2016-04-01 15:48:55,674: w0 sink : csv:/tmp/dm_1.csv(DIMENSION_METADATA@smaster2.product.company.com:8091)
2016-04-01 15:48:55,674: w0 : total | last | per sec
2016-04-01 15:48:55,674: w0 batch : 15 | 15 | 318.9
2016-04-01 15:48:55,674: w0 byte : 110 | 110 | 2338.6
2016-04-01 15:48:55,674: w0 msg : 7 | 7 | 148.8
2016-04-01 15:48:55,720: w0 source : http://localhost:8091(DIMENSION_METADATA@snfs2.product.company.com:8091)
2016-04-01 15:48:55,720: w0 sink : csv:/tmp/dm_1.csv(DIMENSION_METADATA@snfs2.product.company.com:8091)
2016-04-01 15:48:55,721: w0 : total | last | per sec
2016-04-01 15:48:55,721: w0 batch : 10 | 10 | 218.3
2016-04-01 15:48:55,721: w0 byte : 83 | 83 | 1811.9
2016-04-01 15:48:55,721: w0 msg : 5 | 5 | 109.2
[####################] 100.0% (19/19 msgs)
bucket: DIMENSION_METADATA, msgs transferred…
: total | last | per sec
batch : 47 | 47 | 219.0
byte : 316 | 316 | 1472.3
msg : 19 | 19 | 88.5
2016-04-01 15:48:55,750: mt warning: cannot save bucket design on a CSV destination
done
[root@smaster1 ~]# wc -l /tmp/dm_1.csv
6
EXAMPLE1*****************************************************END
However if I try to export the data on per-node basis using the --single-node parameter, then I can find all keys in the csv files generated (although the records are in different files).
EXAMPLE2*****************************************************START
[root@smaster1 ~]# /opt/couchbase/bin/cbtransfer http://50.50.50.127:8091 csv:/tmp/dm_2.csv -b DIMENSION_METADATA -u Administrator -p password -v --single-node
2016-04-01 16:03:32,393: mt cbtransfer…
2016-04-01 16:03:32,393: mt source : http://50.50.50.127:8091
2016-04-01 16:03:32,393: mt sink : csv:/tmp/dm_2.csv
2016-04-01 16:03:32,393: mt opts : {‘username’: ‘’, ‘source_vbucket_state’: ‘active’, ‘destination_vbucket_state’: ‘active’, ‘verbose’: 1, ‘dry_run’: False, ‘extra’: {‘max_retry’: 10.0, ‘rehash’: 0.0, ‘data_only’: 0.0, ‘nmv_retry’: 1.0, ‘conflict_resolve’: 1.0, ‘cbb_max_mb’: 100000.0, ‘try_xwm’: 1.0, ‘batch_max_bytes’: 400000.0, ‘report_full’: 2000.0, ‘batch_max_size’: 1000.0, ‘report’: 5.0, ‘design_doc_only’: 0.0, ‘recv_min_bytes’: 4096.0}, ‘single_node’: True, ‘bucket_destination’: None, ‘destination_operation’: None, ‘vbucket_list’: None, ‘threads’: 4, ‘key’: None, ‘password’: ‘’, ‘id’: None, ‘bucket_source’: ‘DIMENSION_METADATA’}
2016-04-01 16:03:32,521: mt bucket: DIMENSION_METADATA
2016-04-01 16:03:32,575: w0 source : http://50.50.50.127:8091(DIMENSION_METADATA@50.50.50.127:8091)
2016-04-01 16:03:32,575: w0 sink : csv:/tmp/dm_2.csv(DIMENSION_METADATA@50.50.50.127:8091)
2016-04-01 16:03:32,575: w0 : total | last | per sec
2016-04-01 16:03:32,575: w0 batch : 15 | 15 | 279.7
2016-04-01 16:03:32,575: w0 byte : 123 | 123 | 2293.9
2016-04-01 16:03:32,575: w0 msg : 7 | 7 | 130.5
[####################] 100.0% (7/7 msgs)
bucket: DIMENSION_METADATA, msgs transferred…
: total | last | per sec
batch : 15 | 15 | 160.8
byte : 123 | 123 | 1318.4
msg : 7 | 7 | 75.0
2016-04-01 16:03:32,617: mt warning: cannot save bucket design on a CSV destination
done
[root@smaster1 ~]# wc -l /tmp/dm_2.csv
8 /tmp/dm_2.csv
[root@smaster1 ~]# /opt/couchbase/bin/cbtransfer http://smaster2.product.company.com:8091 csv:/tmp/dm_3.csv -b DIMENSION_METADATA -u Administrator -p password -v --single-node
2016-04-01 16:02:04,811: mt cbtransfer…
2016-04-01 16:02:04,811: mt source : http://smaster2.product.company.com:8091
2016-04-01 16:02:04,811: mt sink : csv:/tmp/dm_3.csv
2016-04-01 16:02:04,811: mt opts : {‘username’: ‘’, ‘source_vbucket_state’: ‘active’, ‘destination_vbucket_state’: ‘active’, ‘verbose’: 1, ‘dry_run’: False, ‘extra’: {‘max_retry’: 10.0, ‘rehash’: 0.0, ‘data_only’: 0.0, ‘nmv_retry’: 1.0, ‘conflict_resolve’: 1.0, ‘cbb_max_mb’: 100000.0, ‘try_xwm’: 1.0, ‘batch_max_bytes’: 400000.0, ‘report_full’: 2000.0, ‘batch_max_size’: 1000.0, ‘report’: 5.0, ‘design_doc_only’: 0.0, ‘recv_min_bytes’: 4096.0}, ‘single_node’: True, ‘bucket_destination’: None, ‘destination_operation’: None, ‘vbucket_list’: None, ‘threads’: 4, ‘key’: None, ‘password’: ‘’, ‘id’: None, ‘bucket_source’: ‘DIMENSION_METADATA’}
2016-04-01 16:02:04,945: mt bucket: DIMENSION_METADATA
2016-04-01 16:02:05,005: w0 source : http://smaster2.product.company.com:8091(DIMENSION_METADATA@smaster2.product.company.com:8091)
2016-04-01 16:02:05,005: w0 sink : csv:/tmp/dm_3.csv(DIMENSION_METADATA@smaster2.product.company.com:8091)
2016-04-01 16:02:05,005: w0 : total | last | per sec
2016-04-01 16:02:05,005: w0 batch : 15 | 15 | 251.2
2016-04-01 16:02:05,005: w0 byte : 110 | 110 | 1842.0
2016-04-01 16:02:05,005: w0 msg : 7 | 7 | 117.2
[####################] 100.0% (7/7 msgs)
bucket: DIMENSION_METADATA, msgs transferred…
: total | last | per sec
batch : 15 | 15 | 152.2
byte : 110 | 110 | 1116.1
msg : 7 | 7 | 71.0
2016-04-01 16:02:05,046: mt warning: cannot save bucket design on a CSV destination
done
[root@smaster1 ~]# wc -l /tmp/dm_3.csv
8 /tmp/dm_3.csv
[root@smaster1 ~]# /opt/couchbase/bin/cbtransfer http://snfs2.product.company.com:8091 csv:/tmp/dm_4.csv -b DIMENSION_METADATA -u Administrator -p password -v --single-node
2016-04-01 16:00:13,267: mt cbtransfer…
2016-04-01 16:00:13,267: mt source : http://snfs2.product.company.com:8091
2016-04-01 16:00:13,267: mt sink : csv:/tmp/dm_4.csv
2016-04-01 16:00:13,267: mt opts : {‘username’: ‘’, ‘source_vbucket_state’: ‘active’, ‘destination_vbucket_state’: ‘active’, ‘verbose’: 1, ‘dry_run’: False, ‘extra’: {‘max_retry’: 10.0, ‘rehash’: 0.0, ‘data_only’: 0.0, ‘nmv_retry’: 1.0, ‘conflict_resolve’: 1.0, ‘cbb_max_mb’: 100000.0, ‘try_xwm’: 1.0, ‘batch_max_bytes’: 400000.0, ‘report_full’: 2000.0, ‘batch_max_size’: 1000.0, ‘report’: 5.0, ‘design_doc_only’: 0.0, ‘recv_min_bytes’: 4096.0}, ‘single_node’: True, ‘bucket_destination’: None, ‘destination_operation’: None, ‘vbucket_list’: None, ‘threads’: 4, ‘key’: None, ‘password’: ‘’, ‘id’: None, ‘bucket_source’: ‘DIMENSION_METADATA’}
2016-04-01 16:00:16,194: mt bucket: DIMENSION_METADATA
…2016-04-01 16:00:16,247: w0 source : http://snfs2.product.company.com:8091(DIMENSION_METADATA@snfs2.product.company.com:8091)
2016-04-01 16:00:16,248: w0 sink : csv:/tmp/dm_4.csv(DIMENSION_METADATA@snfs2.product.company.com:8091)
2016-04-01 16:00:16,248: w0 : total | last | per sec
2016-04-01 16:00:16,248: w0 batch : 11 | 11 | 207.5
2016-04-01 16:00:16,248: w0 byte : 83 | 83 | 1565.6
2016-04-01 16:00:16,248: w0 msg : 5 | 5 | 94.3
[####################] 100.0% (5/5 msgs)
bucket: DIMENSION_METADATA, msgs transferred…
: total | last | per sec
batch : 11 | 11 | 159.4
byte : 83 | 83 | 1202.4
msg : 5 | 5 | 72.4
2016-04-01 16:00:16,266: mt warning: cannot save bucket design on a CSV destination
done
[root@smaster1 ~]# wc -l /tmp/dm_4.csv
6 /tmp/dm_4.csv
EXAMPLE2*****************************************************END
If we take the count of records, excluding the headers, in dm_2.csv, dm_3.csv and dm_4.csv, then we will get 19.
Can you please confirm whether the behaviour of EXAMPLE1 is correct?
If it is not correct, could you please tell me whether this behaviour is different on newer releases of couchbase?
Thanks,
Richards.