I have a json file with around 6mn enteries. I am using java sdk for writing to couchbase, but writing one entry at a time is taking hours. I am fairly new to couchbase and wanted to know what is best way for speeding up the write to db. Would leveraging spark help in this case. Do you have any sample programs for pyspark and couchbase. Any other suggestions would also be really great
Thank you for the reply. My data get updated frequently (every week) and have some other data which gets updated on daily basis. So I need to push the data into cb as part of a script instead of using cbimport to pull the data into couchbase. Also the records need to updated/overwritten
Is probably a good place for you to start and attempt to ingest and submit a batch of JSON documents from your file at a time, rather than one at a time.