Hi Everyone
I am using the couches transport plugin to integrate with Elasticsearch, the guides state that you should not store the fields in elastic search, but instead get the keys only and then use multi-get to get the doc contents from couch.
I am not storing individual fields in elastic, however the “_source” field has the contents of the doc.
In my case the storage cost is only a small fraction of the elastic VM.
Is the only reason for this pattern saving on storage cost in elastic?
Storage cost aside, would storing the contents in elastic be more efficient (performance, less network calls and less code)
We make that recommendation in order that we’re letting both Elasticsearch and Couchbase play to their strengths. Elasticsearch is great for free text index and query while Couchbase is designed to be an operational database of record.
You probably also want to avoid having two copies of the same data, in different systems, with potential drift between the two.
As Matthew mentioned, Couchbase and ElasticSearch are optimized for different things. ElasticSearch doesn’t specifically cache the source documents, whereas Couchbase is fully optimized for storage and retrieval. It’s generally much more efficient to retrieve the source data from Couchbase than ElasticSearch. In particular, it’s usually much faster to do a query in ES to get the set of document keys you’re interested in and then retrieve those documents in bulk from Couchbase.
If you’re talking about small amounts of data, then it really doesn’t matter all that much. However, once you start scaling up (and out), the difference adds up quickly.