They are about the connection between Hadoop and Couchbase. Does anyone of you know if this has anything to do with the Lambda architecture or if and how it could fit into it. Or is it a kind of an own architecture?
Let’s say the common components are Couchbase, Kafka, Spark / Storm, and Hadoop. They can be arranged in any order, and the data can flow in any direction. Lambda architecture is just one arrangement and flow, and it is a solution to a very specific problem. We see lots of arrangements and flows.
Kafka to Storm/Spark to Couchbase to Hadoop
Couchbase to Kafka to Hadoop
Log to Hadoop to Couchbase
Couchbase to Hadoop to Couchbase
The first one is what the big data site focuses on. However, you can create your own architecture by reordered components or add/removing them. That’s the fun part.
Unfortunately I do not understand why or in which cases it should be useful to replicate data between Couchbase and Hadoop in both directions. For me it would make sense to see the data flow from Couchbase to Hadoop or vice versa but not in both directions. Could you explain me why and how this could be useful?