I’m evaluating Couchbase vs. MongoDB for storing JSON documents.
-
Is there a way to index documents real-time? For example, if a user writes the data and a query is issued using index, I want to get the recently updated data.
-
Here is a scenario. I have millions of documents like below:
{ userid:1234, schemaid:87777, recordid:7797, field1:“fff”, field2:“dddd”,…}
Fields will be variable based on schemaid but userid, schemaid & recordid will be present always. In most cases, I want to query for a specific userid, schemaid and fetch all matched documents with all its fields. I assume that an index is needed for performance. If I have to create index, can I return the entire doc in value field of map() function? What is the perf and storage impact of this approach?
Alternative is, I can move userid, schemaid, recordid properties to the key. In this case, can I still query on a substring with wildcard something like {userid:1234,schemaid:87777,recordid:% and get matched records? I’m aware of the memory footprint of the key.
Thanks.
I assume this is the right DL monitored by Couchbase engineers. It will be helpful to know the answer before moving forward.
I don’t fully understand your questions, but I can address your first question:
Is there a way to index documents real-time?
No, indexes are defined before hand and queries are made in real time.
With Couchbase views, you can either define separate indexes for each document field or define a grouped index, like in the case of time:
// grouped key
if (doc.date)
{
var date = new Date(doc.date);
emit([date.getFullYear(), date.getMonth(), date.getDate()], null);
}
See link below on how to query.
http://www.couchbase.com/docs//couchbase-manual-2.0/couchbase-views-writing-querying-grouping.html
Also, N1QL is a new feature that is in developer preview. It is an SQL-like query language for Couchbase. They use real time algorithms in addition to predefined indexes to make dynamically generated, complex queries.
http://www.couchbase.com/communities/n1ql
UPDATE (for comments below)
-
There is a stale parameter for queries, which has three options:
stale=ok - Stale views are OK.
stale=false - Waits for view to be updated before returning results.
stale=update_after - (default) Returns immediately available results, but triggers an update to occur after results are returned.
http://docs.couchbase.com/couchbase-manual-2.2/#couchbase-views-writing-stale
-
N1QL DP3 is supposed to come out in March. I don’t know when a production ready build will be released, but the answer at the link below gives good insight.
http://www.couchbase.com/communities/q-and-a/couchbase-commitment-n1ql
Thanks, here is my followup questions:
- Is there a way to index documents real-time? – I should not see any stale data from index.
- Any ETA on when N1QL will be released?