Can I index the Meta object

We are engineering an ETL pipeline from CB.

I need fast access to doc keys and doc CAS (in order to know what doc has changed). Basically, we will maintain a mapping of keys to CAS and when the ETL runs we will compare and see if the CAS and changed to know if there is any work to do…

What is the best way to get this information efficiently.

I am thinking to do an index like this

('type', meta) to support select meta().id, meta().cas from default where type = x

Can I index the Meta?
How would I write this index?
Is there an obvious better way to do what I’m doing?

Thank you for you help.

Only indexable are meta().id, meta().cas, meta().expiration

https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/indexing-meta-info.html

NOTE: index is eventual consistent, might not have up to date info.
create index ix1 on default(type, meta().cas);

1 Like

Hi Naftali,

If your goal is to do some processing whenever a document changes, the Eventing service might be useful. Or possibly the Couchbase Kafka connector, if you’ve got a Kafka broker handy.

Thanks,
David

1 Like

Hi @naftali,

In recent versions of Couchbase the leading characters of the CAS string is like a timestamp in millis since epoch. You might use this to your advantage.

For example if you used Eventing (or the Couchbase Kafka connector) you could ignore the older CAS values prior to your last ETL run. For example assume I only want data changed in the most recent hour in Eventing I could set LOOKBACKSEC to 3600. Here we could run an Eventing Function with a feed boundary of Everything.

 function OnUpdate(doc, meta) {
   if (LOOKBACKSEC > 0) {
        var cutoff_millis = Date.now() - LOOKBACKSEC * 1000;
        var doc_millis = parseInt(meta.cas.substring(0,13));
        if (doc_millis < cutoff_millis) return;
    }
    // ..... your code here the document changed in the last hour ....
}

You deploy it and just leave the eventing function “live” and it will emit anything newer than an hour and continue to emit current values on changes (subject to DCP dedup).

In the above Eventing function you might use the curl() function to emit changed items to an external REST endpoint. Refer to Function: Basic cURL POST | Couchbase Docs to use the curl() call.

Note no indexes are required but you need a fast REST endpoint and a lot of workers for your Eventing function is you have lots of changes (or mutations) to emit.

Best

Jon Strabala
Principal Product Manager - Server‌

1 Like

amazing feature. thanks for sharing.

i was facing same problem, but after see your reply i solve my problem so thank you for sharing your reply this is really helpful for me

2 Likes

i was facing same problem..