Delay in Full text search adding and getting the data

Hello,
I have an api backed by a full text search node. The scenario is that

  • User added data from the api , it goes to couchbase bucket backed by FTS
  • In less than 4 sec, user calls the api again to get the data using the Search SDK in couchbase. It doesnt come . But there is a slight delay in getting the data. The data is eventually visible.

I using java sdk , and i am assuming in java it provides only NOT_BOUNDED in SearchScanConsistency.java .

However in scala according to here Full Text Search (FTS) | Couchbase Docs it does have a Read-Your-Own-Writes (RYOW) consistency

and since the reading is done with the a separate api , we dont have the mutation state , so we cannot actually use consistentWith() as well :frowning: .

Is there something similar in java… Wondering why does one language has that functionality and the other doesnt?

Will switching to N1QL for FTS will solve that issue for me ?

Regards

Hey @herat_acharya,

RYOW is available in Java SDK as well - Search | Couchbase Docs

And if you have done proper sizing guidelines for the cluster, then the KV mutations should be visible in the Search index/service pretty quickly. One basic sanity you could ensure is giving enough RAM quota for the FTS service.

Note - To get the latest updates from the SDK team, please always reach out to the relevant SDK forums. (Java SDK - Couchbase Forums)

Cheers!

Thanks for the reply, @sreeks
The way ryow scan consistency is achieved in java sdk is by actually getting the mutation token.

As I mentioned earlier the writes happen completely differently in a separate api than the reads, how would I be able to get a mutation token, and the reads are basically getting a lot of data that has been written in the fts index at different points in time. How would I be able to use the consistentWith() functionality without having a mutation token. And since the search query returns bulk result would the mutation token work in that way.

From the example in the link it appears the read happens immediately after write, hence the mutation state is available. Which is not my scenario.

Also Is the mutation state per record entry? How does it work with bulk read?

It would be easier like N1QL Search scan consistency class would have another enum like AT_PLUS, that would wait for the index to be uptodate.

Regards

Yes, at_plus semantics are yet to be supported by FTS.
We shall try to prioritize that.

at_plus (RYOW) is supported by FTS and via N1QL SEARCH.
scan_plus/request_plus aren’t available for FTS though, be it directly or via N1QL.

1 Like

Yikes, My bad. I was trying to say request_plus. FTS doesn’t support request_plus at the moment.

And your concern about RYOW/reading the latest KV state would be best addressed by request_plus.

1 Like

Thank you @sreeks @abhinav … Just so that i am on the right path and understanding correctly from what you have mentioned above? , AT_PLUS is only supported by FTS using N1QL search? Or can we also do it in java sdk without using consistentWith , since i dont have the mutation state.

https://docs.couchbase.com/server/current/fts/fts-consistency.html Is this the right way to do AT_PLUS ??

Essentially how can we pass in at plus to my search query, My server version is 6.6 and i am using java sdk version 3.3.2

Regards,
Herat

AT_PLUS is supported by FTS - when queried directly or via N1QL.

Here’s an example of how the search request would look - Search Request JSON Properties | Couchbase Docs

While running this from N1QL, you will have the option to embed the entire search request as part of the SEARCH() arguments, ref: Search Functions | Couchbase Docs

I’m fairly certain you can use the AT_PLUS capability using the java SDK. If it’s not documented here already - Search | Couchbase Docs , let’s ping @daschl for some guidance on it.

Thank you @abhinav While looking at SearchScanConsistency.java , I could see only NOT_BOUNDED option. Thats why i was confused as to why Scala SDK has AT_PLUS and Java SDK doesnt …

I would not have either consistency Vectors as mentioned here in the Consistency docs for FTS or mutation state :frowning: , These 2 are the only documented way i could find. :frowning:

package com.couchbase.client.java.search;

/**
 * An enum listing the various consistency levels for FTS searches
 * that don't need additional parameters (like a mutation token vector).
 *
 * @author Simon Baslé
 * @since 2.3
 */
public enum SearchScanConsistency {

    /**
     * This is the default, the indexer does not wait for certain index updates until it returns the current hits.
     *
     * <p>This is also the fastest mode, because we avoid the cost of obtaining the vector,
     * and we also avoid any wait for the index to catch up to the vector.</p>
     */
    NOT_BOUNDED {
        @Override
        public String toString() {
            return "";
        }
    }

}

Hi @herat_acharya.

Just for the record, to use AT_PLUS with Java SDK, pass the mutation tokens to SearchOptions.consistentWith. As you’ve said, this won’t solve your problem since you don’t have the mutation tokens.

It sounds like you want something like N1QL’s REQUEST_PLUS mode, which waits for the indexer to catch up with the current state of the Key/Value service before executing the query. This is not currently supported by FTS. We’re tracking this feature request as MB-18428.

Thanks,
David

Thank you @david.nault … do you know when this will be taken up … I have read a few posts about having RYOW , it would be a good addition to the mix since N1QL using GSI indexes already support it.

As a temporary solution is there a way to get the mutation state or the consistency vectors, completely asnychrounos to the writes that i am doing? meaning

If i do the writes in T0 time … can i have the latest mutation state/ vectors in T1 time by querying the dB?

do you know when this will be taken up

I’m not involved in FTS planning :sweat_smile:. If you add yourself as a watcher on the Jira issue, you’ll receive email notifications as the status changes.

If your organization has a Couchbase Enterprise subscription, it wouldn’t hurt to file a support ticket expressing interest in the feature.

Thanks,
David

Possible workaround: Since MutationState is thread-safe, and only remembers the highest sequence number for each partition, you could share a single static instance, update it with the mutation token from every KV write, and pass it to every FTS query where you want “request_plus” semantics. A limitation is that you wouldn’t necessarily see changes made by other processes. You might also take a performance hit, since you’d be updating the same concurrent map instance after every KV write.

Thats an interesting idea @david.nault
For now i realized that we had kept a default RAM settings for FTS node which is 512mb … which is very less, hence we were seeing a delay … Our instance allowed us to bump the RAM to 20 gb, which made it a lot quicker and very negligible delay, but this is just the dev environment, when this goes to production we expect a lot of calls through our api, so the RAM might not be enough. That being said we do have couchbase enterprise subscription since a lot of our teams are using it, but we are the first ones to try FTS, we have raised a request through the couchbase support tickets.

@david.nault Regarding your idea, I would have to maintain the mutationState static variable in a concurrent map for each partition is it not? or just a single static instance of mutationState would do ?

Regards,
Herat

A single static instance would do.