FTS index rebuilding process

Nitesh_Gupta · March 1, 2021, 9:34am

Hi,
We are on CB version 6.0.2
I want to know about the process of index rebuilding while adding/removing a field from FTS field.
What does happen when we add/remove a field from FTS index? Does index rebuild happens in background (means while index rebuilding, can we still use our application to throw queries on search service)?
OR
Do We need to take downtime during index rebuild??

Note: during index rebuild, i don’t care for the new field addition. Even if it is giving data, we are okay. I mean will availability of search be there ? (at least it should return older response instead of empty response)

My second doubt:
Index rebuild can happen by means of 2 ways. Either by update/insert/delete of data or by change in definition of index. While index rebuild happening due to change in definition and data is also getting modified/added at the same time, will FTS be consistent after full index building process?

Thanks
Nitesh

sreeks · March 1, 2021, 12:55pm

Hi @Nitesh_Gupta,

Yes, Whenever the user edits the index definition, it results in an index rebuild except for a few cases in the recent releases. (>6.5).

Adding/removing replica partition count won’t result in index rebuild.
Any scorch storage property change won’t result in an index rebuild.

All other index definition changes result in a rebuild. During this time, live traffic would be affected.

Ideally, these shouldn’t be a major concern for the production systems since most of these fields to index/search would have been finalized during the dev period itself.

Use Index-Alias

Now, for accommodating any production time index maintenance tasks, we have the index-alias feature which users can use to manage the index rebuilds/recreations without affecting the live traffic.
ref - Create a Search Index | Couchbase Docs

Indexes never get rebuilds by DML/CRUD operations on the documents, it only gets rebuild upon index definition changes.
FTS index will always be streaming all the latest changes into it’s index and should give the latest results.

Users can use consistency level “at_plus” and provide vectors for verifying this.

Whenever we need to read your own writes(RYOW), 
then we are supposed to pass a consistency vector conveying that to the FTS back end. 
Today FTS supports only `at_plus` consistency level.

In short, looks like one has to use consistentWith(MutationState) in the API calls while searching where the MutationState is derived from the previous write ops which one wants to read.

examples:

MutationResult mutationResult = collection.upsert("key", JsonObject.create());
MutationState mutationState = MutationState.from(mutationResult.mutationToken().get());
 
SearchResult searchResult = cluster.searchQuery(
  "index",
  SearchQuery.queryString("query"),
  searchOptions().consistentWith(mutationState)
); 
https://docs.couchbase.com/java-sdk/current/howtos/full-text-searching-with-sdk.html

https://docs.couchbase.com/java-sdk/2.7/scan-consistency-examples.html

Nitesh_Gupta · March 1, 2021, 1:07pm

Thanks @sreeks for the explanation.

What are the scorch storage properties?

And as per the explanation, i can understand that we need to take downtime of index rebuilding process to avoid bad consistency?

sreeks · March 1, 2021, 1:46pm

No,
If you are not ready/don’t want to use the index-alias feature to guard against runtime maintenance of your production index(recommended practice), then such rebuilds can affect your consistency.

In all circumstances, clients with strict consistency requirements can decorate their queries with consistency levels and vectors to protect against such consistency worries.

Nitesh_Gupta · March 2, 2021, 11:08am

Hi,
What are scorch storage properties?

sreeks · March 2, 2021, 12:27pm

Those are the configuration knobs for changing the storage level properties of an index, eg: compaction aggressiveness.
We don’t expose them to users unless they have some issues out of the default configuration values.

We want the FTS users not to worry about those intricacies.

Topic		Replies	Views
Indexes being rebuilt on startup Full Text Search	2	776	August 14, 2020
What criteria(s) trigger the Full Text Search Index re-built? Full Text Search	10	2079	June 3, 2019
FTS index building repeats over and over again Full Text Search	1	740	June 10, 2020
Evaluating Full Text Search Full Text Search fts	26	2795	February 24, 2021
FTS index creation on datetime field causes continuous FTS service restart (v5b) Full Text Search	12	2648	July 27, 2017

FTS index rebuilding process

Use Index-Alias

Related topics