FTS Search design for messages search

hiren.rojasara · August 16, 2022, 5:47am

Hi, i am developing one messaging kind of app in which there could be N groups and each group can have N messages and i am thinking of storing message as document in couchbase.

For that i am using one collections for that, called as “messages” and in which have we could have messages like,
doc key → msg/1
doc value → {
“msg” : “couchbase is Great”,
“groupId” :“group1”
}

doc key → msg/2
doc value → {
“msg” : “elasticsearch is great”,
“groupId” :“group2”
}

Now I want to provide search on group. So i was thinking to use type mapping on search index.
So type field will be “groupId” and in while creating search index i am specifying “messages.group1”. So it will be create search index for type group1 only.

So if i search on group1 with keyword “great” then it returns me msg with “msg1” only.

So here my question is that, is it good design that to create separate search index per group. Because in our system there could be more than 1 million groups. So there would be 1 million search index in system.

If i create one search index per collection, then for example “Hi” text could be found in multiple messages across multiple groups. So internally it would have long doc Id set against “Hi” keyword and it might affect the query result.

So what would be better design to solve this kind of scenario.

Thanks in advance.

sreeks · August 16, 2022, 6:23am

Hi,

A separate search index per group won’t be the recommendation at all as scaling the number indexes to order of 1000s won’t work smoothly.

If you create a single collection level search index, you could always scope down your queries to which ever group you are interested in using AND clauses Or CONJUNCT queries in Search. Meaning you could just add one more search clause for the groupID you are interested in.

Cheers!

hiren.rojasara · August 16, 2022, 9:44am

Hi @sreeks , Thanks for your reply.

I wanted to know that, How couchbase/bleve search performs conjuct operation. If you provide information on that would be very helpful.

For example in conjunct if i pass two queries
{ “match” :“group1” , “field” : “group” },
{ “prefix” :“god” , “field” : “msg” }

Assumption 1:
Both query gets executed in parallel or sequential order and intersect happens after getting the both query results and returns the results.

Assumption 2:
First query gets executed and second query applies filter only on results of first query and return the results.

sreeks · August 16, 2022, 10:37am

The internal bleve searchers would perform the intersects and fetch you results that satisfy both/all the given clauses in a conjunct query.

Assumption 1 is closer to what happens internally.
Bleve doesn’t have the explicit concept of filters in a query.

Cheers!

Topic		Replies	Views
Couchbase FTS best practice Full Text Search	1	1078	April 10, 2018
Creating index on specific document types Full Text Search	29	1837	February 5, 2021
Couchbase 6.0 Full-Text-Index definition Couchbase Server	2	627	October 11, 2019
Full-Text Search on a small subset of a data bucket Couchbase Server	7	1750	December 13, 2018
FTS partial phrase search Full Text Search	12	4567	February 15, 2019

FTS Search design for messages search

Related topics