FTS Search design for messages search

Hi, i am developing one messaging kind of app in which there could be N groups and each group can have N messages and i am thinking of storing message as document in couchbase.

For that i am using one collections for that, called as “messages” and in which have we could have messages like,
doc key → msg/1
doc value → {
“msg” : “couchbase is Great”,
“groupId” :“group1”
}

doc key → msg/2
doc value → {
“msg” : “elasticsearch is great”,
“groupId” :“group2”
}

Now I want to provide search on group. So i was thinking to use type mapping on search index.
So type field will be “groupId” and in while creating search index i am specifying “messages.group1”. So it will be create search index for type group1 only.

So if i search on group1 with keyword “great” then it returns me msg with “msg1” only.

So here my question is that, is it good design that to create separate search index per group. Because in our system there could be more than 1 million groups. So there would be 1 million search index in system.

If i create one search index per collection, then for example “Hi” text could be found in multiple messages across multiple groups. So internally it would have long doc Id set against “Hi” keyword and it might affect the query result.

So what would be better design to solve this kind of scenario.

Thanks in advance.

Hi,

A separate search index per group won’t be the recommendation at all as scaling the number indexes to order of 1000s won’t work smoothly.

If you create a single collection level search index, you could always scope down your queries to which ever group you are interested in using AND clauses Or CONJUNCT queries in Search. Meaning you could just add one more search clause for the groupID you are interested in.

Cheers!

Hi @sreeks , Thanks for your reply.

I wanted to know that, How couchbase/bleve search performs conjuct operation. If you provide information on that would be very helpful.

For example in conjunct if i pass two queries
{ “match” :“group1” , “field” : “group” },
{ “prefix” :“god” , “field” : “msg” }

Assumption 1:
Both query gets executed in parallel or sequential order and intersect happens after getting the both query results and returns the results.

Assumption 2:
First query gets executed and second query applies filter only on results of first query and return the results.

The internal bleve searchers would perform the intersects and fetch you results that satisfy both/all the given clauses in a conjunct query.

Assumption 1 is closer to what happens internally.
Bleve doesn’t have the explicit concept of filters in a query.

Cheers!