Hi CB Team,
We have a scenario where we want to index (using FTS) documents that contains both static and dynamic fields.
Static fields: Common fields across all the documents. Ex: make, model, country.
Dynamic fields: They are not common across all the documents. Ex: mpg (car), rateOfClimb(aircraft), stroke(ship)
Sample document to illustrate the scenario:
{
"id": "vehicle::1000",
"type": "vehicle",
"category": "car",
"make": "Hyundai",
"model": "Sonata",
"country": "South Korea",
"airbags": true,
"engineType": "gasoline",
"horsepower": 500,
"mpg": 30
}
{
"id": "vehicle::1001",
"type": "vehicle",
"category": "aircraft",
"make": "Boeing",
"model": "747",
"country": "USA",
"engine": "Rolls-Royce",
"thrust": 59450,
"range": 4000,
"rateOfClimb": 6000
}
{
"id": "vehicle::1002",
"type": "vehicle",
"category": "ship",
"make": "Marine Shipbuilding Co",
"model": "Coral Princess",
"country": "France",
"stroke": 2500,
"propeller": "screw",
"pistonSpeed": 10
}
Note: We will be using each and every field in the index creation.
Note: We will be running search queries that are text, number or range based on any field.
Note: We would like to keep the JSON document as flat as possible. i.e. no nested fields.
In our research, we found out that the "default type mapped” index would be a perfect fit for all of our functional requirements. But the concern is on non-functional side (performance and scalability). According to Couchbase’s FTS best practices and optimization guide - “The default dynamic mapping produces larger indexes and is potentially unsuitable for production deployments.” (Refer: Full-Text Search Indexing Best Practices & Tips - Part 1).
Questions:
-
Is "default type mapped” index a production worthy solution when you have dynamic fields, have to index every field and have few tens of millions documents in the bucket?
-
If not, what is the best approach to deal with this situation?
-
If yes, do you have any performance numbers or metrics that you could share us with?
Thanks,
Vishnu