FTS Custom Analyzer help

naftali · June 11, 2020, 4:52pm

We index a document type=user (say).

We index the email field.

We want to support non exact matching for which the standard Analyzer works.

I’m assuming that I can index the same field twice using a different Analyzer and providing an alternative ‘searchable as’ id. Is this a correct assumption?

But we also want to support exact matches.
Simple wont work for that because it breaks up the address into terms.
Keyword also wont work because I we need me@example.com to match Me@ExampLe.COm.

Thus I’m trying to create keyword anaylzer with a to_lower step, but it’s not clear to me from the UI options how I can do that. I’m not sure what combination of drop down character filters/tokenizer/token filter I select to get that results.

From the definition of standard I am hopeful that unicode + to_lower without stop will do the trick?:
UPDATE: According to this http://analysis.blevesearch.com/analysis that doesn’t work sadly.

standard : Analysis by means of the Unicode tokenizer, the to_lower token filter, and the stop token filter.

abhinav · June 11, 2020, 5:04pm

I’m assuming that I can index the same field twice using a different Analyzer and providing an alternative ‘searchable as’ id. Is this a correct assumption?

Yes this is allowed, note that you’ll have to use separate fields to search them however.

Also if it’s keyword with to_lower analysis you want to achieve, just create a custom_analyzer with the “single” tokenizer and the to_lower token filter - you should be set to match: [“me@example.com”, “Me@ExampLe.COm”].

naftali · June 11, 2020, 5:09pm

Thank you.

Testing it out with the awesome build your own and try it feature I did find that I have to combine the tokenizer=‘single’ with the token filter = ‘to_lower’ to get the result

Topic		Replies	Views
How to include all words to the search index, such as "from" "to" Full Text Search	6	400	November 8, 2023
Create custom analyzers for searching: non case sensitive names including special chars Full Text Search	4	889	April 14, 2022
FTS NOT matching on plurals or other characters Full Text Search	3	899	February 17, 2021
Any way to use standard analyzer & still include stop words? Full Text Search	7	5103	July 2, 2018
What is the best analyser config for an uuid "00000055-ed1a-46a9-9ab0-e766cea7e7e3" Full Text Search	3	414	June 18, 2023

FTS Custom Analyzer help

Related topics