FTS Custom Analyzer help

We index a document type=user (say).

We index the email field.

We want to support non exact matching for which the standard Analyzer works.

I’m assuming that I can index the same field twice using a different Analyzer and providing an alternative ‘searchable as’ id. Is this a correct assumption?

But we also want to support exact matches.
Simple wont work for that because it breaks up the address into terms.
Keyword also wont work because I we need me@example.com to match Me@ExampLe.COm.

Thus I’m trying to create keyword anaylzer with a to_lower step, but it’s not clear to me from the UI options how I can do that. I’m not sure what combination of drop down character filters/tokenizer/token filter I select to get that results.

From the definition of standard I am hopeful that unicode + to_lower without stop will do the trick?:
UPDATE: According to this http://analysis.blevesearch.com/analysis that doesn’t work sadly.

standard : Analysis by means of the Unicode tokenizer, the to_lower token filter, and the stop token filter.

I’m assuming that I can index the same field twice using a different Analyzer and providing an alternative ‘searchable as’ id. Is this a correct assumption?

Yes this is allowed, note that you’ll have to use separate fields to search them however.

Also if it’s keyword with to_lower analysis you want to achieve, just create a custom_analyzer with the “single” tokenizer and the to_lower token filter - you should be set to match: [“me@example.com”, “Me@ExampLe.COm”].

Thank you.

Testing it out with the awesome build your own and try it feature I did find that I have to combine the tokenizer=‘single’ with the token filter = ‘to_lower’ to get the result

1 Like