Hello,
I am using fts search on a field that represents person name. The name can contain diacritic symbols (é, è, ë, ñ, ø, ç, …). What analyser should I use for that field so if my CB contain below data:
“Hervé Villechaizé”
“Hérve Villechaize”
“Herve Villechaize”
“Hërve Villechaizè”
and user searches with simple English characters “Herve Villechaize”, system will return all 4 documents?
Does such analyser exist? Or what customizations should I do for existent analysers?
Thanks in advance,
Natalia
Thanks for pointing me out. I tried to implement and it works when I place search with English letters, the result will contain diacritics as well, but when I search with diacritic character “hervé” search is not returning anything.
I saw a reply around _all, but I did not get how to implement it. Can you please advice?
You are trying a “term” query there for the diacritic term.
Term queries are non-analytic. ref - Search Request JSON Properties | Couchbase Docs
So, it won’t apply query time text analysis with the custom analyser.
Indeed diacritic now it returns. But now I have another issue with spaces. If I would search for “Herve Villechaize” and in database I have other persons “Neil Shervell” and “Emma Haize”, they will be returned as well by this query
How correctly to construct query in order to get all variations of “Herve Villechaize” (meaning with English alphabet and diactitic symbols and no other names)?
Thanks
The default operator for a match query is “or” - meaning all the tokens generated by the match analytic query are OR-ed. Setting the operator to “and” will force the query to look for all the tokens in the particular field.