@petojurkovic,
There are two parts to the question: (1) how to filter based on the specific email and (2) how to index and search the data
field.
For (1), index the field email
using the keyword
analyzer, which will return the contents of the email field as one single token. I’m assuming you sanitize your input and that the email
field contains only a single, well-formed email address.
Then, your search will be a compound search that always includes a term search that must
match the email address. A term search isn’t analyzed, so again, you want to make sure you just pass the entire valid email as input.
For (2), I read this a few times and I think what you want is to find the string “searchableValue” no matter where it appears in an underlying structure under “data”. The dynamic mapping should work for this. In the example you gave with the LIKE, I’m not sure exactly what kind of full text search behavior you want to see, but you can play around with that part of the query to get what you want. Regular expression search doesn’t perform as well as match
or match phrase
, so I recommend you use those if they meet your needs. (More about FTS query types )
I created a document with ID petojurkovic
and "type": "petojurkovic"
in my travel-sample bucket to illustrate this (a bit lazy of me!) My REST API call for index definition and query are below. For your query, you probably want to use one of the SDKs instead of using the REST call (because the SDKs know the cluster topology, and they are probably easier to use, too).
For the index definition, to make testing easier, I turned on “Store Dynamic Fields” in the advanced index settings. This just writes a copy of the data in every field into the index, so when you search in the web admin, you see result snippets and highlighting. I also turned off dynamic indexing at the top level and only turned it on for the data
field, which makes your index a little more selective / efficient.
curl -XPUT -H "Content-Type: application/json" \
http://127.0.0.1:8094/api/index/petojurkovic \
-d '{
"type": "fulltext-index",
"name": "petojurkovic",
"uuid": "3830cac09bb9ffb4",
"sourceType": "couchbase",
"sourceName": "travel-sample",
"sourceUUID": "3dd7f72189ec1a3952e2c267bc5a061d",
"planParams": {
"maxPartitionsPerPIndex": 32,
"numReplicas": 0,
"hierarchyRules": null,
"nodePlanParams": null,
"pindexWeights": null,
"planFrozen": false
},
"params": {
"doc_config": {
"mode": "type_field",
"type_field": "type"
},
"mapping": {
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"display_order": "1",
"dynamic": true,
"enabled": false
},
"default_type": "_default",
"index_dynamic": true,
"store_dynamic": true,
"type_field": "type",
"types": {
"petojurkovic": {
"display_order": "0",
"dynamic": false,
"enabled": true,
"properties": {
"data": {
"display_order": "0",
"dynamic": true,
"enabled": true
},
"email": {
"dynamic": false,
"enabled": true,
"fields": [
{
"analyzer": "keyword",
"display_order": "0",
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "email",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"store": {
"kvStoreName": "mossStore"
}
},
"sourceParams": {
"clusterManagerBackoffFactor": 0,
"clusterManagerSleepInitMS": 0,
"clusterManagerSleepMaxMS": 2000,
"dataManagerBackoffFactor": 0,
"dataManagerSleepInitMS": 0,
"dataManagerSleepMaxMS": 2000,
"feedBufferAckThreshold": 0,
"feedBufferSizeBytes": 0
}
}'
This is how you would do what’s described above with a query string query (easiest for me, possibly not the type of query you want).
curl -XPOST -H "Content-Type: application/json" \
http://127.0.0.1:8094/api/index/petojurkovic/query \
-d '{
"explain": true,
"fields": [
"*"
],
"highlight": {},
"query": {
"query": "+email:email@address.tld searchableValue"
}
}'
Hope that helps you get started right; let me know if you have more questions. Good luck!