Hi,
CouchBase uses inverted index as per my knowledge.
but why search time is significantly increasing when the no. of documents doubled even though total hits for the query remain same?
Thanks,
Rajeev
Hi,
CouchBase uses inverted index as per my knowledge.
but why search time is significantly increasing when the no. of documents doubled even though total hits for the query remain same?
Thanks,
Rajeev
It shouldn’t affect performance significantly provided you have used the right index definition, apt querying and sizing for your FTS cluster.
Without any of these info, it is hard to comment on whats happening in your case.
Cheers!
This is the index definition that I am using.
{
"type": "fulltext-index",
"name": "ffprofiles",
"uuid": "c0d8fa4dfeea33b7",
"sourceType": "couchbase",
"sourceName": "devcpl",
"sourceUUID": "254ed07175a79e45ea78aa01fa786c80",
"planParams": {
"maxPartitionsPerPIndex": 171
},
"params": {
"doc_config": {
"docid_prefix_delim": "::",
"docid_regexp": "",
"mode": "docid_prefix",
"type_field": "type"
},
"mapping": {
"analysis": {},
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"default_analyzer": "",
"dynamic": true,
"enabled": false
},
"default_type": "_default",
"docvalues_dynamic": true,
"index_dynamic": true,
"store_dynamic": false,
"type_field": "_type",
"types": {
"FF": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"PIIPayload": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"ProfileInfo": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"Profile": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"Customer": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"Address": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"AddressLine": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"_text": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "_text",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"PersonName": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"GivenName": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"_text": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "_text",
"store": true,
"type": "text"
}
]
}
}
},
"Surname": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"_text": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "_text",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"Telephone": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"_attr": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"PhoneNumber": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "PhoneNumber",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"_attr": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"BirthDate": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "BirthDate",
"store": true,
"type": "datetime"
}
]
}
}
}
}
}
}
},
"UniqueID": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"_attr": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"properties": {
"ID": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "ID",
"store": true,
"type": "text"
}
]
},
"ID_Context": {
"default_analyzer": "",
"dynamic": false,
"enabled": true,
"fields": [
{
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "ID_Context",
"store": true,
"type": "text"
}
]
}
}
}
}
}
}
}
}
}
}
}
}
},
"store": {
"indexType": "scorch",
"kvStoreName": "mossStore"
}
},
"sourceParams": {}
}
plz provide a sample document, query , FTS cluster sizing details.
If you already have a license, then please raise a CBSE with all these details…
Sample doc:
**Query:**
{
"explain": true,
"fields": [
"*"
],
"highlight": {},
"query": {
"query": " \"Zh Omarova 29 Apt 11 Almaty 050012 Kazakhstan\" "
}
}
256 MB RAM quota for FTS looks clearly low. Not sure how many documents you are indexing now.
But making this to 700mb/1GB would be better.
If you plan to query the entire address like below, use a keyword analyzer for the addressLine _text than the default.
Field scoping the queries helps to speed it up.
While indexing also, you may skip “_all” option to save the index space, but that mandates field scoped queries.
Few indexing tips are here - https://blog.couchbase.com/full-text-search-indexing-best-practices/
Recommend a CBSE here for taking this further…
Thank you for your insights.
I can’t take further CBSE as I don’t have a license…
The Alfresco Full Text Search (FTS) query text can be used standalone or it can be embedded in CMIS-SQL using the contains()
predicate function. The CMIS specification supports a subset of FTS. The full power of FTS can not be used and, at the same time, maintain portability between CMIS repositories.
FTS is exposed directly by the interface, which adds its own template, and is also used as its default field. When FTS is embedded in CMIS-SQL, only the CMIS-SQL-style property identifiers ( cmis:name
) and aliases, CMIS-SQL column aliases, and the special fields listed can be used to identify fields. The SQL query defines tables and table aliases after from
and join
clauses. If the SQL query references more than one table, the contains()
function must specify a single table to use by its alias. All properties in the embedded FTS query are added to this table and all column aliases used in the FTS query must refer to the same table. For a single table, the table alias is not required as part of the contains()
function.
When FTS is used standalone, fields can also be identified using prefix:local-name
and {uri}local-name
styles.
Query time boosts allow matches on certain parts of the query to influence the score more than others.
All query elements can be boosted: terms, phrases, exact terms, expanded terms, proximity (only in filed groups), ranges, and groups.forpc jiofilocalhtml.run