N1QL query with many SEARCH(FTS) - concat results

Hi! I’m trying to figure out how to do the N1QL query that would use multiple SEARCH predicates. Currently it’s returning an empty set of data, while there are some results in both queries (when I run them separately). Any idea how to effectively join it? I would also like to use scoring so it needs to be sorted DESC by score.

 SELECT 
   SEARCH_META(s1) AS meta1, 
   SEARCH_META(s2) AS meta2
 FROM `bucket` i 
WHERE SEARCH(i, {"query": "sap* ras* irnq0oamyl*"}, {"index": "ind1", "out": "s1"}) 
AND SEARCH(i, {"query": "krnlpzfmyl*"}, {"index": "ind2", "out": "s2"})

If I run those SEARCH queries separately then I get results:

[
    {
        "meta1": {
            "id": "796aee75-fc25-4723-bcd9-fa21be52df5c",
            "score": 0.17407765752383322
        }
    },
    {
        "meta1": {
            "id": "b7abd3cc-a049-4c42-a0f4-0d22e33b475c",
            "score": 0.0647523879789343
        }
    },
    {
        "meta1": {
            "id": "bb55c912-045d-467e-9aaa-6457481179bb",
            "score": 0.052859253129068286
        }
    },
    {
        "meta1": {
            "id": "ff25fee1-cb48-4887-aa14-eb13b7289b69",
            "score": 0.230408143187017
        }
    }
    ]

And

 	[
    {
        "meta2": {
            "id": "b7abd3cc-a049-4c42-a0f4-0d22e33b475c",
            "score": 0.14433756470680237
        }
    },
    {
        "meta2": {
            "id": "bb55c912-045d-467e-9aaa-6457481179bb",
            "score": 0.3456122335610931
        }
    }
    ]

Using Couchbase Server v6.6

The query doing AND that means per same document both conditions must be true.

NOTE: Each SEARCH() does separately and do Intersection on the document key. SERCH_SCORE(), SEARCH_META() mostly available when single index is used. In case of Intersect those might not available. In that case it returns MISSING.

Project META(i).id you should see results

You can use single index(search function) and do conjunction/disjunction in side search. cc @abhinav

Yes I’d recommend using a single FTS index as opposed to the two you have: ind1, ind2.
If you can share both your index definitions, I’d be able to recommend a unified definition and the SEARCH(…) query to best suit your scoring needs.

Thanks for an instant answer!
So my indexes are set on the same bucket. Nothing fancy, each index on a different field. That’s it. See screenshots below.

In addition could you also add output of this in your request want to see what do u mean empty set of data. Is some fields missing or documents missing.

SELECT 
   MTEA(i).id,
   SEARCH_META(s1) AS meta1, 
   SEARCH_META(s2) AS meta2
 FROM `bucket` i 
WHERE SEARCH(i, {"query": "sap* ras* irnq0oamyl*"}, {"index": "ind1", "out": "s1"}) 
AND SEARCH(i, {"query": "krnlpzfmyl*"}, {"index": "ind2", "out": "s2"});

Documents are missing. Empty array.

{
  "results": []
}

Ok, I’d recommend creating a single index, with child fields field1 and field2 within the default mapping like you have. lets call this index “ind”. Also since you’re essentially doing wildcard searches - I’d recommend using the “keyword” analyzer for both fields.

Here’s your new query now …

 SELECT SEARCH_META() ,
 FROM `bucket` i 
WHERE SEARCH(i, {
	"query": {
		"conjuncts": [{
			"disjuncts": [{
				"field": "field1",
				"wildcard": "sap*"
			}, {
				"field": "field1",
				"wildcard": "ras*"
			}, {
				"field": "field1",
				"wildcard": "irnq0oamyl*"
			}]
		}, {
			"field": "field2",
			"wildcard": "krnlpzfmyl*"
		}]
	}
});
1 Like

Hum. It should have the following two documents .

        "id": "b7abd3cc-a049-4c42-a0f4-0d22e33b475c",
        "id": "bb55c912-045d-467e-9aaa-6457481179bb",

Try the following see if you get expected documents.

SELECT META().id, *
 FROM `bucket` i 
WHERE SEARCH(i, {"query": "sap* ras* irnq0oamyl*"}, {"index": "ind1", "out": "s1"}) ;

SELECT META().id, *
 FROM `bucket` i 
WHERE  SEARCH(i, {"query": "krnlpzfmyl*"}, {"index": "ind2", "out": "s2"});

Alright, so it’s not that simple comparing to a standard FTS query. But anyway I got the point now!
By the way:

  1. why would you recommend keyword analyzer as a default one?
  2. can I add “fuzziness”: “2” to this query or should I use direct match instead of wildcards?
  3. how about the performance? is there any difference comparing to a pure FTS query?

A wildcard query is a non-analytic query. Meaning it does not apply any analysis to the search terms. Using a keyword analyzer to index your data will essentially preserve the content as is so the search terms from non-analytic queries can be found.

“fuzziness”: 2 can be added to match queries not “wildcards”. The performance of the query will directly depend on the number of tokens your search criteria will match. So I’d advise you to design your query as narrow and precise as possible for better performance.

FTS Queries via N1QL will go through an additional hop of N1QL getting results from FTS and then delivering them to you, so you may notice (very slightly) higher latencies as opposed to directly running FTS queries. The advantages of using N1QL for your FTS queries are multi-fold - you will be able to combine results from multiple indexes (FTS or GSI) for one.

1 Like

@maciej.tomaszewski,

{“query”: “sap* ras* irnq0oamyl*”} is not a valid query for the SEARCH function. It can be either of these …

SELECT 
   SEARCH_META(s1) AS meta1, 
   SEARCH_META(s2) AS meta2
 FROM `bucket` i 
WHERE SEARCH(i, "sap* ras* irnq0oamyl*", {"index": "ind1", "out": "s1"}) 
AND SEARCH(i,  "krnlpzfmyl*", {"index": "ind2", "out": "s2"});

OR

SELECT 
   SEARCH_META(s1) AS meta1, 
   SEARCH_META(s2) AS meta2
 FROM `bucket` i 
WHERE SEARCH(i, {"query":{"query":"sap* ras* irnq0oamyl*"}}, {"index": "ind1", "out": "s1"}) 
AND SEARCH(i,  {"query":{"query":"krnlpzfmyl*"}}, {"index": "ind2", "out": "s2"});
2 Likes

Oh… funny think but you are right! It worked when I dropped "query" part for the 2nd argument. Now it works as expected but I think I’m gonna switch to what @abhinav proposed. It looks like a more advanced solution.

1 Like

Agree it. I was more looking why it gave empty result.

You got my respect for that! Really good job in debugging my problem :smiley:
Based on what @abhinav wrote ...N1QL for your FTS queries are multi-fold - you will be able to combine results from multiple indexes (FTS or GSI) for one. I guess there is also another adventage - using WHERE conditions which are not available in FTS itself. Like narrowing down the number of indexed documents to only those with let’s say “company_id” = 1 and “country” = “us” (while all of them exists in the same index). Am I right or is there another way to achieve that?

My advice is if you can achieve directly FTS index (searching from the SDKs, REST API), go with it.

N1QL it has other advantages , If you want Join, aggregation etc… i.e Instead of GSI leverage FTS index do complete SQL.

The following articles gives details when to use N1QL GSI, N1QL FTS