How can I combine object and object?

ezpz · June 13, 2024, 2:54pm

I saw this page [page] Example: Running a Simple Vector Similarity Query and I read The Search Service combines the Vector search results from a knn object with the traditional query object by using an OR function. If the same documents match the knn and query objects, the Search Service ranks those documents higher in search results
docs say knn object and query object using an ‘OR’ , I want to know can I change it ‘AND’ ? ‘query’ Object ‘AND’ ‘knn’ Object

jon.strabala · June 20, 2024, 8:09pm

Currently you can not do this in the basic JSON syntax. However you can write a hybrid SQL++ vector search like the following:

SELECT color, verbs, brightness
FROM `vector-sample`.color.rgb AS t1
WHERE 
    brightness < 20
AND SEARCH(t1, {
  "query": {  "match_none": {} },
  "knn": [{
    "field": "colorvect_l2",
    "vector": [0.0, 0.0, 128.0],
    "k": 3 }]}
)

Alternatively you could create a 1D vector enum_vect to represent a category (or a department or an org_id) and add do something like:

SELECT id
FROM `vector-sample`.color.rgb AS t1
WHERE 
  brightness < 20 
AND 
  enum_vect[0] = 27
AND 
SEARCH(t1, {
  "query": {  "match_none": {} },
  "knn_operator": "and",
  "knn": [
    {   "field": "colorvect_l2",
        "vector": [0.0, 0.0, 127.7],
        "k": 3 
    },{ "field": "enum_vect",
        "vector": [10],
        "k": 30 
    }
  ]
})

Note that the 1D enum_vect we use a high k to get more matches. and we also use the the use of ( “knn_operator”: “and” ) to AND our vectors.

Next we use the SQL++ ( AND enum_vect[0] = 27 ) to ensure we only return the category of 27 to ensure we don’t have a leakage of a “near” category into our results because the vector side of enum_vect is still approximant.

Furthermore remember every time you use “knn” you are always doing approximate search and based on the completion order of the scatter gather operations especially with sorting the same value in the vector search you might get different results and a different number of items back. SO the intersections between the vectors might differ between runs of the same query. Yes I know this is a bit weird.

jon.strabala · September 9, 2024, 11:12pm

This sort of pre-filter hack works best when using dot_product to index all vectors yes both colorvect_l2 enum_vect else you can get some very large scores that do not sort.

SELECT id
FROM `vector-sample`.color.rgb AS t1
WHERE 
  t1.brightness < 20 
AND 
  t1.enum_vect[0] = 10
SEARCH(t1, {
  "query": {  "match_none": {} },
  "knn_operator": "and",
  "knn": [
    {   "field": "colorvect_l2",
        "vector": [0.0, 0.0, 127.0001],
        "k": 3 
    },{ "field": "enum_vect",
        "vector": [10.0001],
        "k": 300
    }
  ]
})

In all cases we add a small number to avoid a perfect vector match which makes the scores so large they don’t sort. But we don’t do this in the SQL++ part

system · December 8, 2024, 11:13pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
OBJECT_VALUES question SQL++ n1ql	7	3426	February 15, 2016
N1QL query for objects inside a object Couchbase Server n1ql	4	195	September 26, 2024
Q: object connect query with N1QL SQL++ query	9	923	July 29, 2019
Search value on two nested arrays with OR statement SQL++	2	683	June 11, 2018
Select query with object SQL++ query , n1ql	1	559	August 22, 2019

How can I combine object and object?

Related topics