Full text search total_hits wrong by changing from

I have 461061 documents in index.

{
  "fields": [
    "*"
  ],
  "highlight": {},
  "sort": [
    "name"
  ],
  "size": 20,
  "from": 0,
  "query": {
    "must": {
      "conjuncts": [
        {
          "field": "class_type",
          "match": "REGION"
        }
      ]
    }
  }
}

it returns

],
"total_hits": 461061,
"max_score": 1.4851010726785485,
"took": 133725788,
"facets": null
}

but if I request like this,

{
  "fields": [
    "*"
  ],
  "highlight": {},
  "sort": [
    "name"
  ],
  "size": 20,
  "from": 461000,
  "query": {
    "must": {
      "conjuncts": [
        {
          "field": "class_type",
          "match": "REGION"
        }
      ]
    }
  }
}

it returns

"hits":[],
"total_hits": 154063,
"max_score": 1.4825419134327702,
"took": 5241947075,
"facets": null

total_hits changed. this is exactly same index.
why this returns?
Why does couchbase respond by abruptly reducing the number of documents in index?
my couchbase is crazy? what should I do?

couchbase-server -v
Couchbase Server 6.0.0-1693 (CE)

I think couchbase full text search max_result_window is problem.

this not works for me.

@horoyoi_o,

Like to get a few more details like below,

Can you please share your search response status part as well for the failing case?
eg: {"status":{"total":N,"failed":N1,"successful":N2}

Is this failure consistent only for this higher order paged query ?

If the Size+From arguments in the SearchRequest exceeds the current bleveMaxResultWindow limit, then you would have got an explicit error rather an incorrect total_hits count response like above.

How many FTS nodes your cluster has?

Cheers!

I am experiencing the same issue:

Our Couchbase cluster is 6.0.2
Node SDK 2.6.9
Hosted in Serverless Amazon Linux 2

I am trying to use the “total hits” returned from FTS search as the basis for paging through those results. Example,

query returns 4940 total hits. when I pass in 0 offset and limit 100.
As I increase the offset from 0 up to 2800 and the total hits remains consistently what i would expect, 4940.

However, as I page beyond 2800 the total hit counts become erratic and the number swings wildly. I ran the same query with offset 2900 3 consecutive times and each time was a different “total hits”



Any ideas?

JG

Hi @The_Cimmerian,

Not sure whether you have the same problem reported earlier in this thread.
In the former case, the issue was happening when the the requested pages were beyond the bleveMaxResultWindow or a limit of 10K.
But in your case, the offset/limit looks well within the 10K range. So I don’t see any obvious reasons for this to fail.

Couple of questions.

  1. Is it working with direct curl calls to the server?
  2. Can you share your search response status part for the failing case?
    eg: {"status":{"total":N,"failed":N1,"successful":N2}

This part helps in understanding the whether all the index partitions were reachable or not.

Please go ahead and create a Support ticket to handle this further as it might need more data collection.

thanks,

Thank you for your assistance. Below is the redacted response I receive from the same FTS Query via the REST API:
I should also point there are 3 nodes in our cluster.

{
  "status": {
    "total": 12,
    "failed": 4,
    "successful": 8,
    "errors": {
      "screened_name_request_test_3f33ca20613e41ce_13aa53f3": "remote: query got status code: 429, queryURL: https://ec2-35-174-184-155.compute-1.amazonaws.com:18094/api/index/screened_name_request_test/query, buf: {\"ctl\":{\"timeout\":0,\"consistency\":null},\"pindexNames\":[\"screened_name_request_test_3f33ca20613e41ce_13aa53f3\",\"screened_name_request_test_3f33ca20613e41ce_aa574717\"],\"query\":{\"must\":{\"conjuncts\":[{\"start\":\"2020-06-20T00:00:00-04:00\",\"end\":\"2020-06-24T23:59:59-04:00\",\"inclusive_start\":true,\"inclusive_end\":true,\"field\":\"completed\"},{\"regexp\":\"aircomm::dp::screened::(name|address)::request::[0-9]{4,10}\",\"field\":\"_id\"}]}},\"size\":2910,\"from\":0,\"highlight\":null,\"fields\":[\"*\"],\"facets\":null,\"explain\":false,\"sort\":[\"-_score\"],\"includeLocations\":false}, resp: &http.Response{Status:\"429 Too Many Requests\", StatusCode:429, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Date\":[]string{\"Thu, 25 Jun 2020 11:28:21 GMT\"}, \"Content-Length\":[]string{\"674\"}, \"Content-Type\":[]string{\"application/json\"}, \"X-Content-Type-Options\":[]string{\"nosniff\"}}, Body:(*http.bodyEOFSignal)(0xc4269e3d40), ContentLength:674, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc421508c00), TLS:(*tls.ConnectionState)(0xc4274018c0)}, err: <nil>",
      "screened_name_request_test_3f33ca20613e41ce_18572d87": "remote: query got status code: 429, queryURL: https://ec2-18-207-241-164.compute-1.amazonaws.com:18094/api/index/screened_name_request_test/query, buf: {\"ctl\":{\"timeout\":0,\"consistency\":null},\"pindexNames\":[\"screened_name_request_test_3f33ca20613e41ce_18572d87\",\"screened_name_request_test_3f33ca20613e41ce_6ddbfb54\"],\"query\":{\"must\":{\"conjuncts\":[{\"start\":\"2020-06-20T00:00:00-04:00\",\"end\":\"2020-06-24T23:59:59-04:00\",\"inclusive_start\":true,\"inclusive_end\":true,\"field\":\"completed\"},{\"regexp\":\"aircomm::dp::screened::(name|address)::request::[0-9]{4,10}\",\"field\":\"_id\"}]}},\"size\":2910,\"from\":0,\"highlight\":null,\"fields\":[\"*\"],\"facets\":null,\"explain\":false,\"sort\":[\"-_score\"],\"includeLocations\":false}, resp: &http.Response{Status:\"429 Too Many Requests\", StatusCode:429, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Content-Type\":[]string{\"application/json\"}, \"X-Content-Type-Options\":[]string{\"nosniff\"}, \"Date\":[]string{\"Thu, 25 Jun 2020 11:28:21 GMT\"}, \"Content-Length\":[]string{\"674\"}}, Body:(*http.bodyEOFSignal)(0xc4269e3c80), ContentLength:674, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc4204b4700), TLS:(*tls.ConnectionState)(0xc42bb3c630)}, err: <nil>",
      "screened_name_request_test_3f33ca20613e41ce_6ddbfb54": "remote: query got status code: 429, queryURL: https://ec2-18-207-241-164.compute-1.amazonaws.com:18094/api/index/screened_name_request_test/query, buf: {\"ctl\":{\"timeout\":0,\"consistency\":null},\"pindexNames\":[\"screened_name_request_test_3f33ca20613e41ce_18572d87\",\"screened_name_request_test_3f33ca20613e41ce_6ddbfb54\"],\"query\":{\"must\":{\"conjuncts\":[{\"start\":\"2020-06-20T00:00:00-04:00\",\"end\":\"2020-06-24T23:59:59-04:00\",\"inclusive_start\":true,\"inclusive_end\":true,\"field\":\"completed\"},{\"regexp\":\"aircomm::dp::screened::(name|address)::request::[0-9]{4,10}\",\"field\":\"_id\"}]}},\"size\":2910,\"from\":0,\"highlight\":null,\"fields\":[\"*\"],\"facets\":null,\"explain\":false,\"sort\":[\"-_score\"],\"includeLocations\":false}, resp: &http.Response{Status:\"429 Too Many Requests\", StatusCode:429, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Content-Type\":[]string{\"application/json\"}, \"X-Content-Type-Options\":[]string{\"nosniff\"}, \"Date\":[]string{\"Thu, 25 Jun 2020 11:28:21 GMT\"}, \"Content-Length\":[]string{\"674\"}}, Body:(*http.bodyEOFSignal)(0xc4269e3c80), ContentLength:674, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc4204b4700), TLS:(*tls.ConnectionState)(0xc42bb3c630)}, err: <nil>",
      "screened_name_request_test_3f33ca20613e41ce_aa574717": "remote: query got status code: 429, queryURL: https://ec2-35-174-184-155.compute-1.amazonaws.com:18094/api/index/screened_name_request_test/query, buf: {\"ctl\":{\"timeout\":0,\"consistency\":null},\"pindexNames\":[\"screened_name_request_test_3f33ca20613e41ce_13aa53f3\",\"screened_name_request_test_3f33ca20613e41ce_aa574717\"],\"query\":{\"must\":{\"conjuncts\":[{\"start\":\"2020-06-20T00:00:00-04:00\",\"end\":\"2020-06-24T23:59:59-04:00\",\"inclusive_start\":true,\"inclusive_end\":true,\"field\":\"completed\"},{\"regexp\":\"aircomm::dp::screened::(name|address)::request::[0-9]{4,10}\",\"field\":\"_id\"}]}},\"size\":2910,\"from\":0,\"highlight\":null,\"fields\":[\"*\"],\"facets\":null,\"explain\":false,\"sort\":[\"-_score\"],\"includeLocations\":false}, resp: &http.Response{Status:\"429 Too Many Requests\", StatusCode:429, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Date\":[]string{\"Thu, 25 Jun 2020 11:28:21 GMT\"}, \"Content-Length\":[]string{\"674\"}, \"Content-Type\":[]string{\"application/json\"}, \"X-Content-Type-Options\":[]string{\"nosniff\"}}, Body:(*http.bodyEOFSignal)(0xc4269e3d40), ContentLength:674, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc421508c00), TLS:(*tls.ConnectionState)(0xc4274018c0)}, err: <nil>"
    }
  },
  "request": {
    "query": {
      "must": {
        "conjuncts": [{
            "start": "2020-06-20T00:00:00-04:00",
            "end": "2020-06-24T23:59:59-04:00",
            "inclusive_start": true,
            "inclusive_end": true,
            "field": "completed"
          },
          {
            "regexp": "xxxx::dp::screened::(name|address)::request::[0-9]{4,10}",
            "field": "_id"
          }
        ]
      }
    },
    "size": 10,
    "from": 2900,
    "highlight": null,
    "fields": [
      "*"
    ],
    "facets": null,
    "explain": false,
    "sort": [
      "-_score"
    ],
    "includeLocations": false
  },
  "hits": [],
  "total_hits": 1655,
  "max_score": 0.08077975277185859,
  "took": 29724104,
  "facets": null
}

I find support has been much slower to respond than this forum, FWIW. it takes two days just between the emails explaining everything twice, once in the opening ticket description and then again when they ask for more information.

Jay

Hi @The_Cimmerian,

As I suspected, there are a total of 12 index partitions and out of that querying 4 of them have failed here. Hence you see the inconsistencies in the final result.

If you note the HTTP status code of 429 here, it indicates that the remote node hosting these 4 partitions rejected those queries due to lack of memory for executing those incoming queries. ie if the node decides to execute those queries, then the memory usage would spike beyond the set FTS memory quota. Hence the node is rejecting those search queries here.

You may check the FTS FAQs for a brief about 429 errors here - Troubleshooting and FAQs | Couchbase Docs

Better sizing of the cluster is the clue here.
If the cluster nodes has enough RAM available, then you may bump the FTS RAM quota to a higher value to help with these errors. Always remember to limit the FTS quota to a maximum of 70% of the available RAM in the node.

Cheers!
Sreekanth