Inconsistent data between queries

Hi,

I have an issue with my data. I have a collection with documents that contain a timestamp indicating the start of this document. The documents contain some other fields and I have created an index over the start timestamp plus a boolean field and another timestamp.

When I now query the index for the oldest documents, i.e., a query of the following form:

SELECT META(my_collection).id,
       startDate
FROM `my_bucket`._default.my_collection AS my_collection
WHERE NOT attr1
    AND ts2 IS NOT NULL
ORDER BY startDate
LIMIT 10

the query returns sorted documents as expected.

However, when I now try to retrieve the oldest document from my collection via

SELECT *  FROM `my_bucket`._default.my_collection AS my_collection
where meta(my_collection).id = 'MyCollection:ID-FROM-QUERY'

the document is not found in the collection.

If try to delete the documents from the collection nothing changes. The index still returns the same documents.

The problem initially looked to me as if the index was inconsistent. Therefore, I have deleted the index and created it again.

Any idea what I could do to fix this issue?

Index maintain Asynchronous and eventual consistent.
The document might have deleted by some one else.
Check What index is used and see any pending mutations.
Also try with scan_plus consistency.
if still need post the explain and index definitions.

Hi @vsr1, Thanks for your reply :pray:t3:

I just ran the queries from my Python code and added QueryOptions(scan_consistency=QueryScanConsistency.REQUEST_PLUS). Still the same result :disappointed:

The document might have deleted by some one else.

Unlikely due to the structure of this project.

Check What index is used and see any pending mutations.

The first query uses an index adv_ts2_attr1_startDate. The advisor also says that the existing indexes are sufficient. The index is defined as follows:

 CREATE INDEX adv_ts2_attr1_startDate ON `default`:`my_bucket`.`_default`.`my_collection`(`ts2`,`attr1`,`startDate`)

The explain of the query is as follows:

{
    "plan": {
        "#operator": "Sequence",
        "~children": [
            {
                "#operator": "Sequence",
                "~children": [
                    {
                        "#operator": "IndexScan3",
                        "as": "my_collection",
                        "bucket": "my_bucket",
                        "covers": [
                            "cover ((`my_collection`.`ts2`))",
                            "cover ((`my_collection`.`attr1`))",
                            "cover ((`my_collection`.`startDate`))",
                            "cover ((meta(`my_collection`).`id`))"
                        ],
                        "index": "adv_ts2_attr1_startDate",
                        "index_id": "ba3328a9b6a34801",
                        "keyspace": "my_collection",
                        "namespace": "default",
                        "scope": "_default",
                        "spans": [
                            {
                                "range": [
                                    {
                                        "inclusion": 0,
                                        "index_key": "`ts2`",
                                        "low": "null"
                                    },
                                    {
                                        "inclusion": 0,
                                        "index_key": "`attr1`",
                                        "low": "null"
                                    }
                                ]
                            }
                        ],
                        "using": "gsi"
                    },
                    {
                        "#operator": "Parallel",
                        "~child": {
                            "#operator": "Sequence",
                            "~children": [
                                {
                                    "#operator": "Filter",
                                    "condition": "((not cover ((`my_collection`.`attr1`))) and (cover ((`my_collection`.`ts2`)) is not null))"
                                },
                                {
                                    "#operator": "InitialProject",
                                    "result_terms": [
                                        {
                                            "expr": "cover ((meta(`my_collection`).`id`))"
                                        },
                                        {
                                            "expr": "cover ((`my_collection`.`startDate`))"
                                        },
                                        {
                                            "expr": "cover ((`my_collection`.`attr1`))"
                                        },
                                        {
                                            "expr": "cover ((`my_collection`.`ts2`))"
                                        }
                                    ]
                                }
                            ]
                        }
                    }
                ]
            },
            {
                "#operator": "Order",
                "limit": "100",
                "sort_terms": [
                    {
                        "expr": "cover ((`my_collection`.`startDate`))"
                    }
                ]
            },
            {
                "#operator": "Limit",
                "expr": "100"
            }
        ]
    }
}

The second query uses the PRIMARY INDEX which sounds reasonable.

Thanks for you support. This is much appreciated :slight_smile:

Hum. Force non covered query and see what you get.


SELECT META(my_collection).id,  
       startDate,
       xxx
FROM `my_bucket`._default.my_collection AS my_collection
WHERE NOT attr1
    AND ts2 IS NOT NULL
ORDER BY startDate
LIMIT 10

I ran the query without the index covering it. The results were younger than the ones I obtained when using the index. I also verified that the results can be queried using the primary index.

This confirms that the index seems to contain documents that were deleted already. It just surprises me that the documents are still contained in the index after I have deleted the index.

Is there any way to do a “force” delete or “force” a creation of the index entirely from scratch?

Thanks!

Drop the index and try recreate. cc @amit.kulkarni

I deleted the index adv_ts2_attr1_startDate as well as the primary index, and created both again. Nothing changed. I tested this even before writing here.

Is the data completely static during these tests?

The most puzzling element from what you’ve observed is where the index is obtaining the keys. Unless of course the data is dynamic and changing - in which case we know that the index is updated asynchronously so could for a period still refer to a key that has been deleted. (It should be a very short period assuming the index was up-to-date before the activity.)

What happens if you move the data (or a sample thereof) to a separate test collection; is the same problem then apparent there too?

(INSERT INTO bkt.scp.newCollection (KEY k, VALUE v) SELECT meta().id k, v FROM bkt.scp.oldCollection v)

I also presume you’ve tried to restart the cluster after dropping the index to ensure nothing is lingering before recreation? (I realise this is a “turn it off and on again” suggestion, but it might help nonetheless. I’m not pinning any real hope on it though.)

HTH.

Are there any index replicas? If yes, did you try deleting all index replicas?

@dh

Is the data completely static during these tests?

Yes, this is static data. In particular, the documents we are talking about. The documents are sometimes updated within a few days but static afterwards. The documents that are still appearing in the index are months old an haven’t been touched since then.

What happens if you move the data (or a sample thereof) to a separate test collection; is the same problem then apparent there too?

(INSERT INTO bkt.scp.newCollection (KEY k, VALUE v) SELECT meta().id k, v FROM bkt.scp.oldCollection v)

Yes, this resulted in an error as well. The query ended with:

“Request execution aborted as the number of errors raised has reached the maximum permitted.” (code: 1195)

The errors have the following form:

  {
    "code": 12009,
    "msg": "DML Error, possible causes include concurrent modification. Failed to perform INSERT on key MyCollection:ID - cause: Error in INSERT of key: MyCollection:ID",
    "reason": {
      "caller": "couchbase:1983",
      "cause": {
        "cause": {
          "caller": "memcached:432",
          "cause": {
            "attempts": "2",
            "cause": "TMPFAIL"
          },
          "code": 5502,
          "key": "datastore.couchbase.bucket.action",
          "message": "Unable to complete action after 2 attempts"
        },
        "key": "MyCollection:ID"
      },
      "code": 12036,
      "key": "datastore.couchbase.insert.error",
      "message": "Error in INSERT of key: MyCollection:ID"
    }
  }

I also presume you’ve tried to restart the cluster after dropping the index to ensure nothing is lingering before recreation? (I realise this is a “turn it off and on again” suggestion, but it might help nonetheless. I’m not pinning any real hope on it though.)

I have just tested the full procedure again. Deleted the primary index as well as the GSI. Then restarted Couchbase and created the indices again. This the same problem.

@hkishore

Are there any index replicas? If yes, did you try deleting all index replicas?

No, the index is not replicated.

Bucket might have ran out of ram-size, Increase bucket ram-size

I am still puzzled what is going on.

Also check in UI any pending mutations.

Try this.

  1. Run original query (covered) and note down the document key in question
  2. Run same query (covered) by adding AND META().id = document key see if you get result
  3. Make same query in step 2 by adding xxx in projection (make non covered) and see
  4. SELECT * FROM my_bucket._default.my_collection AS s USE KEYS “document key” see if document present.

Apart from the suggestions by @ vsr1, can you also try the below checks?

  1. [Trivial check], In your second query, you have
    where meta(my_collection).id = 'MyCollection:ID-FROM-QUERY'. If you are using the ID-FROM-QUERY as it is from the first query result, can you change it to where meta(my_collection).id = 'ID-FROM-QUERY'? The right-hand side should be the exact ID.
  2. When we delete and create the index, it gets rebuilt using the mutations from the storage. So if the index is still returning the id, that document mutation has to be sent by the storage.

SELECT * FROM my_bucket._default.my_collection AS my_collection where meta(my_collection).id = 'MyCollection:ID-FROM-QUERY'
If the above query uses the primary index, can you delete the primary index (including replicas) and rerun the query? This can pinpoint if any issue lies within the primary index.

@vsr1

Also check in UI any pending mutations.

Where would I look in the UI for pending mutations? I checked various options and didn’t find any.

Run original query (covered) and note down the document key in question
Run same query (covered) by adding AND META().id = document key see if you get result

This query found the result document.

Make same query in step 2 by adding xxx in projection (make non covered) and see

Running this query returned an empty result set.

SELECT * FROM my_bucket._default.my_collection AS s USE KEYS “document key” see if document present.

Running this query returned an empty result set as well.
FYI: I verified this query and the previous one with other keys to verify the correctness of the queries.

@hkishore

[Trivial check], In your second query, you have where meta(my_collection).id = ‘MyCollection:ID-FROM-QUERY’. If you are using the ID-FROM-QUERY as it is from the first query result, can you change it to where meta(my_collection).id = ‘ID-FROM-QUERY’? The right-hand side should be the exact ID.

That was just an issue of my notation (as noted above, I verified queries with other document keys). Sorry for the confusion.

When we delete and create the index, it gets rebuilt using the mutations from the storage. So if the index is still returning the id, that document mutation has to be sent by the storage.

I agree to the argument. Is there a way to verify this and to find the raw data on disk?

If the above query uses the primary index, can you delete the primary index (including replicas) and rerun the query? This can pinpoint if any issue lies within the primary index.

The queries of the format SELECT * FROM my_bucket._default.my_collection AS my_collection where meta(my_collection).id = 'MyCollection:ID-FROM-QUERY' do not return a document.

In Click Indexes on left side
Choose the bucket/scope/collection and you see list of index.
Click on Index to expand, Click On Index Stats

The info provided you tells me index has info but data node doesn’t.
I really don’t know what is going on after drop and create also not up to date.

Also check indexer.log/projector.log any errors. Manage Logging | Couchbase Docs

@vsr1 Thanks for the hint! However, I didn’t see any remaining mutations in the Index Stats.

I also checked the log files. I saw some errors but nothing seems to be related to the issue. E.g, I see some errors hinting at Prometheus (e.g., Prometheus http request failed). When looking at the descriptions, the errors are typically timeouts or “Service Unavailable”. However, the timestamps don’t seem to be correlating with my queries.

@0xef06b0 I think you should try the above suggestion (after increasing the bucket quota) as the earlier experiments seem to suggest that KV store somehow still has the deleted document. That’s likely how those end up being in the index even after index is dropped/recreated.

I agree to the argument. Is there a way to verify this and to find the raw data on disk?

  1. Can you try querying the document from the documents tab using the “Document ID” field as shown below? Does it return a value for the ID in question and what about other IDs?

  1. Is there any Unicode/special character in the missing ID which is not in other IDs? It seems like you are querying using the Python SDK. Make sure any Unicode characters are handled correctly. Refer: Documents | Couchbase Docs

As you know which key is not present then use cbindex and inspect directly
cbindex -auth Administrator:password -type=scanAll -bucket default -scope s1 -collection c1 -index ix1 | grep <document key>
Then inspect UI document key if see it.

Hi all, after trying the other suggestions above, we decided to restart our entire Docker infrastructure - not only the Couchbase container. This fixed the issues above. The documents now do not appear in the index anymore.

I’m still puzzled about the issue. In particular because I re-created the indexes, restarted the Couchbase service, etc.

I don’t know enough details about the Dockerization of Couchbase but we are running it in a pretty basic setup with /opt/couchbase/var mounted to a directory of the user. I don’t know where index data is stored or similar.

In any case, thanks again to everybody who provided input to this issue!

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.