I am using couchbase server 6.6.0 and am trying to set the TTL using the java SDK as shown below so documents immediately expires. I want to do this in bulk as my operations will be in millions. My intention is to immediately expire the documents as below. Then I am hoping that couchbase server will delete them. After the below code is run, I still see the documents in the web console with a N1QL query. But when I run my code again to look for those records, it can’t find them. So something did happen, but not sure what state it is in. Also, I noticed the expiration meta tag is still set to 0 in the web console. (I think this is because I did not set up an index on the expiry field though)
I tried to retrieve the document in the web console, and I can still retrieve the data. How come this is happening and what do I need to do to ensure it’s completely deleted? I tried setting auto compaction on for 1 hour in the web console, but that didn’t purge it.
I tried doing a “delete using keys” approach, however, I was finding that it wasn’t actually deleting all the documents, even though it was accepting the commands. I had to run it multiple times and slowly things were getting deleted, so I didn’t find it a reliable way to delete the volume I was doing. I didn’t do the “remove”, but I think delete using keys is similar under the hood? I set the TTL to 1. 0 is no expiration from my understanding.
Yes, I tried your recommended approach (setting TTL to 1 second). But the background/cleanup process doesn’t seem to be working as I still can query these documents in couchbase. I prefer this approach if possible. Any ideas how I can debug this further or ensure that cleanup process is working as expected?
You might have “tombstones” either be patient or adjust your bucket settings
Expiry Pager : Scans for items that have expired, and erases them from memory and disk; after which, a tombstone remains for a default period of 3 days. The expiry pager runs every 60 minutes by default: for information on changing the interval, see cbepctlset flush_param. For more information on item-deletion and tombstones, see Expiration.
You can tune this way down in the Buckets/Edit/Advanced Bucket Settings
The check “ Override the default auto-compaction settings?” Then adjust “ Metadata Purge Interval”
Yes delete with USE KEYS and bucket.remove() will do the same underlying request to the Key-Value server.
Can I assume that your queries were using N1QL (as opposed to the Key-Value API e.g. bucket.get(), lookupIn() or exists())? If so, they will have been hitting an index, and my guess is that this index had not yet been told by the Key-Value server about the removed documents. This eventual consistency is a great thing, as it means that you can decide at read time what level of consistency you require - e.g. you can request that the N1QL read is consistent with any mutations at the time of the query. Please take a look at the scan consistency docs for details: https://docs.couchbase.com/java-sdk/2.7/scan-consistency-examples.html
As for why you are seeing the current behaviour, e.g. setting TTL=0 and seeing Key-Value lookups reflecting this, but the documents still existing when you do a N1QL query, it comes down to the same eventual consistency. The Key-Value service has not yet told the N1QL index about the change, but it will do when the expiry pager runs, or compaction is run. Please see these docs for more details: https://docs.couchbase.com/server/current/learn/buckets-memory-and-storage/expiration.html#post-expiration-purging (specifically the Post-Expiration Purging section).
I tried setting auto compaction on for 1 hour in the web console, but that didn’t purge it.
The expiries should be processed on compaction - can I double-check that you were running your query at least an hour after doing the queryBucket.touch(), e.g. 100% after compaction had run?
Still, the TTL discussion is a bit of a red herring to follow, as IMO queryBucket.remove() is the way to go - together with setting the scanConsistency if you’re doing subsequent queries.
You might doing covered query. Unless Expiry Pager ran and removed documents REQUEST_PLUS will not help, Mutation is not recorded and indexer will not be updated.
Once expiration passed without Expiry Pager if directly retrieved document via SDK or N1QL the document marked as deleted.
If use non-covered query you will not get expired documents.
How do I know what my Expiry Pager settings are? Is there a way to retrieve that information easily? My subsequent query has an index on it. BTW I am no longer “expiring” the document. I am deleting it, as per the suggestions above. Does the same system apply? i.e. Do I need to wait until expiry pager runs? I thought I don’t need to if I am setting REQUEST_PLUS and doing delete.
Yes, that’s what I tried, I pasted the java code above which is my query after I do delete (query is using existing index). I am batching my deletes in 10000 at a time.
delete from mybucket use keys $documentIds
If my query is using an existing index, and I am using REQUEST_PLUS consistency, I should NEVER see the document? Because I do, and this is where I am currently at.
Before I go for eventing, for option #2 how would you suggest I do the delete, which SDK call? I thought I am doing that with the DELETE use keys option under the hood as described above.