I’m using couchbase 6.0.1
I have 1 bucket contains 100 mio data.
I try to create search index with filter type_ = ‘Customer’ ( this should be around 20 mio data)
My total bucket size is around 150Gb,
My search node storage is 200 Gb.
When I try to check my search node storage, it is out of space ( consume more than 197Gb)
My question:
Why is it consume all of my storage?
If I filter the type , it should be store only the filtered one, right? but I check the total doc count is 100 mio
What is the index type you are using in the index definition? Is this “scorch”, if not update it to “scorch”.
(this would result in index rebuild but a reduced index size)
Total doc count indicates the number of documents processed so far and not the actual items in the index. The stat label is rectified in the latest releases.
Yes, actually I’ve reloaded the data multiple times bcs part of the testing.
Do you mean when I reload with same doc id, it will consume more space in disk?
I’ve tried to recreate the index with additional properties :
Yes, updates and deletes would have an impact on the index size as this we use append-only storage. Reclaiming of the obsolete data happens during the background compaction cycles. This might happen concurrently at a slower pace in the background. You may find more context here at point 5 - Full-Text Search Service Production Systems: 7 Useful Tips
The recommendations would only restrict some storage property like maximum segment size and won’t result in any data corruption.