Hi experts, I have 2 nodes, with hardisks dedicated to index for each one. One index hardisk in my second server is getting full, After restart CB server the problem is solved but for 2 or three days and I have to restart it again. Only in the second server is happening.
The configuration (not in production):
2 x server (1 disk for data / 1 disk for index)
SO: Debian
CB: 4.5.0-2601
Can you provide details of how many indexes are there on the second server? Data size, disk size, fragmentation of each index(available on UI) and what is the storage mode (Standard, Memory Optimized) and compaction mode(under Settings in UI).
Also can you check what files are taking up most space when you hit close to 100%.
Right now the index disk on server-2 is full, and today im seeing that the index disk on server-1 is reaching 96%, I think that it has the same symptom.
If I sum the index disk files never reach the total capacity of the disk (230GB), I try to find hidden files or any other information but I couldnt find anything, I tried to execute manual compactation more than one time and nothing happend. The only workaround that I’ve found is to restart de CB server but the issue is not solved.
As the fragmentation is 0% for all the indexes, running compaction is not going to make a difference.
Indexer doesn’t use any hidden/tmp files for execution. At the time of compaction, there can be extra disk usage as data is copied over to a new file while the old file is still around. So one possibility is that you see high disk usage when compaction is in progress and restarting the server is aborting compaction and cleaning up the extra copy of the data.
But from your screenshots, fragmentation is at 0% which doesn’t indicate compaction is going to trigger.
In general, “du” should be able to tell what is taking up space on the disk. Unless you are able to figure out what files are taking up the extra space, it is difficult to say what the source of the problem is.
We are experiencing same issues with index data. Here’s our cluster: 9 nodes, 256G mem each, and on each node we have 3 disks, 1T each. My setup when configuring the server:
After the cluster was set up, we loaded it with some data, not much comparing to the capacity of the cluster:
Then I created primary index for metrics-metadata, which worked fine. When I created a GSI on that bucket, and after the index was 100% ready, I got warning about disk space full. One of the nodes (and only one) is 95% full on the index disk. Here’s the output of du:
While the data itself takes only 22G on the data disk, the index file is filling up the index disk. And I noticed it’s still growing even the index showed 100% ready. I had to drop the index. After a while, the index disk space is freed up.
Is it normal to have this huge index file with such a small amount of data? And why is this only happening on one of the 9 nodes? Can we spread the index file across the cluster?
Following the above post, the situation on our staging cluster is even worse. We have the same setup, and the same indices. The secondary index filled up the disk of one node while it’s at 95% ready. Now it’s stuck there, cannot finish. And if I try to drop that index from command line, I got an error stating this index is “not found”.
@hclc, which couchbase version are you using? If you are on 4.5.0, please switch to Circular Write Mode. (change compaction mode from UI “Settings” → “Auto Compaction” → “Index Fragmentation” → “Circular Write Mode”).
In general, Indexer save 5 snapshots of data(at different points in time) for recovery. This can lead to more disk space usage than data size. Also the storage itself has write amplification due to MVCC architecture. When the compaction is in progress, data on the disk gets duplicated when the new compacted file gets created.
Thanks for your information deepkaran. However we are using version 4.0. Any workaround for this version? Also, we have 9 nodes in this cluster with plenty of storage, can we save the indexer file in more than one disk?
Thank you! we cannot use WHERE clause because we want to index the whole bucket. But I set the rollback point and will give it another try.
Again, very important question (I think), is there a way to split the index file in to multiple nodes? Couchbase is a clustered system, why the index has to be on one node, and one disk? Any work around for this?
You can manually partition your index to be placed on multiple nodes e.g.
CREATE INDEX productName_index1 ON bucket_name(productName, ProductID) WHERE type=“product” AND productName BETWEEN “A” AND “K” USING GSI WITH {“nodes”:“node1:8091”};
CREATE INDEX productName_index2 ON bucket_name(productName, ProductID) WHERE type=“product” AND productName BETWEEN “K” AND “Z” USING GSI WITH {“nodes”:“node2:8091”};
With the indexes above, if you search for productName = “APPLE WATCH” the scans will go to productName_index1 and productName = “SAMSUNG WATCH” will end up on productName_index2.
I set the rollback points to 2, and added the where clause, however the index size is still filling up my hard drive (over 850G now), and index is just 95% ready. Based on my experience last time, I cannot drop index at this time – will get an error saying “index not found”.
What options do I have? Do I have to watch the index grow and completely fill my disk? Is there a way to stop the index from building?
Then again, the total bucket size is only 150G (in memory, and on disk), why does the index have to be this big? Is there something I did wrong.
Now I have successfully dropped all the index I created (manually partitioned as suggested in the above response), but I see the disk usage on one node is still growing.
It is the folder named “/mnt/storage2/couchbase/data/@2i”. It is close to 100% on the 1T disk now.
What is the content of this folder? How can we stop it from eating up the disk space?
2i is the directory for GSI indexes. You may be able to identify what index is growing and the log may tell you more about what the indexer is doing on that node.
Thanks! Inside that folder it was a file (or folder) named as my index’s name. It kept growing for about 15 minutes after I dropped the index, till my disk was 100%, then it finally cleaned itself.
@hclc, as I mentioned in my earlier comment, the high disk space usage is due to high write amplification of storage engine due to MVCC architecture. And when the compaction is in progress, data on the disk gets duplicated when the new compacted file gets created, which takes lot more space.