Couchbase backup. No space left on device

Good day.
I have Couchbase server in AKS.
I configured couchbase backups using these parameters:

backups:
default-backup:
name: couchbase-backup
strategy: full_incremental
full:
schedule: “0 3 * * 0”
incremental:
schedule: “0 3 * * 1-6”
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
backoffLimit: 2
backupRetention: 720h
logRetention: 168h
size: 5Gi
ephemeralVolume: false
objectStore:
useIAM: false
secret: azure-secret
uri: az://cbbackup

and I configured cloud backup with Azure storage account.
And after few days I realized that my backups is not working properly.

I checked and found very strange behavior. This is log from pod/couchbasebackup

kubectl logs pod/couchbase-backup-incremental-28221300-ljtch -n apps
/usr/local/lib/python3.8/dist-packages/requests/init.py:109: RequestsDependencyWarning: urllib3 (2.0.2) or chardet (None)/charset_normalizer (3.1.0) doesn’t match a supported version!
warnings.warn(
Traceback (most recent call last):
File “/usr/local/bin/backup.py”, line 1243, in
Backup(context).run()
File “/usr/local/bin/backup.py”, line 379, in run
self._setup_logging()
File “/usr/local/bin/backup.py”, line 1156, in _setup_logging
file_handler = logging.FileHandler(filename=self._get_logs_abs_path(),
File “/usr/lib/python3.8/logging/init.py”, line 1147, in init
StreamHandler.init(self, self._open())
File “/usr/lib/python3.8/logging/init.py”, line 1176, in _open
return open(self.baseFilename, self.mode, encoding=self.encoding)
OSError: [Errno 28] No space left on device: ‘/data/scriptlogs/backup/2023-08-29T03_01_03.log’

No space left on device I checked free space and found that only 1.5 gigabytes out of 5 gigabytes of hard drive is used.
And to give a proof this is outptu from kubectl get couchbasebackup couchbase-backup -o yaml

apiVersion: couchbase.com/v2
kind: CouchbaseBackup
metadata:
annotations:
meta.helm.sh/release-name: couchbase-release
meta.helm.sh/release-namespace: apps
creationTimestamp: “2023-08-02T17:31:13Z”
generation: 109
labels:
app.kubernetes.io/managed-by: Helm
cluster: couchbase-release
name: couchbase-backup
namespace: apps
resourceVersion: “115777197”
uid: 27cbfa84-1cd7-40e6-a1df-3579f7c4aacf
spec:
backoffLimit: 2
backupRetention: 720h
ephemeralVolume: false
failedJobsHistoryLimit: 3
full:
schedule: 0 3 * * 0
incremental:
schedule: 0 3 * * 1-6
logRetention: 168h
objectStore:
secret: azure-secret
uri: az://cbbackup
useIAM: false
services:
analytics: true
bucketConfig: true
bucketQuery: true
clusterAnalytics: true
clusterQuery: true
data: true
eventing: true
ftsAliases: true
ftsIndexes: true
gsIndexes: true
views: true
size: 5Gi
strategy: full_incremental
successfulJobsHistoryLimit: 1
threads: 1
status:
archive: az://cbbackup/archive
backups:

  • full: 2023-08-15T20_19_18.026423127Z
    incrementals:
    • 2023-08-16T03_00_23.152342302Z
    • 2023-08-17T03_00_26.114923499Z
    • 2023-08-18T03_00_32.024195992Z
    • 2023-08-19T03_00_30.619394995Z
      name: couchbase-release-2023-08-15T20_19_15
  • full: 2023-08-20T03_00_34.305474025Z
    incrementals:
    • 2023-08-21T03_00_40.506386234Z
    • 2023-08-22T03_00_43.216405629Z
    • 2023-08-23T03_00_50.281174379Z
    • 2023-08-24T03_00_55.846861426Z
    • 2023-08-25T03_00_57.820162133Z
    • 2023-08-26T03_01_06.659720506Z
      name: couchbase-release-2023-08-20T03_00_17
  • full: 2023-08-27T03_01_10.868515494Z
    incrementals:
    name: couchbase-release-2023-08-27T03_00_31
    capacityUsed: 1512Mi
    cronjob: deprecated
    duration: 62s
    failed: true
    job: deprecated
    lastFailure: “2023-08-28T03:03:58Z”
    lastRun: “2023-08-28T03:02:55Z”
    lastSuccess: “2023-08-27T03:07:23Z”
    pod: deprecated
    repo: couchbase-release-2023-08-27T03_00_31
    running: false

Question: what’s going on???

And I have another question:
if I use cloud storage for backups, then why is the local copy also stored on disk? why is there no deletion or something like that? After all, that’s why I set up cloud storage, so as not to store data locally, not to think about free space

If I understand the message correctly, its the log file device that is full.

ok.
this log file is in the same folder /data
OSError: [Errno 28] No space left on device: **‘/data/**scriptlogs/backup/2023-08-29T03_01_03.log’
this folder /data -mounted folder at couchbase-backup pod.
this folder contain logs AND local copy of backup.
And this for this folder I reserved 5Gb
size: 5Gi

How it possible?

Tonight I increased volume size from 5Gb to 50 Gb. Nothing else was changed.
And tonight backup was completed!!!

apiVersion: couchbase.com/v2
kind: CouchbaseBackup
metadata:
annotations:
meta.helm.sh/release-name: couchbase-release
meta.helm.sh/release-namespace: apps
creationTimestamp: “2023-07-25T20:23:47Z”
generation: 76
labels:
app.kubernetes.io/managed-by: Helm
cluster: couchbase-release
name: couchbase-backup
namespace: apps
resourceVersion: “68633673”
uid: 0e16e87c-5a70-4839-b38f-328afab46cbf
spec:
backoffLimit: 2
backupRetention: 720h
ephemeralVolume: false
failedJobsHistoryLimit: 3
full:
schedule: 0 3 * * 0
incremental:
schedule: 0 3 * * 1-6
logRetention: 168h
objectStore:
secret: azure-secret
uri: az://cbbackup
useIAM: false
services:
analytics: true
bucketConfig: true
bucketQuery: true
clusterAnalytics: true
clusterQuery: true
data: true
eventing: true
ftsAliases: true
ftsIndexes: true
gsIndexes: true
views: true
size: 50Gi
strategy: full_incremental
successfulJobsHistoryLimit: 1
threads: 1
status:
archive: az://cbbackup/archive
backups:

  • full: 2023-08-06T03_00_30.291787438Z
    incrementals:
    • 2023-08-07T03_00_33.555753246Z
    • 2023-08-08T03_00_34.791049911Z
    • 2023-08-09T03_00_37.128669934Z
    • 2023-08-10T03_00_37.373483327Z
    • 2023-08-11T03_00_38.681273507Z
    • 2023-08-12T03_00_44.99418999Z
      name: couchbase-release-2023-08-06T03_00_18
  • full: 2023-08-13T03_00_48.572596861Z
    incrementals:
    • 2023-08-14T03_00_52.647386054Z
      name: couchbase-release-2023-08-13T03_00_23
  • incrementals:
    • 2023-08-15T20_54_26.952198931Z
      name: couchbase-release-2023-08-15T20_27_55
  • full: 2023-08-17T08_23_13.19901395Z
    incrementals:
    name: couchbase-release-2023-08-17T07_56_31
  • full: 2023-08-29T20_29_40.933511093Z
    incrementals:
    • 2023-08-30T03_00_31.30381511Z
      name: couchbase-release-2023-08-29T20_29_26
      capacityUsed: 1189Mi
      cronjob: deprecated
      duration: 205s
      failed: false
      job: deprecated
      lastFailure: “2023-08-17T08:28:01Z”
      lastRun: “2023-08-30T03:00:19Z”
      lastSuccess: “2023-08-30T03:03:45Z”
      pod: deprecated
      repo: couchbase-release-2023-08-29T20_29_26
      running: false

What’s wrong with this? Or what’s wrong with me?

I wonder if the issue was that it ran out of inodes? It’s possible to run out of inodes while there is still space available on the device. what does “df -hi” show?

Here’s a comprehensive thread on the error: filesystems - Python causing: IOError: [Errno 28] No space left on device: '../results/32766.html' on disk with lots of space - Stack Overflow

This post hypothesizes that error has something to do with they python process running out of memory. python - No space on device while gzipping - Stack Overflow

why is there no deletion or something like that?

backupRetention: 720h

Did I understand correctly that backupRetention: 720h is about LOCAL copy?

I thought that this parameter was responsible for how long the copy would be stored. And since I use cloud storage, this is storage time in the cloud (in my case in an azure storage account).
Am I correct in understanding that this is the storage time for the local copy? How then to clean cloud storage from old archives? Manually?

I’m sorry for such annoying questions, but I can’t catch an idea: “why I should still think about local backups(storage, volumes, space) if I chose to use cloud storage”

As far as I can tell from the documentation, when cloud storage is used, there should not be any backup data saved to local disk (only some staging of metadata). Am I missing something? Which documentation are you following? @hyunjuV ?

  • When using cbbackupmgr to backup to cloud storage, cbbackupmgr uses a staging directory to store metadata for low-latency interactions. Data is streamed directly to the cloud storage. The staging directory is described here in the documentation, where it’s noted that the “staging directory can become quite large during a normal backup depending on the number of documents being backed up, and the size of their keys.”

  • You can calculate the approximate size needed for the staging directory using the formula provided in the documentation: https://docs.couchbase.com/server/current/backup-restore/cbbackupmgr-cloud.html#disk-requirements

2 Likes

Thank you for you response, I expected behavior you described.
But I faced with another. You can see my first post at this theme if you have a time

The error is an operating system error, so if the operating system is indicating this error when you don’t think it exists your best avenue for investigation is the provider of the operating system. I did provide suggestions for investigation (check if inodes are available etc) are available, but it’s not clear if you followed up.

Tonight I increased volume size from 5Gb to 50 Gb. Nothing else was changed.

The number of available inodes is based on the size, so the number of available inodes was also changed.