On July 9th, 2024, my photo management software I self host (immich) stopped working. I was getting an error in the api logs talking about something not working with "Redis". The error in particular was the following:
MISCONF Redis is configured to save RDB snapshots, but it's currently unable to persist to disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error
Ok, the error is with Redis, let's check the logs that it's reporting:
1:M 10 Jul 2024 22:39:04.073 * 10000 changes in 60 seconds. Saving...
1:M 10 Jul 2024 22:39:04.075 * Background saving started by pid 213
213:C 10 Jul 2024 22:39:04.138 # Write error while saving DB to the disk(fsync): Disk quota exceeded
1:M 10 Jul 2024 22:39:04.175 # Background saving error
1:M 10 Jul 2024 22:39:10.033 * 10000 changes in 60 seconds. Saving...
1:M 10 Jul 2024 22:39:10.034 * Background saving started by pid 214
214:C 10 Jul 2024 22:39:10.100 # Write error while saving DB to the disk(fsync): Disk quota exceeded
Ahhok, it seems like Redis is reporting issues with saving a file and the error being reported is with the disk quota being exceeded. I've configured the PVC
in Kubernetes to be 30GB
large, I would be really impressed if the Redis database was that big, because I may have some data, but not enough for a Redis database to be that large. However you can check if your taking up too much space by running the df -i
command in the container. Let's try that.
$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
overlay 30276800 600293 29676507 2% /
tmpfs 999296 17 999279 1% /dev
10.0.0.200:/volume1/nfs-volume/redis-pvc 0 0 0 - /data
/dev/sda2 30276800 600293 29676507 2% /etc/hosts
shm 999296 1 999295 1% /dev/shm
tmpfs 999296 9 999287 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 999296 1 999295 1% /proc/asound
tmpfs 999296 1 999295 1% /proc/scsi
tmpfs 999296 1 999295 1% /sys/firmware
As you can see from the result, I'm not really anywhere near maxing out any of my storage on this pod... On top of that, the data is being saved to NFS Synology box I have on my network. The Synology box has a max capacity of 2 TB
and I know for a fact that only 1 TB
of that is used. Why am I getting disk quota exceeded error?
Looking online, I found out that the quick solution to this issue was disabling the snapshots on every write as written in this gist link on github.
$ redis-cli
> config set stop-writes-on-bgsave-error no
Redis calls this out in their config file as well. There's even a question and answer on one of their pages as seen here.
Debugging the issue
Restarting redis or immich didn't solve the issue. Stumpped that turning it off and on again didn't fix any issues, I decided to see if I could replicate the issue. Becaue it's a homelab and not a bank, I did a little old
$ kubectl exec redis -it -- bash
And then went to the directory that was connected to the file share and tried to write hello world
to a new random file.
echo "hello world" > example.file
To my surprise, this posted the same error I had seen before, saying that I have exceeded my quota share. I wasn't sure if Synology had the ability to set file system limits but the investigative journalist inside of me decided to check it out.
I logged into Synology and went to my shared folders that I use for my Kubernetes cluster, and lo and behold this is what I see...
Well, actually not this, because this was me fixing the issue, but you gotta believe me, it showed that I used 256GB
of my 256GB
quota.
Lessons learned
Synology lets you set the max amount of storage allowed on your drives. I must have set this up about 2 years ago and never configured it again. The issue is now solved and my immich uploads are (partially) working again 🙃