Backup your Databases
I've had some issues with my persistent volumes on Kubernetes, which I was lucky that I was able to fix it, but thats not always the case. I decided to make regular backups of my mysql databases and since im using Kubernetes, why not use the CronJob
resource as well to make backups.
Plan of Action
I wanted to achieve the following:
- Use a custom container image that hosts the backup script.
- Run a pod on a time schedule that mounts a NFS volume to the pod.
- Create a backup with a timestamp suffix and store it to NFS every day.
- Run a weekly job that will keep the latest 7 backups on disk.
The solution that I came up with is to use Kubernetes CronJob's that will run on the timeschedules of our choice.
Container Image
First to build our container image, in the file Dockerfile
:
FROM alpine:3.19.1
# Install dependencies
RUN apk --no-cache add mysql-client
# Copy binary
COPY bin/db-backup /usr/bin/db-backup
RUN chmod +x /usr/bin/db-backup
# Execute
CMD ["db-backup"]
Then we need to define our backup script in bin/db-backup
:
#!/usr/bin/env sh
# MySQL credentials
DB_USER="${DB_USER:-}"
DB_PASS="${DB_PASS:-}"
DB_HOST="${DB_HOST:-}"
# Backup directory
BACKUP_DIR="${BACKUP_DIR:-/data/backups}"
DATE=$(date +"%Y%m%d%H%M")
BACKUP_FILE="$BACKUP_DIR/all_databases_$DATE.sql.gz"
# Function to log and exit on error
log_and_exit() {
echo "$(date +"%Y-%m-%d %H:%M:%S") - $1"
echo $1
exit 1
}
# Check if required environment variables are set
if [ -z "$DB_USER" ] || [ -z "$DB_PASS" ] || [ -z "$DB_HOST" ]; then
log_and_exit "Error: One or more required environment variables (DB_USER, DB_PASS, DB_HOST) are not set."
fi
# Ensure the backup directory exists
mkdir -p $BACKUP_DIR
# Dump all databases and gzip the output
mysqldump -u $DB_USER -p$DB_PASS -h $DB_HOST --all-databases | gzip > $BACKUP_FILE
# Verify the backup file
if [ -f "$BACKUP_FILE" ]; then
echo "[$DATE] Backup successful: $BACKUP_FILE"
else
log_and_exit "Backup failed!"
fi
As you can see we are relying on environment variables that we need to have in our runtime environment.
Then continue to build the container image:
Then push it to your registry, I have published mine at ruanbekker/mysql-backups:alpine-latest
.
Backup CronJob
Now that we have our container image published, we can define our cronjob that will do backups every morning at 2AM, in templates/mysql-backup-job.yaml
:
apiVersion: batch/v1
kind: CronJob
metadata:
name: mysql-db-backup
namespace: databases
spec:
schedule: "* 2 * * *" # Runs daily at 2:00 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: db-backup
image: "ruanbekker/mysql-backups:alpine-latest"
imagePullPolicy: Always
env:
- name: DB_HOST
value: "mysql.databases.svc.cluster.local"
- name: BACKUP_DIR
value: "/backup"
- name: DB_USER
valueFrom:
secretKeyRef:
name: mysql-db-backup-secrets
key: DB_USER
- name: DB_PASS
valueFrom:
secretKeyRef:
name: mysql-db-backup-secrets
key: DB_PASS
volumeMounts:
- name: backup-storage
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: mysql-backup-pvc
We can see we are referencing some environment variables using secrets, so let's create those secrets in templates/secrets.yaml
:
apiVersion: v1
kind: Secret
metadata:
name: mysql-db-backup-secrets
namespace: databases
type: Opaque
data:
DB_USER: YWRtaW4=
DB_PASS: YWRtaW4=
# echo -n 'admin' | base64
That includes our DB_USER
and DB_PASS
, ensure that you have DB_HOST
which will be the endpoint of your MySQL host (as you can see mine is the service endpoint inside the cluster), as well as BACKUP_DIR
which is the backup directory inside if your pod, this needs to match the volumeMounts section.
Backup Cleanup
Then lastly the job that will clean up the old backups once a week can be defined in templates/mysql-cleanup-job.yaml
:
apiVersion: batch/v1
kind: CronJob
metadata:
name: db-backup-cleanup
namespace: databases
spec:
schedule: "0 3 * * 0" # Runs weekly at 3:00 AM on Sundays
jobTemplate:
spec:
template:
spec:
containers:
- name: db-backup-cleanup
image: "alpine:3.19.1"
command:
- /bin/sh
- -c
- |
find /backup -type f -mtime +7 -name '*.sql.gz' -exec rm {} \;
volumeMounts:
- name: backup-storage
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: mysql-backup-pvc
Persistent Volume Claim
I have a NFS Storage Class named nfs
and we need to create a persistent volume claim and then let both jobs use this pvc as both jobs need to use the data on that storage path. Inside templates/mysql-pvc-for-jobs.yaml
:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-backup-pvc
namespace: databases
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs
resources:
requests:
storage: 5Gi
Deploy
Once we have everything defined, we can deploy them using kubectl apply -f templates/
. Just make sure you review the namespaces, storage classes etc.
You can view the resources using: