Running a database in production is hard. It usually appears to be simple at first, but this changes when something goes wrong. Someone deleted a table, you got hacked, or something got corrupted. Now you need to restore a backup that you made (better hope that you made one!). It may also be that you get more load on your database and your system no longer works well enough. Now you need to scale or modify your setup.
I’ve been searching for a good Kubernetes-native solution for running real databases. As part of the research, I’m trying out different database operators – and writing a tutorial on how to use them. If you like to know about the others, make sure to subscribe to our newsletter.
This is a tutorial (and a video!) on using Appscode’s KubeDB and Stash to create a database on Kubernetes and setup a system for regular backups in a reliable way.
KubeDB by AppsCode simplifies and automates routine database tasks such as provisioning, patching, backup, recovery, failure detection, and repair for various popular databases on private and public cloudsKubeDB website
In this tutorial I will:
- Install the KubeDB operator and create a simple PostgreSQL database
- Installing Stash/KubeDB Enterprise (for backup and restore)
- Explain what kind of features you would like to have beyond a simple database
- Closing remarks
1. Install the KubeDB operator and create a simple PostgreSQL database
Installing the KubeDB operator is described in the KubeDB guides but it comes down to adding the Appscode Helm repository and then installing a couple of Helm charts, as shown below. Because I like repeatable steps, I always create a Makefile, and this is what I will be showing you.
From what I understand it’s mandatory to install the operator in the kube-system namespace.
repository: helm repo add appscode https://charts.appscode.com/stable/ update: helm repo update # check what the latest version is with: versions: helm search repo appscode/kubedb # installing the operator install-community: helm install kubedb-community appscode/kubedb \ --version v0.16.1 \ --namespace kube-system # installing the latest supported version install-catalog: helm install kubedb-catalog appscode/kubedb-catalog \ --version v0.16.1 \ --namespace kube-system
This should have installed what you need. You can check with
kubectl get postgresversions
The next step would be to install an actual PostgreSQL database. We can use the following minimal kubespec to create the “Postgres” resource (my-postgres.yaml):
apiVersion: kubedb.com/v1alpha2 kind: Postgres metadata: name: my-postgres spec: version: "10.6-v3" storageType: Durable storage: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi terminationPolicy: DoNotTerminate
For more details see the guide on Postgres
This is the moment where you check that you are in the namespace that you want. By the way: I use kubens for this purpose, together with kubectx a great helper to switch namespaces and clusters.
To create the database you can simply run
kubectl apply -f my-postgres.yaml
Now check what resources are being created:
kubectl get all. If all worked well you should see at least the following resources:
pod/postgres-0 service/postgres statefulset.apps/postgres appbinding.appcatalog.appscode.com/postgres postgres.kubedb.com/postgres
2. Installing Stash/KubeDB Enterprise (for backup and restore)
2.1 Get the license
Now, in order to get access to the automated backup; and restore features of KubeDB you’ll need to get the ‘enterprise’ license of KubeDB. If you get that license it will include the ability to use Stash for the backup and restore of databases.
Currently, how it works is you fetch a 14-day trial license from the license issuer. This is enough to get you started. After that, you can get in touch with KubeDB and sign an agreement for using the pay-as-you-go licensing, after which you will then also be added to priority support.
I think it’s very reasonably priced (your own server + license comes out to less than what you’d pay for AWS managed databases)
2.2 Install the operators
# Install KubeDB-enterprise install-enterprise: helm install kubedb-enterprise appscode/kubedb-enterprise \ --version v0.2.1 \ --namespace kube-system \ --set-file license=kubedb-enterprise-license.txt # install stash helm install stash-enterprise appscode/stash-enterprise \ --version v0.11.7 \ --namespace kube-system \ --set-file license=stash-enterprise-license.txt
2.3 Install stash-postgres-addon
Next, you’ll need to install the Postgres add-on for Stash. It basically includes the intelligence to deal with this particular database type.
stash-postgres-addon: curl -fsSL https://github.com/stashed/catalog/raw/v2020.11.17/deploy/helm3.sh | bash -s -- --catalog=stash-postgres
If you have successfully installed the stash-postgres addon you should be able to retrieve a list of task options like so:
kubectl get tasks. It shows me a bunch of tasks like:
3. Configure a backup job
3.1 Configure a backup repository
First, we’ll configure a place where the backups will go. Here we will use an S3 bucket. Stash uses Restic, and Restic supports symmetric encryption so that your backups are not stored in plaintext. Make sure you store the Restic password safely outside of your cluster. Otherwise, in case of a catastrophic failure, you would still have no actual data!
You will need to create a secret with these three key-value pairs. I find it easiest to create a file
secrets and put the key-value pairs in there like so:
$ cat > secrets RESTIC_PASSWORD=a_strong_password AWS_ACCESS_KEY_ID=key AWS_SECRET_ACCESS_KEY=secret
Then create this secret from that file
kubectl create secret generic bucket-secrets \ --from-env-file="secrets"
Now you can create the repository resource (the storage definition). I have used a Minio server, but any bucket should work. Please note the ‘http://’ prefix on the endpoint address. Since we run the bucket on the cluster we do not want to use TLS, and emitting this prefix otherwise causes a TLS error.
# repository.yaml apiVersion: stash.appscode.com/v1alpha1 kind: Repository metadata: name: minio-repo spec: backend: s3: endpoint: http://minio.minio.svc.cluster.local:7777 bucket: database-backups prefix: /backups/demo/ storageSecretName: bucket-secrets
We can apply this resource again with
kubectl apply -f repository.yaml
3.2 Create the backupconfiguration
Now we are ready to create the backup configuration. Here is an example:
#backupconfiguration.yaml apiVersion: stash.appscode.com/v1beta1 kind: BackupConfiguration metadata: name: minio-backup spec: driver: Restic repository: name: minio-repo task: name: postgres-backup-10.14.0-v3 target: ref: apiVersion: appcatalog.appscode.com/v1alpha1 kind: AppBinding name: my-postgres schedule: "* */1 * * *" # backup every hour paused: false backupHistoryLimit: 1 retentionPolicy: name: "keep-some" keepLast: 15 keepHourly: 24 keepDaily: 30 keepMonthly: 12 prune: true
The ‘task’ field is a reference to the particular backup task that you want to run. From the output in
kubectl get tasks select the version that matches the major version of your database.
The target.name needs to be given a reference to the name you have given to the database when you created it in step 1 (in my case my-postgres).
The retentionPolicy is a configuration for Restic to clean up a certain amount of backups to clear up space, while still keeping some. — Personally I really like this feature, as it allows you to do more frequent backups without bloating your storage.
Once applied you should see that a backupsession (
kubectl get backupsession) is started every time (set the cron interval to shorter while you are testing). In case the session starts, but problems occur a good way to debug is to look at the pod that is started as part of this process; you may need to look into the init container.
Finally, you can check the repository (
kubectl get repository) to see if it has created. Use your favorite tool to check if the backup files have also arrived where you expect.
If this all works you have completed your automatic backup configuration!
4. Closing remarks
While this tutorial stops here, if you plan to run a database in production you should make sure to regularly test a full restore of your database.
Overall I have found that the KubeDB and Stash operators are reasonably well documented, and the support is good.
Finally: Be sure to sign up for the Leafcloud newsletter, as I hope to complete evaluations of alternatives and a comparison as well.