Backing Up K8ssandra With MinIO

K8ssandra includes Medusa for Apache Cassandra® to handle the backup and restore of your Cassandra nodes. Medusa has recently been upgraded to provide support for all S3 compatible backends, including MinIO, the original k8s object storage suite. Let’s see how to setup K8ssandra and MinIO to backup Cassandra in just a few steps.

Post your MinIO

Similar to K8ssandra, MinIO can be deployed simply through Helm.

First, add the MinIO repository to your local list:

helm repo add minio https://helm.min.io/

MinIO Helm diagrams allow you to do several things simultaneously at the time of installation:

  • Set credentials to access MinIO
  • Create a bucket for your backups which can be set as default

You can create a file k8ssandra-medusa bucket and use minio_key/minio_secret As credentials, publish MinIO in a new namespace called minio By running the following command:

helm install --set accessKey=minio_key,secretKey=minio_secret,defaultBucket.enabled=true,defaultBucket.name=k8ssandra-medusa minio minio/minio -n minio --create-namespace

Noticeable: Container creation is not mandatory at this point and can be done through the MinIO user interface.

after helm install Completed, you should see something similar to this in the file minio namespace:

% kubectl get all -n minio
NAME                        READY   STATUS    RESTARTS   AGE
pod/minio-5fd4dd687-gzr8j   1/1     Running   0          109s

NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/minio   ClusterIP   10.96.144.61   <none>        9000/TCP   109s

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/minio   1/1     1            1           109s

NAME                              DESIRED   CURRENT   READY   AGE
replicaset.apps/minio-5fd4dd687   1         1         1       109s

Using port forwarding, you can detect access to the browser’s MinIO UI on port 9000:

% kubectl port-forward service/minio 9000 -n minio
Forwarding from 127.0.0.1:9000 -> 9000
Forwarding from [::1]:9000 -> 9000

You can now login to MinIO at http://localhost:9000 using the credentials specified for the installation time (if you use the same commands above, they will be minio_key And minio_secret):

Log in to MinIO

Once logged in, you can see that the file k8ssandra-medusa A bucket has been created and is currently empty:

K8ssandra Medusa Bucket Inside MinIO

Publish K8ssandra

Now that MinIO is up and running, you can create a namespace to install K8ssandra and create a secret for Medusa to access the container. construction medusa_secret.yaml File with the following content:

apiVersion: v1
kind: Secret
metadata:
 name: medusa-bucket-key
type: Opaque
stringData:
 # Note that this currently has to be set to medusa_s3_credentials!
 medusa_s3_credentials: |-
   [default]
   aws_access_key_id = minio_key
   aws_secret_access_key = minio_secret

Now create a file k8ssandra namespace and secret medusa with the following commands:

kubectl create namespace k8ssandra
kubectl apply -f medusa_secret.yaml -n k8ssandra

You must watch now medusa-bucket-key secret in k8ssandra namespace:

% kubectl get secrets -n k8ssandra
NAME                  TYPE                                  DATA   AGE
default-token-twk5w   kubernetes.io/service-account-token   3      4m49s
medusa-bucket-key     Opaque                                1      45s

You can then deploy K8ssandra with the following custom values ​​file (all default values ​​will be used if not customized here):

medusa:
  enabled: true
  storage: s3_compatible
  storage_properties:
      host: minio.minio.svc.cluster.local
      port: 9000
      secure: "False"
  bucketName: k8ssandra-medusa
  storageSecret: medusa-bucket-key

Save the above file as k8ssandra_medusa_minio.yaml Then install K8ssandra with the following command:

helm install k8ssandra k8ssandra/k8ssandra -f k8ssandra_medusa_minio.yaml -n k8ssandra

Now wait for the Cassandra block to be ready using the following wait ordering:

kubectl wait --for=condition=Ready cassandradatacenter/dc1 --timeout=900s -n k8ssandra

You should now see a list of pods similar to this:

% kubectl get pods -n k8ssandra
NAME                                                  READY   STATUS      RESTARTS   AGE
k8ssandra-cass-operator-547845459-dwg68               1/1     Running     0          6m36s
k8ssandra-dc1-default-sts-0                           3/3     Running     0          5m56s
k8ssandra-dc1-stargate-776f88f945-p9twg               0/1     Running     0          6m36s
k8ssandra-grafana-75b9cb64cc-kndtc                    2/2     Running     0          6m36s
k8ssandra-kube-prometheus-operator-5bdd97c666-qz5vv   1/1     Running     0          6m36s
k8ssandra-medusa-operator-d766d5b66-wjt7j             1/1     Running     0          6m36s
k8ssandra-reaper-5f9bbfc989-j59xk                     1/1     Running     0          2m48s
k8ssandra-reaper-operator-858cd89bdd-7gfjj            1/1     Running     0          6m36s
k8ssandra-reaper-schema-4gshj                         0/1     Completed   0          3m3s
prometheus-k8ssandra-kube-prometheus-prometheus-0     2/2     Running     1          6m32s

Create and backup some data

Extract the username and password to access Cassandra (the password varies for each install unless explicitly set at install time) into variables:

% username=$(kubectl get secret k8ssandra-superuser -n k8ssandra -o jsonpath="{.data.username}" | base64 --decode)
% password=$(kubectl get secret k8ssandra-superuser -n k8ssandra -o jsonpath="{.data.password}" | base64 --decode)

Connect through CQLSH on one of the nodes:

% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password

Copy/paste the following statements into the CQLSH prompt and hit Enter:

CREATE KEYSPACE medusa_test  WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
USE medusa_test;
CREATE TABLE users (email TEXT PRIMARY KEY, name TEXT, state TEXT);
INSERT INTO users (email, name, state) VALUES ('alice@example.com', 'Alice Smith', 'TX');
INSERT INTO users (email, name, state) VALUES ('bob@example.com', 'Bob Jones', 'VA');
INSERT INTO users (email, name, state) VALUES ('carol@example.com', 'Carol Jackson', 'CA');
INSERT INTO users (email, name, state) VALUES ('david@example.com', 'David Yang', 'NV');

Make sure the rows are entered correctly:

SELECT * FROM medusa_test.users;

 email             | name          | state
-------------------+---------------+-------
 alice@example.com |   Alice Smith |    TX
   bob@example.com |     Bob Jones |    VA
 david@example.com |    David Yang |    NV
 carol@example.com | Carol Jackson |    CA

(4 rows)

Now back up this data, and check that the files have been created in your MinIO container.

To this end, use the following command:

helm install my-backup k8ssandra/backup -n k8ssandra --set name=backup1,cassandraDatacenter.name=dc1

Since the backup process is asynchronous, you can monitor its completion by running the following command:

kubectl get cassandrabackup backup1 -n k8ssandra -o jsonpath={.status.finishTime}

As long as this does not output the date and time, the backup is still running. With the amount of data out there and the fact that you’re using a locally accessible backend, this should complete quickly.

Now update MinIO UI and you will see some files in File k8ssandra-medusa Bucket:

Files in K8ssandra Medusa Bucket

It should show an index folder (the index of the Medusa backup) and then another folder for each Cassandra node in the collection (in this case there is only one node).

Delete data and restore the backup

TRUNCATE table and check that it is empty:

% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password

TRUNCATE medusa_test.users;

SELECT * FROM medusa_test.users;

 email | name | state
-------+------+-------


(0 rows)

Now restore the backup that was previously taken:

helm install restore-test k8ssandra/restore --set name=restore-backup1,backup.name=backup1,cassandraDatacenter.name=dc1 -n k8ssandra

This process will take a little longer because it requires stopping the StatefulSet pod and performing the restore as part of the init containers before starting the Cassandra container. You can monitor the progress with this command:

watch -d kubectl get cassandrarestore restore-backup1 -o jsonpath={.status} -n k8ssandra

The entire restoration process is completed once the file is completed finishTime The value appears in the output:

{"finishTime":"2021-03-23T13:58:36Z","restoreKey":"83977399-44dd-4752-b4c4-407273f0339e","startTime":"2021-03-23T13:55:35Z"}

Verify that you can read the data from the previously truncated table:

% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u k8ssandra-superuser -p XHsZ943WBg5RPNhVAT8x -e "SELECT * FROM medusa_test.users"

 email             | name          | state
-------------------+---------------+-------
 alice@example.com |   Alice Smith |    TX
   bob@example.com |     Bob Jones |    VA
 david@example.com |    David Yang |    NV
 carol@example.com | Carol Jackson |    CA

(4 rows)

You have successfully recovered your lost data in just a few commands!

Many backgrounds available

While MinIO is an obvious choice in the world of Kubernetes, it’s not the only S3-compatible backend that K8ssandra can use. K8ssandra has supported AWS S3 and Google Cloud Storage as Medusa wallpapers since version 1.0.0. There are also a variety of solutions that can run on-premises (including CEPH, Cloudian, Riak S2, and Dell EMC ECS) or in cloud environments (including IBM Cloud Object Storage and OVHcloud Object Storage).

See your K8ssandra backup/restore documentation for more detailed instructions and let us know if you have questions, we’d love to help! If you are looking to learn Cassandra, or would like to learn how to handle backups on a service operated by Cassandra, please head over to the Astra DB website and try the free tier.

.

Leave a Comment