You have a running OpenShift Cluster powering your production microservices and worried about etcd data backup?. In this guide we show you how to easily backup etcd and push the backup data to AWS S3 object store. An etcd is a key-value store for OpenShift Container Platform, which persists the state of all resource objects.
In any OpenShift Cluster administration, it is a good and recommended practice to back up your cluster’s etcd data regularly and store it in a secure location. The ideal location for data storage is outside the OpenShift Container Platform environment. This can be an NFS server share, secondary server in your Infrastructure or in a Cloud environment.
The other recommendation is taking etcd backups during non-peak usage hours, as the action is blocking in nature. Ensure etcd backup operation is performed after any OpenShift Cluster upgrade. The importance of this is that during cluster restoration, an etcd backup taken from the same z-stream release must be used. As an example, an OpenShift Container Platform 4.6.3 cluster must use an etcd backup that was taken from 4.6.3.
Step 1: Login to one Master Node in the Cluster
The etcd cluster backup has to be performed on a single invocation of the backup script on a master host. Do not take a backup for each master host.
Login to one master node either through SSH or debug session:
# SSH Access
ssh core@<master_node_ip_or_dns_name>
# Debug session
oc debug node/<node_name>
For a debug session you need to change your root directory to the host:
sh-4.6# chroot /host
If the cluster-wide proxy is enabled, be sure that you have exported the NO_PROXY
, HTTP_PROXY
, and HTTPS_PROXY
environment variables.
Step 2: Perform etcd Backup on OpenShift 4.x
An OpenShift cluster access as a user with the cluster-admin
role is required to perform this operation.
Before you proceed check to confirm if proxy is enabled:
oc get proxy cluster -o yaml
If you have proxy enabled, httpProxy, httpsProxy, and noProxy fields will have the values set.
Run the cluster-backup.sh script to initiate etcd backup process. You should pass a path where backup is saved.
mkdir /home/core/etcd_backups
sudo /usr/local/bin/cluster-backup.sh /home/core/etcd_backups
Here is my command execution output:
Certificate /etc/kubernetes/static-pod-certs/configmaps/etcd-serving-ca/ca-bundle.crt is missing. Checking in different directory
Certificate /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-serving-ca/ca-bundle.crt found!
found latest kube-apiserver: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-13
found latest kube-controller-manager: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-6
found latest kube-scheduler: /etc/kubernetes/static-pod-resources/kube-scheduler-pod-6
found latest etcd: /etc/kubernetes/static-pod-resources/etcd-pod-7
b056e4cb492c8f855be6b57fc22c202eb2ccf6538b91c06ceb3923a7c6b898b1
etcdctl version: 3.5.9
API version: 3.5
{"level":"info","ts":"2023-08-30T00:27:13.272889Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/home/core/etcd_backups/snapshot_2023-08-30_002712.db.part"}
{"level":"info","ts":"2023-08-30T00:27:13.280339Z","logger":"client","caller":"[email protected]/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2023-08-30T00:27:13.280487Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://192.168.56.11:2379"}
{"level":"info","ts":"2023-08-30T00:27:14.044229Z","logger":"client","caller":"[email protected]/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2023-08-30T00:27:14.167032Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://192.168.56.11:2379","size":"104 MB","took":"now"}
{"level":"info","ts":"2023-08-30T00:27:14.167168Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/home/core/etcd_backups/snapshot_2023-08-30_002712.db"}
Snapshot saved at /home/core/etcd_backups/snapshot_2023-08-30_002712.db
Deprecated: Use `etcdutl snapshot status` instead.
{"hash":1265701168,"revision":442134,"totalKey":9006,"totalSize":103821312}
snapshot db and kube resources are successfully saved to /home/core/etcd_backups
List files in the backup directory:
$ ls -1 /home/core/etcd_backups/
snapshot_2023-08-30_002712.db
static_kuberesources_2023-08-30_002712.tar.gz
$ du -sh /home/core/etcd_backups/*
100M /home/core/etcd_backups/snapshot_2023-08-30_002712.db
72K /home/core/etcd_backups/static_kuberesources_2023-08-30_002712.tar.gz
There will be two files in the backup:
snapshot_<datetimestamp>.db
: This file is the etcd snapshot.static_kuberesources_<datetimestamp>.tar.gz
: This file contains the resources for the static pods. If etcd encryption is enabled, it also contains the encryption keys for the etcd snapshot.
Step 3: Push Backup to AWS S3 (From Bastion Server)
Login from Bastion Server and copy backup files.
scp -r core@serverip:/home/core/etcd_backups ~/
Install AWS CLI tool:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
Install unzip tool:
sudo yum -y install unzip
Extract downloaded file:
unzip awscli-exe-linux-x86_64.zip
Install AWS CLI:
$ sudo ./aws/install
You can now run: /usr/local/bin/aws --version
Confirm installation by checking the version:
$ aws --version
aws-cli/2.8.12 Python/3.10.8 Darwin/22.5.0 source/x86_64 prompt/off
Create OpenShift Backups bucket:
$ aws s3 mb s3://openshiftbackups
make_bucket: openshiftbackups
Create an IAM User:
aws iam create-user --user-name backupsonly
Create AWS Policy for Backups user – user able to write to S3 only:
cat >aws-s3-uploads-policy.json<<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*",
"s3:Put*"
],
"Resource": "*"
}
]
}
EOF
Apply the policy:
aws iam create-policy --policy-name upload-only-policy --policy-document file://aws-s3-uploads-policy.json
Assign AWS Policy to IAM User:
aws iam attach-user-policy --policy-arn arn:aws:iam::<accountid>:policy/upload-only-policy --user-name backupsonly
You can now create an access key for an IAM user to test:
$ aws iam create-access-key --user-name backupsonly
{
"AccessKey": {
"UserName": "backupsonly",
"AccessKeyId": "AKIATWFKCYAHF74SCFEP",
"Status": "Active",
"SecretAccessKey": "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",
"CreateDate": "2023-03-16T12:14:39+00:00"
}
}
Take note of the Access and Secret Key IDs and use it in configuration:
aws configure # On OCP Bastion server
Provide:
- AWS Access Key ID
- AWS Secret Access Key
- Default region
Try uploading the files to S3 bucket:
$ aws s3 cp etcd_backups/ s3://openshiftbackups/etcd --recursive
upload: etcd_backups/static_kuberesources_2023-03-16_134036.tar.gz to s3://openshiftbackups/etcd/static_kuberesources_2023-03-16_134036.tar.gz
upload: etcd_backups/snapshot_2023-03-16_134036.db to s3://openshiftbackups/etcd/snapshot_2023-03-16_134036.db
Confirm:
$ aws s3 ls s3://openshiftbackups/etcd/
2023-03-16 16:00:59 1549340704 snapshot_2023-03-16_134036.db
2023-03-16 16:00:59 77300 static_kuberesources_2023-03-16_134036.tar.gz
Step 4: Automated Backups to AWS S3 (From Bastion Server)
We can do a script which will perform the following:
- Login from bastion to master node
- Initiate backup of etcd
- Copy backup data from master node to the bastion server
- Delete backup data on master node
- Copy backup data to the S3 bucket
- Delete local data upon successful upload to S3
Create script file on Bastion server:
vim backup_etcd_s3.sh
Here is the script which can be modified further for more advanced use case.
#!/bin/bash
MASTER_NAME="master01.example.net"
USERNAME="core"
BACKUPS_DIR=~/etcd_backups
S3_BUCKET="openshiftbackups/etcd"
RESULT="$?"
# Create backups directory if doesn't exist
[ -d ${BACKUPS_DIR} ] && echo "Directory Exists" || mkdir ${BACKUPS_DIR}
# Login and run backup
ssh ${USERNAME}@${MASTER_NAME} 'mkdir /home/core/etcd_backups' 2>/dev/null
ssh ${USERNAME}@${MASTER_NAME} 'sudo /usr/local/bin/cluster-backup.sh /home/core/etcd_backups'
scp -r ${USERNAME}@${MASTER_NAME}:/home/core/etcd_backups/* ${BACKUPS_DIR}/
# clean etcd backups directory on the master node
if [ $RESULT -eq 0 ]; then
ssh ${USERNAME}@${MASTER_NAME} 'rm -rf /home/core/etcd_backups/*'
fi
# Backup to aws s3
aws s3 cp ${BACKUPS_DIR}/ s3://${S3_BUCKET} --recursive
# List bucket contents
aws s3 ls s3://${S3_BUCKET}/
# Clean backups older than 1 day
#find ${BACKUPS_DIR}/ -mtime +1 -exec rm {} \;
find ${BACKUPS_DIR}/ -type f -mtime +1 -delete
Using Cron Job that runs 3am:
$ crontab -e
0 3 * * * /path/to/backup_etcd_s3.sh
Conclusion
In this article we’ve looked at how you can backup OpenShift etcd and push the data to an S3 bucket. In our next guide we could discuss on how you can perform a restore from the backup.
More guides on OpenShift cluster: