Welcome to our guide on setting up Persistent Volumes Dynamic Provisioning using GlusterFS and Heketi for your Kubernetes / OpenShift clusters. GlusterFS is a free and open source scalable network filesystem suitable for data-intensive tasks such as cloud storage and media streaming. It utilizes common off-the-shelf hardware. In my setup, I’ve opted to deploy GlusterFS as a hyper-converged service on the Kubernetes nodes. This will unlock the power of dynamically provisioned, persistent GlusterFS volumes in Kubernetes.
We’ll use the gluster-kubernetes project which provides Kubernetes administrators a mechanism to easily deploy GlusterFS as a native storage service onto an existing Kubernetes cluster. Here, GlusterFS is managed and orchestrated like any other app in Kubernetes. heketi is a RESTful volume management interface for GlusterFS. It allows you to create and manage Gluster volumes using API.
Infrastructure Requirements
Below are the basic requirements for the setup.
- There must be at least three nodes
- Each node must have at least one raw block device attached for use by heketi
- Each node must have the following ports opened for GlusterFS communications: 2222 for GlusterFS pod’s sshd, 24007 for GlusterFS Daemon, 24008 for GlusterFS Management, 49152 to 49251 for each brick created on the host.
- The following kernel modules must be loaded:
- dm_snapshot
- dm_mirror
- dm_thin_pool
- Each node requires that the
mount.glusterfs
command is available. - GlusterFS client version installed on nodes should be as close as possible to the version of the server.
Step 1: Setup Kubernetes / OpenShift Cluster
This setup assumes you have a running Kubernetes / OpenShift(OKD) cluster. Refer to our guides on how to quickly spin up a cluster for test/production use.
- Deploy Kubernetes cluster on Rocky / AlmaLinux 8
- Deploy Kubernetes Cluster on Ubuntu 22.04
- Deploy Production Ready Kubernetes Cluster with Ansible & Kubespray
- Setup Kubernetes Cluster on Ubuntu 18.04
Step 2: Install glusterfs and configure firewall
If you’re using Red Hat based Linux distribution, install the glusterfs-fuse package which provides mount.glusterfs command.
sudo yum -y install glusterfs-fuse
For Ubuntu / Debian:
sudo apt install glusterfs-client
Load all kernel modules required
for i in dm_snapshot dm_mirror dm_thin_pool; do sudo modprobe $i; done
Check if the modules are loaded.
$ sudo lsmod | egrep 'dm_snapshot|dm_mirror|dm_thin_pool'
dm_thin_pool 66358 0
dm_persistent_data 75269 1 dm_thin_pool
dm_bio_prison 18209 1 dm_thin_pool
dm_mirror 22289 0
dm_region_hash 20813 1 dm_mirror
dm_log 18411 2 dm_region_hash,dm_mirror
dm_snapshot 39103 0
dm_bufio 28014 2 dm_persistent_data,dm_snapshot
dm_mod 124461 5 dm_log,dm_mirror,dm_bufio,dm_thin_pool,dm_snapshot
Check version installed.
$ glusterfs --version
glusterfs 3.12.2
Also open required ports on the Firewall – CentOS / RHEL / Fedora
for i in 2222 24007 24008 49152-49251; do
sudo firewall-cmd --add-port=${i}/tcp --permanent
done
sudo firewall-cmd --reload
Step 3: Check Kubernetes Cluster status
Verify the Kubernetes installation by making sure all nodes in the cluster are Ready:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready master 146m v1.26.5
worker01 Ready <none> 146m v1.26.5
worker02 Ready <none> 146m v1.26.5
worker03 Ready <none> 146m v1.26.5
To view the exact version of Kubernetes running, use:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"fcf512e2763f3b98bcc8e3fb087cd8cb80f8ca83", GitTreeState:"clean", BuildDate:"2022-08-15T05:48:10Z", GoVersion:"go1.18.4", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.5", GitCommit:"890a139214b4de1f01543d15003b5bda71aae9c7", GitTreeState:"clean", BuildDate:"2023-05-17T14:08:49Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}
Step 4: Add secondary raw disks to your nodes
Each node must have at least one raw block device attached for use by heketi. I’ve added 2 virtual disks of 50gb each to my k8s nodes.
[worker01 ~]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdc 253:32 0 50G 0 disk
vdd 253:48 0 50G 0 disk
[worker02 ~]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdc 253:32 0 50G 0 disk
vdd 253:48 0 50G 0 disk
[worker03 ~]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdc 253:32 0 50G 0 disk
vdd 253:48 0 50G 0 disk
Step 5: Create a topology file
You must provide the GlusterFS cluster topology information which describes the nodes present in the GlusterFS cluster and the block devices attached to them for use by heketi.
Since I’m running all operations on the Kubernetes master node, Let’s pull gluster-kubernetes from Github.
sudo yum -y install git vim
git clone https://github.com/gluster/gluster-kubernetes.git
Copy and edit the topology information template.
cd gluster-kubernetes/deploy/
cp topology.json.sample topology.json
This is what I have in my configuration.
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"worker01"
],
"storage": [
"10.10.1.193"
]
},
"zone": 1
},
"devices": [
"/dev/vdc",
"/dev/vdd"
]
},
{
"node": {
"hostnames": {
"manage": [
"worker02"
],
"storage": [
"10.10.1.167"
]
},
"zone": 1
},
"devices": [
"/dev/vdc",
"/dev/vdd"
]
},
{
"node": {
"hostnames": {
"manage": [
"worker03"
],
"storage": [
"10.10.1.178"
]
},
"zone": 1
},
"devices": [
"/dev/vdc",
"/dev/vdd"
]
}
]
}
]
}
When
creating your own topology file:
- Make sure the topology file only lists block devices intended for heketi’s use. heketi needs access to whole block devices (e.g. /dev/vdc, /dev/vdd) which it will partition and format.
- The
hostnames
array is a bit misleading.manage
should be a list of hostnames for the node, butstorage
should be a list of IP addresses on the node for backend storage communications.
Step 6: Run the deployment script
With the topology file created, you are ready to run the gk-deploy script from a machine with administrative access to your Kubernetes cluster. If not running from the master node, copy the Kubernetes configuration file to ~/.kube/config.
Familiarize yourself with the options available.
./gk-deploy -h
Common options:
-g, --deploy-gluster: Deploy GlusterFS pods on the nodes in the topology that contain brick devices --ssh-user USER: User to use for SSH commands to GlusterFS nodes. Non-root users must have sudo permissions on the nodes. Default is 'root' --user-key USER_KEY: Secret string for general heketi users. This is a required argument. -l LOG_FILE, --log-file LOG_FILE: Save all output to the specified file. -v, --verbose: Verbose output
Run the command below to start the deployment of GlusterFS/Heketi replacing MyUserStrongKey and MyAdminStrongKey with your key values.
./gk-deploy -g \
--user-key MyUserStrongKey \
--admin-key MyAdminStrongKey \
-l /tmp/heketi_deployment.log \
-v topology.json
Press Y key to accept the installation.
Do you wish to proceed with deployment?
[Y]es, [N]o? [Default: Y]: Y
If the deployment was successful, you should get a message:
heketi is now running and accessible via http://10.233.108.5:8080
See screenshot below.
Pods, service and endpoint will be created automatically upon successful deployment. The GlusterFS and heketi should now be installed and ready to go.
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
heketi 1/1 1 1 75m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-44jvh 1/1 Running 0 110m
glusterfs-j56df 1/1 Running 0 110m
glusterfs-lttb5 1/1 Running 0 110m
heketi-b4b94d59d-bqmpz 1/1 Running 0 76m
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
heketi ClusterIP 10.233.42.58 <none> 8080/TCP 76m
heketi-storage-endpoints ClusterIP 10.233.41.189 <none> 1/TCP 76m
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 127m
$ kubectl get endpoints
NAME ENDPOINTS AGE
heketi 10.233.108.5:8080 76m
heketi-storage-endpoints 10.10.1.167:1,10.10.1.178:1,10.10.1.193:1 77m
kubernetes 10.10.1.119:6443 127m
Step 7: Install heketi-cli to interact with GlusterFS
The heketi-cli is used to interact with GlusterFS deployed on the Kubernetes cluster. Download the latest release and place the binary in your PATH.
wget https://github.com/heketi/heketi/releases/download/v10.4.0/heketi-client-v10.4.0-release-10.linux.amd64.tar.gz
Extract downloaded archive files – This will have both client and server.
for i in `ls | grep heketi | grep .tar.gz`; do tar xvf $i; done
Copy heketi-cli to /usr/local/bin directory.
sudo cp ./heketi-client/bin/heketi-cli /usr/local/bin
You should be able to get heketi-cli version as any user logged in to the server.
$ heketi-cli --version
heketi-cli v10.4.0-release-10
You can set the HEKETI_CLI_SERVER environment variable for the heketi-cli to read it directly.
export HEKETI_CLI_SERVER=$(kubectl get svc/heketi --template 'http://{{.spec.clusterIP}}:{{(index .spec.ports 0).port}}')
Confirm variable value:
$ echo $HEKETI_CLI_SERVER
http://10.233.108.5:8080
Query cluster details
$ heketi-cli cluster list --user admin --secret MyAdminStrongKey
Clusters:
Id:88ed1913182f880ab5eb22ca2f904615 [file][block]
$ heketi-cli cluster info 88ed1913182f880ab5eb22ca2f904615
Cluster id: 88ed1913182f880ab5eb22ca2f904615
Nodes:
1efe9a69341b50b00a0b15f6e7d8c797
2d48f05c7d7d8d1e9f4b4963ef8362e3
cf5753b191eca0b67aa48687c08d4e12
Volumes:
e06893fc6e4f5fa23994432a40877889
Block: true
File: true
If you save Heketi admin username and key as environment variables, you don’t need to pass these options.
$ export HEKETI_CLI_USER=admin
$ export HEKETI_CLI_KEY=MyAdminStrongKey
$ heketi-cli cluster list
Clusters:
Id:5c94db92049afc5ec53455d88f55f6bb [file][block]
$ heketi-cli cluster info 5c94db92049afc5ec53455d88f55f6bb
Cluster id: 5c94db92049afc5ec53455d88f55f6bb
Nodes:
3bd2d62ea6b8b8c87ca45037c7080804
a795092bad48ed91be962c6a351cbf1b
e98fd47bb4811f7c8adaeb572ca8823c
Volumes:
119c23455c894c33e968a1047b474af2
Block: true
File: true
$ heketi-cli node list
Id:75b2696a9e142e6900ee9fd2d1eb56b6 Cluster:23800e4b6bdeebaec4f6c45b17cabf55
Id:9ca47f98eaa60f0e734ab628897160fc Cluster:23800e4b6bdeebaec4f6c45b17cabf55
Id:c43023282eef0f10d4109c68bcdf0f9d Cluster:23800e4b6bdeebaec4f6c45b17cabf55
View topology info:
$ heketi-cli topology info
Cluster Id: 698754cfaf9642b451c4671f96c46a0b
File: true
Block: true
Volumes:
Nodes:
Node Id: 39e8fb3b09ccfe47d1d3f2d8e8b426c8
State: online
Cluster Id: 698754cfaf9642b451c4671f96c46a0b
Zone: 1
Management Hostnames: worker03
Storage Hostnames: 10.10.1.178
Devices:
Node Id: b9c3ac6737d27843ea0ce69a366de48c
State: online
Cluster Id: 698754cfaf9642b451c4671f96c46a0b
Zone: 1
Management Hostnames: worker01
Storage Hostnames: 10.10.1.193
Devices:
Node Id: c94636a003af0ca82e7be6962149869b
State: online
Cluster Id: 698754cfaf9642b451c4671f96c46a0b
Zone: 1
Management Hostnames: worker02
Storage Hostnames: 10.10.1.167
Devices:
Create StorageClass for dynamic provisioning.
$ vim gluster-storage-class.yaml
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: glusterfs-storage
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://10.233.108.5:8080"
restuser: "admin"
restuserkey: "MyAdminStrongKey"
$ kubectl create -f gluster-storage-class.yaml
storageclass.storage.k8s.io/glusterfs-storage created
$ kubectl get storageclass
NAME PROVISIONER AGE
glusterfs-storage kubernetes.io/glusterfs 18s
$ kubectl describe storageclass.storage.k8s.io/glusterfs-storage
Name: glusterfs-storage
IsDefaultClass: No
Annotations: <none>
Provisioner: kubernetes.io/glusterfs
Parameters: resturl=http://10.233.108.5:8080,restuser=admin,restuserkey=MyAdminStrongKey
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Create PVC
$ cat gluster-pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: glusterpvc01
annotations:
volume.beta.kubernetes.io/storage-class: glusterfs-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
$ kubectl create -f gluster-pvc.yaml
persistentvolumeclaim/glusterpvc01 created
Where:
- glusterfs-storage is the Kubernetes Storage Class annotation and the name of the Storage Class.
- 1Gi is the amount of storage requested
To learn how to use Dynamic provisioning in your deployments, check the Hello World with GlusterFS Dynamic Provisioning