This article is part of Smart Infrastructure monitoring series, we’ve already covered how to Install Prometheus Server on CentOS 7 and how to Install Grafana and InfluxDB on CentOS 7. We have a Ceph cluster on production that we have been trying to find good tools for monitoring it, lucky enough, we came across Prometheus and Grafana.
Ceph Cluster monitoring with Prometheus requires Prometheus exporter that scrapes meta information about a ceph cluster. In this guide, we’ll use DigitalOcean Ceph exporter.
Pre-requisites:
- Installed Prometheus Server.
- Installed Grafana Server.
- Docker installed on a Server to run Prometheus Ceph exporter. It should be able to talk to ceph cluster.
- Working Ceph Cluster
- Access to Ceph cluster to copy ceph.conf configuration file and the ceph.<user>.keyring in order to authenticate to your cluster.
Follow below steps for a complete guide on how to set this up.
Step 1: Install Prometheus Server and Grafana:
Use these links for how to install Prometheus and Grafana.
- Install Prometheus Server on CentOS 7 and Install Grafana and InfluxDB on CentOS 7.
- Install Prometheus Server and Grafana on Ubuntu
- Install Prometheus Server and Grafana on Debian
Step 2: Install Docker on Prometheus Ceph exporter client
Please note that Prometheus Ceph exporter client should have access to Ceph cluster network for it to pull Cluster metrics. Install Docker on this server using our official Docker installation guide:
Also, install docker-compose.
Step 3: Build Ceph Exporter Docker image
Once you have Docker Engine installed and service running. You should be ready to build docker image from DigitalOcean Ceph exporter project. Consider installing Git if you don’t have it already.
sudo yum -y install git
If you’re using Ubuntu, run:
sudo apt update && sudo apt -y install git
Then clone the project from Github:
git clone https://github.com/digitalocean/ceph_exporter.git
Switch to the ceph_exporter directory and build docker image:
cd ceph_exporter
docker build -t ceph_exporter .
This will build an image named ceph_exporter. It may take a while depending on your internet and disk write speeds.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ceph_exporter latest 1e3b0082e6d4 3 minutes ago 379MB
Step 4: Start Prometheus ceph exporter client container
Copy ceph.conf configuration file and the ceph.<user>.keyring to /etc/ceph directory and start docker container host’s network stack. You can use vanilla docker commands, docker-compose or systemd to manage the container. For docker command line tool, run below commands.
docker run -it \
-v /etc/ceph:/etc/ceph \
--net=host \
-p=9128:9128 \
digitalocean/ceph_exporter
For docker-compose, create the following file:
$ vim docker-compose.yml
# Example usage of exporter in use
version: '2'
services:
ceph-exporter:
image: ceph_exporter
restart: always
network_mode: "host"
volumes:
- /etc/ceph:/etc/ceph
ports:
- '9128:9128'
Then start docker container using:
$ docker-compose up -d
For systemd, create service unit file like below:
$ sudo vim /etc/systemd/system/ceph_exporter.service
[Unit]
Description=Manage Ceph exporter service
[Install]
WantedBy=multi-user.target
[Service]
Restart=always
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill ceph_exporter
ExecStartPre=-/usr/bin/docker rm ceph_exporter
ExecStart=/usr/bin/docker run \
--name ceph_exporter \
-v /etc/ceph:/etc/ceph \
--net=host \
-p=9128:9128 \
ceph_exporter
ExecStop=-/usr/bin/docker kill ceph_exporter
ExecStop=-/usr/bin/docker rm ceph_exporter
Reload systemd daemon:
sudo systemctl daemon-reload
Start and enable the service:
sudo systemctl enable ceph_exporter
sudo systemctl start ceph_exporter
Check container status:
sudo systemctl status ceph_exporter
You should get output like below if all went fine.
Step 5: Open 9128 on the firewall
I use firewalld since this is a CentOS 7 server, allow access to port 9128 from your trusted network.
sudo firewall-cmd --permanent \
--add-rich-rule 'rule family="ipv4" \
source address="192.168.10.0/24" \
port protocol="tcp" port="9128" accept'
sudo firewall-cmd --reload
Test access with nc or telnet command.
$ telnet 127.0.0.1 9128
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
$ nc -v 127.0.0.1 9128
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:9128.
Step 6: Configure Prometheus scrape target
We need to define the Prometheus static_configs line for created ceph exporter container. Edit the file /etc/prometheus/prometheus.yml on your Prometheus server to look like below.
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
- job_name: 'ceph-exporter'
static_configs:
- targets: ['ceph-exporter-node-ip:9128']
labels:
alias: ceph-exporter
Replace localhost with your ceph exporter host IP address. Remember to restart Prometheus service after making the changes:
sudo systemctl restart prometheus
Step 7: Add Prometheus Data Source to Grafana
Login to your Grafana Dashboard and add Prometheus data source. You’ll need to provide the following information:
Name: Name given to this data source
Type: The type of data source, in our case this is Prometheus
URL: IP address and port number of Prometheus server you’re adding.
Access: Specify if access through proxy or direct. Proxy means access through Grafana server, direct means access from the web.
Save the settings by clicking save & Test button.
Step 8: Import Ceph Cluster Grafana Dashboards
The last step is to import the Ceph Cluster Grafana Dashboards. From my research, I found the following Dashboards by Cristian Calin.
- Ceph Cluster Overview: https://grafana.com/dashboards/917
- Ceph Pools Overview: https://grafana.com/dashboards/926
- Ceph OSD Overview: https://grafana.com/dashboards/923
We will use dashboard IDs 917, 926 and 923 when importing dashboards on Grafana.
Click the plus sign (+)> Import to import dashboard. Enter the number that matches the dashboard you wish to import above.
To View imported dashboards, go to Dashboards and select the name of the dashboard you want to view.
For OSD and Pools dashboard, you need to select the pool name / OSD number to view its usage and status. SUSE guys have similar dashboards available on https://github.com/SUSE/grafana-dashboards-ceph
Other Prometheus Monitoring guides:
- How to Monitor Redis Server with Prometheus and Grafana in 5 minutes
- How to Monitor Linux Server Performance with Prometheus and Grafana in 5 minutes
- How to Monitor BIND DNS server with Prometheus and Grafana
- Monitoring MySQL / MariaDB with Prometheus in five minutes
- How to Monitor Apache Web Server with Prometheus and Grafana in 5 minutes