Monitoring ZFS with Grafana
Date: 2026-02-08
Host: grafana
Component: Prometheus + Grafana + exporters +zfs
Scope: ZFS monitoring
ZFS / Zpool Metrics for Prometheus (node_exporter + textfile collector on FreeBSD)
This guide sets up zpool capacity metrics (zpool list) for Prometheus/Grafana using
node_exporter’s textfile collector on FreeBSD.
This complements (not replaces) zfs_exporter, which provides ARC/internal ZFS metrics but not pool capacity.
What I Got Working
Metrics exposed to Prometheus:
zpool_size_bytes{pool="zroot"}zpool_alloc_bytes{pool="zroot"}zpool_free_bytes{pool="zroot"}zpool_capacity_ratio{pool="zroot"}
These map directly to:
zpool list
1. Ensure node_exporter is enabled with textfile collector
/etc/rc.conf
node_exporter_enable="YES"
node_exporter_args="--collector.textfile.directory=/var/db/node_exporter/textfile_collector"
Restart:
service node_exporter restart
Create the directory:
mkdir -p /var/db/node_exporter/textfile_collector
2. Create the Zpool Prometheus script
Save as:
/usr/local/bin/zpool_prom.sh
#!/bin/sh
OUT="/var/db/node_exporter/textfile_collector/zpool.prom.$$"
FINAL="/var/db/node_exporter/textfile_collector/zpool.prom"
# zpool list -Hp columns:
# name size alloc free ckpoint expandsz frag cap dedup health altroot
zpool list -Hp | awk '
BEGIN {
print "# HELP zpool_size_bytes ZFS pool total size in bytes"
print "# TYPE zpool_size_bytes gauge"
print "# HELP zpool_alloc_bytes ZFS pool allocated bytes"
print "# TYPE zpool_alloc_bytes gauge"
print "# HELP zpool_free_bytes ZFS pool free bytes"
print "# TYPE zpool_free_bytes gauge"
print "# HELP zpool_capacity_ratio ZFS pool capacity used as ratio (0-1)"
print "# TYPE zpool_capacity_ratio gauge"
}
{
pool=$1; size=$2; alloc=$3; free=$4; cap=$8;
gsub(/%/,"",cap);
printf "zpool_size_bytes{pool=\"%s\"} %s\n", pool, size;
printf "zpool_alloc_bytes{pool=\"%s\"} %s\n", pool, alloc;
printf "zpool_free_bytes{pool=\"%s\"} %s\n", pool, free;
printf "zpool_capacity_ratio{pool=\"%s\"} %.6f\n", pool, cap/100.0;
}' > "$OUT" && mv "$OUT" "$FINAL"
Make executable:
chmod +x /usr/local/bin/zpool_prom.sh
Test once:
/usr/local/bin/zpool_prom.sh
cat /var/db/node_exporter/textfile_collector/zpool.prom
3. Schedule the script (cron)
crontab -e
Add:
* * * * * /usr/local/bin/zpool_prom.sh >/dev/null 2>&1
This updates metrics once per minute.
4. Verify node_exporter exposes the metrics
curl -s http://127.0.0.1:9100/metrics | egrep '^zpool_(size|alloc|free|capacity)_'
You should see:
zpool_size_bytes{pool="zroot"} ...
zpool_alloc_bytes{pool="zroot"} ...
zpool_free_bytes{pool="zroot"} ...
zpool_capacity_ratio{pool="zroot"} ...
If not, node_exporter is not reading the textfile directory you configured.
5. Prometheus scrape config
Example prometheus.yml:
scrape_configs:
- job_name: node
static_configs:
- targets:
- clemente:9100
Reload Prometheus.
6. Grafana Queries (zpool list equivalent)
Use these PromQL queries:
Pool size
zpool_size_bytes
Allocated
zpool_alloc_bytes
Free
zpool_free_bytes
Capacity %
100 * zpool_capacity_ratio
Units:
- Bytes → IEC (GiB)
- Capacity → Percent (0–100)
Notes
node_filesystem_*≠zpool listnode_filesystem_*shows dataset/mountpoint view (df/statfs)- This setup exposes real pool-level capacity
zfs_exporteris useful for ARC/cache telemetry, not pool capacity.
Per‑Jail ZFS Usage (Bastille jails)
This section adds per‑jail disk usage using the ZFS datasets that back each jail:
<pool>/bastille/jails/<jail>/root. This reports REFER (actual referenced data),
USED (includes snapshots), and AVAIL per jail.
What You’ll Get (per jail)
zfs_jail_refer_bytes{jail="<jail>",dataset="<dataset>"}zfs_jail_used_bytes{jail="<jail>",dataset="<dataset>"}zfs_jail_avail_bytes{jail="<jail>",dataset="<dataset>"}
These map directly to:
zfs list -Hp -o name,used,avail,refer
1) Create the per‑jail exporter script
Save as:
/usr/local/bin/zfs_jails_prom.sh
#!/bin/sh
OUT="/var/db/node_exporter/textfile_collector/zfs_jails.prom.$$"
FINAL="/var/db/node_exporter/textfile_collector/zfs_jails.prom"
# Export per‑jail root datasets:
# <pool>/bastille/jails/<jail>/root
zfs list -Hp -o name,used,avail,refer | awk '
BEGIN {
print "# HELP zfs_jail_used_bytes ZFS dataset USED bytes for the jail root dataset (includes snapshots)"
print "# TYPE zfs_jail_used_bytes gauge"
print "# HELP zfs_jail_avail_bytes ZFS dataset AVAIL bytes for the jail root dataset"
print "# TYPE zfs_jail_avail_bytes gauge"
print "# HELP zfs_jail_refer_bytes ZFS dataset REFER bytes for the jail root dataset (referenced data)"
print "# TYPE zfs_jail_refer_bytes gauge"
}
$1 ~ /\\/bastille\\/jails\\/[^\\/]+\\/root$/ {
ds=$1; used=$2; avail=$3; refer=$4;
jail=ds;
sub(/^.*\\/bastille\\/jails\\//, "", jail);
sub(/\\/root$/, "", jail);
printf "zfs_jail_used_bytes{jail=\\"%s\\",dataset=\\"%s\\"} %s\\n", jail, ds, used;
printf "zfs_jail_avail_bytes{jail=\\"%s\\",dataset=\\"%s\\"} %s\\n", jail, ds, avail;
printf "zfs_jail_refer_bytes{jail=\\"%s\\",dataset=\\"%s\\"} %s\\n", jail, ds, refer;
}' > "$OUT" && mv "$OUT" "$FINAL"
Make executable and test:
chmod +x /usr/local/bin/zfs_jails_prom.sh
/usr/local/bin/zfs_jails_prom.sh
tail -n 20 /var/db/node_exporter/textfile_collector/zfs_jails.prom
2) Schedule the script (cron)
* * * * * /usr/local/bin/zfs_jails_prom.sh >/dev/null 2>&1
3) Verify metrics are exposed by node_exporter
curl -s http://127.0.0.1:9100/metrics | egrep '^zfs_jail_(used|avail|refer)_bytes' | head
You should see one line per jail/dataset.
4) Grafana / PromQL examples
Actual jail footprint (REFER):
zfs_jail_refer_bytes{job="host"}
Includes snapshots (USED):
zfs_jail_used_bytes{job="host"}
Percent of pool allocated (approx):
100 * zfs_jail_refer_bytes{job="host"} / zpool_alloc_bytes{pool="zroot",job="host"}
Top consumers (bar gauge):
topk(10, zfs_jail_refer_bytes{job="host"})
Notes
- This requires one dataset per jail (you have
.../bastille/jails/<jail>/root). - ZFS cannot attribute pool ALLOC/FREE to jails; this reports dataset usage only.
Troubleshooting
No zpool metrics in /metrics:
- Check node_exporter args:
ps -auxww | grep [n]ode_exporter - Directory mismatch is the most common failure.
Wrong capacity %:
- Ensure awk uses
$8for CAP fromzpool list -Hp.
Done
You now have accurate, pool-level ZFS metrics in Prometheus/Grafana that match:
zpool list