mirror of https://github.com/ceph/ceph-ansible.git
dashboard: Add new prometheus alert
It was requested for us to update our alerting definitions to include a slow OSD Ops health check. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1951664 Signed-off-by: Boris Ranto <branto@redhat.com>pull/6615/head
parent
fc784fc44c
commit
2491d4e004
|
@ -105,3 +105,11 @@ groups:
|
||||||
annotations:
|
annotations:
|
||||||
summary: "OSD(s) with High PG Count"
|
summary: "OSD(s) with High PG Count"
|
||||||
description: "This indicates there are some OSDs with high PG count (275+)."
|
description: "This indicates there are some OSDs with high PG count (275+)."
|
||||||
|
- alert: Slow OSD Ops
|
||||||
|
expr: ceph_healthcheck_slow_ops > 0
|
||||||
|
for: 1m
|
||||||
|
labels:
|
||||||
|
severity: page
|
||||||
|
annotations:
|
||||||
|
summary: "Slow OSD Ops"
|
||||||
|
description: "OSD requests are taking too long to process (osd_op_complaint_time exceeded)"
|
||||||
|
|
Loading…
Reference in New Issue