Thanos Query Http Request Query Error Rate High critical
Thanos Query {{$labels.job}} is failing to handle {{$value | humanize}}% of "query" requests.
>>>
(sum by (job) (
rate
(
{code=~"5..", job=~".*thanos-query.*", handler="query"}[5m]))/ sum by (job) (
rate
(
{job=~".*thanos-query.*", handler="query"}[5m]))) * 100 > 5
The rule calculates, for each job, the percentage of HTTP 5xx responses among all query requests over the last 5 minutes ( (sum(rate(http_requests_total{code=~"5..",handler="query",job=~".*thanos-query.*"}[5m])) / sum(rate(http_requests_total{handler="query",job=~".*thanos-query.*"}[5m]))) * 100 ) and triggers when this error rate exceeds 5 %. In other words, if any Thanos‑Query instance reports that more than five percent of its query‑handler requests are failing, the alert fires.
Get Alert✕
Download
Copy to Clipboard