Mixin / Kubelet
58,202

Created 5/5/2024
Updated 5/6/2024
Revision 2
Categories
DockerWeb Servers
Grafana Version >=10.4.1
Datasources
Prometheus

Description

This dashboard monitors the health and performance of Kubelets across nodes, aggregating core runtime metrics to reveal node-level activity and potential bottlenecks. It highlights key areas such as pod/container lifecycle, volume management, and runtime operation efficiency, with focused panels like kubelet_running_pods, kubelet_runtime_operations_duration_seconds_bucket, and kubelet_pod_start_duration_seconds_count to diagnose latency and error conditions. It also exposes rate and duration metrics for storage and PLEG-related activities, enabling quick identification of relist storms or configuration issues.

Source Grafana.com

Used Metrics 25

  • go_goroutines

  • kubelet_cgroup_manager_duration_seconds_bucket

  • kubelet_cgroup_manager_duration_seconds_count

  • kubelet_node_config_error

  • kubelet_node_name

  • kubelet_pleg_relist_duration_seconds_bucket

  • kubelet_pleg_relist_duration_seconds_count

  • kubelet_pleg_relist_interval_seconds_bucket

  • kubelet_pod_start_duration_seconds_bucket

  • kubelet_pod_start_duration_seconds_count

  • kubelet_pod_worker_duration_seconds_bucket

  • kubelet_pod_worker_duration_seconds_count

  • kubelet_running_containers

  • kubelet_running_pods

  • kubelet_runtime_operations_duration_seconds_bucket

  • kubelet_runtime_operations_errors_total

  • kubelet_runtime_operations_total

  • process_cpu_seconds_total

  • process_resident_memory_bytes

  • rest_client_request_duration_seconds_bucket

  • rest_client_requests_total

  • storage_operation_duration_seconds_bucket

  • storage_operation_duration_seconds_count

  • storage_operation_errors_total

  • volume_manager_total_volumes

Get Dashboard
Download
Copy to Clipboard