Etcd Cluster Overview
112,293

Created 11/24/2021
Updated 11/24/2021
Revision 1
Grafana Version >=8.2.5
Datasources
Prometheus

Description

This dashboard monitors an etcd cluster’s health and performance, aggregating cluster state, traffic, key-value operations, and resource usage. It highlights leadership stability and availability with panels like etcd_server_has_leader/etcd_server_is_leader, per-node metrics such as process_resident_memory_bytes, and operation timelines including etcd_disk_wal_fsync_duration_seconds_bucket and etcd_disk_backend_commit_duration_seconds_bucket to surface latency and throughput. Key features include stacked keys visualization (Keys (Stacked) and etcd_debugging_mvcc_keys_total), traffic breakdown (Client vs Peer in/out), and disk/compaction metrics to diagnose latency and storage pressure.

Screenshots

Source Grafana.com

Used Metrics 32

  • etcd_debugging_mvcc_db_compaction_keys_total

  • etcd_debugging_mvcc_delete_total

  • etcd_debugging_mvcc_keys_total

  • etcd_debugging_mvcc_put_total

  • etcd_debugging_snap_save_total_duration_seconds_sum

  • etcd_disk_backend_commit_duration_seconds_bucket

  • etcd_disk_backend_commit_duration_seconds_sum

  • etcd_disk_backend_defrag_duration_seconds_sum

  • etcd_disk_wal_fsync_duration_seconds_bucket

  • etcd_disk_wal_fsync_duration_seconds_sum

  • etcd_mvcc_db_total_size_in_bytes

  • etcd_mvcc_db_total_size_in_use_in_bytes

  • etcd_network_client_grpc_received_bytes_total

  • etcd_network_client_grpc_sent_bytes_total

  • etcd_network_peer_received_bytes_total

  • etcd_network_peer_sent_bytes_total

  • etcd_server_has_leader

  • etcd_server_health_failures

  • etcd_server_heartbeat_send_failures_total

  • etcd_server_id

  • etcd_server_is_leader

  • etcd_server_leader_changes_seen_total

  • etcd_server_proposals_applied_total

  • etcd_server_proposals_committed_total

  • etcd_server_proposals_failed_total

  • etcd_server_proposals_pending

  • etcd_server_quota_backend_bytes

  • etcd_server_slow_apply_total

  • etcd_server_slow_read_indexes_total

  • grpc_server_handled_total

  • grpc_server_started_total

  • process_resident_memory_bytes

Get Dashboard
Download
Copy to Clipboard