System Monitoring and Auto-Recovery
1.1 Overview
1.2 Objectives of System Monitoring and Auto-Recovery
1.3 Tools for System Monitoring and Auto-Recovery
1.4 Monitoring Code (Prometheus and Grafana)
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'capsurelabs_app'
static_configs:
- targets: ['localhost:8080']
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093']
rule_files:
- "alert_rules.yml"1.5 Auto-Recovery Mechanisms
1.5.1 Kubernetes Liveness and Readiness Probes
1.5.2 Ansible for Automated Recovery
1.5.3 AWS Auto-Recovery for EC2 Instances
Last updated
