Monitoring
Monitoring tracks system health through metrics, logs, and alerts.
The Three Pillars
| Pillar | Tool examples | Purpose |
|---|---|---|
| Metrics | Prometheus, Grafana | Numeric trends and dashboards |
| Logs | ELK, Loki, journalctl | Event investigation |
| Traces | Jaeger, Zipkin | Request flow across services |
Key Metrics
- Latency — response time percentiles (p50, p95, p99)
- Traffic — requests per second
- Errors — error rate percentage
- Saturation — CPU, memory, disk usage