System Health Monitoring

The Prometheus-based monitoring system provides a comprehensive overview of system health by tracking key performance metrics. These include GPU and CPU utilization, memory consumption, and top active processes.

Installation and Configuration

To retrieve the source code and configuration instructions for setting up the monitoring stack, refer to the following GitHub repository: SCAILIUM Monitoring