Saltar a contenido

O2 Dashboards Catalog

Catalogo operativo de dashboards al 2026-06-01.

Objetivo

Convertir Grafana en el centro operativo principal del VPS sin exponerla publicamente, sin tocar OpenClaw ni NPM y sin romper O1.

Folder Grafana

  • folder provisionado: O2 Dashboards
  • datasource unico: Prometheus
  • datasource uid: prometheus
  • acceso: http://127.0.0.1:3000 desde el VPS o por tunel SSH

Dashboards creados

1. Executive Overview

  • foco: lectura ejecutiva de disponibilidad, salud base, CPU, RAM y peso de contenedores
  • paneles clave: OpenClaw Availability 24h, Core Observability Healthy, CPU Now, RAM Now, Containers Tracked, Resource Trend, Service Response Time, Top Container Memory, Top Container CPU
  • variables: Host, Container, Probe Target
  • metricas usadas: probe_success, probe_duration_seconds, up, node_cpu_seconds_total, node_memory_MemAvailable_bytes, node_memory_MemTotal_bytes, node_filesystem_avail_bytes, node_filesystem_size_bytes, container_last_seen, container_memory_working_set_bytes, container_cpu_usage_seconds_total

2. Infrastructure Health

  • foco: salud del host y tendencias de CPU, RAM, load, red y disco
  • paneles clave: Host Uptime, CPU Usage, Memory Usage, Root Disk Usage, CPU and Memory Trend, System Load, Host Network Throughput, Root Disk Free Space
  • variables: Host
  • metricas usadas: node_boot_time_seconds, node_cpu_seconds_total, node_memory_MemAvailable_bytes, node_memory_MemTotal_bytes, node_load1, node_load5, node_load15, node_network_receive_bytes_total, node_network_transmit_bytes_total, node_filesystem_avail_bytes, node_filesystem_size_bytes

3. Docker Operations

  • foco: operacion real de contenedores con CPU, RAM, red y capa escribible
  • paneles clave: Containers Seen, CPU Cores In Use, Working Set Memory, Writable Layer Usage, Top CPU Consumers, Top Memory Consumers, Top Network Activity, Top Writable Layer Usage
  • variables: Container
  • metricas usadas: container_last_seen, container_cpu_usage_seconds_total, container_memory_working_set_bytes, container_fs_usage_bytes, container_network_receive_bytes_total, container_network_transmit_bytes_total

4. Service Availability

  • foco: disponibilidad de OpenClaw, NPM local, Portainer local y salud de scrapes
  • paneles clave: OpenClaw HTTPS 24h, NPM Local 24h, Portainer Local 24h, Scrape Targets Healthy, Probe Success, HTTP Status Code, Probe Duration, Health by Job
  • variables: Probe Target
  • metricas usadas: probe_success, probe_http_status_code, probe_duration_seconds, up

5. Capacity Planning

  • foco: tendencia de capacidad, forecast de disco y crecimiento de Prometheus
  • paneles clave: Root Disk Used Now, Projected Disk Used in 7d, Memory Headroom, Prometheus Active Series, 7d Resource Trend, Disk Free Forecast, Prometheus Growth Signals, Operational Growth Baseline
  • variables: Host
  • metricas usadas: node_filesystem_avail_bytes, node_filesystem_size_bytes, node_memory_MemAvailable_bytes, node_cpu_seconds_total, prometheus_tsdb_head_series, prometheus_tsdb_head_samples_appended_total, container_last_seen, up

Reglas de uso

  • usar Executive Overview como primera pantalla
  • bajar luego a Infrastructure Health, Docker Operations o Service Availability segun el hallazgo
  • usar Capacity Planning para decisiones de retencion, limpieza o crecimiento
  • no editar dashboards en caliente como fuente de verdad; la fuente de verdad sigue siendo Git

Uso junto a O3

  • Infrastructure Health acompana alertas de disco, RAM, CPU y load
  • Docker Operations acompana alertas de contenedores y cAdvisor
  • Service Availability acompana alertas de OpenClaw, NPM, Portainer y probes
  • Capacity Planning acompana alertas de crecimiento de TSDB y presion de disco

Validacion esperada

  • datasource Prometheus visible por API local autenticada
  • folder O2 Dashboards visible en Grafana
  • dashboards visibles: Executive Overview, Infrastructure Health, Docker Operations, Service Availability, Capacity Planning
  • todos los JSON validos y referenciando uid=prometheus