Monitor API
Base path: /api/monitoring (with the exception of health probes at /health/* and Prometheus at /api/metrics/prometheus)
All endpoints require authentication (except health probes and Prometheus). See API Overview for auth details.
System health
Section titled “System health”Basic health check
Section titled “Basic health check”GET /healthPublic — no authentication required. Returns immediately:
{ "status": "ok", "version": "0.5.0", "uptime_seconds": 86400}Readiness probe
Section titled “Readiness probe”GET /health/readyReturns 200 when the daemon is ready to serve requests, 503 during startup or if critical services are down. Public — use for load balancer health checks and Kubernetes readiness probes.
Liveness probe
Section titled “Liveness probe”GET /health/liveReturns 200 as long as the process is alive. Public — use for container restart policies.
Detailed system health
Section titled “Detailed system health”GET /api/monitoring/healthFull component health including latency:
{ "overall": "healthy", "uptime": 604800, "version": "0.5.0", "components": [ {"id": "api", "name": "API Server", "status": "healthy", "latency": 45, "last_check": "2026-03-02T12:00:00Z"}, {"id": "db", "name": "Database", "status": "healthy", "latency": 12, "last_check": "2026-03-02T12:00:00Z"}, {"id": "workers", "name": "Workers", "status": "healthy", "latency": 5, "last_check": "2026-03-02T12:00:00Z"}, {"id": "memory", "name": "Memory Store", "status": "degraded", "latency": 150, "message": "High latency detected", "last_check": "2026-03-02T12:00:00Z"}, {"id": "ai", "name": "AI Service", "status": "healthy", "latency": 800, "last_check": "2026-03-02T12:00:00Z"} ], "last_check": "2026-03-02T12:00:00Z"}Component detail
Section titled “Component detail”GET /api/monitoring/health/components/{id}Returns status for a single component by ID (e.g., api, db, workers, memory, ai).
Trigger health check
Section titled “Trigger health check”POST /api/monitoring/health/checkForces an immediate health re-evaluation and returns updated status.
Usage statistics
Section titled “Usage statistics”Token usage summary
Section titled “Token usage summary”GET /api/monitoring/tokensReturns current token consumption and limits:
{ "total_used": 2500000, "total_limit": 5000000, "used_today": 150000, "limit_today": 500000, "by_model": { "claude-sonnet-4-6": { "model": "claude-sonnet-4-6", "input_tokens": 80000, "output_tokens": 40000, "total_tokens": 120000, "cost_estimate": 3.60 } }, "by_project": [ {"project_id": "proj_001", "project_name": "API Development", "tokens_used": 100000, "percentage": 66.7} ], "trend": "stable"}Token usage history
Section titled “Token usage history”GET /api/monitoring/tokens/historyQuery: period (day, week, month)
Returns a time series for charting:
[ {"date": "2026-03-01", "input_tokens": 40000, "output_tokens": 20000, "total_tokens": 60000}, {"date": "2026-03-02", "input_tokens": 45000, "output_tokens": 22000, "total_tokens": 67000}]Usage overview
Section titled “Usage overview”GET /api/monitoring/usageAggregate usage summary including agent activity.
Activity summary
Section titled “Activity summary”GET /api/monitoring/activity-summaryReturns a summary of recent system events grouped by type.
Performance
Section titled “Performance”Performance overview
Section titled “Performance overview”GET /api/monitoring/performanceReturns current performance metrics (response times, throughput, error rates).
Performance history
Section titled “Performance history”GET /api/monitoring/performance/historyQuery: period, granularity
Returns historical performance data for trending charts.
System metrics
Section titled “System metrics”GET /api/monitoring/systemReturns system resource usage (CPU, memory, disk).
Metrics (Prometheus)
Section titled “Metrics (Prometheus)”GET /api/metrics/prometheusPublic — no authentication required. Returns Prometheus text-format metrics:
# HELP snippbot_tasks_total Total tasks executed# TYPE snippbot_tasks_total countersnippbot_tasks_total{status="completed"} 4821snippbot_tasks_total{status="failed"} 47
# HELP snippbot_token_usage_total Total tokens consumed# TYPE snippbot_token_usage_total countersnippbot_token_usage_total{model="claude-sonnet-4-6",type="input"} 12847291
# HELP snippbot_active_sessions Current active chat sessions# TYPE snippbot_active_sessions gaugesnippbot_active_sessions 3
# HELP snippbot_scheduler_jobs_active Active scheduled jobs# TYPE snippbot_scheduler_jobs_active gaugesnippbot_scheduler_jobs_active 10Prometheus scrape config:
scrape_configs: - job_name: snippbot static_configs: - targets: ['localhost:18781'] metrics_path: /api/metrics/prometheusAlerts
Section titled “Alerts”List alerts
Section titled “List alerts”GET /api/monitoring/alerts{ "alerts": [ { "id": "alert_abc123", "name": "High error rate", "condition": "task_error_rate > 0.1", "threshold": 0.1, "period_minutes": 60, "delivery": {"type": "webhook", "url": "https://..."}, "enabled": true, "last_triggered": null } ]}Get an alert
Section titled “Get an alert”GET /api/monitoring/alerts/{id}Acknowledge an alert
Section titled “Acknowledge an alert”POST /api/monitoring/alerts/{id}/acknowledgeResolve an alert
Section titled “Resolve an alert”POST /api/monitoring/alerts/{id}/resolveSnooze an alert
Section titled “Snooze an alert”POST /api/monitoring/alerts/{id}/snoozeDismiss an alert
Section titled “Dismiss an alert”DELETE /api/monitoring/alerts/{id}Analytics
Section titled “Analytics”GET /api/monitoring/analyticsAggregated analytics across all monitoring dimensions.