Lesson 4 of 6·9 min read

Metrics & Alerting

Raw data alone doesn't help — you need actionable metrics and intelligent alerting that notifies you before a trend becomes a problem. OpenClaw provides a four-tier system for this.

The Four Core Metrics for AI Agents

1. Latency (Response Time)

  • P50 / P95 / P99 — Percentile-based latency per agent
  • Time-to-First-Token — How quickly does the agent respond?
  • End-to-End Latency — Total duration including tool calls and retrieval

2. Token Usage (Consumption)

  • Input Tokens — Prompt size per request
  • Output Tokens — Response length per request
  • Context Window Utilization — How full is the context window?
  • Cost per Interaction — Cost per agent interaction in EUR

3. Error Rate

  • LLM Errors — Rate limits, timeouts, API errors
  • Agent Errors — Wrong tool calls, loop detection, stuck states
  • Guardrail Violations — Outputs caught by the guardrail system
  • Hallucination Rate — Hallucinations detected through fact-checking

4. Quality Score

  • User Satisfaction — Feedback-based (thumbs up/down, NPS)
  • Task Completion Rate — Successfully completed tasks
  • Alignment Score — Conformity with defined guidelines

Configuring Threshold Alerts

# openclaw-alerts.yml
alerts:
  - name: high-latency
    metric: p95_latency
    threshold: 5000ms
    window: 5m
    severity: warning
    channels: [slack, email]

  - name: error-spike
    metric: error_rate
    threshold: 5%
    window: 10m
    severity: critical
    channels: [slack, pagerduty]

  - name: cost-overrun
    metric: daily_cost
    threshold: 500 EUR
    window: 24h
    severity: warning
    channels: [email]

  - name: alignment-drift
    metric: alignment_score
    threshold: "<0.85"
    window: 1h
    severity: critical
    channels: [slack, pagerduty, email]

Notification Channels

ChannelUse CaseLatency
SlackTeam notificationsSeconds
EmailManagement reports, summariesMinutes
PagerDutyCritical alerts, on-callSeconds
WebhookCustom integrationsSeconds
Microsoft TeamsEnterprise environmentsSeconds

Defining Custom Metrics

Use the SDK to track your own metrics:

oc.metric("customer_sentiment", value=0.82, tags={"agent": "support-v2"})
oc.metric("retrieval_relevance", value=0.91, tags={"index": "knowledge-base"})
oc.metric("compliance_check_passed", value=1, tags={"check": "pii-scan"})

Alert Escalation

OpenClaw supports multi-level escalation:

  1. Warning — Slack message to the team (5 minutes)
  2. Critical — PagerDuty + Slack + email (immediately)
  3. Emergency — Auto-shutdown of affected agent + escalation to management

Best Practice: Start with loose thresholds and tighten them gradually. Too many alerts lead to alert fatigue — then even critical warnings get ignored.