When an AI agent in production makes a critical error, every second counts. OpenClaw provides automated incident detection, standardized shutdown procedures, and safe rollback strategies.
OpenClaw detects incidents automatically based on multiple signals:
| Severity | Type | Example | Auto Action |
|---|---|---|---|
| P0 — Critical | Agent failure | Agent stops responding | Auto-shutdown |
| P0 — Critical | Data leak | PII in public response | Immediate block |
| P1 — High | Alignment crash | Score drops below 0.5 | Auto-pause |
| P1 — High | Cost explosion | 10x normal consumption | Rate limiting |
| P2 — Medium | Quality drop | Error rate above 10% | Alert + investigation |
| P3 — Low | Performance degradation | Latency 2x above normal | Alert |
# incident-detection.yml
detection:
rules:
- name: mass-hallucination
condition: hallucination_rate > 15% over 15m
severity: P1
auto_action: pause_agent
description: "Unusually high hallucination rate"
- name: loop-detection
condition: same_tool_call > 10 within single_trace
severity: P1
auto_action: kill_trace
description: "Agent in infinite loop"
- name: unauthorized-data-access
condition: data_access outside_policy_boundary
severity: P0
auto_action: shutdown_agent
description: "Data access outside policy"
- name: cascading-failure
condition: error_count > 3 agents within 5m
severity: P0
auto_action: system_wide_pause
description: "Cascading failures across multiple agents"
Graceful Shutdown: support-agent-v3
────────────────────────────────────
1. ✅ New requests rejected (redirect to fallback)
2. ✅ Running interactions completed (max 60s)
3. ✅ Open tool calls terminated
4. ✅ State persisted (for later analysis)
5. ✅ Shutdown event logged
6. ✅ Stakeholders notified
Duration: ~45 seconds
Emergency Shutdown: support-agent-v3
──────────────────────────────────────
1. ✅ Immediate abort of ALL interactions
2. ✅ All API connections severed
3. ✅ Fallback message to all active users
4. ✅ Emergency event logged
5. ✅ P0 alert to on-call + management
Duration: <5 seconds
# Show current prompt version history
openclaw agent prompt-history support-agent-v3
# Rollback to previous version
openclaw agent rollback support-agent-v3 --to-version v3.0
# Verify rollback
openclaw test run --suite support-agent-regression --quick
OpenClaw stores every configuration state as a snapshot:
| Timestamp | Version | Change | Score |
|---|---|---|---|
| 2026-02-18 14:00 | v3.1.4 | Temperature: 0.7 → 0.3 | 0.91 |
| 2026-02-17 10:00 | v3.1.3 | New tool: order_lookup | 0.93 |
| 2026-02-15 16:00 | v3.1.2 | Prompt update | 0.89 |
| 2026-02-10 09:00 | v3.1.1 | Model: gpt-4o-mini → gpt-4o | 0.94 |
# Rollback to a specific snapshot
openclaw agent rollback support-agent-v3 --to-snapshot 2026-02-17T10:00
For system-wide issues, OpenClaw can roll back all agents simultaneously:
# System-wide rollback to last stable state
openclaw system rollback --to-last-stable
# Rollback with automatic regression tests
openclaw system rollback --to-last-stable --verify
After every P0/P1 incident, OpenClaw generates a post-mortem template:
Post-Mortem: PII Leak in Support Agent
═══════════════════════════════════════
Date: 2026-02-18
Severity: P0 — Critical
Duration: 12 minutes (14:23 – 14:35)
Impact: 3 customer interactions affected
Detected by: OpenClaw PII scanner (automatic)
Resolved by: Auto-shutdown + prompt rollback
Timeline:
14:20 Prompt update v3.1.4 deployed
14:23 First trace with PII in output
14:24 OpenClaw PII alert triggered
14:25 Auto-shutdown initiated
14:27 On-call engineer notified
14:30 Root cause identified (prompt regression)
14:33 Rollback to v3.1.3 performed
14:35 Agent back online, tests passed
Root Cause:
Prompt update v3.1.4 accidentally removed the
instruction for PII avoidance in responses.
Action Items:
☐ Introduce prompt review process (four-eyes principle)
☐ Add PII regression test to test suite
☐ Implement pre-deployment check for PII rules
Key takeaway: A good incident response plan is created before the incident — not during it. Configure shutdown procedures and rollback strategies today so you're ready to act tomorrow.