Log Analysis & Debugging

When an agent exhibits unexpected behavior, you need to find the root cause fast. OpenClaw provides a Trace Explorer with step-by-step replay — you see exactly what the agent thought and decided at every step.

Trace Explorer

The Trace Explorer is the centerpiece of the debugging workflow:

Waterfall View

Shows each span chronologically with timing:

[12:04:01.000] Trace Start: order-processing-agent
[12:04:01.012] ├── intent-classification       12ms   ✅
[12:04:01.024] ├── order-lookup                 89ms   ✅
[12:04:01.113] ├── inventory-check              45ms   ✅
[12:04:01.158] ├── price-calculation            23ms   ✅
[12:04:01.181] ├── llm-response-generation    1,203ms  ⚠️ (slow)
[12:04:02.384] ├── guardrail-check             140ms   ❌ (blocked)
[12:04:02.524] └── fallback-response             8ms   ✅

Prompt/Response Inspection

For each LLM call, you can inspect:

System Prompt — What instructions did the agent have?
User Input — What was the input?
Context — What documents/data were in the context?
Raw Response — What did the LLM respond?
Parsed Output — How did the agent interpret the response?
Token Count — Input/output/total with costs

Step-by-Step Replay

The replay function lets you trace an agent interaction step by step:

Click a trace in the Explorer
Select "Replay" in the toolbar
Navigate forward/backward through each span
See the agent's state at each point in time (memory, context, decision)

Error Root Cause Analysis

OpenClaw categorizes errors automatically:

Error Type	Description	Common Cause
LLM Timeout	API response not timely	Overload, large prompts
Rate Limit	API limit reached	Too many parallel requests
Hallucination	Fact-check failed	Insufficient context
Guardrail Block	Output blocked by policy	Toxic/unsafe content
Tool Failure	External tool call failed	API down, wrong parameters
Loop Detected	Agent in infinite loop	Missing exit condition
Alignment Drift	Score below threshold	Prompt degradation over time

Automatic Correlation

OpenClaw correlates errors automatically:

Temporally: Which errors occur in clusters?
Causally: Which span triggered the error?
Cross-agent: Does the error affect multiple agents?

Debugging Workflow

The recommended debugging process:

Receive alert — OpenClaw reports anomalous behavior
Identify trace — Find affected traces via filters
Analyze waterfall — Where in the flow does the problem occur?
Inspect prompt — What does the agent see? What does the LLM respond?
Determine root cause — Context issue? Prompt issue? Tool issue?
Deploy fix — Adjust prompt, fix tool, update guardrail

Practical Tip: Use the bookmark function to save interesting traces. Over time, you'll build a library of typical failure patterns that helps new team members during onboarding.