Lesson 4 of 6·10 min read

Cost Monitoring & Optimization

AI agents can become expensive quickly — especially with GPT-4-class models and high volume. OpenClaw provides granular cost tracking and data-driven optimization recommendations.

Token Usage Tracking

Real-Time Cost Calculation

OpenClaw calculates costs per trace based on current model pricing:

Trace: order-processing-agent (tr_x9k2m1)
─────────────────────────────────────────
Model:          gpt-4o (2025-08-06)
Input Tokens:   1,245 × $2.50/1M  = $0.003113
Output Tokens:  389 × $10.00/1M   = $0.003890
Tool Calls:     2 × avg $0.001    = $0.002000
Embedding:      1 × 512 tokens    = $0.000051
─────────────────────────────────────────
Total:                               $0.009054

Cost Aggregation

DimensionExamplePurpose
Per agentSupport Agent: EUR 847/monthAgent-level budgets
Per teamSales Team: EUR 2,100/monthDepartment allocation
Per projectOnboarding Flow: EUR 340/monthProject ROI
Per customerTenant A: EUR 120/monthTenant billing
Per modelGPT-4o: EUR 3,200/monthModel comparison

Cost Allocation per Team/Project

OpenClaw assigns costs automatically:

with oc.trace("support-agent") as trace:
    trace.set_cost_center({
        "team": "customer-success",
        "project": "tier-1-support",
        "budget_id": "CS-2026-Q1"
    })

Budget Management

# budgets.yml
budgets:
  - id: CS-2026-Q1
    name: "Customer Success Q1 2026"
    limit: 5000 EUR
    period: quarterly
    alerts:
      - at: 50%   # EUR 2,500
        channels: [email]
      - at: 80%   # EUR 4,000
        channels: [slack, email]
      - at: 95%   # EUR 4,750
        channels: [slack, pagerduty, email]
      - at: 100%  # EUR 5,000
        action: throttle  # reduce agent throughput

Budget Alerts

OpenClaw warns proactively:

  • Trend-based — "At current consumption, you'll exceed the budget in 8 days"
  • Anomaly-based — "Token usage 340% above average of last 7 days"
  • Threshold-based — "80% of monthly budget consumed (day 18 of 28)"

Optimization Recommendations

OpenClaw analyzes usage patterns and provides concrete recommendations:

1. Model Downgrade

Recommendation: Model downgrade for intent-classification
──────────────────────────────────────────────────────────
Current:    gpt-4o ($2.50/$10.00 per 1M tokens)
Recommended: gpt-4o-mini ($0.15/$0.60 per 1M tokens)
Reason:     Accuracy difference only 0.3% for this task
Savings:    ~EUR 420/month (94% reduction for this span)

2. Prompt Optimization

Recommendation: Prompt compression for support-agent
────────────────────────────────────────────────────
Current:    Average 2,100 input tokens per call
Problem:    System prompt contains 1,400 redundant tokens
Recommended: Shorten prompt to 700 tokens (few-shot -> instruction)
Savings:    ~EUR 180/month (33% reduction in input costs)

3. Caching

Recommendation: Semantic cache for FAQ queries
──────────────────────────────────────────────
Current:    68% of support queries are recurring
Recommended: Semantic cache with 0.95 similarity threshold
Savings:    ~EUR 560/month (cache hit eliminates LLM call)

Cost Dashboard

The cost dashboard shows:

  • Daily Burn Rate — Current daily spend with trend
  • Budget Utilization — Progress bar per budget
  • Cost per Interaction — Average cost per agent interaction
  • Model Cost Comparison — Costs broken down by model
  • Optimization Potential — Estimated savings from recommended actions

Best Practice: Set up cost centers and budgets from the start. Without clear cost allocation, AI spending grows uncontrollably. Most companies are surprised when they first see what individual agents actually cost.