# OpenClaw (formerly Clawdbot) Research Research notes on OpenClaw's architecture, API, and automation patterns for integration with sandbox-agent. ## Overview - **Provider**: Multi-provider (Anthropic, OpenAI, etc. via Pi agent) - **Execution Method**: WebSocket Gateway + HTTP APIs - **Session Persistence**: Session Key (string) + Session ID (UUID) - **SDK**: No official SDK - uses WebSocket/HTTP protocol directly - **Binary**: `clawdbot` (npm global install or local) - **Default Port**: 18789 (WebSocket + HTTP multiplex) ## Architecture OpenClaw is architected differently from other coding agents (Claude Code, Codex, OpenCode, Amp): ``` ┌─────────────────────────────────────┐ │ Gateway Service │ ws://127.0.0.1:18789 │ (long-running daemon) │ http://127.0.0.1:18789 │ │ │ ┌─────────────────────────────┐ │ │ │ Pi Agent (embedded RPC) │ │ │ │ - Tool execution │ │ │ │ - Model routing │ │ │ │ - Session management │ │ │ └─────────────────────────────┘ │ └─────────────────────────────────────┘ │ ├── WebSocket (full control plane) ├── HTTP /v1/chat/completions (OpenAI-compatible) ├── HTTP /v1/responses (OpenResponses-compatible) ├── HTTP /tools/invoke (single tool invocation) └── HTTP /hooks/agent (webhook triggers) ``` **Key Difference**: OpenClaw runs as a **daemon** that owns the agent runtime. Other agents (Claude, Codex, Amp) spawn a subprocess per turn. OpenClaw is more similar to OpenCode's server model but with a persistent gateway. ## Automation Methods (Priority Order) ### 1. WebSocket Gateway Protocol (Recommended) Full-featured bidirectional control with streaming events. #### Connection Handshake ```typescript // Connect to Gateway const ws = new WebSocket("ws://127.0.0.1:18789"); // First frame MUST be connect request ws.send(JSON.stringify({ type: "req", id: "1", method: "connect", params: { minProtocol: 3, maxProtocol: 3, client: { id: "gateway-client", // or custom client id version: "1.0.0", platform: "linux", mode: "backend" }, role: "operator", scopes: ["operator.admin"], caps: [], auth: { token: "YOUR_GATEWAY_TOKEN" } } })); // Expect hello-ok response // { type: "res", id: "1", ok: true, payload: { type: "hello-ok", ... } } ``` #### Agent Request ```typescript // Send agent turn request const runId = crypto.randomUUID(); ws.send(JSON.stringify({ type: "req", id: runId, method: "agent", params: { message: "Your prompt here", idempotencyKey: runId, sessionKey: "agent:main:main", // or custom session key thinking: "low", // optional: low|medium|high deliver: false, // don't send to messaging channel timeout: 300000 // 5 minute timeout } })); ``` #### Response Flow (Two-Stage) ```typescript // Stage 1: Immediate ack // { type: "res", id: "...", ok: true, payload: { runId, status: "accepted", acceptedAt: 1234567890 } } // Stage 2: Streaming events // { type: "event", event: "agent", payload: { runId, seq: 1, stream: "output", data: {...} } } // { type: "event", event: "agent", payload: { runId, seq: 2, stream: "tool", data: {...} } } // ... // Stage 3: Final response (same id as request) // { type: "res", id: "...", ok: true, payload: { runId, status: "ok", summary: "completed", result: {...} } } ``` ### 2. OpenAI-Compatible HTTP API For simple integration with tools expecting OpenAI Chat Completions. **Enable in config:** ```json5 { gateway: { http: { endpoints: { chatCompletions: { enabled: true } } } } } ``` **Request:** ```bash curl -X POST http://127.0.0.1:18789/v1/chat/completions \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "clawdbot:main", "messages": [{"role": "user", "content": "Hello"}], "stream": true }' ``` **Model Format:** - `model: "clawdbot:"` (e.g., `"clawdbot:main"`) - `model: "agent:"` (alias) ### 3. OpenResponses HTTP API For clients that speak OpenResponses (item-based input, function tools). **Enable in config:** ```json5 { gateway: { http: { endpoints: { responses: { enabled: true } } } } } ``` **Request:** ```bash curl -X POST http://127.0.0.1:18789/v1/responses \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "clawdbot:main", "input": "Hello", "stream": true }' ``` ### 4. Webhooks (Fire-and-Forget) For event-driven automation without maintaining a connection. **Enable in config:** ```json5 { hooks: { enabled: true, token: "webhook-secret", path: "/hooks" } } ``` **Request:** ```bash curl -X POST http://127.0.0.1:18789/hooks/agent \ -H "Authorization: Bearer webhook-secret" \ -H "Content-Type: application/json" \ -d '{ "message": "Run this task", "name": "Automation", "sessionKey": "hook:automation:task-123", "deliver": false, "timeoutSeconds": 120 }' ``` **Response:** `202 Accepted` (async run started) ### 5. CLI Subprocess For simple one-off automation (similar to Claude Code pattern). ```bash clawdbot agent --message "Your prompt" --session-key "automation:task" ``` ## Session Management ### Session Key Format ``` agent:: agent:main:main # Main agent, main session agent:main:subagent:abc # Subagent session agent:beta:main # Beta agent, main session hook:email:msg-123 # Webhook-spawned session global # Legacy global session ``` ### Session Operations (WebSocket) ```typescript // List sessions { type: "req", id: "...", method: "sessions.list", params: { limit: 50, activeMinutes: 120 } } // Resolve session info { type: "req", id: "...", method: "sessions.resolve", params: { key: "agent:main:main" } } // Patch session settings { type: "req", id: "...", method: "sessions.patch", params: { key: "agent:main:main", thinkingLevel: "medium", model: "anthropic/claude-sonnet-4-20250514" }} // Reset session (clear history) { type: "req", id: "...", method: "sessions.reset", params: { key: "agent:main:main" } } // Delete session { type: "req", id: "...", method: "sessions.delete", params: { key: "agent:main:main" } } // Compact session (summarize history) { type: "req", id: "...", method: "sessions.compact", params: { key: "agent:main:main" } } ``` ### Session CLI ```bash clawdbot sessions # List sessions clawdbot sessions --active 120 # Active in last 2 hours clawdbot sessions --json # JSON output ``` ## Streaming Events ### Event Format ```typescript interface AgentEvent { runId: string; // Correlates to request seq: number; // Monotonically increasing per run stream: string; // Event category ts: number; // Unix timestamp (ms) data: Record; // Event-specific payload } ``` ### Stream Types | Stream | Description | |--------|-------------| | `output` | Text output chunks | | `tool` | Tool invocation/result | | `thinking` | Extended thinking content | | `status` | Run status changes | | `error` | Error information | ### Event Categories | Event Type | Payload | |------------|---------| | `assistant.delta` | `{ text: "..." }` | | `tool.start` | `{ name: "Read", input: {...} }` | | `tool.result` | `{ name: "Read", result: "..." }` | | `thinking.delta` | `{ text: "..." }` | | `run.complete` | `{ summary: "..." }` | | `run.error` | `{ error: "..." }` | ## Token Usage / Cost Tracking OpenClaw tracks tokens per response and supports cost estimation. ### In-Chat Commands ``` /status # Session model, context usage, last response tokens, estimated cost /usage off|tokens|full # Toggle per-response usage footer /usage cost # Local cost summary from session logs ``` ### Configuration Token costs are configured per model: ```json5 { models: { providers: { anthropic: { models: [{ id: "claude-sonnet-4-20250514", cost: { input: 3.00, // USD per 1M tokens output: 15.00, cacheRead: 0.30, cacheWrite: 3.75 } }] } } } } ``` ### Programmatic Access Token usage is included in agent response payloads: ```typescript // In final response or streaming events { usage: { inputTokens: 1234, outputTokens: 567, cacheReadTokens: 890, cacheWriteTokens: 123 } } ``` ## Authentication ### Gateway Token ```bash # Environment variable CLAWDBOT_GATEWAY_TOKEN=your-secret-token # Or config file { gateway: { auth: { mode: "token", token: "your-secret-token" } } } ``` ### HTTP Requests ``` Authorization: Bearer YOUR_TOKEN ``` ### WebSocket Connect ```typescript { params: { auth: { token: "YOUR_TOKEN" } } } ``` ## Status Sync ### Health Check ```typescript // WebSocket { type: "req", id: "...", method: "health", params: {} } // HTTP curl http://127.0.0.1:18789/health # Basic health clawdbot health --json # Detailed health ``` ### Status Response ```typescript { ok: boolean; linkedChannel?: string; models?: { available: string[] }; agents?: { configured: string[] }; presence?: PresenceEntry[]; uptimeMs?: number; } ``` ### Presence Events The gateway pushes presence updates to all connected clients: ```typescript // Event { type: "event", event: "presence", payload: { entries: [...], stateVersion: {...} } } ``` ## Comparison with Other Agents | Aspect | OpenClaw | Claude Code | Codex | OpenCode | Amp | |--------|----------|-------------|-------|----------|-----| | **Process Model** | Daemon | Subprocess | Server | Server | Subprocess | | **Protocol** | WebSocket + HTTP | CLI JSONL | JSON-RPC stdio | HTTP + SSE | CLI JSONL | | **Session Resume** | Session Key | `--resume` | Thread ID | Session ID | `--continue` | | **Multi-Turn** | Same session key | Same session ID | Same thread | Same session | Same session | | **Streaming** | WS events + SSE | JSONL | Notifications | SSE | JSONL | | **HITL** | No | No (headless) | No (SDK) | Yes (SSE) | No | | **SDK** | None (protocol) | None (CLI) | Yes | Yes | Closed | ### Key Differences 1. **Daemon Model**: OpenClaw runs as a persistent gateway service, not a per-turn subprocess 2. **Multi-Protocol**: Supports WebSocket, OpenAI-compatible HTTP, OpenResponses, and webhooks 3. **Channel Integration**: Built-in WhatsApp/Telegram/Discord/iMessage support 4. **Node System**: Mobile/desktop nodes can connect for camera, canvas, location, etc. 5. **No HITL**: Like Claude/Codex/Amp, permissions are configured upfront, not interactive ## Integration Patterns for sandbox-agent ### Recommended: Persistent WebSocket Connection ```typescript class OpenClawDriver { private ws: WebSocket; private pending = new Map(); async connect(url: string, token: string) { this.ws = new WebSocket(url); await this.handshake(token); this.ws.on("message", (data) => this.handleMessage(JSON.parse(data))); } async runAgent(params: { message: string; sessionKey?: string; thinking?: string; }): Promise { const runId = crypto.randomUUID(); const events: AgentEvent[] = []; return new Promise((resolve, reject) => { this.pending.set(runId, { resolve, reject, events }); this.ws.send(JSON.stringify({ type: "req", id: runId, method: "agent", params: { message: params.message, sessionKey: params.sessionKey ?? "agent:main:main", thinking: params.thinking, deliver: false, idempotencyKey: runId } })); }); } private handleMessage(frame: GatewayFrame) { if (frame.type === "event" && frame.event === "agent") { const pending = this.pending.get(frame.payload.runId); if (pending) pending.events.push(frame.payload); } else if (frame.type === "res") { const pending = this.pending.get(frame.id); if (pending && frame.payload?.status === "ok") { pending.resolve({ result: frame.payload, events: pending.events }); this.pending.delete(frame.id); } else if (pending && frame.payload?.status === "error") { pending.reject(new Error(frame.payload.summary)); this.pending.delete(frame.id); } // Ignore "accepted" acks } } } ``` ### Alternative: HTTP API (Simpler) ```typescript async function runOpenClawPrompt(prompt: string, sessionKey?: string) { const response = await fetch("http://127.0.0.1:18789/v1/chat/completions", { method: "POST", headers: { "Authorization": `Bearer ${process.env.CLAWDBOT_GATEWAY_TOKEN}`, "Content-Type": "application/json", "x-clawdbot-session-key": sessionKey ?? "automation:sandbox" }, body: JSON.stringify({ model: "clawdbot:main", messages: [{ role: "user", content: prompt }], stream: false }) }); return response.json(); } ``` ## Configuration for sandbox-agent Integration Recommended config for automated use: ```json5 { gateway: { port: 18789, auth: { mode: "token", token: "${CLAWDBOT_GATEWAY_TOKEN}" }, http: { endpoints: { chatCompletions: { enabled: true }, responses: { enabled: true } } } }, agents: { defaults: { model: { primary: "anthropic/claude-sonnet-4-20250514" }, thinking: { level: "low" }, workspace: "${HOME}/workspace" } } } ``` ## Notes - OpenClaw is significantly more complex than other agents due to its gateway architecture - The multi-protocol support (WS, OpenAI, OpenResponses, webhooks) provides flexibility - Session management is richer (labels, spawn tracking, model/thinking overrides) - No SDK means direct protocol implementation is required - The daemon model means connection lifecycle management is important (reconnects, etc.) - Agent responses are two-stage: immediate ack + final result (handle both) - Tool policy filtering is configurable per agent/session/group