mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 15:03:37 +00:00
553 lines
14 KiB
Markdown
553 lines
14 KiB
Markdown
# OpenClaw (formerly Clawdbot) Research
|
|
|
|
Research notes on OpenClaw's architecture, API, and automation patterns for integration with sandbox-agent.
|
|
|
|
## Overview
|
|
|
|
- **Provider**: Multi-provider (Anthropic, OpenAI, etc. via Pi agent)
|
|
- **Execution Method**: WebSocket Gateway + HTTP APIs
|
|
- **Session Persistence**: Session Key (string) + Session ID (UUID)
|
|
- **SDK**: No official SDK - uses WebSocket/HTTP protocol directly
|
|
- **Binary**: `clawdbot` (npm global install or local)
|
|
- **Default Port**: 18789 (WebSocket + HTTP multiplex)
|
|
|
|
## Architecture
|
|
|
|
OpenClaw is architected differently from other coding agents (Claude Code, Codex, OpenCode, Amp):
|
|
|
|
```
|
|
┌─────────────────────────────────────┐
|
|
│ Gateway Service │ ws://127.0.0.1:18789
|
|
│ (long-running daemon) │ http://127.0.0.1:18789
|
|
│ │
|
|
│ ┌─────────────────────────────┐ │
|
|
│ │ Pi Agent (embedded RPC) │ │
|
|
│ │ - Tool execution │ │
|
|
│ │ - Model routing │ │
|
|
│ │ - Session management │ │
|
|
│ └─────────────────────────────┘ │
|
|
└─────────────────────────────────────┘
|
|
│
|
|
├── WebSocket (full control plane)
|
|
├── HTTP /v1/chat/completions (OpenAI-compatible)
|
|
├── HTTP /v1/responses (OpenResponses-compatible)
|
|
├── HTTP /tools/invoke (single tool invocation)
|
|
└── HTTP /hooks/agent (webhook triggers)
|
|
```
|
|
|
|
**Key Difference**: OpenClaw runs as a **daemon** that owns the agent runtime. Other agents (Claude, Codex, Amp) spawn a subprocess per turn. OpenClaw is more similar to OpenCode's server model but with a persistent gateway.
|
|
|
|
## Automation Methods (Priority Order)
|
|
|
|
### 1. WebSocket Gateway Protocol (Recommended)
|
|
|
|
Full-featured bidirectional control with streaming events.
|
|
|
|
#### Connection Handshake
|
|
|
|
```typescript
|
|
// Connect to Gateway
|
|
const ws = new WebSocket("ws://127.0.0.1:18789");
|
|
|
|
// First frame MUST be connect request
|
|
ws.send(JSON.stringify({
|
|
type: "req",
|
|
id: "1",
|
|
method: "connect",
|
|
params: {
|
|
minProtocol: 3,
|
|
maxProtocol: 3,
|
|
client: {
|
|
id: "gateway-client", // or custom client id
|
|
version: "1.0.0",
|
|
platform: "linux",
|
|
mode: "backend"
|
|
},
|
|
role: "operator",
|
|
scopes: ["operator.admin"],
|
|
caps: [],
|
|
auth: { token: "YOUR_GATEWAY_TOKEN" }
|
|
}
|
|
}));
|
|
|
|
// Expect hello-ok response
|
|
// { type: "res", id: "1", ok: true, payload: { type: "hello-ok", ... } }
|
|
```
|
|
|
|
#### Agent Request
|
|
|
|
```typescript
|
|
// Send agent turn request
|
|
const runId = crypto.randomUUID();
|
|
ws.send(JSON.stringify({
|
|
type: "req",
|
|
id: runId,
|
|
method: "agent",
|
|
params: {
|
|
message: "Your prompt here",
|
|
idempotencyKey: runId,
|
|
sessionKey: "agent:main:main", // or custom session key
|
|
thinking: "low", // optional: low|medium|high
|
|
deliver: false, // don't send to messaging channel
|
|
timeout: 300000 // 5 minute timeout
|
|
}
|
|
}));
|
|
```
|
|
|
|
#### Response Flow (Two-Stage)
|
|
|
|
```typescript
|
|
// Stage 1: Immediate ack
|
|
// { type: "res", id: "...", ok: true, payload: { runId, status: "accepted", acceptedAt: 1234567890 } }
|
|
|
|
// Stage 2: Streaming events
|
|
// { type: "event", event: "agent", payload: { runId, seq: 1, stream: "output", data: {...} } }
|
|
// { type: "event", event: "agent", payload: { runId, seq: 2, stream: "tool", data: {...} } }
|
|
// ...
|
|
|
|
// Stage 3: Final response (same id as request)
|
|
// { type: "res", id: "...", ok: true, payload: { runId, status: "ok", summary: "completed", result: {...} } }
|
|
```
|
|
|
|
### 2. OpenAI-Compatible HTTP API
|
|
|
|
For simple integration with tools expecting OpenAI Chat Completions.
|
|
|
|
**Enable in config:**
|
|
```json5
|
|
{
|
|
gateway: {
|
|
http: {
|
|
endpoints: {
|
|
chatCompletions: { enabled: true }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X POST http://127.0.0.1:18789/v1/chat/completions \
|
|
-H "Authorization: Bearer YOUR_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "clawdbot:main",
|
|
"messages": [{"role": "user", "content": "Hello"}],
|
|
"stream": true
|
|
}'
|
|
```
|
|
|
|
**Model Format:**
|
|
- `model: "clawdbot:<agentId>"` (e.g., `"clawdbot:main"`)
|
|
- `model: "agent:<agentId>"` (alias)
|
|
|
|
### 3. OpenResponses HTTP API
|
|
|
|
For clients that speak OpenResponses (item-based input, function tools).
|
|
|
|
**Enable in config:**
|
|
```json5
|
|
{
|
|
gateway: {
|
|
http: {
|
|
endpoints: {
|
|
responses: { enabled: true }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X POST http://127.0.0.1:18789/v1/responses \
|
|
-H "Authorization: Bearer YOUR_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "clawdbot:main",
|
|
"input": "Hello",
|
|
"stream": true
|
|
}'
|
|
```
|
|
|
|
### 4. Webhooks (Fire-and-Forget)
|
|
|
|
For event-driven automation without maintaining a connection.
|
|
|
|
**Enable in config:**
|
|
```json5
|
|
{
|
|
hooks: {
|
|
enabled: true,
|
|
token: "webhook-secret",
|
|
path: "/hooks"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X POST http://127.0.0.1:18789/hooks/agent \
|
|
-H "Authorization: Bearer webhook-secret" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"message": "Run this task",
|
|
"name": "Automation",
|
|
"sessionKey": "hook:automation:task-123",
|
|
"deliver": false,
|
|
"timeoutSeconds": 120
|
|
}'
|
|
```
|
|
|
|
**Response:** `202 Accepted` (async run started)
|
|
|
|
### 5. CLI Subprocess
|
|
|
|
For simple one-off automation (similar to Claude Code pattern).
|
|
|
|
```bash
|
|
clawdbot agent --message "Your prompt" --session-key "automation:task"
|
|
```
|
|
|
|
## Session Management
|
|
|
|
### Session Key Format
|
|
|
|
```
|
|
agent:<agentId>:<sessionType>
|
|
agent:main:main # Main agent, main session
|
|
agent:main:subagent:abc # Subagent session
|
|
agent:beta:main # Beta agent, main session
|
|
hook:email:msg-123 # Webhook-spawned session
|
|
global # Legacy global session
|
|
```
|
|
|
|
### Session Operations (WebSocket)
|
|
|
|
```typescript
|
|
// List sessions
|
|
{ type: "req", id: "...", method: "sessions.list", params: { limit: 50, activeMinutes: 120 } }
|
|
|
|
// Resolve session info
|
|
{ type: "req", id: "...", method: "sessions.resolve", params: { key: "agent:main:main" } }
|
|
|
|
// Patch session settings
|
|
{ type: "req", id: "...", method: "sessions.patch", params: {
|
|
key: "agent:main:main",
|
|
thinkingLevel: "medium",
|
|
model: "anthropic/claude-sonnet-4-20250514"
|
|
}}
|
|
|
|
// Reset session (clear history)
|
|
{ type: "req", id: "...", method: "sessions.reset", params: { key: "agent:main:main" } }
|
|
|
|
// Delete session
|
|
{ type: "req", id: "...", method: "sessions.delete", params: { key: "agent:main:main" } }
|
|
|
|
// Compact session (summarize history)
|
|
{ type: "req", id: "...", method: "sessions.compact", params: { key: "agent:main:main" } }
|
|
```
|
|
|
|
### Session CLI
|
|
|
|
```bash
|
|
clawdbot sessions # List sessions
|
|
clawdbot sessions --active 120 # Active in last 2 hours
|
|
clawdbot sessions --json # JSON output
|
|
```
|
|
|
|
## Streaming Events
|
|
|
|
### Event Format
|
|
|
|
```typescript
|
|
interface AgentEvent {
|
|
runId: string; // Correlates to request
|
|
seq: number; // Monotonically increasing per run
|
|
stream: string; // Event category
|
|
ts: number; // Unix timestamp (ms)
|
|
data: Record<string, unknown>; // Event-specific payload
|
|
}
|
|
```
|
|
|
|
### Stream Types
|
|
|
|
| Stream | Description |
|
|
|--------|-------------|
|
|
| `output` | Text output chunks |
|
|
| `tool` | Tool invocation/result |
|
|
| `thinking` | Extended thinking content |
|
|
| `status` | Run status changes |
|
|
| `error` | Error information |
|
|
|
|
### Event Categories
|
|
|
|
| Event Type | Payload |
|
|
|------------|---------|
|
|
| `assistant.delta` | `{ text: "..." }` |
|
|
| `tool.start` | `{ name: "Read", input: {...} }` |
|
|
| `tool.result` | `{ name: "Read", result: "..." }` |
|
|
| `thinking.delta` | `{ text: "..." }` |
|
|
| `run.complete` | `{ summary: "..." }` |
|
|
| `run.error` | `{ error: "..." }` |
|
|
|
|
## Token Usage / Cost Tracking
|
|
|
|
OpenClaw tracks tokens per response and supports cost estimation.
|
|
|
|
### In-Chat Commands
|
|
|
|
```
|
|
/status # Session model, context usage, last response tokens, estimated cost
|
|
/usage off|tokens|full # Toggle per-response usage footer
|
|
/usage cost # Local cost summary from session logs
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Token costs are configured per model:
|
|
```json5
|
|
{
|
|
models: {
|
|
providers: {
|
|
anthropic: {
|
|
models: [{
|
|
id: "claude-sonnet-4-20250514",
|
|
cost: {
|
|
input: 3.00, // USD per 1M tokens
|
|
output: 15.00,
|
|
cacheRead: 0.30,
|
|
cacheWrite: 3.75
|
|
}
|
|
}]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Programmatic Access
|
|
|
|
Token usage is included in agent response payloads:
|
|
```typescript
|
|
// In final response or streaming events
|
|
{
|
|
usage: {
|
|
inputTokens: 1234,
|
|
outputTokens: 567,
|
|
cacheReadTokens: 890,
|
|
cacheWriteTokens: 123
|
|
}
|
|
}
|
|
```
|
|
|
|
## Authentication
|
|
|
|
### Gateway Token
|
|
|
|
```bash
|
|
# Environment variable
|
|
CLAWDBOT_GATEWAY_TOKEN=your-secret-token
|
|
|
|
# Or config file
|
|
{
|
|
gateway: {
|
|
auth: {
|
|
mode: "token",
|
|
token: "your-secret-token"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### HTTP Requests
|
|
|
|
```
|
|
Authorization: Bearer YOUR_TOKEN
|
|
```
|
|
|
|
### WebSocket Connect
|
|
|
|
```typescript
|
|
{
|
|
params: {
|
|
auth: { token: "YOUR_TOKEN" }
|
|
}
|
|
}
|
|
```
|
|
|
|
## Status Sync
|
|
|
|
### Health Check
|
|
|
|
```typescript
|
|
// WebSocket
|
|
{ type: "req", id: "...", method: "health", params: {} }
|
|
|
|
// HTTP
|
|
curl http://127.0.0.1:18789/health # Basic health
|
|
clawdbot health --json # Detailed health
|
|
```
|
|
|
|
### Status Response
|
|
|
|
```typescript
|
|
{
|
|
ok: boolean;
|
|
linkedChannel?: string;
|
|
models?: { available: string[] };
|
|
agents?: { configured: string[] };
|
|
presence?: PresenceEntry[];
|
|
uptimeMs?: number;
|
|
}
|
|
```
|
|
|
|
### Presence Events
|
|
|
|
The gateway pushes presence updates to all connected clients:
|
|
```typescript
|
|
// Event
|
|
{ type: "event", event: "presence", payload: { entries: [...], stateVersion: {...} } }
|
|
```
|
|
|
|
## Comparison with Other Agents
|
|
|
|
| Aspect | OpenClaw | Claude Code | Codex | OpenCode | Amp |
|
|
|--------|----------|-------------|-------|----------|-----|
|
|
| **Process Model** | Daemon | Subprocess | Server | Server | Subprocess |
|
|
| **Protocol** | WebSocket + HTTP | CLI JSONL | JSON-RPC stdio | HTTP + SSE | CLI JSONL |
|
|
| **Session Resume** | Session Key | `--resume` | Thread ID | Session ID | `--continue` |
|
|
| **Multi-Turn** | Same session key | Same session ID | Same thread | Same session | Same session |
|
|
| **Streaming** | WS events + SSE | JSONL | Notifications | SSE | JSONL |
|
|
| **HITL** | No | No (headless) | No (SDK) | Yes (SSE) | No |
|
|
| **SDK** | None (protocol) | None (CLI) | Yes | Yes | Closed |
|
|
|
|
### Key Differences
|
|
|
|
1. **Daemon Model**: OpenClaw runs as a persistent gateway service, not a per-turn subprocess
|
|
2. **Multi-Protocol**: Supports WebSocket, OpenAI-compatible HTTP, OpenResponses, and webhooks
|
|
3. **Channel Integration**: Built-in WhatsApp/Telegram/Discord/iMessage support
|
|
4. **Node System**: Mobile/desktop nodes can connect for camera, canvas, location, etc.
|
|
5. **No HITL**: Like Claude/Codex/Amp, permissions are configured upfront, not interactive
|
|
|
|
## Integration Patterns for sandbox-agent
|
|
|
|
### Recommended: Persistent WebSocket Connection
|
|
|
|
```typescript
|
|
class OpenClawDriver {
|
|
private ws: WebSocket;
|
|
private pending = new Map<string, { resolve, reject }>();
|
|
|
|
async connect(url: string, token: string) {
|
|
this.ws = new WebSocket(url);
|
|
await this.handshake(token);
|
|
this.ws.on("message", (data) => this.handleMessage(JSON.parse(data)));
|
|
}
|
|
|
|
async runAgent(params: {
|
|
message: string;
|
|
sessionKey?: string;
|
|
thinking?: string;
|
|
}): Promise<AgentResult> {
|
|
const runId = crypto.randomUUID();
|
|
const events: AgentEvent[] = [];
|
|
|
|
return new Promise((resolve, reject) => {
|
|
this.pending.set(runId, { resolve, reject, events });
|
|
this.ws.send(JSON.stringify({
|
|
type: "req",
|
|
id: runId,
|
|
method: "agent",
|
|
params: {
|
|
message: params.message,
|
|
sessionKey: params.sessionKey ?? "agent:main:main",
|
|
thinking: params.thinking,
|
|
deliver: false,
|
|
idempotencyKey: runId
|
|
}
|
|
}));
|
|
});
|
|
}
|
|
|
|
private handleMessage(frame: GatewayFrame) {
|
|
if (frame.type === "event" && frame.event === "agent") {
|
|
const pending = this.pending.get(frame.payload.runId);
|
|
if (pending) pending.events.push(frame.payload);
|
|
} else if (frame.type === "res") {
|
|
const pending = this.pending.get(frame.id);
|
|
if (pending && frame.payload?.status === "ok") {
|
|
pending.resolve({ result: frame.payload, events: pending.events });
|
|
this.pending.delete(frame.id);
|
|
} else if (pending && frame.payload?.status === "error") {
|
|
pending.reject(new Error(frame.payload.summary));
|
|
this.pending.delete(frame.id);
|
|
}
|
|
// Ignore "accepted" acks
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Alternative: HTTP API (Simpler)
|
|
|
|
```typescript
|
|
async function runOpenClawPrompt(prompt: string, sessionKey?: string) {
|
|
const response = await fetch("http://127.0.0.1:18789/v1/chat/completions", {
|
|
method: "POST",
|
|
headers: {
|
|
"Authorization": `Bearer ${process.env.CLAWDBOT_GATEWAY_TOKEN}`,
|
|
"Content-Type": "application/json",
|
|
"x-clawdbot-session-key": sessionKey ?? "automation:sandbox"
|
|
},
|
|
body: JSON.stringify({
|
|
model: "clawdbot:main",
|
|
messages: [{ role: "user", content: prompt }],
|
|
stream: false
|
|
})
|
|
});
|
|
return response.json();
|
|
}
|
|
```
|
|
|
|
## Configuration for sandbox-agent Integration
|
|
|
|
Recommended config for automated use:
|
|
|
|
```json5
|
|
{
|
|
gateway: {
|
|
port: 18789,
|
|
auth: {
|
|
mode: "token",
|
|
token: "${CLAWDBOT_GATEWAY_TOKEN}"
|
|
},
|
|
http: {
|
|
endpoints: {
|
|
chatCompletions: { enabled: true },
|
|
responses: { enabled: true }
|
|
}
|
|
}
|
|
},
|
|
agents: {
|
|
defaults: {
|
|
model: {
|
|
primary: "anthropic/claude-sonnet-4-20250514"
|
|
},
|
|
thinking: { level: "low" },
|
|
workspace: "${HOME}/workspace"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Notes
|
|
|
|
- OpenClaw is significantly more complex than other agents due to its gateway architecture
|
|
- The multi-protocol support (WS, OpenAI, OpenResponses, webhooks) provides flexibility
|
|
- Session management is richer (labels, spawn tracking, model/thinking overrides)
|
|
- No SDK means direct protocol implementation is required
|
|
- The daemon model means connection lifecycle management is important (reconnects, etc.)
|
|
- Agent responses are two-stage: immediate ack + final result (handle both)
|
|
- Tool policy filtering is configurable per agent/session/group
|