Merge branch 'main' into feat/support-pi

This commit is contained in:
Nathan Flurry 2026-02-10 22:27:03 -08:00
commit 4c6c5983c0
156 changed files with 16196 additions and 2338 deletions

278
docs/agent-sessions.mdx Normal file
View file

@ -0,0 +1,278 @@
---
title: "Agent Sessions"
description: "Create sessions and send messages to agents."
sidebarTitle: "Sessions"
icon: "comments"
---
Sessions are the unit of interaction with an agent. You create one session per task, then send messages and stream events.
## Session Options
`POST /v1/sessions/{sessionId}` accepts the following fields:
- `agent` (required): `claude`, `codex`, `opencode`, `amp`, or `mock`
- `agentMode`: agent mode string (for example, `build`, `plan`)
- `permissionMode`: permission mode string (`default`, `plan`, `bypass`, etc.)
- `model`: model override (agent-specific)
- `variant`: model variant (agent-specific)
- `agentVersion`: agent version override
- `mcp`: MCP server config map (see `MCP`)
- `skills`: skill path config (see `Skills`)
## Create A Session
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("build-session", {
agent: "codex",
agentMode: "build",
permissionMode: "default",
model: "gpt-4.1",
variant: "reasoning",
agentVersion: "latest",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "codex",
"agentMode": "build",
"permissionMode": "default",
"model": "gpt-4.1",
"variant": "reasoning",
"agentVersion": "latest"
}'
```
</CodeGroup>
## Send A Message
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("build-session", {
message: "Summarize the repository structure.",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Summarize the repository structure."}'
```
</CodeGroup>
## Stream A Turn
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const response = await client.postMessageStream("build-session", {
message: "Explain the main entrypoints.",
});
const reader = response.body?.getReader();
if (reader) {
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value, { stream: true }));
}
}
```
```bash cURL
curl -N -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages/stream" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Explain the main entrypoints."}'
```
</CodeGroup>
## Fetch Events
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const events = await client.getEvents("build-session", {
offset: 0,
limit: 50,
includeRaw: false,
});
console.log(events.events);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events?offset=0&limit=50" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
`GET /v1/sessions/{sessionId}/get-messages` is an alias for `events`.
## Stream Events (SSE)
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
for await (const event of client.streamEvents("build-session", { offset: 0 })) {
console.log(event.type, event.data);
}
```
```bash cURL
curl -N -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events/sse?offset=0" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## List Sessions
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const sessions = await client.listSessions();
console.log(sessions.sessions);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Reply To A Question
When the agent asks a question, reply with an array of answers. Each inner array is one multi-select response.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.replyQuestion("build-session", "question-1", {
answers: [["yes"]],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"answers":[["yes"]]}'
```
</CodeGroup>
## Reject A Question
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.rejectQuestion("build-session", "question-1");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reject" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Reply To A Permission Request
Use `once`, `always`, or `reject`.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.replyPermission("build-session", "permission-1", {
reply: "once",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/permissions/permission-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"reply":"once"}'
```
</CodeGroup>
## Terminate A Session
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.terminateSession("build-session");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/terminate" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>

87
docs/attachments.mdx Normal file
View file

@ -0,0 +1,87 @@
---
title: "Attachments"
description: "Upload files into the sandbox and attach them to prompts."
sidebarTitle: "Attachments"
icon: "paperclip"
---
Use the filesystem API to upload files, then reference them as attachments when sending prompts.
<Steps>
<Step title="Upload a file">
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const buffer = await fs.promises.readFile("./data.csv");
const upload = await client.writeFsFile(
{ path: "./uploads/data.csv", sessionId: "my-session" },
buffer,
);
console.log(upload.path);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./data.csv
```
</CodeGroup>
The response returns the absolute path that you should use for attachments.
</Step>
<Step title="Attach the file in a prompt">
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("my-session", {
message: "Please analyze the attached CSV.",
attachments: [
{
path: "/home/sandbox/uploads/data.csv",
mime: "text/csv",
filename: "data.csv",
},
],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"message": "Please analyze the attached CSV.",
"attachments": [
{
"path": "/home/sandbox/uploads/data.csv",
"mime": "text/csv",
"filename": "data.csv"
}
]
}'
```
</CodeGroup>
</Step>
</Steps>
## Notes
- Use absolute paths from the upload response to avoid ambiguity.
- If `mime` is omitted, the server defaults to `application/octet-stream`.
- OpenCode receives file parts directly; other agents will see the attachment paths appended to the prompt.

View file

@ -29,7 +29,7 @@ const sessionId = `session-${crypto.randomUUID()}`;
await client.createSession(sessionId, {
agent: "claude",
agentMode: "code", // Optional: agent-specific mode
permissionMode: "default", // Optional: "default" | "plan" | "bypass"
permissionMode: "default", // Optional: "default" | "plan" | "bypass" | "acceptEdits" (Claude: accept edits; Codex: auto-approve file changes; others: default)
model: "claude-sonnet-4", // Optional: model override
});
```
@ -70,7 +70,7 @@ Use `offset` to track the last seen `sequence` number and resume from where you
### Bare minimum
Handle these three events to render a basic chat:
Handle item lifecycle plus turn lifecycle to render a basic chat:
```ts
type ItemState = {
@ -79,9 +79,20 @@ type ItemState = {
};
const items = new Map<string, ItemState>();
let turnInProgress = false;
function handleEvent(event: UniversalEvent) {
switch (event.type) {
case "turn.started": {
turnInProgress = true;
break;
}
case "turn.ended": {
turnInProgress = false;
break;
}
case "item.started": {
const { item } = event.data as ItemEventData;
items.set(item.item_id, { item, deltas: [] });
@ -110,12 +121,14 @@ function handleEvent(event: UniversalEvent) {
}
```
When rendering, show a loading indicator while `item.status === "in_progress"`:
When rendering:
- Use `turnInProgress` for turn-level UI state (disable send button, show global "Agent is responding", etc.).
- Use `item.status === "in_progress"` for per-item streaming state.
```ts
function renderItem(state: ItemState) {
const { item, deltas } = state;
const isLoading = item.status === "in_progress";
const isItemLoading = item.status === "in_progress";
// For streaming text, combine item content with accumulated deltas
const text = item.content
@ -126,7 +139,8 @@ function renderItem(state: ItemState) {
return {
content: streamedText,
isLoading,
isItemLoading,
isTurnLoading: turnInProgress,
role: item.role,
kind: item.kind,
};

View file

@ -2,7 +2,6 @@
title: "CLI Reference"
description: "Complete CLI reference for sandbox-agent."
sidebarTitle: "CLI"
icon: "terminal"
---
## Server
@ -71,7 +70,6 @@ sandbox-agent opencode [OPTIONS]
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `--session-title <TITLE>` | - | Title for the OpenCode session |
| `--opencode-bin <PATH>` | - | Override `opencode` binary path |
```bash
sandbox-agent opencode --token "$TOKEN"
@ -79,7 +77,7 @@ sandbox-agent opencode --token "$TOKEN"
The daemon logs to a per-host log file under the sandbox-agent data directory (for example, `~/.local/share/sandbox-agent/daemon/daemon-127-0-0-1-2468.log`).
Requires the `opencode` binary to be installed (or set `OPENCODE_BIN` / `--opencode-bin`). If it is not found on `PATH`, sandbox-agent installs it automatically.
Existing installs are reused and missing binaries are installed automatically.
---
@ -247,10 +245,12 @@ sandbox-agent api sessions create <SESSION_ID> [OPTIONS]
|--------|-------------|
| `-a, --agent <AGENT>` | Agent identifier (required) |
| `-g, --agent-mode <MODE>` | Agent mode |
| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`) |
| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`, `acceptEdits`) |
| `-m, --model <MODEL>` | Model override |
| `-v, --variant <VARIANT>` | Model variant |
| `-A, --agent-version <VERSION>` | Agent version |
| `--mcp-config <PATH>` | JSON file with MCP server config (see `mcp` docs) |
| `--skill <PATH>` | Skill directory or `SKILL.md` path (repeatable) |
```bash
sandbox-agent api sessions create my-session \
@ -259,6 +259,8 @@ sandbox-agent api sessions create my-session \
--permission-mode default
```
`acceptEdits` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
#### Send Message
```bash
@ -380,6 +382,132 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once
---
### Filesystem
#### List Entries
```bash
sandbox-agent api fs entries [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--path <PATH>` | Directory path (default: `.`) |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs entries --path ./workspace
```
#### Read File
`api fs read` writes raw bytes to stdout.
```bash
sandbox-agent api fs read <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs read ./notes.txt > ./notes.txt
```
#### Write File
```bash
sandbox-agent api fs write <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--content <TEXT>` | Write UTF-8 content |
| `--from-file <PATH>` | Read content from a local file |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs write ./hello.txt --content "hello"
sandbox-agent api fs write ./image.bin --from-file ./image.bin
```
#### Delete Entry
```bash
sandbox-agent api fs delete <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--recursive` | Delete directories recursively |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs delete ./old.log
```
#### Create Directory
```bash
sandbox-agent api fs mkdir <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs mkdir ./cache
```
#### Move/Rename
```bash
sandbox-agent api fs move <FROM> <TO> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--overwrite` | Overwrite destination if it exists |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs move ./a.txt ./b.txt --overwrite
```
#### Stat
```bash
sandbox-agent api fs stat <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs stat ./notes.txt
```
#### Upload Batch (tar)
```bash
sandbox-agent api fs upload-batch --tar <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--tar <PATH>` | Tar archive to extract |
| `--path <PATH>` | Destination directory |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs upload-batch --tar ./skills.tar --path ./skills
```
---
## CLI to HTTP Mapping
| CLI Command | HTTP Endpoint |
@ -398,3 +526,11 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once
| `api sessions reply-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reply` |
| `api sessions reject-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reject` |
| `api sessions reply-permission` | `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply` |
| `api fs entries` | `GET /v1/fs/entries` |
| `api fs read` | `GET /v1/fs/file` |
| `api fs write` | `PUT /v1/fs/file` |
| `api fs delete` | `DELETE /v1/fs/entry` |
| `api fs mkdir` | `POST /v1/fs/mkdir` |
| `api fs move` | `POST /v1/fs/move` |
| `api fs stat` | `GET /v1/fs/stat` |
| `api fs upload-batch` | `POST /v1/fs/upload-batch` |

View file

@ -44,9 +44,11 @@ Events / Message Flow
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+
| session.started | none | method=thread/started | type=session.created | none | none |
| session.ended | SDKMessage.type=result | no explicit session end (turn/completed) | no explicit session end (session.deleted)| type=done | none (daemon synthetic) |
| turn.started | synthetic on message send | method=turn/started | type=session.status (busy) | synthetic on message send | none (daemon synthetic) |
| turn.ended | synthetic after result | method=turn/completed | type=session.idle | synthetic on done | none (daemon synthetic) |
| message (user) | SDKMessage.type=user | item/completed (ThreadItem.type=userMessage)| message.updated (Message.role=user) | type=message | none (daemon synthetic) |
| message (assistant) | SDKMessage.type=assistant | item/completed (ThreadItem.type=agentMessage)| message.updated (Message.role=assistant)| type=message | message_start/message_end |
| message.delta | stream_event (partial) or synthetic | method=item/agentMessage/delta | type=message.part.updated (delta) | synthetic | message_update (text_delta/thinking_delta) |
| message.delta | stream_event (partial) or synthetic | method=item/agentMessage/delta | type=message.part.updated (text-part delta) | synthetic | message_update (text_delta/thinking_delta) |
| tool call | type=tool_use | method=item/mcpToolCall/progress | message.part.updated (part.type=tool) | type=tool_call | tool_execution_start |
| tool result | user.message.content.tool_result | item/completed (tool result ThreadItem variants) | message.part.updated (part.type=tool, state=completed) | type=tool_result | tool_execution_end |
| permission.requested | control_request.can_use_tool | none | type=permission.asked | none | none |
@ -56,6 +58,10 @@ Events / Message Flow
| error | SDKResultMessage.error | method=error | type=session.error (or message error) | type=error | hook_error (status item) |
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+
Permission status normalization:
- `permission.requested` uses `status=requested`.
- `permission.resolved` uses `status=accept`, `accept_for_session`, or `reject`.
Synthetics
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
@ -63,6 +69,8 @@ Synthetics
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| session.started | When agent emits no explicit start | session.started event | Mark source=daemon |
| session.ended | When agent emits no explicit end | session.ended event | Mark source=daemon; reason may be inferred |
| turn.started | When agent emits no explicit turn start | turn.started event | Mark source=daemon |
| turn.ended | When agent emits no explicit turn end | turn.ended event | Mark source=daemon |
| item_id (Claude) | Claude provides no item IDs | item_id | Maintain provider_item_id map when possible |
| user message (Claude) | Claude emits only assistant output | item.completed | Mark source=daemon; preserve raw input in event metadata |
| question events (Claude) | AskUserQuestion tool usage | question.requested/resolved | Derived from tool_use blocks (source=agent) |
@ -71,7 +79,7 @@ Synthetics
| message.delta (Claude) | No native deltas emitted | item.delta | Synthetic delta with full message content; source=daemon |
| message.delta (Amp) | No native deltas | item.delta | Synthetic delta with full message content; source=daemon |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| message.delta (OpenCode) | part delta before message | item.delta | If part arrives first, create item.started stub then delta |
| message.delta (OpenCode) | text part delta before message | item.delta | If part arrives first, create item.started stub then delta |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
Delta handling
@ -82,10 +90,11 @@ Delta handling
- Pi emits message_update deltas and cumulative tool_execution_update partialResult values (we diff to produce deltas).
Policy:
- Always emit item.delta across all providers.
- Emit item.delta for streamable text content across providers.
- For providers without native deltas, emit a single synthetic delta containing the full content prior to item.completed.
- For Claude when partial streaming is enabled, forward native deltas and skip the synthetic full-content delta.
- For providers with native deltas, forward as-is; also emit item.completed when final content is known.
- For OpenCode reasoning part deltas, emit typed reasoning item updates (item.started/item.completed with content.type=reasoning) instead of item.delta.
Message normalization notes

144
docs/credentials.mdx Normal file
View file

@ -0,0 +1,144 @@
---
title: "Credentials"
description: "How sandbox-agent discovers and uses provider credentials."
icon: "key"
---
Sandbox-agent automatically discovers API credentials from environment variables and agent config files. Credentials are used to authenticate with AI providers (Anthropic, OpenAI) when spawning agents.
## Credential sources
Credentials are extracted in priority order. The first valid credential found for each provider is used.
### Environment variables (highest priority)
**API keys** (checked first):
| Variable | Provider |
|----------|----------|
| `ANTHROPIC_API_KEY` | Anthropic |
| `CLAUDE_API_KEY` | Anthropic (fallback) |
| `OPENAI_API_KEY` | OpenAI |
| `CODEX_API_KEY` | OpenAI (fallback) |
**OAuth tokens** (checked if no API key found):
| Variable | Provider |
|----------|----------|
| `CLAUDE_CODE_OAUTH_TOKEN` | Anthropic (OAuth) |
| `ANTHROPIC_AUTH_TOKEN` | Anthropic (OAuth fallback) |
OAuth tokens from environment variables are only used when `include_oauth` is enabled (the default).
### Agent config files
If no environment variable is set, sandbox-agent checks agent-specific config files:
| Agent | Config path | Provider |
|-------|-------------|----------|
| Amp | `~/.amp/config.json` | Anthropic |
| Claude Code | `~/.claude.json`, `~/.claude/.credentials.json` | Anthropic |
| Codex | `~/.codex/auth.json` | OpenAI |
| OpenCode | `~/.local/share/opencode/auth.json` | Both |
OAuth tokens are supported for Claude Code, Codex, and OpenCode. Expired tokens are automatically skipped.
## Provider requirements by agent
| Agent | Required provider |
|-------|-------------------|
| Claude Code | Anthropic |
| Amp | Anthropic |
| Codex | OpenAI |
| OpenCode | Anthropic or OpenAI |
| Mock | None |
## Error handling behavior
Sandbox-agent uses a **best-effort, fail-forward** approach to credentials:
### Extraction failures are silent
If a config file is missing, unreadable, or malformed, extraction continues to the next source. No errors are thrown. Missing credentials simply mean the provider is marked as unavailable.
```
~/.claude.json missing → try ~/.claude/.credentials.json
~/.claude/.credentials.json missing → try OpenCode config
All sources exhausted → anthropic = None (not an error)
```
### Agents spawn without credential validation
When you send a message to a session, sandbox-agent does **not** pre-validate credentials. The agent process is spawned with whatever credentials were found (or none), and the agent's native error surfaces if authentication fails.
This design:
- Lets you test agent error handling behavior
- Avoids duplicating provider-specific auth validation
- Ensures sandbox-agent faithfully proxies agent behavior
For example, sending a message to Claude Code without Anthropic credentials will spawn the agent, which will then emit its own "ANTHROPIC_API_KEY not set" error through the event stream.
## Checking credential status
### API endpoint
The `GET /v1/agents` endpoint includes a `credentialsAvailable` field for each agent:
```json
{
"agents": [
{
"id": "claude",
"installed": true,
"credentialsAvailable": true,
...
},
{
"id": "codex",
"installed": true,
"credentialsAvailable": false,
...
}
]
}
```
### TypeScript SDK
```typescript
const { agents } = await client.listAgents();
for (const agent of agents) {
console.log(`${agent.id}: ${agent.credentialsAvailable ? 'authenticated' : 'no credentials'}`);
}
```
### OpenCode compatibility
The `/opencode/provider` endpoint returns a `connected` array listing providers with valid credentials:
```json
{
"all": [...],
"connected": ["claude", "mock"]
}
```
## Passing credentials explicitly
You can override auto-discovered credentials by setting environment variables before starting sandbox-agent:
```bash
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
sandbox-agent daemon start
```
Or when using the SDK in embedded mode:
```typescript
const client = await SandboxAgentClient.spawn({
env: {
ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
},
});
```

245
docs/custom-tools.mdx Normal file
View file

@ -0,0 +1,245 @@
---
title: "Custom Tools"
description: "Give agents custom tools inside the sandbox using MCP servers or skills."
sidebarTitle: "Custom Tools"
icon: "wrench"
---
There are two ways to give agents custom tools that run inside the sandbox:
| | MCP Server | Skill |
|---|---|---|
| **How it works** | Sandbox Agent spawns your MCP server process and routes tool calls to it via stdio | A markdown file that instructs the agent to run your script with `node` (or any command) |
| **Tool discovery** | Agent sees tools automatically via MCP protocol | Agent reads instructions from the skill file |
| **Best for** | Structured tools with typed inputs/outputs | Lightweight scripts with natural-language instructions |
| **Requires** | `@modelcontextprotocol/sdk` dependency | Just a markdown file and a script |
Both approaches execute code inside the sandbox, so your tools have full access to the sandbox filesystem, network, and installed system tools.
## Option A: Tools via MCP
<Steps>
<Step title="Write your MCP server">
Create an MCP server that exposes tools using `@modelcontextprotocol/sdk` with `StdioServerTransport`. This server will run inside the sandbox.
```ts src/mcp-server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "rand",
version: "1.0.0",
});
server.tool(
"random_number",
"Generate a random integer between min and max (inclusive)",
{
min: z.number().describe("Minimum value"),
max: z.number().describe("Maximum value"),
},
async ({ min, max }) => ({
content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
}),
);
const transport = new StdioServerTransport();
await server.connect(transport);
```
This is a simple example. Your MCP server runs inside the sandbox, so you can execute any code you'd like: query databases, call internal APIs, run shell commands, or interact with any service available in the container.
</Step>
<Step title="Package the MCP server">
Bundle into a single JS file so it can be uploaded and executed without a `node_modules` folder.
```bash
npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs
```
This creates `dist/mcp-server.cjs` ready to upload.
</Step>
<Step title="Create sandbox and upload MCP server">
Start your sandbox, then write the bundled file into it.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const content = await fs.promises.readFile("./dist/mcp-server.cjs");
await client.writeFsFile(
{ path: "/opt/mcp/custom-tools/mcp-server.cjs" },
content,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/mcp-server.cjs
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point an MCP server config at the bundled JS file. When the session starts, Sandbox Agent spawns the MCP server process and routes tool calls to it.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
mcp: {
customTools: {
type: "local",
command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
},
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"customTools": {
"type": "local",
"command": ["node", "/opt/mcp/custom-tools/mcp-server.cjs"]
}
}
}'
```
</CodeGroup>
</Step>
</Steps>
## Option B: Tools via Skills
Skills are markdown files that instruct the agent how to use a script. Upload the script and a skill file, then point the session at the skill directory.
<Steps>
<Step title="Write your script">
Write a script that the agent will execute. This runs inside the sandbox just like an MCP server, but the agent invokes it directly via its shell tool.
```ts src/random-number.ts
const min = Number(process.argv[2]);
const max = Number(process.argv[3]);
if (Number.isNaN(min) || Number.isNaN(max)) {
console.error("Usage: random-number <min> <max>");
process.exit(1);
}
console.log(Math.floor(Math.random() * (max - min + 1)) + min);
```
</Step>
<Step title="Write a skill file">
Create a `SKILL.md` that tells the agent what the script does and how to run it. The frontmatter `name` and `description` fields are required. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
```md SKILL.md
---
name: random-number
description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
---
To generate a random number, run:
```bash
node /opt/skills/random-number/random-number.cjs <min> <max>
```
This prints a single random integer between min and max (inclusive).
</Step>
<Step title="Package the script">
Bundle the script just like an MCP server so it has no dependencies at runtime.
```bash
npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs
```
</Step>
<Step title="Create sandbox and upload files">
Upload both the bundled script and the skill file.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const script = await fs.promises.readFile("./dist/random-number.cjs");
await client.writeFsFile(
{ path: "/opt/skills/random-number/random-number.cjs" },
script,
);
const skill = await fs.promises.readFile("./SKILL.md");
await client.writeFsFile(
{ path: "/opt/skills/random-number/SKILL.md" },
skill,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/random-number.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/random-number.cjs
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/SKILL.md" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./SKILL.md
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point the session at the skill directory. The agent reads `SKILL.md` and learns how to use your script.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
skills: {
sources: [
{ type: "local", source: "/opt/skills/random-number" },
],
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "local", "source": "/opt/skills/random-number" }
]
}
}'
```
</CodeGroup>
</Step>
</Steps>
## Notes
- The sandbox image must include a Node.js runtime that can execute the bundled files.

214
docs/deploy/computesdk.mdx Normal file
View file

@ -0,0 +1,214 @@
---
title: "ComputeSDK"
description: "Deploy the daemon using ComputeSDK's provider-agnostic sandbox API."
---
[ComputeSDK](https://computesdk.com) provides a unified interface for managing sandboxes across multiple providers. Write once, deploy anywhere—switch providers by changing environment variables.
## Prerequisites
- `COMPUTESDK_API_KEY` from [console.computesdk.com](https://console.computesdk.com)
- Provider API key (one of: `E2B_API_KEY`, `DAYTONA_API_KEY`, `VERCEL_TOKEN`, `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`, `BLAXEL_API_KEY`, `CSB_API_KEY`)
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
## TypeScript Example
```typescript
import {
compute,
detectProvider,
getMissingEnvVars,
getProviderConfigFromEnv,
isProviderAuthComplete,
isValidProvider,
PROVIDER_NAMES,
type ExplicitComputeConfig,
type ProviderName,
} from "computesdk";
import { SandboxAgent } from "sandbox-agent";
const PORT = 3000;
const REQUEST_TIMEOUT_MS =
Number.parseInt(process.env.COMPUTESDK_TIMEOUT_MS || "", 10) || 120_000;
/**
* Detects and validates the provider to use.
* Priority: COMPUTESDK_PROVIDER env var > auto-detection from API keys
*/
function resolveProvider(): ProviderName {
const providerOverride = process.env.COMPUTESDK_PROVIDER;
if (providerOverride) {
if (!isValidProvider(providerOverride)) {
throw new Error(
`Unsupported provider "${providerOverride}". Supported: ${PROVIDER_NAMES.join(", ")}`
);
}
if (!isProviderAuthComplete(providerOverride)) {
const missing = getMissingEnvVars(providerOverride);
throw new Error(
`Missing credentials for "${providerOverride}". Set: ${missing.join(", ")}`
);
}
return providerOverride as ProviderName;
}
const detected = detectProvider();
if (!detected) {
throw new Error(
`No provider credentials found. Set one of: ${PROVIDER_NAMES.map((p) => getMissingEnvVars(p).join(", ")).join(" | ")}`
);
}
return detected as ProviderName;
}
function configureComputeSDK(): void {
const provider = resolveProvider();
const config: ExplicitComputeConfig = {
provider,
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
requestTimeoutMs: REQUEST_TIMEOUT_MS,
};
// Add provider-specific config from environment
const providerConfig = getProviderConfigFromEnv(provider);
if (Object.keys(providerConfig).length > 0) {
(config as any)[provider] = providerConfig;
}
compute.setConfig(config);
}
configureComputeSDK();
// Build environment variables to pass to sandbox
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
// Create sandbox
const sandbox = await compute.sandbox.create({
envs: Object.keys(envs).length > 0 ? envs : undefined,
});
// Helper to run commands with error handling
const run = async (cmd: string, options?: { background?: boolean }) => {
const result = await sandbox.runCommand(cmd, options);
if (typeof result?.exitCode === "number" && result.exitCode !== 0) {
throw new Error(`Command failed: ${cmd} (exit ${result.exitCode})\n${result.stderr || ""}`);
}
return result;
};
// Install sandbox-agent
await run("curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh");
// Install agents conditionally based on available API keys
if (envs.ANTHROPIC_API_KEY) {
await run("sandbox-agent install-agent claude");
}
if (envs.OPENAI_API_KEY) {
await run("sandbox-agent install-agent codex");
}
// Start the server in the background
await run(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`, { background: true });
// Get the public URL for the sandbox
const baseUrl = await sandbox.getUrl({ port: PORT });
// Wait for server to be ready
const deadline = Date.now() + REQUEST_TIMEOUT_MS;
while (Date.now() < deadline) {
try {
const response = await fetch(`${baseUrl}/v1/health`);
if (response.ok) {
const data = await response.json();
if (data?.status === "ok") break;
}
} catch {
// Server not ready yet
}
await new Promise((r) => setTimeout(r, 500));
}
// Connect to the server
const client = await SandboxAgent.connect({ baseUrl });
// Detect which agent to use based on available API keys
const agent = envs.ANTHROPIC_API_KEY ? "claude" : "codex";
// Create a session and start coding
await client.createSession("my-session", { agent });
await client.postMessage("my-session", {
message: "Summarize this repository",
});
for await (const event of client.streamEvents("my-session")) {
console.log(event.type, event.data);
}
// Cleanup
await sandbox.destroy();
```
## Supported Providers
ComputeSDK auto-detects your provider from environment variables:
| Provider | Environment Variables |
|----------|----------------------|
| E2B | `E2B_API_KEY` |
| Daytona | `DAYTONA_API_KEY` |
| Vercel | `VERCEL_TOKEN` or `VERCEL_OIDC_TOKEN` |
| Modal | `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET` |
| Blaxel | `BLAXEL_API_KEY` |
| CodeSandbox | `CSB_API_KEY` |
## Notes
- **Provider resolution order**: `COMPUTESDK_PROVIDER` env var takes priority, otherwise auto-detection from API keys.
- **Conditional agent installation**: Only agents with available API keys are installed, reducing setup time.
- **Command error handling**: The example validates exit codes and throws on failures for easier debugging.
- `sandbox.runCommand(..., { background: true })` keeps the server running while your app continues.
- `sandbox.getUrl({ port })` returns a public URL for the sandbox port.
- Always destroy the sandbox when you are done to avoid leaking resources.
- If sandbox creation times out, set `COMPUTESDK_TIMEOUT_MS` to a higher value (default: 120000ms).
## Explicit Provider Selection
To force a specific provider instead of auto-detection, set the `COMPUTESDK_PROVIDER` environment variable:
```bash
export COMPUTESDK_PROVIDER=e2b
```
Or configure programmatically using `getProviderConfigFromEnv()`:
```typescript
import { compute, getProviderConfigFromEnv, type ExplicitComputeConfig } from "computesdk";
const config: ExplicitComputeConfig = {
provider: "e2b",
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
requestTimeoutMs: 120_000,
};
// Automatically populate provider-specific config from environment
const providerConfig = getProviderConfigFromEnv("e2b");
if (Object.keys(providerConfig).length > 0) {
(config as any).e2b = providerConfig;
}
compute.setConfig(config);
```
## Direct Mode (No ComputeSDK API Key)
To bypass the ComputeSDK gateway and use provider SDKs directly, see the provider-specific examples:
- [E2B](/deploy/e2b)
- [Daytona](/deploy/daytona)
- [Vercel](/deploy/vercel)

View file

@ -1,27 +0,0 @@
---
title: "Deploy"
sidebarTitle: "Overview"
description: "Choose where to run the sandbox-agent server."
icon: "server"
---
<CardGroup cols={2}>
<Card title="Local" icon="laptop" href="/deploy/local">
Run locally for development. The SDK can auto-spawn the server.
</Card>
<Card title="E2B" icon="cube" href="/deploy/e2b">
Deploy inside an E2B sandbox with network access.
</Card>
<Card title="Vercel" icon="triangle" href="/deploy/vercel">
Deploy inside a Vercel Sandbox with port forwarding.
</Card>
<Card title="Cloudflare" icon="cloud" href="/deploy/cloudflare">
Deploy inside a Cloudflare Sandbox with port exposure.
</Card>
<Card title="Daytona" icon="cloud" href="/deploy/daytona">
Run in a Daytona workspace with port forwarding.
</Card>
<Card title="Docker" icon="docker" href="/deploy/docker">
Build and run in a container (development only).
</Card>
</CardGroup>

View file

@ -25,65 +25,98 @@
},
"navbar": {
"links": [
{
"label": "Gigacode",
"icon": "terminal",
"href": "https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode"
},
{
"label": "Discord",
"icon": "discord",
"href": "https://discord.gg/auCecybynK"
},
{
"label": "GitHub",
"icon": "github",
"type": "github",
"href": "https://github.com/rivet-dev/sandbox-agent"
}
]
},
"navigation": {
"pages": [
"tabs": [
{
"group": "Getting started",
"tab": "Documentation",
"pages": [
"quickstart",
"building-chat-ui",
"manage-sessions",
"opencode-compatibility"
]
},
{
"group": "Deploy",
"pages": [
"deploy/index",
"deploy/local",
"deploy/e2b",
"deploy/daytona",
"deploy/vercel",
"deploy/cloudflare",
"deploy/docker"
]
},
{
"group": "SDKs",
"pages": ["sdks/typescript", "sdks/python"]
},
{
"group": "Reference",
"pages": [
"cli",
"inspector",
"session-transcript-schema",
"gigacode",
{
"group": "AI",
"pages": ["ai/skill", "ai/llms-txt"]
"group": "Getting started",
"pages": [
"quickstart",
"building-chat-ui",
"manage-sessions",
{
"group": "Deploy",
"icon": "server",
"pages": [
"deploy/local",
"deploy/computesdk",
"deploy/e2b",
"deploy/daytona",
"deploy/vercel",
"deploy/cloudflare",
"deploy/docker"
]
}
]
},
{
"group": "Advanced",
"pages": ["daemon", "cors", "telemetry"]
"group": "SDKs",
"pages": ["sdks/typescript", "sdks/python"]
},
{
"group": "Agent Features",
"pages": [
"agent-sessions",
"attachments",
"skills-config",
"mcp-config",
"custom-tools"
]
},
{
"group": "Features",
"pages": ["file-system"]
},
{
"group": "Reference",
"pages": [
"cli",
"inspector",
"session-transcript-schema",
"opencode-compatibility",
{
"group": "More",
"pages": [
"credentials",
"daemon",
"cors",
"telemetry",
{
"group": "AI",
"pages": ["ai/skill", "ai/llms-txt"]
}
]
}
]
}
]
},
{
"group": "HTTP API Reference",
"openapi": "openapi.json"
"tab": "HTTP API",
"pages": [
{
"group": "HTTP Reference",
"openapi": "openapi.json"
}
]
}
]
}

184
docs/file-system.mdx Normal file
View file

@ -0,0 +1,184 @@
---
title: "File System"
description: "Read, write, and manage files inside the sandbox."
sidebarTitle: "File System"
icon: "folder"
---
The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload batches of files via tar archives.
## Path Resolution
- Absolute paths are used as-is.
- Relative paths use the session working directory when `sessionId` is provided.
- Without `sessionId`, relative paths resolve against the server home directory.
- Relative paths cannot contain `..` or absolute prefixes; requests that attempt to escape the root are rejected.
The session working directory is the server process current working directory at the moment the session is created.
## List Entries
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const entries = await client.listFsEntries({
path: "./workspace",
sessionId: "my-session",
});
console.log(entries);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Read And Write Files
`PUT /v1/fs/file` writes raw bytes. `GET /v1/fs/file` returns raw bytes.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.writeFsFile({ path: "./notes.txt", sessionId: "my-session" }, "hello");
const bytes = await client.readFsFile({
path: "./notes.txt",
sessionId: "my-session",
});
const text = new TextDecoder().decode(bytes);
console.log(text);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary "hello"
curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--output ./notes.txt
```
</CodeGroup>
## Create Directories
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.mkdirFs({
path: "./data",
sessionId: "my-session",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Move, Delete, And Stat
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.moveFs(
{ from: "./notes.txt", to: "./notes-old.txt", overwrite: true },
{ sessionId: "my-session" },
);
const stat = await client.statFs({
path: "./notes-old.txt",
sessionId: "my-session",
});
await client.deleteFsEntry({
path: "./notes-old.txt",
sessionId: "my-session",
});
console.log(stat);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/move?sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"from":"./notes.txt","to":"./notes-old.txt","overwrite":true}'
curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Batch Upload (Tar)
Batch upload accepts `application/x-tar` only and extracts into the destination directory. The response returns absolute paths for extracted files, capped at 1024 entries.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
import path from "node:path";
import tar from "tar";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const archivePath = path.join(process.cwd(), "skills.tar");
await tar.c({
cwd: "./skills",
file: archivePath,
}, ["."]);
const tarBuffer = await fs.promises.readFile(archivePath);
const result = await client.uploadFsBatch(tarBuffer, {
path: "./skills",
sessionId: "my-session",
});
console.log(result);
```
```bash cURL
tar -cf skills.tar -C ./skills .
curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/x-tar" \
--data-binary @skills.tar
```
</CodeGroup>

View file

@ -1,7 +1,6 @@
---
title: "Inspector"
description: "Debug and inspect agent sessions with the Inspector UI."
icon: "magnifying-glass"
---
The Inspector is a web-based GUI for debugging and inspecting Sandbox Agent sessions. Use it to view events, send messages, and troubleshoot agent behavior in real-time.

122
docs/mcp-config.mdx Normal file
View file

@ -0,0 +1,122 @@
---
title: "MCP"
description: "Configure MCP servers for agent sessions."
sidebarTitle: "MCP"
icon: "plug"
---
MCP (Model Context Protocol) servers extend agents with tools. Sandbox Agent can auto-load MCP servers when a session starts by passing an `mcp` map in the create-session request.
## Session Config
The `mcp` field is a map of server name to config. Use `type: "local"` for stdio servers and `type: "remote"` for HTTP/SSE servers:
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-mcp", {
agent: "claude",
mcp: {
filesystem: {
type: "local",
command: "my-mcp-server",
args: ["--root", "."],
},
github: {
type: "remote",
url: "https://example.com/mcp",
headers: {
Authorization: "Bearer ${GITHUB_TOKEN}",
},
},
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-mcp" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"filesystem": {
"type": "local",
"command": "my-mcp-server",
"args": ["--root", "."]
},
"github": {
"type": "remote",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${GITHUB_TOKEN}"
}
}
}
}'
```
</CodeGroup>
## Config Fields
### Local Server
Stdio servers that run inside the sandbox.
| Field | Description |
|---|---|
| `type` | `local` |
| `command` | string or array (`["node", "server.js"]`) |
| `args` | array of string arguments |
| `env` | environment variables map |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `cwd` | working directory for the MCP process |
```json
{
"type": "local",
"command": ["node", "./mcp/server.js"],
"args": ["--root", "."],
"env": { "LOG_LEVEL": "debug" },
"cwd": "/workspace"
}
```
### Remote Server
HTTP/SSE servers accessed over the network.
| Field | Description |
|---|---|
| `type` | `remote` |
| `url` | MCP server URL |
| `headers` | static headers map |
| `bearerTokenEnvVar` | env var name to inject into `Authorization: Bearer ...` |
| `envHeaders` | map of header name to env var name |
| `oauth` | object with `clientId`, `clientSecret`, `scope`, or `false` to disable |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `transport` | `http` or `sse` |
```json
{
"type": "remote",
"url": "https://example.com/mcp",
"headers": { "x-client": "sandbox-agent" },
"bearerTokenEnvVar": "MCP_TOKEN",
"transport": "sse"
}
```
## Custom MCP Servers
To bundle and upload your own MCP server into the sandbox, see [Custom Tools](/custom-tools).

File diff suppressed because it is too large Load diff

View file

@ -1,7 +1,6 @@
---
title: "OpenCode SDK & UI Support"
title: "OpenCode Compatibility"
description: "Connect OpenCode clients, SDKs, and web UI to Sandbox Agent."
icon: "rectangle-terminal"
---
<Warning>
@ -60,10 +59,11 @@ The OpenCode web UI can connect to Sandbox Agent for a full browser-based experi
</Step>
<Step title="Clone and Start the OpenCode Web App">
```bash
git clone https://github.com/opencode-ai/opencode
git clone https://github.com/anomalyco/opencode
cd opencode/packages/app
export VITE_OPENCODE_SERVER_HOST=127.0.0.1
export VITE_OPENCODE_SERVER_PORT=2468
bun install
bun run dev -- --host 127.0.0.1 --port 5173
```
</Step>
@ -113,6 +113,7 @@ for await (const event of events.stream) {
- **CORS**: When using the web UI from a different origin, configure `--cors-allow-origin`
- **Provider Selection**: Use the provider/model selector in the UI to choose which backing agent to use (claude, codex, opencode, amp)
- **Models & Variants**: Providers are grouped by backing agent (e.g. Claude Code, Codex, Amp). OpenCode models are grouped by `OpenCode (<provider>)` to preserve their native provider grouping. Each model keeps its real model ID, and variants are exposed when available (Codex/OpenCode/Amp).
- **Optional Native Proxy for TUI/Config Endpoints**: Set `OPENCODE_COMPAT_PROXY_URL` (for example `http://127.0.0.1:4096`) to proxy select OpenCode-native endpoints to a real OpenCode server. This currently applies to `/command`, `/config`, `/global/config`, and `/tui/*`. If not set, sandbox-agent uses its built-in compatibility handlers.
## Endpoint Coverage
@ -134,10 +135,15 @@ See the full endpoint compatibility table below. Most endpoints are functional f
| `GET /question` | ✓ | List pending questions |
| `POST /question/{id}/reply` | ✓ | Answer agent questions |
| `GET /provider` | ✓ | Returns provider metadata |
| `GET /command` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `GET /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `/tui/*` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /agent` | | Returns agent list |
| `GET /config` | | Returns config |
| *other endpoints* | | Return empty/stub responses |
✓ Functional &nbsp;&nbsp; Stubbed
✓ Functional &nbsp;&nbsp; ↔ Proxied (optional) &nbsp;&nbsp; Stubbed
</Accordion>

View file

@ -1,7 +1,6 @@
---
title: "Session Transcript Schema"
description: "Universal event schema for session transcripts across all agents."
icon: "brackets-curly"
---
Each coding agent outputs events in its own native format. The sandbox-agent converts these into a universal event schema, giving you a consistent session transcript regardless of which agent you use.
@ -27,7 +26,7 @@ This table shows which agent feature coverage appears in the universal event str
| Reasoning/Thinking | - | ✓ | - | - | ✓ |
| Command Execution | - | ✓ | - | - | |
| File Changes | - | ✓ | - | - | |
| MCP Tools | - | ✓ | - | - | |
| MCP Tools | ✓ | ✓ | ✓ | ✓ | |
| Streaming Deltas | ✓ | ✓ | ✓ | - | ✓ |
| Variants | | ✓ | ✓ | ✓ | ✓ |
@ -125,6 +124,13 @@ Every event from the API is wrapped in a `UniversalEvent` envelope.
| `session.started` | Session has started | `{ metadata?: any }` |
| `session.ended` | Session has ended | `{ reason, terminated_by, message?, exit_code? }` |
### Turn Lifecycle
| Type | Description | Data |
|------|-------------|------|
| `turn.started` | Turn has started | `{ phase: "started", turn_id?, metadata? }` |
| `turn.ended` | Turn has ended | `{ phase: "ended", turn_id?, metadata? }` |
**SessionEndedData**
| Field | Type | Values |
@ -159,7 +165,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
| Type | Description | Data |
|------|-------------|------|
| `permission.requested` | Permission request pending | `{ permission_id, action, status, metadata? }` |
| `permission.resolved` | Permission granted or denied | `{ permission_id, action, status, metadata? }` |
| `permission.resolved` | Permission decision recorded | `{ permission_id, action, status, metadata? }` |
| `question.requested` | Question pending user input | `{ question_id, prompt, options, status }` |
| `question.resolved` | Question answered or rejected | `{ question_id, prompt, options, status, response? }` |
@ -169,7 +175,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
|-------|------|-------------|
| `permission_id` | string | Identifier for the permission request |
| `action` | string | What the agent wants to do |
| `status` | string | `requested`, `approved`, `denied` |
| `status` | string | `requested`, `accept`, `accept_for_session`, `reject` |
| `metadata` | any? | Additional context |
**QuestionEventData**
@ -366,6 +372,8 @@ The daemon emits synthetic events (`synthetic: true`, `source: "daemon"`) to pro
|-----------|------|
| `session.started` | Agent doesn't emit explicit session start |
| `session.ended` | Agent doesn't emit explicit session end |
| `turn.started` | Agent doesn't emit explicit turn start |
| `turn.ended` | Agent doesn't emit explicit turn end |
| `item.started` | Agent doesn't emit item start events |
| `item.delta` | Agent doesn't stream deltas natively |
| `question.*` | Claude Code plan mode (from ExitPlanMode tool) |

87
docs/skills-config.mdx Normal file
View file

@ -0,0 +1,87 @@
---
title: "Skills"
description: "Auto-load skills into agent sessions."
sidebarTitle: "Skills"
icon: "sparkles"
---
Skills are local instruction bundles stored in `SKILL.md` files. Sandbox Agent can fetch, discover, and link skill directories into agent-specific skill paths at session start using the `skills.sources` field. The format is fully compatible with [skills.sh](https://skills.sh).
## Session Config
Pass `skills.sources` when creating a session to load skills from GitHub repos, local paths, or git URLs.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-skills", {
agent: "claude",
skills: {
sources: [
{ type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
{ type: "local", source: "/workspace/my-custom-skill" },
],
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-skills" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "github", "source": "rivet-dev/skills", "skills": ["sandbox-agent"] },
{ "type": "local", "source": "/workspace/my-custom-skill" }
]
}
}'
```
</CodeGroup>
Each skill directory must contain `SKILL.md`. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
## Skill Sources
Each entry in `skills.sources` describes where to find skills. Three source types are supported:
| Type | `source` value | Example |
|------|---------------|---------|
| `github` | `owner/repo` | `"rivet-dev/skills"` |
| `local` | Filesystem path | `"/workspace/my-skill"` |
| `git` | Git clone URL | `"https://git.example.com/skills.git"` |
### Optional fields
- **`skills`** — Array of skill directory names to include. When omitted, all discovered skills are installed.
- **`ref`** — Branch, tag, or commit to check out (default: HEAD). Applies to `github` and `git` types.
- **`subpath`** — Subdirectory within the repo to search for skills.
## Custom Skills
To write, upload, and configure your own skills inside the sandbox, see [Custom Tools](/custom-tools).
## Advanced
### Discovery logic
After resolving a source to a local directory (cloning if needed), Sandbox Agent discovers skills by:
1. Checking if the directory itself contains `SKILL.md`.
2. Scanning `skills/` subdirectory for child directories containing `SKILL.md`.
3. Scanning immediate children of the directory for `SKILL.md`.
Discovered skills are symlinked into project-local skill roots (`.claude/skills/<name>`, `.agents/skills/<name>`, `.opencode/skill/<name>`).
### Caching
GitHub sources are downloaded as zip archives and git sources are cloned to `~/.sandbox-agent/skills-cache/` and updated on subsequent session creations. GitHub sources do not require `git` to be installed.