Merge branch 'main' into feat/support-pi

This commit is contained in:
Nathan Flurry 2026-02-10 22:27:03 -08:00
commit 4c6c5983c0
156 changed files with 16196 additions and 2338 deletions

165
.claude/commands/release.md Normal file
View file

@ -0,0 +1,165 @@
# Release Agent
You are a release agent for the Gigacode project (sandbox-agent). Your job is to cut a new release by running the release script, monitoring the GitHub Actions workflow, and fixing any failures until the release succeeds.
## Step 1: Gather Release Information
Ask the user what type of release they want to cut:
- **patch** - Bug fixes (e.g., 0.1.8 -> 0.1.9)
- **minor** - New features (e.g., 0.1.8 -> 0.2.0)
- **major** - Breaking changes (e.g., 0.1.8 -> 1.0.0)
- **rc** - Release candidate (e.g., 0.2.0-rc.1)
For **rc** releases, also ask:
1. What base version the RC is for (e.g., 0.2.0). If the user doesn't specify, determine it by bumping the minor version from the current version.
2. What RC number (e.g., 1, 2, 3). If the user doesn't specify, check existing git tags to auto-determine the next RC number:
```bash
git tag -l "v<base_version>-rc.*" | sort -V
```
If no prior RC tags exist for that base version, use `rc.1`. Otherwise, increment the highest existing RC number.
The final RC version string is `<base_version>-rc.<number>` (e.g., `0.2.0-rc.1`).
## Step 2: Confirm Release Details
Before proceeding, display the release details to the user and ask for explicit confirmation:
- Current version (read from `Cargo.toml` workspace.package.version)
- New version
- Current branch
- Whether it will be tagged as "latest" (RC releases are never tagged as latest)
Do NOT proceed without user confirmation.
## Step 3: Run the Release Script (Setup Local)
The release script handles version bumping, local checks, committing, pushing, and triggering the workflow.
For **major**, **minor**, or **patch** releases:
```bash
echo "yes" | ./scripts/release/main.ts --<type> --phase setup-local
```
For **rc** releases (using explicit version):
```bash
echo "yes" | ./scripts/release/main.ts --version <version> --phase setup-local
```
Where `<type>` is `major`, `minor`, or `patch`, and `<version>` is the full RC version string like `0.2.0-rc.1`.
The `--phase setup-local` runs these steps in order:
1. Confirms release details (interactive prompt - piping "yes" handles this)
2. Updates version in all files (Cargo.toml, package.json files)
3. Runs local checks (cargo check, cargo fmt, pnpm typecheck)
4. Git commits with message `chore(release): update version to X.Y.Z`
5. Git pushes
6. Triggers the GitHub Actions workflow
If local checks fail at step 3, fix the issues in the codebase, then re-run using `--only-steps` to avoid re-running already-completed steps:
```bash
echo "yes" | ./scripts/release/main.ts --version <version> --only-steps run-local-checks,git-commit,git-push,trigger-workflow
```
## Step 4: Monitor the GitHub Actions Workflow
After the workflow is triggered, wait 5 seconds for it to register, then begin polling.
### Find the workflow run
```bash
gh run list --workflow=release.yaml --limit=1 --json databaseId,status,conclusion,createdAt,url
```
Verify the run was created recently (within the last 2 minutes) to confirm you are monitoring the correct run. Save the `databaseId` as the run ID.
### Poll for completion
Poll every 15 seconds using:
```bash
gh run view <run-id> --json status,conclusion
```
Report progress to the user periodically (every ~60 seconds or when status changes). The status values are:
- `queued` / `in_progress` / `waiting` - Still running, keep polling
- `completed` - Done, check `conclusion`
When `status` is `completed`, check `conclusion`:
- `success` - Release succeeded! Proceed to Step 6.
- `failure` - Proceed to Step 5.
- `cancelled` - Inform the user and stop.
## Step 5: Handle Workflow Failures
If the workflow fails:
### 5a. Get failure logs
```bash
gh run view <run-id> --log-failed
```
### 5b. Analyze the error
Read the failure logs carefully. Common failure categories:
- **Build failures** (cargo build, TypeScript compilation) - Fix the code
- **Formatting issues** (cargo fmt) - Run `cargo fmt` and commit
- **Test failures** - Fix the failing tests
- **Publishing failures** (crates.io, npm) - These may be transient; check if retry will help
- **Docker build failures** - Check Dockerfile or build script issues
- **Infrastructure/transient failures** (network timeouts, rate limits) - Just re-trigger without code changes
### 5c. Fix and re-push
If a code fix is needed:
1. Make the fix in the codebase
2. Amend the release commit (since the release version commit is the most recent):
```bash
git add -A
git commit --amend --no-edit
git push --force-with-lease
```
IMPORTANT: Use `--force-with-lease` (not `--force`) for safety. Amend the commit rather than creating a new one so the release stays as a single version-bump commit.
3. Re-trigger the workflow:
```bash
gh workflow run .github/workflows/release.yaml \
-f version=<version> \
-f latest=<true|false> \
--ref <branch>
```
Where `<branch>` is the current branch (usually `main`). Set `latest` to `false` for RC releases, `true` for stable releases that are newer than the current latest tag.
4. Return to Step 4 to monitor the new run.
If no code fix is needed (transient failure), skip straight to re-triggering the workflow (step 3 above).
### 5d. Retry limit
If the workflow has failed **5 times**, stop and report all errors to the user. Ask whether they want to continue retrying or abort the release. Do not retry infinitely.
## Step 6: Report Success
When the workflow completes successfully:
1. Print the GitHub Actions run URL
2. Print the new version number
3. Suggest running post-release testing: "Run `/project:post-release-testing` to verify the release works correctly."
## Important Notes
- The product name is "Gigacode" (capital G, lowercase c). The CLI binary is `gigacode` (lowercase).
- Do not include co-authors in any commit messages.
- Use conventional commits style (e.g., `chore(release): update version to X.Y.Z`).
- Keep commit messages to a single line.
- The release script requires `tsx` to run (it's a TypeScript file with a shebang).
- Always work on the current branch. Releases are typically cut from `main`.

View file

@ -4,7 +4,7 @@ dist/
build/
# Dependencies
node_modules/
**/node_modules/
# Cache
.cache/

View file

@ -20,17 +20,25 @@ jobs:
- name: Sync to skills repo
env:
SKILLS_REPO_TOKEN: ${{ secrets.RIVET_GITHUB_PAT }}
GH_TOKEN: ${{ secrets.RIVET_GITHUB_PAT }}
run: |
if [ -z "$SKILLS_REPO_TOKEN" ]; then
echo "SKILLS_REPO_TOKEN is not set" >&2
if [ -z "$GH_TOKEN" ]; then
echo "::error::RIVET_GITHUB_PAT secret is not set"
exit 1
fi
# Validate token before proceeding
if ! gh auth status 2>/dev/null; then
echo "::error::RIVET_GITHUB_PAT is invalid or expired. Rotate the token at https://github.com/settings/tokens"
exit 1
fi
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
git clone "https://x-access-token:${SKILLS_REPO_TOKEN}@github.com/rivet-dev/skills.git" /tmp/rivet-skills
# Clone public repo, configure auth via gh credential helper
gh auth setup-git
git clone https://github.com/rivet-dev/skills.git /tmp/rivet-skills
mkdir -p /tmp/rivet-skills/skills/sandbox-agent
rm -rf /tmp/rivet-skills/skills/sandbox-agent/*

8
.gitignore vendored
View file

@ -40,5 +40,13 @@ npm-debug.log*
Cargo.lock
**/*.rs.bk
# Agent runtime directories
.agents/
.claude/
.opencode/
# Example temp files
.tmp-upload/
# CLI binaries (downloaded during npm publish)
sdks/cli/platforms/*/bin/

10
.mcp.json Normal file
View file

@ -0,0 +1,10 @@
{
"mcpServers": {
"everything": {
"args": [
"@modelcontextprotocol/server-everything"
],
"command": "npx"
}
}
}

View file

@ -47,6 +47,16 @@ Universal schema guidance:
- On parse failures, emit an `agent.unparsed` event (source=daemon, synthetic=true) and treat it as a test failure. Preserve raw payloads when `include_raw=true`.
- Track subagent support in `docs/conversion.md`. For now, normalize subagent activity into normal message/tool flow, but revisit explicit subagent modeling later.
- Keep the FAQ in `README.md` and `frontend/packages/website/src/components/FAQ.tsx` in sync. When adding or modifying FAQ entries, update both files.
- Update `research/wip-agent-support.md` as agent support changes are implemented.
### OpenAPI / utoipa requirements
Every `#[utoipa::path(...)]` handler function must have a doc comment where:
- The **first line** becomes the OpenAPI `summary` (short human-readable title, e.g. `"List Agents"`). This is used as the sidebar label and page heading in the docs site.
- The **remaining lines** become the OpenAPI `description` (one-sentence explanation of what the endpoint does).
- Every `responses(...)` entry must have a `description` (no empty descriptions).
When adding or modifying endpoints, regenerate `docs/openapi.json` and verify titles render correctly in the docs site.
### CLI ⇄ HTTP endpoint map (keep in sync)
@ -64,11 +74,45 @@ Universal schema guidance:
- `sandbox-agent api sessions reply-question``POST /v1/sessions/{sessionId}/questions/{questionId}/reply`
- `sandbox-agent api sessions reject-question``POST /v1/sessions/{sessionId}/questions/{questionId}/reject`
- `sandbox-agent api sessions reply-permission``POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply`
- `sandbox-agent api fs entries``GET /v1/fs/entries`
- `sandbox-agent api fs read``GET /v1/fs/file`
- `sandbox-agent api fs write``PUT /v1/fs/file`
- `sandbox-agent api fs delete``DELETE /v1/fs/entry`
- `sandbox-agent api fs mkdir``POST /v1/fs/mkdir`
- `sandbox-agent api fs move``POST /v1/fs/move`
- `sandbox-agent api fs stat``GET /v1/fs/stat`
- `sandbox-agent api fs upload-batch``POST /v1/fs/upload-batch`
## OpenCode CLI (Experimental)
## OpenCode Compatibility Layer
`sandbox-agent opencode` starts a sandbox-agent server and attaches an OpenCode session (uses `/opencode`).
### Session ownership
Sessions are stored **only** in sandbox-agent's v1 `SessionManager` — they are never sent to or stored in the native OpenCode server. The OpenCode TUI reads sessions via `GET /session` which the compat layer serves from the v1 store. The native OpenCode process has no knowledge of sessions.
### Proxy elimination strategy
The `/opencode` compat layer (`opencode_compat.rs`) historically proxied many endpoints to the native OpenCode server via `proxy_native_opencode()`. The goal is to **eliminate proxying** by implementing each endpoint natively using the v1 `SessionManager` as the single source of truth.
**Already de-proxied** (use v1 SessionManager directly):
- `GET /session``oc_session_list` reads from `SessionManager::list_sessions()`
- `GET /session/{id}``oc_session_get` reads from `SessionManager::get_session_info()`
- `GET /session/status``oc_session_status` derives busy/idle from v1 session `ended` flag
- `POST /tui/open-sessions` — returns `true` directly (TUI fetches sessions from `GET /session`)
- `POST /tui/select-session` — emits `tui.session.select` event via the OpenCode event broadcaster
**Still proxied** (none of these reference session IDs or the session list — all are session-agnostic):
- `GET /command` — command list
- `GET /config`, `PATCH /config` — project config read/write
- `GET /global/config`, `PATCH /global/config` — global config read/write
- `GET /tui/control/next`, `POST /tui/control/response` — TUI control loop
- `POST /tui/append-prompt`, `/tui/submit-prompt`, `/tui/clear-prompt` — prompt management
- `POST /tui/open-help`, `/tui/open-themes`, `/tui/open-models` — TUI navigation
- `POST /tui/execute-command`, `/tui/show-toast`, `/tui/publish` — TUI actions
When converting a proxied endpoint: add needed fields to `SessionState`/`SessionInfo` in `router.rs`, implement the logic natively in `opencode_compat.rs`, and use `session_info_to_opencode_value()` to format responses.
## Post-Release Testing
After cutting a release, verify the release works correctly. Run `/project:post-release-testing` to execute the testing agent.

View file

@ -3,21 +3,21 @@ resolver = "2"
members = ["server/packages/*", "gigacode"]
[workspace.package]
version = "0.1.7"
version = "0.1.12-rc.1"
edition = "2021"
authors = [ "Rivet Gaming, LLC <developer@rivet.gg>" ]
license = "Apache-2.0"
repository = "https://github.com/rivet-dev/sandbox-agent"
description = "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, Amp, and Pi."
description = "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, Cursor, Amp, and Pi."
[workspace.dependencies]
# Internal crates
sandbox-agent = { version = "0.1.7", path = "server/packages/sandbox-agent" }
sandbox-agent-error = { version = "0.1.7", path = "server/packages/error" }
sandbox-agent-agent-management = { version = "0.1.7", path = "server/packages/agent-management" }
sandbox-agent-agent-credentials = { version = "0.1.7", path = "server/packages/agent-credentials" }
sandbox-agent-universal-agent-schema = { version = "0.1.7", path = "server/packages/universal-agent-schema" }
sandbox-agent-extracted-agent-schemas = { version = "0.1.7", path = "server/packages/extracted-agent-schemas" }
sandbox-agent = { version = "0.1.12-rc.1", path = "server/packages/sandbox-agent" }
sandbox-agent-error = { version = "0.1.12-rc.1", path = "server/packages/error" }
sandbox-agent-agent-management = { version = "0.1.12-rc.1", path = "server/packages/agent-management" }
sandbox-agent-agent-credentials = { version = "0.1.12-rc.1", path = "server/packages/agent-credentials" }
sandbox-agent-universal-agent-schema = { version = "0.1.12-rc.1", path = "server/packages/universal-agent-schema" }
sandbox-agent-extracted-agent-schemas = { version = "0.1.12-rc.1", path = "server/packages/extracted-agent-schemas" }
# Serialization
serde = { version = "1.0", features = ["derive"] }
@ -69,6 +69,7 @@ url = "2.5"
regress = "0.10"
include_dir = "0.7"
base64 = "0.22"
toml_edit = "0.22"
# Code generation (build deps)
typify = "0.4"

View file

@ -5,7 +5,7 @@
<h3 align="center">Run Coding Agents in Sandboxes. Control Them Over HTTP.</h3>
<p align="center">
A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Amp, or Pi — streaming events, handling permissions, managing sessions.
A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Cursor, Amp, or Pi — streaming events, handling permissions, managing sessions.
</p>
<p align="center">
@ -24,13 +24,13 @@ Sandbox Agent solves three problems:
1. **Coding agents need sandboxes** — You can't let AI execute arbitrary code on your production servers. Coding agents need isolated environments, but existing SDKs assume local execution. Sandbox Agent is a server that runs inside the sandbox and exposes HTTP/SSE.
2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Cursor, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
3. **Sessions are ephemeral** — Agent transcripts live in the sandbox. When the process ends, you lose everything. Sandbox Agent streams events in a universal schema to your storage. Persist to Postgres, ClickHouse, or [Rivet](https://rivet.dev). Replay later, audit everything.
## Features
- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Amp, and Pi with full feature coverage
- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Cursor, Amp, and Pi with full feature coverage
- **Streaming Events**: Real-time SSE stream of everything the agent does — tool calls, permission requests, file edits, and more
- **Universal Session Schema**: [Standardized schema](https://sandboxagent.dev/docs/session-transcript-schema) that normalizes all agent event formats for storage and replay
- **Human-in-the-Loop**: Approve or deny tool executions and answer agent questions remotely over HTTP
@ -131,6 +131,8 @@ for await (const event of client.streamEvents("demo", { offset: 0 })) {
}
```
`permissionMode: "acceptEdits"` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
[SDK documentation](https://sandboxagent.dev/docs/sdks/typescript) — [Building a Chat UI](https://sandboxagent.dev/docs/building-chat-ui) — [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions)
### HTTP Server
@ -232,7 +234,7 @@ No, they're complementary. AI SDK is for building chat interfaces and calling LL
<details>
<summary><strong>Which coding agents are supported?</strong></summary>
Claude Code, Codex, OpenCode, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
Claude Code, Codex, OpenCode, Cursor, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
</details>
<details>

278
docs/agent-sessions.mdx Normal file
View file

@ -0,0 +1,278 @@
---
title: "Agent Sessions"
description: "Create sessions and send messages to agents."
sidebarTitle: "Sessions"
icon: "comments"
---
Sessions are the unit of interaction with an agent. You create one session per task, then send messages and stream events.
## Session Options
`POST /v1/sessions/{sessionId}` accepts the following fields:
- `agent` (required): `claude`, `codex`, `opencode`, `amp`, or `mock`
- `agentMode`: agent mode string (for example, `build`, `plan`)
- `permissionMode`: permission mode string (`default`, `plan`, `bypass`, etc.)
- `model`: model override (agent-specific)
- `variant`: model variant (agent-specific)
- `agentVersion`: agent version override
- `mcp`: MCP server config map (see `MCP`)
- `skills`: skill path config (see `Skills`)
## Create A Session
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("build-session", {
agent: "codex",
agentMode: "build",
permissionMode: "default",
model: "gpt-4.1",
variant: "reasoning",
agentVersion: "latest",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "codex",
"agentMode": "build",
"permissionMode": "default",
"model": "gpt-4.1",
"variant": "reasoning",
"agentVersion": "latest"
}'
```
</CodeGroup>
## Send A Message
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("build-session", {
message: "Summarize the repository structure.",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Summarize the repository structure."}'
```
</CodeGroup>
## Stream A Turn
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const response = await client.postMessageStream("build-session", {
message: "Explain the main entrypoints.",
});
const reader = response.body?.getReader();
if (reader) {
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value, { stream: true }));
}
}
```
```bash cURL
curl -N -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages/stream" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Explain the main entrypoints."}'
```
</CodeGroup>
## Fetch Events
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const events = await client.getEvents("build-session", {
offset: 0,
limit: 50,
includeRaw: false,
});
console.log(events.events);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events?offset=0&limit=50" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
`GET /v1/sessions/{sessionId}/get-messages` is an alias for `events`.
## Stream Events (SSE)
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
for await (const event of client.streamEvents("build-session", { offset: 0 })) {
console.log(event.type, event.data);
}
```
```bash cURL
curl -N -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events/sse?offset=0" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## List Sessions
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const sessions = await client.listSessions();
console.log(sessions.sessions);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Reply To A Question
When the agent asks a question, reply with an array of answers. Each inner array is one multi-select response.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.replyQuestion("build-session", "question-1", {
answers: [["yes"]],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"answers":[["yes"]]}'
```
</CodeGroup>
## Reject A Question
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.rejectQuestion("build-session", "question-1");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reject" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Reply To A Permission Request
Use `once`, `always`, or `reject`.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.replyPermission("build-session", "permission-1", {
reply: "once",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/permissions/permission-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"reply":"once"}'
```
</CodeGroup>
## Terminate A Session
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.terminateSession("build-session");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/terminate" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>

87
docs/attachments.mdx Normal file
View file

@ -0,0 +1,87 @@
---
title: "Attachments"
description: "Upload files into the sandbox and attach them to prompts."
sidebarTitle: "Attachments"
icon: "paperclip"
---
Use the filesystem API to upload files, then reference them as attachments when sending prompts.
<Steps>
<Step title="Upload a file">
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const buffer = await fs.promises.readFile("./data.csv");
const upload = await client.writeFsFile(
{ path: "./uploads/data.csv", sessionId: "my-session" },
buffer,
);
console.log(upload.path);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./data.csv
```
</CodeGroup>
The response returns the absolute path that you should use for attachments.
</Step>
<Step title="Attach the file in a prompt">
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("my-session", {
message: "Please analyze the attached CSV.",
attachments: [
{
path: "/home/sandbox/uploads/data.csv",
mime: "text/csv",
filename: "data.csv",
},
],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"message": "Please analyze the attached CSV.",
"attachments": [
{
"path": "/home/sandbox/uploads/data.csv",
"mime": "text/csv",
"filename": "data.csv"
}
]
}'
```
</CodeGroup>
</Step>
</Steps>
## Notes
- Use absolute paths from the upload response to avoid ambiguity.
- If `mime` is omitted, the server defaults to `application/octet-stream`.
- OpenCode receives file parts directly; other agents will see the attachment paths appended to the prompt.

View file

@ -29,7 +29,7 @@ const sessionId = `session-${crypto.randomUUID()}`;
await client.createSession(sessionId, {
agent: "claude",
agentMode: "code", // Optional: agent-specific mode
permissionMode: "default", // Optional: "default" | "plan" | "bypass"
permissionMode: "default", // Optional: "default" | "plan" | "bypass" | "acceptEdits" (Claude: accept edits; Codex: auto-approve file changes; others: default)
model: "claude-sonnet-4", // Optional: model override
});
```
@ -70,7 +70,7 @@ Use `offset` to track the last seen `sequence` number and resume from where you
### Bare minimum
Handle these three events to render a basic chat:
Handle item lifecycle plus turn lifecycle to render a basic chat:
```ts
type ItemState = {
@ -79,9 +79,20 @@ type ItemState = {
};
const items = new Map<string, ItemState>();
let turnInProgress = false;
function handleEvent(event: UniversalEvent) {
switch (event.type) {
case "turn.started": {
turnInProgress = true;
break;
}
case "turn.ended": {
turnInProgress = false;
break;
}
case "item.started": {
const { item } = event.data as ItemEventData;
items.set(item.item_id, { item, deltas: [] });
@ -110,12 +121,14 @@ function handleEvent(event: UniversalEvent) {
}
```
When rendering, show a loading indicator while `item.status === "in_progress"`:
When rendering:
- Use `turnInProgress` for turn-level UI state (disable send button, show global "Agent is responding", etc.).
- Use `item.status === "in_progress"` for per-item streaming state.
```ts
function renderItem(state: ItemState) {
const { item, deltas } = state;
const isLoading = item.status === "in_progress";
const isItemLoading = item.status === "in_progress";
// For streaming text, combine item content with accumulated deltas
const text = item.content
@ -126,7 +139,8 @@ function renderItem(state: ItemState) {
return {
content: streamedText,
isLoading,
isItemLoading,
isTurnLoading: turnInProgress,
role: item.role,
kind: item.kind,
};

View file

@ -2,7 +2,6 @@
title: "CLI Reference"
description: "Complete CLI reference for sandbox-agent."
sidebarTitle: "CLI"
icon: "terminal"
---
## Server
@ -71,7 +70,6 @@ sandbox-agent opencode [OPTIONS]
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `--session-title <TITLE>` | - | Title for the OpenCode session |
| `--opencode-bin <PATH>` | - | Override `opencode` binary path |
```bash
sandbox-agent opencode --token "$TOKEN"
@ -79,7 +77,7 @@ sandbox-agent opencode --token "$TOKEN"
The daemon logs to a per-host log file under the sandbox-agent data directory (for example, `~/.local/share/sandbox-agent/daemon/daemon-127-0-0-1-2468.log`).
Requires the `opencode` binary to be installed (or set `OPENCODE_BIN` / `--opencode-bin`). If it is not found on `PATH`, sandbox-agent installs it automatically.
Existing installs are reused and missing binaries are installed automatically.
---
@ -247,10 +245,12 @@ sandbox-agent api sessions create <SESSION_ID> [OPTIONS]
|--------|-------------|
| `-a, --agent <AGENT>` | Agent identifier (required) |
| `-g, --agent-mode <MODE>` | Agent mode |
| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`) |
| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`, `acceptEdits`) |
| `-m, --model <MODEL>` | Model override |
| `-v, --variant <VARIANT>` | Model variant |
| `-A, --agent-version <VERSION>` | Agent version |
| `--mcp-config <PATH>` | JSON file with MCP server config (see `mcp` docs) |
| `--skill <PATH>` | Skill directory or `SKILL.md` path (repeatable) |
```bash
sandbox-agent api sessions create my-session \
@ -259,6 +259,8 @@ sandbox-agent api sessions create my-session \
--permission-mode default
```
`acceptEdits` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
#### Send Message
```bash
@ -380,6 +382,132 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once
---
### Filesystem
#### List Entries
```bash
sandbox-agent api fs entries [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--path <PATH>` | Directory path (default: `.`) |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs entries --path ./workspace
```
#### Read File
`api fs read` writes raw bytes to stdout.
```bash
sandbox-agent api fs read <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs read ./notes.txt > ./notes.txt
```
#### Write File
```bash
sandbox-agent api fs write <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--content <TEXT>` | Write UTF-8 content |
| `--from-file <PATH>` | Read content from a local file |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs write ./hello.txt --content "hello"
sandbox-agent api fs write ./image.bin --from-file ./image.bin
```
#### Delete Entry
```bash
sandbox-agent api fs delete <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--recursive` | Delete directories recursively |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs delete ./old.log
```
#### Create Directory
```bash
sandbox-agent api fs mkdir <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs mkdir ./cache
```
#### Move/Rename
```bash
sandbox-agent api fs move <FROM> <TO> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--overwrite` | Overwrite destination if it exists |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs move ./a.txt ./b.txt --overwrite
```
#### Stat
```bash
sandbox-agent api fs stat <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs stat ./notes.txt
```
#### Upload Batch (tar)
```bash
sandbox-agent api fs upload-batch --tar <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--tar <PATH>` | Tar archive to extract |
| `--path <PATH>` | Destination directory |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs upload-batch --tar ./skills.tar --path ./skills
```
---
## CLI to HTTP Mapping
| CLI Command | HTTP Endpoint |
@ -398,3 +526,11 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once
| `api sessions reply-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reply` |
| `api sessions reject-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reject` |
| `api sessions reply-permission` | `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply` |
| `api fs entries` | `GET /v1/fs/entries` |
| `api fs read` | `GET /v1/fs/file` |
| `api fs write` | `PUT /v1/fs/file` |
| `api fs delete` | `DELETE /v1/fs/entry` |
| `api fs mkdir` | `POST /v1/fs/mkdir` |
| `api fs move` | `POST /v1/fs/move` |
| `api fs stat` | `GET /v1/fs/stat` |
| `api fs upload-batch` | `POST /v1/fs/upload-batch` |

View file

@ -44,9 +44,11 @@ Events / Message Flow
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+
| session.started | none | method=thread/started | type=session.created | none | none |
| session.ended | SDKMessage.type=result | no explicit session end (turn/completed) | no explicit session end (session.deleted)| type=done | none (daemon synthetic) |
| turn.started | synthetic on message send | method=turn/started | type=session.status (busy) | synthetic on message send | none (daemon synthetic) |
| turn.ended | synthetic after result | method=turn/completed | type=session.idle | synthetic on done | none (daemon synthetic) |
| message (user) | SDKMessage.type=user | item/completed (ThreadItem.type=userMessage)| message.updated (Message.role=user) | type=message | none (daemon synthetic) |
| message (assistant) | SDKMessage.type=assistant | item/completed (ThreadItem.type=agentMessage)| message.updated (Message.role=assistant)| type=message | message_start/message_end |
| message.delta | stream_event (partial) or synthetic | method=item/agentMessage/delta | type=message.part.updated (delta) | synthetic | message_update (text_delta/thinking_delta) |
| message.delta | stream_event (partial) or synthetic | method=item/agentMessage/delta | type=message.part.updated (text-part delta) | synthetic | message_update (text_delta/thinking_delta) |
| tool call | type=tool_use | method=item/mcpToolCall/progress | message.part.updated (part.type=tool) | type=tool_call | tool_execution_start |
| tool result | user.message.content.tool_result | item/completed (tool result ThreadItem variants) | message.part.updated (part.type=tool, state=completed) | type=tool_result | tool_execution_end |
| permission.requested | control_request.can_use_tool | none | type=permission.asked | none | none |
@ -56,6 +58,10 @@ Events / Message Flow
| error | SDKResultMessage.error | method=error | type=session.error (or message error) | type=error | hook_error (status item) |
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+
Permission status normalization:
- `permission.requested` uses `status=requested`.
- `permission.resolved` uses `status=accept`, `accept_for_session`, or `reject`.
Synthetics
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
@ -63,6 +69,8 @@ Synthetics
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| session.started | When agent emits no explicit start | session.started event | Mark source=daemon |
| session.ended | When agent emits no explicit end | session.ended event | Mark source=daemon; reason may be inferred |
| turn.started | When agent emits no explicit turn start | turn.started event | Mark source=daemon |
| turn.ended | When agent emits no explicit turn end | turn.ended event | Mark source=daemon |
| item_id (Claude) | Claude provides no item IDs | item_id | Maintain provider_item_id map when possible |
| user message (Claude) | Claude emits only assistant output | item.completed | Mark source=daemon; preserve raw input in event metadata |
| question events (Claude) | AskUserQuestion tool usage | question.requested/resolved | Derived from tool_use blocks (source=agent) |
@ -71,7 +79,7 @@ Synthetics
| message.delta (Claude) | No native deltas emitted | item.delta | Synthetic delta with full message content; source=daemon |
| message.delta (Amp) | No native deltas | item.delta | Synthetic delta with full message content; source=daemon |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| message.delta (OpenCode) | part delta before message | item.delta | If part arrives first, create item.started stub then delta |
| message.delta (OpenCode) | text part delta before message | item.delta | If part arrives first, create item.started stub then delta |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
Delta handling
@ -82,10 +90,11 @@ Delta handling
- Pi emits message_update deltas and cumulative tool_execution_update partialResult values (we diff to produce deltas).
Policy:
- Always emit item.delta across all providers.
- Emit item.delta for streamable text content across providers.
- For providers without native deltas, emit a single synthetic delta containing the full content prior to item.completed.
- For Claude when partial streaming is enabled, forward native deltas and skip the synthetic full-content delta.
- For providers with native deltas, forward as-is; also emit item.completed when final content is known.
- For OpenCode reasoning part deltas, emit typed reasoning item updates (item.started/item.completed with content.type=reasoning) instead of item.delta.
Message normalization notes

144
docs/credentials.mdx Normal file
View file

@ -0,0 +1,144 @@
---
title: "Credentials"
description: "How sandbox-agent discovers and uses provider credentials."
icon: "key"
---
Sandbox-agent automatically discovers API credentials from environment variables and agent config files. Credentials are used to authenticate with AI providers (Anthropic, OpenAI) when spawning agents.
## Credential sources
Credentials are extracted in priority order. The first valid credential found for each provider is used.
### Environment variables (highest priority)
**API keys** (checked first):
| Variable | Provider |
|----------|----------|
| `ANTHROPIC_API_KEY` | Anthropic |
| `CLAUDE_API_KEY` | Anthropic (fallback) |
| `OPENAI_API_KEY` | OpenAI |
| `CODEX_API_KEY` | OpenAI (fallback) |
**OAuth tokens** (checked if no API key found):
| Variable | Provider |
|----------|----------|
| `CLAUDE_CODE_OAUTH_TOKEN` | Anthropic (OAuth) |
| `ANTHROPIC_AUTH_TOKEN` | Anthropic (OAuth fallback) |
OAuth tokens from environment variables are only used when `include_oauth` is enabled (the default).
### Agent config files
If no environment variable is set, sandbox-agent checks agent-specific config files:
| Agent | Config path | Provider |
|-------|-------------|----------|
| Amp | `~/.amp/config.json` | Anthropic |
| Claude Code | `~/.claude.json`, `~/.claude/.credentials.json` | Anthropic |
| Codex | `~/.codex/auth.json` | OpenAI |
| OpenCode | `~/.local/share/opencode/auth.json` | Both |
OAuth tokens are supported for Claude Code, Codex, and OpenCode. Expired tokens are automatically skipped.
## Provider requirements by agent
| Agent | Required provider |
|-------|-------------------|
| Claude Code | Anthropic |
| Amp | Anthropic |
| Codex | OpenAI |
| OpenCode | Anthropic or OpenAI |
| Mock | None |
## Error handling behavior
Sandbox-agent uses a **best-effort, fail-forward** approach to credentials:
### Extraction failures are silent
If a config file is missing, unreadable, or malformed, extraction continues to the next source. No errors are thrown. Missing credentials simply mean the provider is marked as unavailable.
```
~/.claude.json missing → try ~/.claude/.credentials.json
~/.claude/.credentials.json missing → try OpenCode config
All sources exhausted → anthropic = None (not an error)
```
### Agents spawn without credential validation
When you send a message to a session, sandbox-agent does **not** pre-validate credentials. The agent process is spawned with whatever credentials were found (or none), and the agent's native error surfaces if authentication fails.
This design:
- Lets you test agent error handling behavior
- Avoids duplicating provider-specific auth validation
- Ensures sandbox-agent faithfully proxies agent behavior
For example, sending a message to Claude Code without Anthropic credentials will spawn the agent, which will then emit its own "ANTHROPIC_API_KEY not set" error through the event stream.
## Checking credential status
### API endpoint
The `GET /v1/agents` endpoint includes a `credentialsAvailable` field for each agent:
```json
{
"agents": [
{
"id": "claude",
"installed": true,
"credentialsAvailable": true,
...
},
{
"id": "codex",
"installed": true,
"credentialsAvailable": false,
...
}
]
}
```
### TypeScript SDK
```typescript
const { agents } = await client.listAgents();
for (const agent of agents) {
console.log(`${agent.id}: ${agent.credentialsAvailable ? 'authenticated' : 'no credentials'}`);
}
```
### OpenCode compatibility
The `/opencode/provider` endpoint returns a `connected` array listing providers with valid credentials:
```json
{
"all": [...],
"connected": ["claude", "mock"]
}
```
## Passing credentials explicitly
You can override auto-discovered credentials by setting environment variables before starting sandbox-agent:
```bash
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
sandbox-agent daemon start
```
Or when using the SDK in embedded mode:
```typescript
const client = await SandboxAgentClient.spawn({
env: {
ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
},
});
```

245
docs/custom-tools.mdx Normal file
View file

@ -0,0 +1,245 @@
---
title: "Custom Tools"
description: "Give agents custom tools inside the sandbox using MCP servers or skills."
sidebarTitle: "Custom Tools"
icon: "wrench"
---
There are two ways to give agents custom tools that run inside the sandbox:
| | MCP Server | Skill |
|---|---|---|
| **How it works** | Sandbox Agent spawns your MCP server process and routes tool calls to it via stdio | A markdown file that instructs the agent to run your script with `node` (or any command) |
| **Tool discovery** | Agent sees tools automatically via MCP protocol | Agent reads instructions from the skill file |
| **Best for** | Structured tools with typed inputs/outputs | Lightweight scripts with natural-language instructions |
| **Requires** | `@modelcontextprotocol/sdk` dependency | Just a markdown file and a script |
Both approaches execute code inside the sandbox, so your tools have full access to the sandbox filesystem, network, and installed system tools.
## Option A: Tools via MCP
<Steps>
<Step title="Write your MCP server">
Create an MCP server that exposes tools using `@modelcontextprotocol/sdk` with `StdioServerTransport`. This server will run inside the sandbox.
```ts src/mcp-server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "rand",
version: "1.0.0",
});
server.tool(
"random_number",
"Generate a random integer between min and max (inclusive)",
{
min: z.number().describe("Minimum value"),
max: z.number().describe("Maximum value"),
},
async ({ min, max }) => ({
content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
}),
);
const transport = new StdioServerTransport();
await server.connect(transport);
```
This is a simple example. Your MCP server runs inside the sandbox, so you can execute any code you'd like: query databases, call internal APIs, run shell commands, or interact with any service available in the container.
</Step>
<Step title="Package the MCP server">
Bundle into a single JS file so it can be uploaded and executed without a `node_modules` folder.
```bash
npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs
```
This creates `dist/mcp-server.cjs` ready to upload.
</Step>
<Step title="Create sandbox and upload MCP server">
Start your sandbox, then write the bundled file into it.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const content = await fs.promises.readFile("./dist/mcp-server.cjs");
await client.writeFsFile(
{ path: "/opt/mcp/custom-tools/mcp-server.cjs" },
content,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/mcp-server.cjs
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point an MCP server config at the bundled JS file. When the session starts, Sandbox Agent spawns the MCP server process and routes tool calls to it.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
mcp: {
customTools: {
type: "local",
command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
},
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"customTools": {
"type": "local",
"command": ["node", "/opt/mcp/custom-tools/mcp-server.cjs"]
}
}
}'
```
</CodeGroup>
</Step>
</Steps>
## Option B: Tools via Skills
Skills are markdown files that instruct the agent how to use a script. Upload the script and a skill file, then point the session at the skill directory.
<Steps>
<Step title="Write your script">
Write a script that the agent will execute. This runs inside the sandbox just like an MCP server, but the agent invokes it directly via its shell tool.
```ts src/random-number.ts
const min = Number(process.argv[2]);
const max = Number(process.argv[3]);
if (Number.isNaN(min) || Number.isNaN(max)) {
console.error("Usage: random-number <min> <max>");
process.exit(1);
}
console.log(Math.floor(Math.random() * (max - min + 1)) + min);
```
</Step>
<Step title="Write a skill file">
Create a `SKILL.md` that tells the agent what the script does and how to run it. The frontmatter `name` and `description` fields are required. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
```md SKILL.md
---
name: random-number
description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
---
To generate a random number, run:
```bash
node /opt/skills/random-number/random-number.cjs <min> <max>
```
This prints a single random integer between min and max (inclusive).
</Step>
<Step title="Package the script">
Bundle the script just like an MCP server so it has no dependencies at runtime.
```bash
npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs
```
</Step>
<Step title="Create sandbox and upload files">
Upload both the bundled script and the skill file.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const script = await fs.promises.readFile("./dist/random-number.cjs");
await client.writeFsFile(
{ path: "/opt/skills/random-number/random-number.cjs" },
script,
);
const skill = await fs.promises.readFile("./SKILL.md");
await client.writeFsFile(
{ path: "/opt/skills/random-number/SKILL.md" },
skill,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/random-number.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/random-number.cjs
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/SKILL.md" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./SKILL.md
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point the session at the skill directory. The agent reads `SKILL.md` and learns how to use your script.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
skills: {
sources: [
{ type: "local", source: "/opt/skills/random-number" },
],
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "local", "source": "/opt/skills/random-number" }
]
}
}'
```
</CodeGroup>
</Step>
</Steps>
## Notes
- The sandbox image must include a Node.js runtime that can execute the bundled files.

214
docs/deploy/computesdk.mdx Normal file
View file

@ -0,0 +1,214 @@
---
title: "ComputeSDK"
description: "Deploy the daemon using ComputeSDK's provider-agnostic sandbox API."
---
[ComputeSDK](https://computesdk.com) provides a unified interface for managing sandboxes across multiple providers. Write once, deploy anywhere—switch providers by changing environment variables.
## Prerequisites
- `COMPUTESDK_API_KEY` from [console.computesdk.com](https://console.computesdk.com)
- Provider API key (one of: `E2B_API_KEY`, `DAYTONA_API_KEY`, `VERCEL_TOKEN`, `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`, `BLAXEL_API_KEY`, `CSB_API_KEY`)
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
## TypeScript Example
```typescript
import {
compute,
detectProvider,
getMissingEnvVars,
getProviderConfigFromEnv,
isProviderAuthComplete,
isValidProvider,
PROVIDER_NAMES,
type ExplicitComputeConfig,
type ProviderName,
} from "computesdk";
import { SandboxAgent } from "sandbox-agent";
const PORT = 3000;
const REQUEST_TIMEOUT_MS =
Number.parseInt(process.env.COMPUTESDK_TIMEOUT_MS || "", 10) || 120_000;
/**
* Detects and validates the provider to use.
* Priority: COMPUTESDK_PROVIDER env var > auto-detection from API keys
*/
function resolveProvider(): ProviderName {
const providerOverride = process.env.COMPUTESDK_PROVIDER;
if (providerOverride) {
if (!isValidProvider(providerOverride)) {
throw new Error(
`Unsupported provider "${providerOverride}". Supported: ${PROVIDER_NAMES.join(", ")}`
);
}
if (!isProviderAuthComplete(providerOverride)) {
const missing = getMissingEnvVars(providerOverride);
throw new Error(
`Missing credentials for "${providerOverride}". Set: ${missing.join(", ")}`
);
}
return providerOverride as ProviderName;
}
const detected = detectProvider();
if (!detected) {
throw new Error(
`No provider credentials found. Set one of: ${PROVIDER_NAMES.map((p) => getMissingEnvVars(p).join(", ")).join(" | ")}`
);
}
return detected as ProviderName;
}
function configureComputeSDK(): void {
const provider = resolveProvider();
const config: ExplicitComputeConfig = {
provider,
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
requestTimeoutMs: REQUEST_TIMEOUT_MS,
};
// Add provider-specific config from environment
const providerConfig = getProviderConfigFromEnv(provider);
if (Object.keys(providerConfig).length > 0) {
(config as any)[provider] = providerConfig;
}
compute.setConfig(config);
}
configureComputeSDK();
// Build environment variables to pass to sandbox
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
// Create sandbox
const sandbox = await compute.sandbox.create({
envs: Object.keys(envs).length > 0 ? envs : undefined,
});
// Helper to run commands with error handling
const run = async (cmd: string, options?: { background?: boolean }) => {
const result = await sandbox.runCommand(cmd, options);
if (typeof result?.exitCode === "number" && result.exitCode !== 0) {
throw new Error(`Command failed: ${cmd} (exit ${result.exitCode})\n${result.stderr || ""}`);
}
return result;
};
// Install sandbox-agent
await run("curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh");
// Install agents conditionally based on available API keys
if (envs.ANTHROPIC_API_KEY) {
await run("sandbox-agent install-agent claude");
}
if (envs.OPENAI_API_KEY) {
await run("sandbox-agent install-agent codex");
}
// Start the server in the background
await run(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`, { background: true });
// Get the public URL for the sandbox
const baseUrl = await sandbox.getUrl({ port: PORT });
// Wait for server to be ready
const deadline = Date.now() + REQUEST_TIMEOUT_MS;
while (Date.now() < deadline) {
try {
const response = await fetch(`${baseUrl}/v1/health`);
if (response.ok) {
const data = await response.json();
if (data?.status === "ok") break;
}
} catch {
// Server not ready yet
}
await new Promise((r) => setTimeout(r, 500));
}
// Connect to the server
const client = await SandboxAgent.connect({ baseUrl });
// Detect which agent to use based on available API keys
const agent = envs.ANTHROPIC_API_KEY ? "claude" : "codex";
// Create a session and start coding
await client.createSession("my-session", { agent });
await client.postMessage("my-session", {
message: "Summarize this repository",
});
for await (const event of client.streamEvents("my-session")) {
console.log(event.type, event.data);
}
// Cleanup
await sandbox.destroy();
```
## Supported Providers
ComputeSDK auto-detects your provider from environment variables:
| Provider | Environment Variables |
|----------|----------------------|
| E2B | `E2B_API_KEY` |
| Daytona | `DAYTONA_API_KEY` |
| Vercel | `VERCEL_TOKEN` or `VERCEL_OIDC_TOKEN` |
| Modal | `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET` |
| Blaxel | `BLAXEL_API_KEY` |
| CodeSandbox | `CSB_API_KEY` |
## Notes
- **Provider resolution order**: `COMPUTESDK_PROVIDER` env var takes priority, otherwise auto-detection from API keys.
- **Conditional agent installation**: Only agents with available API keys are installed, reducing setup time.
- **Command error handling**: The example validates exit codes and throws on failures for easier debugging.
- `sandbox.runCommand(..., { background: true })` keeps the server running while your app continues.
- `sandbox.getUrl({ port })` returns a public URL for the sandbox port.
- Always destroy the sandbox when you are done to avoid leaking resources.
- If sandbox creation times out, set `COMPUTESDK_TIMEOUT_MS` to a higher value (default: 120000ms).
## Explicit Provider Selection
To force a specific provider instead of auto-detection, set the `COMPUTESDK_PROVIDER` environment variable:
```bash
export COMPUTESDK_PROVIDER=e2b
```
Or configure programmatically using `getProviderConfigFromEnv()`:
```typescript
import { compute, getProviderConfigFromEnv, type ExplicitComputeConfig } from "computesdk";
const config: ExplicitComputeConfig = {
provider: "e2b",
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
requestTimeoutMs: 120_000,
};
// Automatically populate provider-specific config from environment
const providerConfig = getProviderConfigFromEnv("e2b");
if (Object.keys(providerConfig).length > 0) {
(config as any).e2b = providerConfig;
}
compute.setConfig(config);
```
## Direct Mode (No ComputeSDK API Key)
To bypass the ComputeSDK gateway and use provider SDKs directly, see the provider-specific examples:
- [E2B](/deploy/e2b)
- [Daytona](/deploy/daytona)
- [Vercel](/deploy/vercel)

View file

@ -1,27 +0,0 @@
---
title: "Deploy"
sidebarTitle: "Overview"
description: "Choose where to run the sandbox-agent server."
icon: "server"
---
<CardGroup cols={2}>
<Card title="Local" icon="laptop" href="/deploy/local">
Run locally for development. The SDK can auto-spawn the server.
</Card>
<Card title="E2B" icon="cube" href="/deploy/e2b">
Deploy inside an E2B sandbox with network access.
</Card>
<Card title="Vercel" icon="triangle" href="/deploy/vercel">
Deploy inside a Vercel Sandbox with port forwarding.
</Card>
<Card title="Cloudflare" icon="cloud" href="/deploy/cloudflare">
Deploy inside a Cloudflare Sandbox with port exposure.
</Card>
<Card title="Daytona" icon="cloud" href="/deploy/daytona">
Run in a Daytona workspace with port forwarding.
</Card>
<Card title="Docker" icon="docker" href="/deploy/docker">
Build and run in a container (development only).
</Card>
</CardGroup>

View file

@ -25,19 +25,26 @@
},
"navbar": {
"links": [
{
"label": "Gigacode",
"icon": "terminal",
"href": "https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode"
},
{
"label": "Discord",
"icon": "discord",
"href": "https://discord.gg/auCecybynK"
},
{
"label": "GitHub",
"icon": "github",
"type": "github",
"href": "https://github.com/rivet-dev/sandbox-agent"
}
]
},
"navigation": {
"tabs": [
{
"tab": "Documentation",
"pages": [
{
"group": "Getting started",
@ -45,46 +52,72 @@
"quickstart",
"building-chat-ui",
"manage-sessions",
"opencode-compatibility"
]
},
{
"group": "Deploy",
"icon": "server",
"pages": [
"deploy/index",
"deploy/local",
"deploy/computesdk",
"deploy/e2b",
"deploy/daytona",
"deploy/vercel",
"deploy/cloudflare",
"deploy/docker"
]
}
]
},
{
"group": "SDKs",
"pages": ["sdks/typescript", "sdks/python"]
},
{
"group": "Agent Features",
"pages": [
"agent-sessions",
"attachments",
"skills-config",
"mcp-config",
"custom-tools"
]
},
{
"group": "Features",
"pages": ["file-system"]
},
{
"group": "Reference",
"pages": [
"cli",
"inspector",
"session-transcript-schema",
"gigacode",
"opencode-compatibility",
{
"group": "More",
"pages": [
"credentials",
"daemon",
"cors",
"telemetry",
{
"group": "AI",
"pages": ["ai/skill", "ai/llms-txt"]
},
{
"group": "Advanced",
"pages": ["daemon", "cors", "telemetry"]
}
]
}
]
}
]
},
{
"group": "HTTP API Reference",
"tab": "HTTP API",
"pages": [
{
"group": "HTTP Reference",
"openapi": "openapi.json"
}
]
}
]
}
}

184
docs/file-system.mdx Normal file
View file

@ -0,0 +1,184 @@
---
title: "File System"
description: "Read, write, and manage files inside the sandbox."
sidebarTitle: "File System"
icon: "folder"
---
The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload batches of files via tar archives.
## Path Resolution
- Absolute paths are used as-is.
- Relative paths use the session working directory when `sessionId` is provided.
- Without `sessionId`, relative paths resolve against the server home directory.
- Relative paths cannot contain `..` or absolute prefixes; requests that attempt to escape the root are rejected.
The session working directory is the server process current working directory at the moment the session is created.
## List Entries
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const entries = await client.listFsEntries({
path: "./workspace",
sessionId: "my-session",
});
console.log(entries);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Read And Write Files
`PUT /v1/fs/file` writes raw bytes. `GET /v1/fs/file` returns raw bytes.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.writeFsFile({ path: "./notes.txt", sessionId: "my-session" }, "hello");
const bytes = await client.readFsFile({
path: "./notes.txt",
sessionId: "my-session",
});
const text = new TextDecoder().decode(bytes);
console.log(text);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary "hello"
curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--output ./notes.txt
```
</CodeGroup>
## Create Directories
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.mkdirFs({
path: "./data",
sessionId: "my-session",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Move, Delete, And Stat
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.moveFs(
{ from: "./notes.txt", to: "./notes-old.txt", overwrite: true },
{ sessionId: "my-session" },
);
const stat = await client.statFs({
path: "./notes-old.txt",
sessionId: "my-session",
});
await client.deleteFsEntry({
path: "./notes-old.txt",
sessionId: "my-session",
});
console.log(stat);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/move?sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"from":"./notes.txt","to":"./notes-old.txt","overwrite":true}'
curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Batch Upload (Tar)
Batch upload accepts `application/x-tar` only and extracts into the destination directory. The response returns absolute paths for extracted files, capped at 1024 entries.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
import path from "node:path";
import tar from "tar";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const archivePath = path.join(process.cwd(), "skills.tar");
await tar.c({
cwd: "./skills",
file: archivePath,
}, ["."]);
const tarBuffer = await fs.promises.readFile(archivePath);
const result = await client.uploadFsBatch(tarBuffer, {
path: "./skills",
sessionId: "my-session",
});
console.log(result);
```
```bash cURL
tar -cf skills.tar -C ./skills .
curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/x-tar" \
--data-binary @skills.tar
```
</CodeGroup>

View file

@ -1,7 +1,6 @@
---
title: "Inspector"
description: "Debug and inspect agent sessions with the Inspector UI."
icon: "magnifying-glass"
---
The Inspector is a web-based GUI for debugging and inspecting Sandbox Agent sessions. Use it to view events, send messages, and troubleshoot agent behavior in real-time.

122
docs/mcp-config.mdx Normal file
View file

@ -0,0 +1,122 @@
---
title: "MCP"
description: "Configure MCP servers for agent sessions."
sidebarTitle: "MCP"
icon: "plug"
---
MCP (Model Context Protocol) servers extend agents with tools. Sandbox Agent can auto-load MCP servers when a session starts by passing an `mcp` map in the create-session request.
## Session Config
The `mcp` field is a map of server name to config. Use `type: "local"` for stdio servers and `type: "remote"` for HTTP/SSE servers:
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-mcp", {
agent: "claude",
mcp: {
filesystem: {
type: "local",
command: "my-mcp-server",
args: ["--root", "."],
},
github: {
type: "remote",
url: "https://example.com/mcp",
headers: {
Authorization: "Bearer ${GITHUB_TOKEN}",
},
},
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-mcp" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"filesystem": {
"type": "local",
"command": "my-mcp-server",
"args": ["--root", "."]
},
"github": {
"type": "remote",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${GITHUB_TOKEN}"
}
}
}
}'
```
</CodeGroup>
## Config Fields
### Local Server
Stdio servers that run inside the sandbox.
| Field | Description |
|---|---|
| `type` | `local` |
| `command` | string or array (`["node", "server.js"]`) |
| `args` | array of string arguments |
| `env` | environment variables map |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `cwd` | working directory for the MCP process |
```json
{
"type": "local",
"command": ["node", "./mcp/server.js"],
"args": ["--root", "."],
"env": { "LOG_LEVEL": "debug" },
"cwd": "/workspace"
}
```
### Remote Server
HTTP/SSE servers accessed over the network.
| Field | Description |
|---|---|
| `type` | `remote` |
| `url` | MCP server URL |
| `headers` | static headers map |
| `bearerTokenEnvVar` | env var name to inject into `Authorization: Bearer ...` |
| `envHeaders` | map of header name to env var name |
| `oauth` | object with `clientId`, `clientSecret`, `scope`, or `false` to disable |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `transport` | `http` or `sse` |
```json
{
"type": "remote",
"url": "https://example.com/mcp",
"headers": { "x-client": "sandbox-agent" },
"bearerTokenEnvVar": "MCP_TOKEN",
"transport": "sse"
}
```
## Custom MCP Servers
To bundle and upload your own MCP server into the sandbox, see [Custom Tools](/custom-tools).

File diff suppressed because it is too large Load diff

View file

@ -1,7 +1,6 @@
---
title: "OpenCode SDK & UI Support"
title: "OpenCode Compatibility"
description: "Connect OpenCode clients, SDKs, and web UI to Sandbox Agent."
icon: "rectangle-terminal"
---
<Warning>
@ -60,10 +59,11 @@ The OpenCode web UI can connect to Sandbox Agent for a full browser-based experi
</Step>
<Step title="Clone and Start the OpenCode Web App">
```bash
git clone https://github.com/opencode-ai/opencode
git clone https://github.com/anomalyco/opencode
cd opencode/packages/app
export VITE_OPENCODE_SERVER_HOST=127.0.0.1
export VITE_OPENCODE_SERVER_PORT=2468
bun install
bun run dev -- --host 127.0.0.1 --port 5173
```
</Step>
@ -113,6 +113,7 @@ for await (const event of events.stream) {
- **CORS**: When using the web UI from a different origin, configure `--cors-allow-origin`
- **Provider Selection**: Use the provider/model selector in the UI to choose which backing agent to use (claude, codex, opencode, amp)
- **Models & Variants**: Providers are grouped by backing agent (e.g. Claude Code, Codex, Amp). OpenCode models are grouped by `OpenCode (<provider>)` to preserve their native provider grouping. Each model keeps its real model ID, and variants are exposed when available (Codex/OpenCode/Amp).
- **Optional Native Proxy for TUI/Config Endpoints**: Set `OPENCODE_COMPAT_PROXY_URL` (for example `http://127.0.0.1:4096`) to proxy select OpenCode-native endpoints to a real OpenCode server. This currently applies to `/command`, `/config`, `/global/config`, and `/tui/*`. If not set, sandbox-agent uses its built-in compatibility handlers.
## Endpoint Coverage
@ -134,10 +135,15 @@ See the full endpoint compatibility table below. Most endpoints are functional f
| `GET /question` | ✓ | List pending questions |
| `POST /question/{id}/reply` | ✓ | Answer agent questions |
| `GET /provider` | ✓ | Returns provider metadata |
| `GET /command` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `GET /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `/tui/*` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /agent` | | Returns agent list |
| `GET /config` | | Returns config |
| *other endpoints* | | Return empty/stub responses |
✓ Functional &nbsp;&nbsp; Stubbed
✓ Functional &nbsp;&nbsp; ↔ Proxied (optional) &nbsp;&nbsp; Stubbed
</Accordion>

View file

@ -1,7 +1,6 @@
---
title: "Session Transcript Schema"
description: "Universal event schema for session transcripts across all agents."
icon: "brackets-curly"
---
Each coding agent outputs events in its own native format. The sandbox-agent converts these into a universal event schema, giving you a consistent session transcript regardless of which agent you use.
@ -27,7 +26,7 @@ This table shows which agent feature coverage appears in the universal event str
| Reasoning/Thinking | - | ✓ | - | - | ✓ |
| Command Execution | - | ✓ | - | - | |
| File Changes | - | ✓ | - | - | |
| MCP Tools | - | ✓ | - | - | |
| MCP Tools | ✓ | ✓ | ✓ | ✓ | |
| Streaming Deltas | ✓ | ✓ | ✓ | - | ✓ |
| Variants | | ✓ | ✓ | ✓ | ✓ |
@ -125,6 +124,13 @@ Every event from the API is wrapped in a `UniversalEvent` envelope.
| `session.started` | Session has started | `{ metadata?: any }` |
| `session.ended` | Session has ended | `{ reason, terminated_by, message?, exit_code? }` |
### Turn Lifecycle
| Type | Description | Data |
|------|-------------|------|
| `turn.started` | Turn has started | `{ phase: "started", turn_id?, metadata? }` |
| `turn.ended` | Turn has ended | `{ phase: "ended", turn_id?, metadata? }` |
**SessionEndedData**
| Field | Type | Values |
@ -159,7 +165,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
| Type | Description | Data |
|------|-------------|------|
| `permission.requested` | Permission request pending | `{ permission_id, action, status, metadata? }` |
| `permission.resolved` | Permission granted or denied | `{ permission_id, action, status, metadata? }` |
| `permission.resolved` | Permission decision recorded | `{ permission_id, action, status, metadata? }` |
| `question.requested` | Question pending user input | `{ question_id, prompt, options, status }` |
| `question.resolved` | Question answered or rejected | `{ question_id, prompt, options, status, response? }` |
@ -169,7 +175,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
|-------|------|-------------|
| `permission_id` | string | Identifier for the permission request |
| `action` | string | What the agent wants to do |
| `status` | string | `requested`, `approved`, `denied` |
| `status` | string | `requested`, `accept`, `accept_for_session`, `reject` |
| `metadata` | any? | Additional context |
**QuestionEventData**
@ -366,6 +372,8 @@ The daemon emits synthetic events (`synthetic: true`, `source: "daemon"`) to pro
|-----------|------|
| `session.started` | Agent doesn't emit explicit session start |
| `session.ended` | Agent doesn't emit explicit session end |
| `turn.started` | Agent doesn't emit explicit turn start |
| `turn.ended` | Agent doesn't emit explicit turn end |
| `item.started` | Agent doesn't emit item start events |
| `item.delta` | Agent doesn't stream deltas natively |
| `question.*` | Claude Code plan mode (from ExitPlanMode tool) |

87
docs/skills-config.mdx Normal file
View file

@ -0,0 +1,87 @@
---
title: "Skills"
description: "Auto-load skills into agent sessions."
sidebarTitle: "Skills"
icon: "sparkles"
---
Skills are local instruction bundles stored in `SKILL.md` files. Sandbox Agent can fetch, discover, and link skill directories into agent-specific skill paths at session start using the `skills.sources` field. The format is fully compatible with [skills.sh](https://skills.sh).
## Session Config
Pass `skills.sources` when creating a session to load skills from GitHub repos, local paths, or git URLs.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-skills", {
agent: "claude",
skills: {
sources: [
{ type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
{ type: "local", source: "/workspace/my-custom-skill" },
],
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-skills" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "github", "source": "rivet-dev/skills", "skills": ["sandbox-agent"] },
{ "type": "local", "source": "/workspace/my-custom-skill" }
]
}
}'
```
</CodeGroup>
Each skill directory must contain `SKILL.md`. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
## Skill Sources
Each entry in `skills.sources` describes where to find skills. Three source types are supported:
| Type | `source` value | Example |
|------|---------------|---------|
| `github` | `owner/repo` | `"rivet-dev/skills"` |
| `local` | Filesystem path | `"/workspace/my-skill"` |
| `git` | Git clone URL | `"https://git.example.com/skills.git"` |
### Optional fields
- **`skills`** — Array of skill directory names to include. When omitted, all discovered skills are installed.
- **`ref`** — Branch, tag, or commit to check out (default: HEAD). Applies to `github` and `git` types.
- **`subpath`** — Subdirectory within the repo to search for skills.
## Custom Skills
To write, upload, and configure your own skills inside the sandbox, see [Custom Tools](/custom-tools).
## Advanced
### Discovery logic
After resolving a source to a local directory (cloning if needed), Sandbox Agent discovers skills by:
1. Checking if the directory itself contains `SKILL.md`.
2. Scanning `skills/` subdirectory for child directories containing `SKILL.md`.
3. Scanning immediate children of the directory for `SKILL.md`.
Discovered skills are symlinked into project-local skill roots (`.claude/skills/<name>`, `.agents/skills/<name>`, `.opencode/skill/<name>`).
### Caching
GitHub sources are downloaded as zip archives and git sources are cloned to `~/.sandbox-agent/skills-cache/` and updated on subsequent session creations. GitHub sources do not require `git` to be installed.

17
examples/CLAUDE.md Normal file
View file

@ -0,0 +1,17 @@
# Examples Instructions
## Docker Isolation
- Docker examples must behave like standalone sandboxes.
- Do not bind mount host files or host directories into Docker example containers.
- If an example needs tools, skills, or MCP servers, install them inside the container during setup.
## Testing Examples
Examples can be tested by starting them in the background and communicating directly with the sandbox-agent API:
1. Start the example: `SANDBOX_AGENT_DEV=1 pnpm start &`
2. Note the base URL and session ID from the output.
3. Send messages: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/messages -H "Content-Type: application/json" -d '{"message":"..."}'`
4. Poll events: `curl http://127.0.0.1:<port>/v1/sessions/<sessionId>/events`
5. Approve permissions: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/permissions/<permissionId>/reply -H "Content-Type: application/json" -d '{"reply":"once"}'`

View file

@ -1,7 +1,7 @@
{
"$schema": "node_modules/wrangler/config-schema.json",
"name": "sandbox-agent-cloudflare",
"main": "src/cloudflare.ts",
"main": "src/index.ts",
"compatibility_date": "2025-01-01",
"compatibility_flags": ["nodejs_compat"],
"assets": {

View file

@ -0,0 +1,19 @@
{
"name": "@sandbox-agent/example-computesdk",
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/computesdk.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"computesdk": "latest"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest",
"vitest": "^3.0.0"
}
}

View file

@ -0,0 +1,156 @@
import {
compute,
detectProvider,
getMissingEnvVars,
getProviderConfigFromEnv,
isProviderAuthComplete,
isValidProvider,
PROVIDER_NAMES,
type ExplicitComputeConfig,
type ProviderName,
} from "computesdk";
import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
import { fileURLToPath } from "node:url";
import { resolve } from "node:path";
const PORT = 3000;
const REQUEST_TIMEOUT_MS =
Number.parseInt(process.env.COMPUTESDK_TIMEOUT_MS || "", 10) || 120_000;
/**
* Detects and validates the provider to use.
* Priority: COMPUTESDK_PROVIDER env var > auto-detection from API keys
*/
function resolveProvider(): ProviderName {
const providerOverride = process.env.COMPUTESDK_PROVIDER;
if (providerOverride) {
if (!isValidProvider(providerOverride)) {
throw new Error(
`Unsupported ComputeSDK provider "${providerOverride}". Supported providers: ${PROVIDER_NAMES.join(", ")}`
);
}
if (!isProviderAuthComplete(providerOverride)) {
const missing = getMissingEnvVars(providerOverride);
throw new Error(
`Missing credentials for provider "${providerOverride}". Set: ${missing.join(", ")}`
);
}
console.log(`Using ComputeSDK provider: ${providerOverride} (explicit)`);
return providerOverride as ProviderName;
}
const detected = detectProvider();
if (!detected) {
throw new Error(
`No provider credentials found. Set one of: ${PROVIDER_NAMES.map((p) => getMissingEnvVars(p).join(", ")).join(" | ")}`
);
}
console.log(`Using ComputeSDK provider: ${detected} (auto-detected)`);
return detected as ProviderName;
}
function configureComputeSDK(): void {
const provider = resolveProvider();
const config: ExplicitComputeConfig = {
provider,
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
requestTimeoutMs: REQUEST_TIMEOUT_MS,
};
const providerConfig = getProviderConfigFromEnv(provider);
if (Object.keys(providerConfig).length > 0) {
const configWithProvider =
config as ExplicitComputeConfig & Record<ProviderName, Record<string, string>>;
configWithProvider[provider] = providerConfig;
}
compute.setConfig(config);
}
configureComputeSDK();
const buildEnv = (): Record<string, string> => {
const env: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
return env;
};
export async function setupComputeSdkSandboxAgent(): Promise<{
baseUrl: string;
cleanup: () => Promise<void>;
}> {
const env = buildEnv();
console.log("Creating ComputeSDK sandbox...");
const sandbox = await compute.sandbox.create({
envs: Object.keys(env).length > 0 ? env : undefined,
});
const run = async (cmd: string, options?: { background?: boolean }) => {
const result = await sandbox.runCommand(cmd, options);
if (typeof result?.exitCode === "number" && result.exitCode !== 0) {
throw new Error(`Command failed: ${cmd} (exit ${result.exitCode})\n${result.stderr || ""}`);
}
return result;
};
console.log("Installing sandbox-agent...");
await run("curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh");
if (env.ANTHROPIC_API_KEY) {
console.log("Installing Claude agent...");
await run("sandbox-agent install-agent claude");
}
if (env.OPENAI_API_KEY) {
console.log("Installing Codex agent...");
await run("sandbox-agent install-agent codex");
}
console.log("Starting server...");
await run(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`, { background: true });
const baseUrl = await sandbox.getUrl({ port: PORT });
console.log("Waiting for server...");
await waitForHealth({ baseUrl });
const cleanup = async () => {
try {
await sandbox.destroy();
} catch (error) {
console.warn("Cleanup failed:", error instanceof Error ? error.message : error);
}
};
return { baseUrl, cleanup };
}
export async function runComputeSdkExample(): Promise<void> {
const { baseUrl, cleanup } = await setupComputeSdkSandboxAgent();
const handleExit = async () => {
await cleanup();
process.exit(0);
};
process.once("SIGINT", handleExit);
process.once("SIGTERM", handleExit);
await runPrompt(baseUrl);
await cleanup();
}
const isDirectRun = Boolean(
process.argv[1] && resolve(process.argv[1]) === fileURLToPath(import.meta.url)
);
if (isDirectRun) {
runComputeSdkExample().catch((error) => {
console.error(error instanceof Error ? error.message : error);
process.exit(1);
});
}

View file

@ -0,0 +1,39 @@
import { describe, it, expect } from "vitest";
import { buildHeaders } from "@sandbox-agent/example-shared";
import { setupComputeSdkSandboxAgent } from "../src/computesdk.ts";
const hasModal = Boolean(process.env.MODAL_TOKEN_ID && process.env.MODAL_TOKEN_SECRET);
const hasVercel = Boolean(process.env.VERCEL_TOKEN || process.env.VERCEL_OIDC_TOKEN);
const hasProviderKey = Boolean(
process.env.BLAXEL_API_KEY ||
process.env.CSB_API_KEY ||
process.env.DAYTONA_API_KEY ||
process.env.E2B_API_KEY ||
hasModal ||
hasVercel
);
const shouldRun = Boolean(process.env.COMPUTESDK_API_KEY) && hasProviderKey;
const timeoutMs = Number.parseInt(process.env.SANDBOX_TEST_TIMEOUT_MS || "", 10) || 300_000;
const testFn = shouldRun ? it : it.skip;
describe("computesdk example", () => {
testFn(
"starts sandbox-agent and responds to /v1/health",
async () => {
const { baseUrl, cleanup } = await setupComputeSdkSandboxAgent();
try {
const response = await fetch(`${baseUrl}/v1/health`, {
headers: buildHeaders({}),
});
expect(response.ok).toBe(true);
const data = await response.json();
expect(data.status).toBe("ok");
} finally {
await cleanup();
}
},
timeoutMs
);
});

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

View file

@ -3,13 +3,14 @@
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/daytona.ts",
"start": "tsx src/index.ts",
"start:snapshot": "tsx src/daytona-with-snapshot.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@daytonaio/sdk": "latest",
"@sandbox-agent/example-shared": "workspace:*"
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/node": "latest",

View file

@ -1,5 +1,6 @@
import { Daytona, Image } from "@daytonaio/sdk";
import { runPrompt } from "@sandbox-agent/example-shared";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";
const daytona = new Daytona();
@ -24,12 +25,21 @@ await sandbox.process.executeCommand(
const baseUrl = (await sandbox.getSignedPreviewUrl(3000, 4 * 60 * 60)).url;
console.log("Waiting for server...");
await waitForHealth({ baseUrl });
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
await sandbox.delete(60);
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
await runPrompt(baseUrl);
await cleanup();

View file

@ -1,5 +1,6 @@
import { Daytona } from "@daytonaio/sdk";
import { runPrompt } from "@sandbox-agent/example-shared";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";
const daytona = new Daytona();
@ -25,12 +26,21 @@ await sandbox.process.executeCommand(
const baseUrl = (await sandbox.getSignedPreviewUrl(3000, 4 * 60 * 60)).url;
console.log("Waiting for server...");
await waitForHealth({ baseUrl });
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
await sandbox.delete(60);
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
await runPrompt(baseUrl);
await cleanup();

View file

@ -3,12 +3,13 @@
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/docker.ts",
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"dockerode": "latest"
"dockerode": "latest",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/dockerode": "latest",

View file

@ -1,5 +1,6 @@
import Docker from "dockerode";
import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";
const IMAGE = "alpine:latest";
const PORT = 3000;
@ -44,13 +45,19 @@ await container.start();
const baseUrl = `http://127.0.0.1:${PORT}`;
await waitForHealth({ baseUrl });
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
try { await container.stop({ t: 5 }); } catch {}
try { await container.remove({ force: true }); } catch {}
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
await runPrompt(baseUrl);
await cleanup();

View file

@ -3,7 +3,7 @@
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/e2b.ts",
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {

View file

@ -1,5 +1,6 @@
import { Sandbox } from "@e2b/code-interpreter";
import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
@ -29,12 +30,18 @@ const baseUrl = `https://${sandbox.getHost(3000)}`;
console.log("Waiting for server...");
await waitForHealth({ baseUrl });
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
await sandbox.kill();
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
await runPrompt(baseUrl);
await cleanup();

View file

@ -0,0 +1,19 @@
{
"name": "@sandbox-agent/example-file-system",
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*",
"tar": "^7"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

View file

@ -0,0 +1,57 @@
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
import * as tar from "tar";
import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
console.log("Starting sandbox...");
const { baseUrl, cleanup } = await startDockerSandbox({ port: 3003 });
console.log("Creating sample files...");
const tmpDir = path.resolve(__dirname, "../.tmp-upload");
const projectDir = path.join(tmpDir, "my-project");
fs.mkdirSync(path.join(projectDir, "src"), { recursive: true });
fs.writeFileSync(path.join(projectDir, "README.md"), "# My Project\n\nUploaded via batch tar.\n");
fs.writeFileSync(path.join(projectDir, "src", "index.ts"), 'console.log("hello from uploaded project");\n');
fs.writeFileSync(path.join(projectDir, "package.json"), JSON.stringify({ name: "my-project", version: "1.0.0" }, null, 2) + "\n");
console.log(" Created 3 files in my-project/");
console.log("Uploading files via batch tar...");
const client = await SandboxAgent.connect({ baseUrl });
const tarPath = path.join(tmpDir, "upload.tar");
await tar.create(
{ file: tarPath, cwd: tmpDir },
["my-project"],
);
const tarBuffer = await fs.promises.readFile(tarPath);
const uploadResult = await client.uploadFsBatch(tarBuffer, { path: "/opt" });
console.log(` Uploaded ${uploadResult.paths.length} files: ${uploadResult.paths.join(", ")}`);
// Cleanup temp files
fs.rmSync(tmpDir, { recursive: true, force: true });
console.log("Verifying uploaded files...");
const entries = await client.listFsEntries({ path: "/opt/my-project" });
console.log(` Found ${entries.length} entries in /opt/my-project`);
for (const entry of entries) {
console.log(` ${entry.entryType === "directory" ? "d" : "-"} ${entry.name}`);
}
const readmeBytes = await client.readFsFile({ path: "/opt/my-project/README.md" });
const readmeText = new TextDecoder().decode(readmeBytes);
console.log(` README.md content: ${readmeText.trim()}`);
console.log("Creating session...");
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(' Try: "read the README in /opt/my-project"');
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

View file

@ -0,0 +1,22 @@
{
"name": "@sandbox-agent/example-mcp-custom-tool",
"private": true,
"type": "module",
"scripts": {
"build:mcp": "esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs",
"start": "pnpm build:mcp && tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@modelcontextprotocol/sdk": "latest",
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*",
"zod": "latest"
},
"devDependencies": {
"@types/node": "latest",
"esbuild": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

View file

@ -0,0 +1,49 @@
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
// Verify the bundled MCP server exists (built by `pnpm build:mcp`).
const serverFile = path.resolve(__dirname, "../dist/mcp-server.cjs");
if (!fs.existsSync(serverFile)) {
console.error("Error: dist/mcp-server.cjs not found. Run `pnpm build:mcp` first.");
process.exit(1);
}
// Start a Docker container running sandbox-agent.
console.log("Starting sandbox...");
const { baseUrl, cleanup } = await startDockerSandbox({ port: 3004 });
// Upload the bundled MCP server into the sandbox filesystem.
console.log("Uploading MCP server bundle...");
const client = await SandboxAgent.connect({ baseUrl });
const bundle = await fs.promises.readFile(serverFile);
const written = await client.writeFsFile(
{ path: "/opt/mcp/custom-tools/mcp-server.cjs" },
bundle,
);
console.log(` Written: ${written.path} (${written.bytesWritten} bytes)`);
// Create a session with the uploaded MCP server as a local command.
console.log("Creating session with custom MCP tool...");
const sessionId = generateSessionId();
await client.createSession(sessionId, {
agent: detectAgent(),
mcp: {
customTools: {
type: "local",
command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
},
},
});
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(' Try: "generate a random number between 1 and 100"');
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });

View file

@ -0,0 +1,24 @@
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
async function main() {
const server = new McpServer({ name: "rand", version: "1.0.0" });
server.tool(
"random_number",
"Generate a random integer between min and max (inclusive)",
{
min: z.number().describe("Minimum value"),
max: z.number().describe("Maximum value"),
},
async ({ min, max }) => ({
content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
}),
);
const transport = new StdioServerTransport();
await server.connect(transport);
}
main();

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

18
examples/mcp/package.json Normal file
View file

@ -0,0 +1,18 @@
{
"name": "@sandbox-agent/example-mcp",
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

31
examples/mcp/src/index.ts Normal file
View file

@ -0,0 +1,31 @@
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
console.log("Starting sandbox...");
const { baseUrl, cleanup } = await startDockerSandbox({
port: 3002,
setupCommands: [
"npm install -g --silent @modelcontextprotocol/server-everything@2026.1.26",
],
});
console.log("Creating session with everything MCP server...");
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, {
agent: detectAgent(),
mcp: {
everything: {
type: "local",
command: ["mcp-server-everything"],
timeoutMs: 10000,
},
},
});
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(' Try: "generate a random number between 1 and 100"');
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

View file

@ -0,0 +1,5 @@
FROM node:22-bookworm-slim
RUN apt-get update -qq && apt-get install -y -qq --no-install-recommends ca-certificates > /dev/null 2>&1 && \
rm -rf /var/lib/apt/lists/* && \
npm install -g --silent @sandbox-agent/cli@latest && \
sandbox-agent install-agent claude

View file

@ -0,0 +1,58 @@
FROM node:22-bookworm-slim AS frontend
RUN corepack enable && corepack prepare pnpm@latest --activate
WORKDIR /build
# Copy workspace root config
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
# Copy packages needed for the inspector build chain:
# inspector -> sandbox-agent SDK -> cli-shared
COPY sdks/typescript/ sdks/typescript/
COPY sdks/cli-shared/ sdks/cli-shared/
COPY frontend/packages/inspector/ frontend/packages/inspector/
COPY docs/openapi.json docs/
# Create stub package.json for workspace packages referenced in pnpm-workspace.yaml
# but not needed for the inspector build (avoids install errors).
RUN set -e; for dir in \
sdks/cli sdks/gigacode \
resources/agent-schemas resources/vercel-ai-sdk-schemas \
scripts/release scripts/sandbox-testing \
examples/shared examples/docker examples/e2b examples/vercel \
examples/daytona examples/cloudflare examples/file-system \
examples/mcp examples/mcp-custom-tool \
examples/skills examples/skills-custom-tool \
frontend/packages/website; do \
mkdir -p "$dir"; \
printf '{"name":"@stub/%s","private":true,"version":"0.0.0"}\n' "$(basename "$dir")" > "$dir/package.json"; \
done; \
for parent in sdks/cli/platforms sdks/gigacode/platforms; do \
for plat in darwin-arm64 darwin-x64 linux-arm64 linux-x64 win32-x64; do \
mkdir -p "$parent/$plat"; \
printf '{"name":"@stub/%s-%s","private":true,"version":"0.0.0"}\n' "$(basename "$parent")" "$plat" > "$parent/$plat/package.json"; \
done; \
done
RUN pnpm install --no-frozen-lockfile
ENV SKIP_OPENAPI_GEN=1
RUN pnpm --filter sandbox-agent build && \
pnpm --filter @sandbox-agent/inspector build
FROM rust:1.88.0-bookworm AS builder
WORKDIR /build
COPY Cargo.toml Cargo.lock ./
COPY server/ ./server/
COPY gigacode/ ./gigacode/
COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/
COPY --from=frontend /build/frontend/packages/inspector/dist/ ./frontend/packages/inspector/dist/
RUN --mount=type=cache,target=/usr/local/cargo/registry \
--mount=type=cache,target=/usr/local/cargo/git \
--mount=type=cache,target=/build/target \
cargo build -p sandbox-agent --release && \
cp target/release/sandbox-agent /sandbox-agent
FROM node:22-bookworm-slim
RUN apt-get update -qq && apt-get install -y -qq --no-install-recommends ca-certificates > /dev/null 2>&1 && \
rm -rf /var/lib/apt/lists/*
COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
RUN sandbox-agent install-agent claude

View file

@ -3,15 +3,18 @@
"private": true,
"type": "module",
"exports": {
".": "./src/sandbox-agent-client.ts"
".": "./src/sandbox-agent-client.ts",
"./docker": "./src/docker.ts"
},
"scripts": {
"typecheck": "tsc --noEmit"
},
"dependencies": {
"dockerode": "latest",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/dockerode": "latest",
"@types/node": "latest",
"typescript": "latest"
}

View file

@ -0,0 +1,301 @@
import Docker from "dockerode";
import { execFileSync } from "node:child_process";
import fs from "node:fs";
import path from "node:path";
import { PassThrough } from "node:stream";
import { fileURLToPath } from "node:url";
import { waitForHealth } from "./sandbox-agent-client.ts";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const EXAMPLE_IMAGE = "sandbox-agent-examples:latest";
const EXAMPLE_IMAGE_DEV = "sandbox-agent-examples-dev:latest";
const DOCKERFILE_DIR = path.resolve(__dirname, "..");
const REPO_ROOT = path.resolve(DOCKERFILE_DIR, "../..");
export interface DockerSandboxOptions {
/** Container port used by sandbox-agent inside Docker. */
port: number;
/** Optional fixed host port mapping. If omitted, Docker assigns a free host port automatically. */
hostPort?: number;
/** Additional shell commands to run before starting sandbox-agent. */
setupCommands?: string[];
/** Docker image to use. Defaults to the pre-built sandbox-agent-examples image. */
image?: string;
}
export interface DockerSandbox {
baseUrl: string;
cleanup: () => Promise<void>;
}
const DIRECT_CREDENTIAL_KEYS = [
"ANTHROPIC_API_KEY",
"CLAUDE_API_KEY",
"CLAUDE_CODE_OAUTH_TOKEN",
"ANTHROPIC_AUTH_TOKEN",
"OPENAI_API_KEY",
"CODEX_API_KEY",
"CEREBRAS_API_KEY",
"OPENCODE_API_KEY",
] as const;
function stripShellQuotes(value: string): string {
const trimmed = value.trim();
if (trimmed.length >= 2 && trimmed.startsWith("\"") && trimmed.endsWith("\"")) {
return trimmed.slice(1, -1);
}
if (trimmed.length >= 2 && trimmed.startsWith("'") && trimmed.endsWith("'")) {
return trimmed.slice(1, -1);
}
return trimmed;
}
function parseExtractedCredentials(output: string): Record<string, string> {
const parsed: Record<string, string> = {};
for (const rawLine of output.split("\n")) {
const line = rawLine.trim();
if (!line) continue;
const cleanLine = line.startsWith("export ") ? line.slice(7) : line;
const match = cleanLine.match(/^([A-Z0-9_]+)=(.*)$/);
if (!match) continue;
const [, key, rawValue] = match;
const value = stripShellQuotes(rawValue);
if (!value) continue;
parsed[key] = value;
}
return parsed;
}
interface ClaudeCredentialFile {
hostPath: string;
containerPath: string;
base64Content: string;
}
function readClaudeCredentialFiles(): ClaudeCredentialFile[] {
const homeDir = process.env.HOME || "";
if (!homeDir) return [];
const candidates: Array<{ hostPath: string; containerPath: string }> = [
{
hostPath: path.join(homeDir, ".claude", ".credentials.json"),
containerPath: "/root/.claude/.credentials.json",
},
{
hostPath: path.join(homeDir, ".claude-oauth-credentials.json"),
containerPath: "/root/.claude-oauth-credentials.json",
},
];
const files: ClaudeCredentialFile[] = [];
for (const candidate of candidates) {
if (!fs.existsSync(candidate.hostPath)) continue;
try {
const raw = fs.readFileSync(candidate.hostPath, "utf8");
files.push({
hostPath: candidate.hostPath,
containerPath: candidate.containerPath,
base64Content: Buffer.from(raw, "utf8").toString("base64"),
});
} catch {
// Ignore unreadable credential file candidates.
}
}
return files;
}
function collectCredentialEnv(): Record<string, string> {
const merged: Record<string, string> = {};
let extracted: Record<string, string> = {};
try {
const output = execFileSync(
"sandbox-agent",
["credentials", "extract-env"],
{ encoding: "utf8", stdio: ["ignore", "pipe", "pipe"] },
);
extracted = parseExtractedCredentials(output);
} catch {
// Fall back to direct env vars if extraction is unavailable.
}
for (const [key, value] of Object.entries(extracted)) {
if (value) merged[key] = value;
}
for (const key of DIRECT_CREDENTIAL_KEYS) {
const direct = process.env[key];
if (direct) merged[key] = direct;
}
return merged;
}
function shellSingleQuotedLiteral(value: string): string {
return `'${value.replace(/'/g, `'\"'\"'`)}'`;
}
function stripAnsi(value: string): string {
return value.replace(
/[\u001B\u009B][[\]()#;?]*(?:(?:[a-zA-Z\d]*(?:;[a-zA-Z\d]*)*)?\u0007|(?:\d{1,4}(?:;\d{0,4})*)?[0-9A-ORZcf-nqry=><])/g,
"",
);
}
async function ensureExampleImage(_docker: Docker): Promise<string> {
const dev = !!process.env.SANDBOX_AGENT_DEV;
const imageName = dev ? EXAMPLE_IMAGE_DEV : EXAMPLE_IMAGE;
if (dev) {
console.log(" Building sandbox image from source (may take a while, only runs once)...");
try {
execFileSync("docker", [
"build", "-t", imageName,
"-f", path.join(DOCKERFILE_DIR, "Dockerfile.dev"),
REPO_ROOT,
], {
stdio: ["ignore", "ignore", "pipe"],
});
} catch (err: unknown) {
const stderr = err instanceof Error && "stderr" in err ? String((err as { stderr: unknown }).stderr) : "";
throw new Error(`Failed to build sandbox image: ${stderr}`);
}
} else {
console.log(" Building sandbox image (may take a while, only runs once)...");
try {
execFileSync("docker", ["build", "-t", imageName, DOCKERFILE_DIR], {
stdio: ["ignore", "ignore", "pipe"],
});
} catch (err: unknown) {
const stderr = err instanceof Error && "stderr" in err ? String((err as { stderr: unknown }).stderr) : "";
throw new Error(`Failed to build sandbox image: ${stderr}`);
}
}
return imageName;
}
/**
* Start a Docker container running sandbox-agent and wait for it to be healthy.
* Registers SIGINT/SIGTERM handlers for cleanup.
*/
export async function startDockerSandbox(opts: DockerSandboxOptions): Promise<DockerSandbox> {
const { port, hostPort } = opts;
const useCustomImage = !!opts.image;
let image = opts.image ?? EXAMPLE_IMAGE;
// TODO: Replace setupCommands shell bootstrapping with native sandbox-agent exec API once available.
const setupCommands = [...(opts.setupCommands ?? [])];
const credentialEnv = collectCredentialEnv();
const claudeCredentialFiles = readClaudeCredentialFiles();
const bootstrapEnv: Record<string, string> = {};
if (claudeCredentialFiles.length > 0) {
delete credentialEnv.ANTHROPIC_API_KEY;
delete credentialEnv.CLAUDE_API_KEY;
delete credentialEnv.CLAUDE_CODE_OAUTH_TOKEN;
delete credentialEnv.ANTHROPIC_AUTH_TOKEN;
const credentialBootstrapCommands = claudeCredentialFiles.flatMap((file, index) => {
const envKey = `SANDBOX_AGENT_CLAUDE_CREDENTIAL_${index}_B64`;
bootstrapEnv[envKey] = file.base64Content;
return [
`mkdir -p ${shellSingleQuotedLiteral(path.posix.dirname(file.containerPath))}`,
`printf %s "$${envKey}" | base64 -d > ${shellSingleQuotedLiteral(file.containerPath)}`,
];
});
setupCommands.unshift(...credentialBootstrapCommands);
}
for (const [key, value] of Object.entries(credentialEnv)) {
if (!process.env[key]) process.env[key] = value;
}
const docker = new Docker({ socketPath: "/var/run/docker.sock" });
if (useCustomImage) {
try {
await docker.getImage(image).inspect();
} catch {
console.log(` Pulling ${image}...`);
await new Promise<void>((resolve, reject) => {
docker.pull(image, (err: Error | null, stream: NodeJS.ReadableStream) => {
if (err) return reject(err);
docker.modem.followProgress(stream, (err: Error | null) => (err ? reject(err) : resolve()));
});
});
}
} else {
image = await ensureExampleImage(docker);
}
const bootCommands = [
...setupCommands,
`sandbox-agent server --no-token --host 0.0.0.0 --port ${port}`,
];
const container = await docker.createContainer({
Image: image,
WorkingDir: "/root",
Cmd: ["sh", "-c", bootCommands.join(" && ")],
Env: [
...Object.entries(credentialEnv).map(([key, value]) => `${key}=${value}`),
...Object.entries(bootstrapEnv).map(([key, value]) => `${key}=${value}`),
],
ExposedPorts: { [`${port}/tcp`]: {} },
HostConfig: {
AutoRemove: true,
PortBindings: { [`${port}/tcp`]: [{ HostPort: hostPort ? `${hostPort}` : "0" }] },
},
});
await container.start();
const logChunks: string[] = [];
const startupLogs = await container.logs({
follow: true,
stdout: true,
stderr: true,
since: 0,
}) as NodeJS.ReadableStream;
const stdoutStream = new PassThrough();
const stderrStream = new PassThrough();
stdoutStream.on("data", (chunk) => {
logChunks.push(stripAnsi(String(chunk)));
});
stderrStream.on("data", (chunk) => {
logChunks.push(stripAnsi(String(chunk)));
});
docker.modem.demuxStream(startupLogs, stdoutStream, stderrStream);
const stopStartupLogs = () => {
const stream = startupLogs as NodeJS.ReadableStream & { destroy?: () => void };
try { stream.destroy?.(); } catch {}
};
const inspect = await container.inspect();
const mappedPorts = inspect.NetworkSettings?.Ports?.[`${port}/tcp`];
const mappedHostPort = mappedPorts?.[0]?.HostPort;
if (!mappedHostPort) {
throw new Error(`Failed to resolve mapped host port for container port ${port}`);
}
const baseUrl = `http://127.0.0.1:${mappedHostPort}`;
try {
await waitForHealth({ baseUrl });
} catch (err) {
stopStartupLogs();
console.error(" Container logs:");
for (const chunk of logChunks) {
process.stderr.write(` ${chunk}`);
}
throw err;
}
stopStartupLogs();
console.log(` Ready (${baseUrl})`);
const cleanup = async () => {
stopStartupLogs();
try { await container.stop({ t: 5 }); } catch {}
try { await container.remove({ force: true }); } catch {}
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
return { baseUrl, cleanup };
}

View file

@ -3,11 +3,7 @@
* Provides minimal helpers for connecting to and interacting with sandbox-agent servers.
*/
import { createInterface } from "node:readline/promises";
import { randomUUID } from "node:crypto";
import { setTimeout as delay } from "node:timers/promises";
import { SandboxAgent } from "sandbox-agent";
import type { PermissionEventData, QuestionEventData } from "sandbox-agent";
function normalizeBaseUrl(baseUrl: string): string {
return baseUrl.replace(/\/+$/, "");
@ -27,10 +23,12 @@ export function buildInspectorUrl({
baseUrl,
token,
headers,
sessionId,
}: {
baseUrl: string;
token?: string;
headers?: Record<string, string>;
sessionId?: string;
}): string {
const normalized = normalizeBaseUrl(ensureUrl(baseUrl));
const params = new URLSearchParams();
@ -41,7 +39,8 @@ export function buildInspectorUrl({
params.set("headers", JSON.stringify(headers));
}
const queryString = params.toString();
return `${normalized}/ui/${queryString ? `?${queryString}` : ""}`;
const sessionPath = sessionId ? `sessions/${sessionId}` : "";
return `${normalized}/ui/${sessionPath}${queryString ? `?${queryString}` : ""}`;
}
export function logInspectorUrl({
@ -110,125 +109,39 @@ export async function waitForHealth({
throw (lastError ?? new Error("Timed out waiting for /v1/health")) as Error;
}
function detectAgent(): string {
export function generateSessionId(): string {
const chars = "abcdefghijklmnopqrstuvwxyz0123456789";
let id = "session-";
for (let i = 0; i < 8; i++) {
id += chars[Math.floor(Math.random() * chars.length)];
}
return id;
}
export function detectAgent(): string {
if (process.env.SANDBOX_AGENT) return process.env.SANDBOX_AGENT;
if (process.env.ANTHROPIC_API_KEY) return "claude";
if (process.env.OPENAI_API_KEY) return "codex";
const hasClaude = Boolean(
process.env.ANTHROPIC_API_KEY ||
process.env.CLAUDE_API_KEY ||
process.env.CLAUDE_CODE_OAUTH_TOKEN ||
process.env.ANTHROPIC_AUTH_TOKEN,
);
const openAiLikeKey = process.env.OPENAI_API_KEY || process.env.CODEX_API_KEY || "";
const hasCodexApiKey = openAiLikeKey.startsWith("sk-");
if (hasCodexApiKey && hasClaude) {
console.log("Both Claude and Codex API keys detected; defaulting to codex. Set SANDBOX_AGENT to override.");
return "codex";
}
if (!hasCodexApiKey && openAiLikeKey) {
console.log("OpenAI/Codex credential is not an API key (expected sk-...), skipping codex auto-select.");
}
if (hasCodexApiKey) return "codex";
if (hasClaude) {
if (openAiLikeKey && !hasCodexApiKey) {
console.log("Using claude by default.");
}
return "claude";
}
return "claude";
}
export async function runPrompt(baseUrl: string): Promise<void> {
console.log(`UI: ${buildInspectorUrl({ baseUrl })}`);
const client = await SandboxAgent.connect({ baseUrl });
const agent = detectAgent();
console.log(`Using agent: ${agent}`);
const sessionId = randomUUID();
await client.createSession(sessionId, { agent });
console.log(`Session ${sessionId}. Press Ctrl+C to quit.`);
const rl = createInterface({ input: process.stdin, output: process.stdout });
let isThinking = false;
let hasStartedOutput = false;
let turnResolve: (() => void) | null = null;
let sessionEnded = false;
const processEvents = async () => {
for await (const event of client.streamEvents(sessionId)) {
if (event.type === "item.started") {
const item = (event.data as any)?.item;
if (item?.role === "assistant") {
isThinking = true;
hasStartedOutput = false;
process.stdout.write("Thinking...");
}
}
if (event.type === "item.delta" && isThinking) {
const delta = (event.data as any)?.delta;
if (delta) {
if (!hasStartedOutput) {
process.stdout.write("\r\x1b[K");
hasStartedOutput = true;
}
const text = typeof delta === "string" ? delta : delta.type === "text" ? delta.text || "" : "";
if (text) process.stdout.write(text);
}
}
if (event.type === "item.completed") {
const item = (event.data as any)?.item;
if (item?.role === "assistant") {
isThinking = false;
process.stdout.write("\n");
turnResolve?.();
turnResolve = null;
}
}
if (event.type === "permission.requested") {
const data = event.data as PermissionEventData;
if (isThinking && !hasStartedOutput) {
process.stdout.write("\r\x1b[K");
}
console.log(`[Auto-approved] ${data.action}`);
await client.replyPermission(sessionId, data.permission_id, { reply: "once" });
}
if (event.type === "question.requested") {
const data = event.data as QuestionEventData;
if (isThinking && !hasStartedOutput) {
process.stdout.write("\r\x1b[K");
}
console.log(`[Question rejected] ${data.prompt}`);
await client.rejectQuestion(sessionId, data.question_id);
}
if (event.type === "error") {
const data = event.data as any;
console.error(`\nError: ${data?.message || JSON.stringify(data)}`);
}
if (event.type === "session.ended") {
const data = event.data as any;
const reason = data?.reason || "unknown";
if (reason === "error") {
console.error(`\nAgent exited with error: ${data?.message || ""}`);
if (data?.exit_code !== undefined) {
console.error(` Exit code: ${data.exit_code}`);
}
} else {
console.log(`Agent session ${reason}`);
}
sessionEnded = true;
turnResolve?.();
turnResolve = null;
}
}
};
processEvents().catch((err) => {
if (!sessionEnded) {
console.error("Event stream error:", err instanceof Error ? err.message : err);
}
});
while (true) {
const line = await rl.question("> ");
if (!line.trim()) continue;
const turnComplete = new Promise<void>((resolve) => {
turnResolve = resolve;
});
try {
await client.postMessage(sessionId, { message: line.trim() });
await turnComplete;
} catch (error) {
console.error(error instanceof Error ? error.message : error);
turnResolve = null;
}
}
}

View file

@ -0,0 +1,12 @@
---
name: random-number
description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
---
To generate a random number, run:
```bash
node /opt/skills/random-number/random-number.cjs <min> <max>
```
This prints a single random integer between min and max (inclusive).

View file

@ -0,0 +1,20 @@
{
"name": "@sandbox-agent/example-skills-custom-tool",
"private": true,
"type": "module",
"scripts": {
"build:script": "esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs",
"start": "pnpm build:script && tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/node": "latest",
"esbuild": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

View file

@ -0,0 +1,53 @@
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
// Verify the bundled script exists (built by `pnpm build:script`).
const scriptFile = path.resolve(__dirname, "../dist/random-number.cjs");
if (!fs.existsSync(scriptFile)) {
console.error("Error: dist/random-number.cjs not found. Run `pnpm build:script` first.");
process.exit(1);
}
// Start a Docker container running sandbox-agent.
console.log("Starting sandbox...");
const { baseUrl, cleanup } = await startDockerSandbox({ port: 3005 });
// Upload the bundled script and SKILL.md into the sandbox filesystem.
console.log("Uploading script and skill file...");
const client = await SandboxAgent.connect({ baseUrl });
const script = await fs.promises.readFile(scriptFile);
const scriptResult = await client.writeFsFile(
{ path: "/opt/skills/random-number/random-number.cjs" },
script,
);
console.log(` Script: ${scriptResult.path} (${scriptResult.bytesWritten} bytes)`);
const skillMd = await fs.promises.readFile(path.resolve(__dirname, "../SKILL.md"));
const skillResult = await client.writeFsFile(
{ path: "/opt/skills/random-number/SKILL.md" },
skillMd,
);
console.log(` Skill: ${skillResult.path} (${skillResult.bytesWritten} bytes)`);
// Create a session with the uploaded skill as a local source.
console.log("Creating session with custom skill...");
const sessionId = generateSessionId();
await client.createSession(sessionId, {
agent: detectAgent(),
skills: {
sources: [{ type: "local", source: "/opt/skills/random-number" }],
},
});
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(' Try: "generate a random number between 1 and 100"');
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });

View file

@ -0,0 +1,9 @@
const min = Number(process.argv[2]);
const max = Number(process.argv[3]);
if (Number.isNaN(min) || Number.isNaN(max)) {
console.error("Usage: random-number <min> <max>");
process.exit(1);
}
console.log(Math.floor(Math.random() * (max - min + 1)) + min);

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

View file

@ -0,0 +1,18 @@
{
"name": "@sandbox-agent/example-skills",
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

View file

@ -0,0 +1,26 @@
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
console.log("Starting sandbox...");
const { baseUrl, cleanup } = await startDockerSandbox({
port: 3001,
});
console.log("Creating session with skill source...");
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, {
agent: detectAgent(),
skills: {
sources: [
{ type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
],
},
});
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(' Try: "How do I start sandbox-agent?"');
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });

View file

@ -0,0 +1,16 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM"],
"module": "ESNext",
"moduleResolution": "Bundler",
"allowImportingTsExtensions": true,
"noEmit": true,
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.test.ts"]
}

View file

@ -3,7 +3,7 @@
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/vercel.ts",
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {

View file

@ -1,5 +1,6 @@
import { Sandbox } from "@vercel/sandbox";
import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
@ -40,12 +41,18 @@ const baseUrl = sandbox.domain(3000);
console.log("Waiting for server...");
await waitForHealth({ baseUrl });
const client = await SandboxAgent.connect({ baseUrl });
const sessionId = generateSessionId();
await client.createSession(sessionId, { agent: detectAgent() });
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
await sandbox.stop();
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);
await runPrompt(baseUrl);
await cleanup();

View file

@ -1,15 +1,20 @@
FROM node:22-alpine AS build
WORKDIR /app
RUN npm install -g pnpm
RUN npm install -g pnpm@9
# Copy package files for all workspaces
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/typescript/package.json ./sdks/typescript/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
# Install dependencies
RUN pnpm install --filter @sandbox-agent/inspector...
# Copy cli-shared source and build it
COPY sdks/cli-shared ./sdks/cli-shared
RUN cd sdks/cli-shared && pnpm exec tsup
# Copy SDK source (with pre-generated types)
COPY sdks/typescript ./sdks/typescript

View file

@ -336,6 +336,12 @@
color: var(--danger);
}
.banner.config-note {
background: rgba(255, 159, 10, 0.12);
border-left: 3px solid var(--warning);
color: var(--warning);
}
.banner.success {
background: rgba(48, 209, 88, 0.1);
border-left: 3px solid var(--success);
@ -471,11 +477,12 @@
position: relative;
}
.sidebar-add-menu {
.sidebar-add-menu,
.session-create-menu {
position: absolute;
top: 36px;
left: 0;
min-width: 200px;
min-width: 220px;
background: var(--surface);
border: 1px solid var(--border-2);
border-radius: 8px;
@ -487,6 +494,405 @@
z-index: 60;
}
.session-create-header {
display: flex;
align-items: center;
gap: 8px;
padding: 6px 6px 4px;
margin-bottom: 4px;
}
.session-create-back {
width: 24px;
height: 24px;
background: transparent;
border: 1px solid var(--border-2);
border-radius: 4px;
color: var(--muted);
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
transition: all var(--transition);
flex-shrink: 0;
}
.session-create-back:hover {
border-color: var(--accent);
color: var(--accent);
}
.session-create-agent-name {
font-size: 12px;
font-weight: 600;
color: var(--text);
}
.session-create-form {
display: flex;
flex-direction: column;
gap: 0;
padding: 4px 2px;
}
.session-create-form .setup-field {
display: flex;
flex-direction: row;
align-items: center;
gap: 8px;
height: 28px;
}
.session-create-form .setup-label {
width: 72px;
flex-shrink: 0;
text-align: right;
}
.session-create-form .setup-select,
.session-create-form .setup-input {
flex: 1;
min-width: 0;
}
.session-create-section {
overflow: hidden;
}
.session-create-section-toggle {
display: flex;
align-items: center;
gap: 8px;
width: 100%;
height: 28px;
padding: 0;
background: transparent;
border: none;
color: var(--text-secondary);
font-size: 11px;
cursor: pointer;
transition: color var(--transition);
}
.session-create-section-toggle:hover {
color: var(--text);
}
.session-create-section-toggle .setup-label {
width: 72px;
flex-shrink: 0;
text-align: right;
}
.session-create-section-count {
font-size: 11px;
font-weight: 400;
color: var(--muted);
}
.session-create-section-arrow {
margin-left: auto;
color: var(--muted-2);
flex-shrink: 0;
}
.session-create-section-body {
margin: 4px 0 6px;
padding: 8px;
border: 1px solid var(--border-2);
border-radius: 4px;
background: var(--surface-2);
}
.session-create-textarea {
width: 100%;
background: var(--surface-2);
border: 1px solid var(--border-2);
border-radius: 4px;
padding: 6px 8px;
font-size: 10px;
color: var(--text);
outline: none;
resize: vertical;
min-height: 60px;
font-family: ui-monospace, SFMono-Regular, 'SF Mono', Consolas, monospace;
transition: border-color var(--transition);
}
.session-create-textarea:focus {
border-color: var(--accent);
}
.session-create-textarea::placeholder {
color: var(--muted-2);
}
.session-create-inline-error {
font-size: 10px;
color: var(--danger);
margin-top: 4px;
line-height: 1.4;
}
.session-create-skill-list {
display: flex;
flex-direction: column;
gap: 2px;
margin-bottom: 4px;
}
.session-create-skill-item {
display: flex;
align-items: center;
gap: 4px;
padding: 3px 4px 3px 8px;
background: var(--surface-2);
border: 1px solid var(--border-2);
border-radius: 4px;
}
.session-create-skill-path {
flex: 1;
min-width: 0;
font-size: 10px;
color: var(--text-secondary);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
.session-create-skill-remove {
width: 18px;
height: 18px;
background: transparent;
border: none;
border-radius: 3px;
color: var(--muted);
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
flex-shrink: 0;
transition: all var(--transition);
}
.session-create-skill-remove:hover {
color: var(--danger);
background: rgba(255, 59, 48, 0.12);
}
.session-create-skill-add-row {
display: flex;
}
.session-create-skill-input {
width: 100%;
background: var(--surface-2);
border: 1px solid var(--accent);
border-radius: 4px;
padding: 4px 8px;
font-size: 10px;
color: var(--text);
outline: none;
font-family: ui-monospace, SFMono-Regular, 'SF Mono', Consolas, monospace;
}
.session-create-skill-input::placeholder {
color: var(--muted-2);
}
.session-create-skill-type-badge {
display: inline-flex;
align-items: center;
padding: 1px 5px;
border-radius: 3px;
font-size: 9px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.3px;
background: rgba(255, 79, 0, 0.15);
color: var(--accent);
flex-shrink: 0;
}
.session-create-skill-type-row {
display: flex;
gap: 4px;
}
.session-create-skill-type-select {
width: 80px;
flex-shrink: 0;
background: var(--surface-2);
border: 1px solid var(--accent);
border-radius: 4px;
padding: 4px 6px;
font-size: 10px;
color: var(--text);
outline: none;
cursor: pointer;
}
.session-create-mcp-list {
display: flex;
flex-direction: column;
gap: 2px;
margin-bottom: 4px;
}
.session-create-mcp-item {
display: flex;
align-items: center;
gap: 4px;
padding: 3px 4px 3px 8px;
background: var(--surface-2);
border: 1px solid var(--border-2);
border-radius: 4px;
}
.session-create-mcp-info {
flex: 1;
min-width: 0;
display: flex;
align-items: center;
gap: 6px;
}
.session-create-mcp-name {
font-size: 11px;
font-weight: 600;
color: var(--text);
white-space: nowrap;
}
.session-create-mcp-type {
font-size: 9px;
font-weight: 500;
text-transform: uppercase;
letter-spacing: 0.3px;
color: var(--muted);
background: var(--surface);
padding: 1px 4px;
border-radius: 3px;
white-space: nowrap;
}
.session-create-mcp-summary {
font-size: 10px;
color: var(--muted);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
min-width: 0;
}
.session-create-mcp-actions {
display: flex;
align-items: center;
gap: 2px;
flex-shrink: 0;
}
.session-create-mcp-edit {
display: flex;
flex-direction: column;
gap: 4px;
}
.session-create-mcp-name-input {
width: 100%;
background: var(--surface-2);
border: 1px solid var(--accent);
border-radius: 4px;
padding: 4px 8px;
font-size: 11px;
color: var(--text);
outline: none;
}
.session-create-mcp-name-input:disabled {
opacity: 0.55;
cursor: not-allowed;
}
.session-create-mcp-name-input::placeholder {
color: var(--muted-2);
}
.session-create-mcp-edit-actions {
display: flex;
gap: 4px;
}
.session-create-mcp-save,
.session-create-mcp-cancel {
flex: 1;
padding: 4px 8px;
border-radius: 4px;
border: none;
font-size: 10px;
font-weight: 600;
cursor: pointer;
transition: background var(--transition);
}
.session-create-mcp-save {
background: var(--accent);
color: #fff;
}
.session-create-mcp-save:hover {
background: var(--accent-hover);
}
.session-create-mcp-cancel {
background: var(--border-2);
color: var(--text-secondary);
}
.session-create-mcp-cancel:hover {
background: var(--muted-2);
}
.session-create-add-btn {
display: flex;
align-items: center;
gap: 4px;
width: 100%;
padding: 4px 8px;
background: transparent;
border: 1px dashed var(--border-2);
border-radius: 4px;
color: var(--muted);
font-size: 10px;
cursor: pointer;
transition: all var(--transition);
}
.session-create-add-btn:hover {
border-color: var(--accent);
color: var(--accent);
}
.session-create-actions {
padding: 4px 2px 2px;
margin-top: 4px;
}
.session-create-actions .button.primary {
width: 100%;
padding: 8px 12px;
font-size: 12px;
}
/* Empty state variant of session-create-menu */
.empty-state-menu-wrapper .session-create-menu {
top: 100%;
left: 50%;
transform: translateX(-50%);
margin-top: 8px;
}
.sidebar-add-option {
background: transparent;
border: 1px solid transparent;
@ -515,12 +921,40 @@
.agent-option-left {
display: flex;
flex-direction: column;
align-items: flex-start;
gap: 2px;
min-width: 0;
}
.agent-option-name {
white-space: nowrap;
min-width: 0;
}
.agent-option-version {
font-size: 10px;
color: var(--muted);
white-space: nowrap;
}
.sidebar-add-option:hover .agent-option-version {
color: rgba(255, 255, 255, 0.6);
}
.agent-option-badges {
display: flex;
align-items: center;
gap: 6px;
flex-shrink: 0;
}
.agent-option-arrow {
color: var(--muted-2);
transition: color var(--transition);
}
.sidebar-add-option:hover .agent-option-arrow {
color: rgba(255, 255, 255, 0.6);
}
.agent-badge {
@ -535,9 +969,6 @@
flex-shrink: 0;
}
.agent-badge.version {
color: var(--muted);
}
.sidebar-add-status {
padding: 6px 8px;
@ -1043,6 +1474,36 @@
height: 16px;
}
/* Session Config Bar */
.session-config-bar {
display: flex;
align-items: flex-start;
gap: 20px;
padding: 10px 16px 12px;
border-top: 1px solid var(--border);
flex-shrink: 0;
flex-wrap: wrap;
}
.session-config-field {
display: flex;
flex-direction: column;
gap: 2px;
}
.session-config-label {
font-size: 10px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
color: var(--muted);
}
.session-config-value {
font-size: 12px;
color: #8e8e93;
}
/* Setup Row */
.setup-row {
display: flex;
@ -1207,6 +1668,29 @@
color: #fff;
}
.setup-config-actions {
display: flex;
gap: 6px;
flex-wrap: wrap;
}
.setup-config-btn {
border: 1px solid var(--border-2);
border-radius: 4px;
background: var(--surface);
color: var(--text-secondary);
}
.setup-config-btn:hover {
border-color: var(--accent);
color: var(--accent);
}
.setup-config-btn.error {
color: var(--danger);
border-color: rgba(255, 59, 48, 0.4);
}
.setup-version {
font-size: 10px;
color: var(--muted);
@ -1311,6 +1795,15 @@
margin-bottom: 0;
}
.config-textarea {
min-height: 130px;
}
.config-inline-error {
margin-top: 8px;
margin-bottom: 0;
}
.card-header {
display: flex;
align-items: center;
@ -1319,6 +1812,16 @@
margin-bottom: 8px;
}
.card-header-pills {
display: flex;
align-items: center;
gap: 6px;
}
.spinner-icon {
animation: spin 0.8s linear infinite;
}
.card-title {
font-size: 13px;
font-weight: 600;

View file

@ -3,11 +3,13 @@ import {
SandboxAgentError,
SandboxAgent,
type AgentInfo,
type CreateSessionRequest,
type AgentModelInfo,
type AgentModeInfo,
type PermissionEventData,
type QuestionEventData,
type SessionInfo,
type SkillSource,
type UniversalEvent,
type UniversalItem
} from "sandbox-agent";
@ -32,6 +34,41 @@ type ItemDeltaEventData = {
delta: string;
};
export type McpServerEntry = {
name: string;
configJson: string;
error: string | null;
};
type ParsedMcpConfig = {
value: NonNullable<CreateSessionRequest["mcp"]>;
count: number;
error: string | null;
};
const buildMcpConfig = (entries: McpServerEntry[]): ParsedMcpConfig => {
if (entries.length === 0) {
return { value: {}, count: 0, error: null };
}
const firstError = entries.find((e) => e.error);
if (firstError) {
return { value: {}, count: entries.length, error: `${firstError.name}: ${firstError.error}` };
}
const value: NonNullable<CreateSessionRequest["mcp"]> = {};
for (const entry of entries) {
try {
value[entry.name] = JSON.parse(entry.configJson);
} catch {
return { value: {}, count: entries.length, error: `${entry.name}: Invalid JSON` };
}
}
return { value, count: entries.length, error: null };
};
const buildSkillsConfig = (sources: SkillSource[]): NonNullable<CreateSessionRequest["skills"]> => {
return { sources };
};
const buildStubItem = (itemId: string, nativeItemId?: string | null): UniversalItem => {
return {
item_id: itemId,
@ -53,6 +90,23 @@ const getCurrentOriginEndpoint = () => {
return window.location.origin;
};
const getSessionIdFromPath = (): string => {
const basePath = import.meta.env.BASE_URL;
const path = window.location.pathname;
const relative = path.startsWith(basePath) ? path.slice(basePath.length) : path;
const match = relative.match(/^sessions\/(.+)/);
return match ? match[1] : "";
};
const updateSessionPath = (id: string) => {
const basePath = import.meta.env.BASE_URL;
const params = window.location.search;
const newPath = id ? `${basePath}sessions/${id}${params}` : `${basePath}${params}`;
if (window.location.pathname + window.location.search !== newPath) {
window.history.replaceState(null, "", newPath);
}
};
const getInitialConnection = () => {
if (typeof window === "undefined") {
return { endpoint: "http://127.0.0.1:2468", token: "", headers: {} as Record<string, string>, hasUrlParam: false };
@ -103,11 +157,7 @@ export default function App() {
const [modelsErrorByAgent, setModelsErrorByAgent] = useState<Record<string, string | null>>({});
const [agentId, setAgentId] = useState("claude");
const [agentMode, setAgentMode] = useState("");
const [permissionMode, setPermissionMode] = useState("default");
const [model, setModel] = useState("");
const [variant, setVariant] = useState("");
const [sessionId, setSessionId] = useState("");
const [sessionId, setSessionId] = useState(getSessionIdFromPath());
const [sessionError, setSessionError] = useState<string | null>(null);
const [message, setMessage] = useState("");
@ -115,6 +165,8 @@ export default function App() {
const [offset, setOffset] = useState(0);
const offsetRef = useRef(0);
const [eventsLoading, setEventsLoading] = useState(false);
const [mcpServers, setMcpServers] = useState<McpServerEntry[]>([]);
const [skillSources, setSkillSources] = useState<SkillSource[]>([]);
const [polling, setPolling] = useState(false);
const pollTimerRef = useRef<number | null>(null);
@ -377,50 +429,52 @@ export default function App() {
stopSse();
stopTurnStream();
setSessionId(session.sessionId);
updateSessionPath(session.sessionId);
setAgentId(session.agent);
setAgentMode(session.agentMode);
setPermissionMode(session.permissionMode);
setModel(session.model ?? "");
setVariant(session.variant ?? "");
setEvents([]);
setOffset(0);
offsetRef.current = 0;
setSessionError(null);
};
const createNewSession = async (nextAgentId?: string) => {
const createNewSession = async (
nextAgentId: string,
config: { model: string; agentMode: string; permissionMode: string; variant: string }
) => {
stopPolling();
stopSse();
stopTurnStream();
const selectedAgent = nextAgentId ?? agentId;
if (nextAgentId) {
setAgentId(nextAgentId);
if (parsedMcpConfig.error) {
setSessionError(parsedMcpConfig.error);
return;
}
const chars = "abcdefghijklmnopqrstuvwxyz0123456789";
let id = "session-";
for (let i = 0; i < 8; i++) {
id += chars[Math.floor(Math.random() * chars.length)];
}
setSessionId(id);
setEvents([]);
setOffset(0);
offsetRef.current = 0;
setSessionError(null);
try {
const body: {
agent: string;
agentMode?: string;
permissionMode?: string;
model?: string;
variant?: string;
} = { agent: selectedAgent };
if (agentMode) body.agentMode = agentMode;
if (permissionMode) body.permissionMode = permissionMode;
if (model) body.model = model;
if (variant) body.variant = variant;
const body: CreateSessionRequest = { agent: nextAgentId };
if (config.agentMode) body.agentMode = config.agentMode;
if (config.permissionMode) body.permissionMode = config.permissionMode;
if (config.model) body.model = config.model;
if (config.variant) body.variant = config.variant;
if (parsedMcpConfig.count > 0) {
body.mcp = parsedMcpConfig.value;
}
if (parsedSkillsConfig.sources.length > 0) {
body.skills = parsedSkillsConfig;
}
await getClient().createSession(id, body);
setSessionId(id);
updateSessionPath(id);
setEvents([]);
setOffset(0);
offsetRef.current = 0;
await fetchSessions();
} catch (error) {
setSessionError(getErrorMessage(error, "Unable to create session"));
@ -762,6 +816,30 @@ export default function App() {
});
break;
}
case "turn.started": {
entries.push({
id: event.event_id,
kind: "meta",
time: event.time,
meta: {
title: "Turn started",
severity: "info"
}
});
break;
}
case "turn.ended": {
entries.push({
id: event.event_id,
kind: "meta",
time: event.time,
meta: {
title: "Turn ended",
severity: "info"
}
});
break;
}
default:
break;
}
@ -852,38 +930,10 @@ export default function App() {
messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
}, [transcriptEntries]);
useEffect(() => {
if (connected && agentId && !modesByAgent[agentId]) {
loadModes(agentId);
}
}, [connected, agentId]);
useEffect(() => {
if (connected && agentId && !modelsByAgent[agentId]) {
loadModels(agentId);
}
}, [connected, agentId]);
useEffect(() => {
const modes = modesByAgent[agentId];
if (modes && modes.length > 0 && !agentMode) {
setAgentMode(modes[0].id);
}
}, [modesByAgent, agentId]);
const currentAgent = agents.find((agent) => agent.id === agentId);
const activeModes = modesByAgent[agentId] ?? [];
const modesLoading = modesLoadingByAgent[agentId] ?? false;
const modesError = modesErrorByAgent[agentId] ?? null;
const modelOptions = modelsByAgent[agentId] ?? [];
const modelsLoading = modelsLoadingByAgent[agentId] ?? false;
const modelsError = modelsErrorByAgent[agentId] ?? null;
const defaultModel = defaultModelByAgent[agentId] ?? "";
const selectedModelId = model || defaultModel;
const selectedModel = modelOptions.find((entry) => entry.id === selectedModelId);
const variantOptions = selectedModel?.variants ?? [];
const defaultVariant = selectedModel?.defaultVariant ?? "";
const supportsVariants = Boolean(currentAgent?.capabilities?.variants);
const currentSessionInfo = sessions.find((s) => s.sessionId === sessionId);
const parsedMcpConfig = useMemo(() => buildMcpConfig(mcpServers), [mcpServers]);
const parsedSkillsConfig = useMemo(() => buildSkillsConfig(skillSources), [skillSources]);
const agentDisplayNames: Record<string, string> = {
claude: "Claude Code",
codex: "Codex",
@ -894,6 +944,15 @@ export default function App() {
};
const agentLabel = agentDisplayNames[agentId] ?? agentId;
const handleSelectAgent = useCallback((targetAgentId: string) => {
if (connected && !modesByAgent[targetAgentId]) {
loadModes(targetAgentId);
}
if (connected && !modelsByAgent[targetAgentId]) {
loadModels(targetAgentId);
}
}, [connected, modesByAgent, modelsByAgent]);
const handleKeyDown = (event: React.KeyboardEvent<HTMLTextAreaElement>) => {
if (event.key === "Enter" && !event.shiftKey) {
event.preventDefault();
@ -957,17 +1016,28 @@ export default function App() {
onSelectSession={selectSession}
onRefresh={fetchSessions}
onCreateSession={createNewSession}
onSelectAgent={handleSelectAgent}
agents={agents.length ? agents : defaultAgents.map((id) => ({ id, installed: false, capabilities: {} }) as AgentInfo)}
agentsLoading={agentsLoading}
agentsError={agentsError}
sessionsLoading={sessionsLoading}
sessionsError={sessionsError}
modesByAgent={modesByAgent}
modelsByAgent={modelsByAgent}
defaultModelByAgent={defaultModelByAgent}
modesLoadingByAgent={modesLoadingByAgent}
modelsLoadingByAgent={modelsLoadingByAgent}
modesErrorByAgent={modesErrorByAgent}
modelsErrorByAgent={modelsErrorByAgent}
mcpServers={mcpServers}
onMcpServersChange={setMcpServers}
mcpConfigError={parsedMcpConfig.error}
skillSources={skillSources}
onSkillSourcesChange={setSkillSources}
/>
<ChatPanel
sessionId={sessionId}
polling={polling}
turnStreaming={turnStreaming}
transcriptEntries={transcriptEntries}
sessionError={sessionError}
message={message}
@ -975,36 +1045,19 @@ export default function App() {
onSendMessage={sendMessage}
onKeyDown={handleKeyDown}
onCreateSession={createNewSession}
onSelectAgent={handleSelectAgent}
agents={agents.length ? agents : defaultAgents.map((id) => ({ id, installed: false, capabilities: {} }) as AgentInfo)}
agentsLoading={agentsLoading}
agentsError={agentsError}
messagesEndRef={messagesEndRef}
agentId={agentId}
agentLabel={agentLabel}
agentMode={agentMode}
permissionMode={permissionMode}
model={model}
variant={variant}
modelOptions={modelOptions}
defaultModel={defaultModel}
modelsLoading={modelsLoading}
modelsError={modelsError}
variantOptions={variantOptions}
defaultVariant={defaultVariant}
supportsVariants={supportsVariants}
streamMode={streamMode}
activeModes={activeModes}
currentAgentVersion={currentAgent?.version ?? null}
modesLoading={modesLoading}
modesError={modesError}
onAgentModeChange={setAgentMode}
onPermissionModeChange={setPermissionMode}
onModelChange={setModel}
onVariantChange={setVariant}
onStreamModeChange={setStreamMode}
onToggleStream={toggleStream}
sessionModel={currentSessionInfo?.model ?? null}
sessionVariant={currentSessionInfo?.variant ?? null}
sessionPermissionMode={currentSessionInfo?.permissionMode ?? null}
sessionMcpServerCount={currentSessionInfo?.mcp ? Object.keys(currentSessionInfo.mcp).length : 0}
sessionSkillSourceCount={currentSessionInfo?.skills?.sources?.length ?? 0}
onEndSession={endSession}
hasSession={Boolean(sessionId)}
eventError={eventError}
questionRequests={questionRequests}
permissionRequests={permissionRequests}
@ -1013,6 +1066,18 @@ export default function App() {
onAnswerQuestion={answerQuestion}
onRejectQuestion={rejectQuestion}
onReplyPermission={replyPermission}
modesByAgent={modesByAgent}
modelsByAgent={modelsByAgent}
defaultModelByAgent={defaultModelByAgent}
modesLoadingByAgent={modesLoadingByAgent}
modelsLoadingByAgent={modelsLoadingByAgent}
modesErrorByAgent={modesErrorByAgent}
modelsErrorByAgent={modelsErrorByAgent}
mcpServers={mcpServers}
onMcpServersChange={setMcpServers}
mcpConfigError={parsedMcpConfig.error}
skillSources={skillSources}
onSkillSourcesChange={setSkillSources}
/>
<DebugPanel

View file

@ -0,0 +1,750 @@
import { ArrowLeft, ArrowRight, ChevronDown, ChevronRight, Pencil, Plus, X } from "lucide-react";
import { useEffect, useRef, useState } from "react";
import type { AgentInfo, AgentModelInfo, AgentModeInfo, SkillSource } from "sandbox-agent";
import type { McpServerEntry } from "../App";
export type SessionConfig = {
model: string;
agentMode: string;
permissionMode: string;
variant: string;
};
const agentLabels: Record<string, string> = {
claude: "Claude Code",
codex: "Codex",
opencode: "OpenCode",
amp: "Amp",
mock: "Mock"
};
const validateServerJson = (json: string): string | null => {
const trimmed = json.trim();
if (!trimmed) return "Config is required";
try {
const parsed = JSON.parse(trimmed);
if (parsed === null || typeof parsed !== "object" || Array.isArray(parsed)) {
return "Must be a JSON object";
}
if (!parsed.type) return 'Missing "type" field';
if (parsed.type !== "local" && parsed.type !== "remote") {
return 'Type must be "local" or "remote"';
}
if (parsed.type === "local" && !parsed.command) return 'Local server requires "command"';
if (parsed.type === "remote" && !parsed.url) return 'Remote server requires "url"';
return null;
} catch {
return "Invalid JSON";
}
};
const getServerType = (configJson: string): string | null => {
try {
const parsed = JSON.parse(configJson);
return parsed?.type ?? null;
} catch {
return null;
}
};
const getServerSummary = (configJson: string): string => {
try {
const parsed = JSON.parse(configJson);
if (parsed?.type === "local") {
const cmd = Array.isArray(parsed.command) ? parsed.command.join(" ") : parsed.command;
return cmd ?? "local";
}
if (parsed?.type === "remote") {
return parsed.url ?? "remote";
}
return parsed?.type ?? "";
} catch {
return "";
}
};
const skillSourceSummary = (source: SkillSource): string => {
let summary = source.source;
if (source.skills && source.skills.length > 0) {
summary += ` [${source.skills.join(", ")}]`;
}
return summary;
};
const SessionCreateMenu = ({
agents,
agentsLoading,
agentsError,
modesByAgent,
modelsByAgent,
defaultModelByAgent,
modesLoadingByAgent,
modelsLoadingByAgent,
modesErrorByAgent,
modelsErrorByAgent,
mcpServers,
onMcpServersChange,
mcpConfigError,
skillSources,
onSkillSourcesChange,
onSelectAgent,
onCreateSession,
open,
onClose
}: {
agents: AgentInfo[];
agentsLoading: boolean;
agentsError: string | null;
modesByAgent: Record<string, AgentModeInfo[]>;
modelsByAgent: Record<string, AgentModelInfo[]>;
defaultModelByAgent: Record<string, string>;
modesLoadingByAgent: Record<string, boolean>;
modelsLoadingByAgent: Record<string, boolean>;
modesErrorByAgent: Record<string, string | null>;
modelsErrorByAgent: Record<string, string | null>;
mcpServers: McpServerEntry[];
onMcpServersChange: (servers: McpServerEntry[]) => void;
mcpConfigError: string | null;
skillSources: SkillSource[];
onSkillSourcesChange: (sources: SkillSource[]) => void;
onSelectAgent: (agentId: string) => void;
onCreateSession: (agentId: string, config: SessionConfig) => void;
open: boolean;
onClose: () => void;
}) => {
const [phase, setPhase] = useState<"agent" | "config">("agent");
const [selectedAgent, setSelectedAgent] = useState("");
const [agentMode, setAgentMode] = useState("");
const [permissionMode, setPermissionMode] = useState("default");
const [model, setModel] = useState("");
const [variant, setVariant] = useState("");
const [mcpExpanded, setMcpExpanded] = useState(false);
const [skillsExpanded, setSkillsExpanded] = useState(false);
// Skill add/edit state
const [addingSkill, setAddingSkill] = useState(false);
const [editingSkillIndex, setEditingSkillIndex] = useState<number | null>(null);
const [skillType, setSkillType] = useState<"github" | "local" | "git">("github");
const [skillSource, setSkillSource] = useState("");
const [skillFilter, setSkillFilter] = useState("");
const [skillRef, setSkillRef] = useState("");
const [skillSubpath, setSkillSubpath] = useState("");
const [skillLocalError, setSkillLocalError] = useState<string | null>(null);
const skillSourceRef = useRef<HTMLInputElement>(null);
// MCP add/edit state
const [addingMcp, setAddingMcp] = useState(false);
const [editingMcpIndex, setEditingMcpIndex] = useState<number | null>(null);
const [mcpName, setMcpName] = useState("");
const [mcpJson, setMcpJson] = useState("");
const [mcpLocalError, setMcpLocalError] = useState<string | null>(null);
const mcpNameRef = useRef<HTMLInputElement>(null);
const mcpJsonRef = useRef<HTMLTextAreaElement>(null);
const cancelSkillEdit = () => {
setAddingSkill(false);
setEditingSkillIndex(null);
setSkillType("github");
setSkillSource("");
setSkillFilter("");
setSkillRef("");
setSkillSubpath("");
setSkillLocalError(null);
};
// Reset state when menu closes
useEffect(() => {
if (!open) {
setPhase("agent");
setSelectedAgent("");
setAgentMode("");
setPermissionMode("default");
setModel("");
setVariant("");
setMcpExpanded(false);
setSkillsExpanded(false);
cancelSkillEdit();
setAddingMcp(false);
setEditingMcpIndex(null);
setMcpName("");
setMcpJson("");
setMcpLocalError(null);
}
}, [open]);
// Auto-select first mode when modes load for selected agent
useEffect(() => {
if (!selectedAgent) return;
const modes = modesByAgent[selectedAgent];
if (modes && modes.length > 0 && !agentMode) {
setAgentMode(modes[0].id);
}
}, [modesByAgent, selectedAgent, agentMode]);
// Focus skill source input when adding
useEffect(() => {
if ((addingSkill || editingSkillIndex !== null) && skillSourceRef.current) {
skillSourceRef.current.focus();
}
}, [addingSkill, editingSkillIndex]);
// Focus MCP name input when adding
useEffect(() => {
if (addingMcp && mcpNameRef.current) {
mcpNameRef.current.focus();
}
}, [addingMcp]);
// Focus MCP json textarea when editing
useEffect(() => {
if (editingMcpIndex !== null && mcpJsonRef.current) {
mcpJsonRef.current.focus();
}
}, [editingMcpIndex]);
if (!open) return null;
const handleAgentClick = (agentId: string) => {
setSelectedAgent(agentId);
setPhase("config");
onSelectAgent(agentId);
};
const handleBack = () => {
setPhase("agent");
setSelectedAgent("");
setAgentMode("");
setPermissionMode("default");
setModel("");
setVariant("");
};
const handleCreate = () => {
if (mcpConfigError) return;
onCreateSession(selectedAgent, { model, agentMode, permissionMode, variant });
onClose();
};
// Skill source helpers
const startAddSkill = () => {
setAddingSkill(true);
setEditingSkillIndex(null);
setSkillType("github");
setSkillSource("rivet-dev/skills");
setSkillFilter("sandbox-agent");
setSkillRef("");
setSkillSubpath("");
setSkillLocalError(null);
};
const startEditSkill = (index: number) => {
const entry = skillSources[index];
setEditingSkillIndex(index);
setAddingSkill(false);
setSkillType(entry.type as "github" | "local" | "git");
setSkillSource(entry.source);
setSkillFilter(entry.skills?.join(", ") ?? "");
setSkillRef(entry.ref ?? "");
setSkillSubpath(entry.subpath ?? "");
setSkillLocalError(null);
};
const commitSkill = () => {
const src = skillSource.trim();
if (!src) {
setSkillLocalError("Source is required");
return;
}
const entry: SkillSource = {
type: skillType,
source: src,
};
const filterList = skillFilter.trim()
? skillFilter.split(",").map((s) => s.trim()).filter(Boolean)
: undefined;
if (filterList && filterList.length > 0) entry.skills = filterList;
if (skillRef.trim()) entry.ref = skillRef.trim();
if (skillSubpath.trim()) entry.subpath = skillSubpath.trim();
if (editingSkillIndex !== null) {
const updated = [...skillSources];
updated[editingSkillIndex] = entry;
onSkillSourcesChange(updated);
} else {
onSkillSourcesChange([...skillSources, entry]);
}
cancelSkillEdit();
};
const removeSkill = (index: number) => {
onSkillSourcesChange(skillSources.filter((_, i) => i !== index));
if (editingSkillIndex === index) {
cancelSkillEdit();
}
};
const isEditingSkill = addingSkill || editingSkillIndex !== null;
const startAddMcp = () => {
setAddingMcp(true);
setEditingMcpIndex(null);
setMcpName("everything");
setMcpJson('{\n "type": "local",\n "command": "npx",\n "args": ["@modelcontextprotocol/server-everything"]\n}');
setMcpLocalError(null);
};
const startEditMcp = (index: number) => {
const entry = mcpServers[index];
setEditingMcpIndex(index);
setAddingMcp(false);
setMcpName(entry.name);
setMcpJson(entry.configJson);
setMcpLocalError(entry.error);
};
const cancelMcpEdit = () => {
setAddingMcp(false);
setEditingMcpIndex(null);
setMcpName("");
setMcpJson("");
setMcpLocalError(null);
};
const commitMcp = () => {
const name = mcpName.trim();
if (!name) {
setMcpLocalError("Server name is required");
return;
}
const error = validateServerJson(mcpJson);
if (error) {
setMcpLocalError(error);
return;
}
// Check for duplicate names (except when editing the same entry)
const duplicate = mcpServers.findIndex((e) => e.name === name);
if (duplicate !== -1 && duplicate !== editingMcpIndex) {
setMcpLocalError(`Server "${name}" already exists`);
return;
}
const entry: McpServerEntry = { name, configJson: mcpJson.trim(), error: null };
if (editingMcpIndex !== null) {
const updated = [...mcpServers];
updated[editingMcpIndex] = entry;
onMcpServersChange(updated);
} else {
onMcpServersChange([...mcpServers, entry]);
}
cancelMcpEdit();
};
const removeMcp = (index: number) => {
onMcpServersChange(mcpServers.filter((_, i) => i !== index));
if (editingMcpIndex === index) {
cancelMcpEdit();
}
};
const isEditingMcp = addingMcp || editingMcpIndex !== null;
if (phase === "agent") {
return (
<div className="session-create-menu">
{agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
{agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
{!agentsLoading && !agentsError && agents.length === 0 && (
<div className="sidebar-add-status">No agents available.</div>
)}
{!agentsLoading && !agentsError &&
agents.map((agent) => (
<button
key={agent.id}
className="sidebar-add-option"
onClick={() => handleAgentClick(agent.id)}
>
<div className="agent-option-left">
<span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
{agent.version && <span className="agent-option-version">{agent.version}</span>}
</div>
<div className="agent-option-badges">
{agent.installed && <span className="agent-badge installed">Installed</span>}
<ArrowRight size={12} className="agent-option-arrow" />
</div>
</button>
))}
</div>
);
}
// Phase 2: config form
const activeModes = modesByAgent[selectedAgent] ?? [];
const modesLoading = modesLoadingByAgent[selectedAgent] ?? false;
const modesError = modesErrorByAgent[selectedAgent] ?? null;
const modelOptions = modelsByAgent[selectedAgent] ?? [];
const modelsLoading = modelsLoadingByAgent[selectedAgent] ?? false;
const modelsError = modelsErrorByAgent[selectedAgent] ?? null;
const defaultModel = defaultModelByAgent[selectedAgent] ?? "";
const selectedModelId = model || defaultModel;
const selectedModelObj = modelOptions.find((entry) => entry.id === selectedModelId);
const variantOptions = selectedModelObj?.variants ?? [];
const showModelSelect = modelsLoading || Boolean(modelsError) || modelOptions.length > 0;
const hasModelOptions = modelOptions.length > 0;
const modelCustom =
model && hasModelOptions && !modelOptions.some((entry) => entry.id === model);
const supportsVariants =
modelsLoading ||
Boolean(modelsError) ||
modelOptions.some((entry) => (entry.variants?.length ?? 0) > 0);
const showVariantSelect =
supportsVariants && (modelsLoading || Boolean(modelsError) || variantOptions.length > 0);
const hasVariantOptions = variantOptions.length > 0;
const variantCustom = variant && hasVariantOptions && !variantOptions.includes(variant);
const agentLabel = agentLabels[selectedAgent] ?? selectedAgent;
return (
<div className="session-create-menu">
<div className="session-create-header">
<button className="session-create-back" onClick={handleBack} title="Back to agents">
<ArrowLeft size={14} />
</button>
<span className="session-create-agent-name">{agentLabel}</span>
</div>
<div className="session-create-form">
<div className="setup-field">
<span className="setup-label">Model</span>
{showModelSelect ? (
<select
className="setup-select"
value={model}
onChange={(e) => { setModel(e.target.value); setVariant(""); }}
title="Model"
disabled={modelsLoading || Boolean(modelsError)}
>
{modelsLoading ? (
<option value="">Loading models...</option>
) : modelsError ? (
<option value="">{modelsError}</option>
) : (
<>
<option value="">
{defaultModel ? `Default (${defaultModel})` : "Default"}
</option>
{modelCustom && <option value={model}>{model} (custom)</option>}
{modelOptions.map((entry) => (
<option key={entry.id} value={entry.id}>
{entry.name ?? entry.id}
</option>
))}
</>
)}
</select>
) : (
<input
className="setup-input"
value={model}
onChange={(e) => setModel(e.target.value)}
placeholder="Model"
title="Model"
/>
)}
</div>
<div className="setup-field">
<span className="setup-label">Mode</span>
<select
className="setup-select"
value={agentMode}
onChange={(e) => setAgentMode(e.target.value)}
title="Mode"
disabled={modesLoading || Boolean(modesError)}
>
{modesLoading ? (
<option value="">Loading modes...</option>
) : modesError ? (
<option value="">{modesError}</option>
) : activeModes.length > 0 ? (
activeModes.map((m) => (
<option key={m.id} value={m.id}>
{m.name || m.id}
</option>
))
) : (
<option value="">Mode</option>
)}
</select>
</div>
<div className="setup-field">
<span className="setup-label">Permission</span>
<select
className="setup-select"
value={permissionMode}
onChange={(e) => setPermissionMode(e.target.value)}
title="Permission Mode"
>
<option value="default">Default</option>
<option value="plan">Plan</option>
<option value="bypass">Bypass</option>
</select>
</div>
{supportsVariants && (
<div className="setup-field">
<span className="setup-label">Variant</span>
{showVariantSelect ? (
<select
className="setup-select"
value={variant}
onChange={(e) => setVariant(e.target.value)}
title="Variant"
disabled={modelsLoading || Boolean(modelsError)}
>
{modelsLoading ? (
<option value="">Loading variants...</option>
) : modelsError ? (
<option value="">{modelsError}</option>
) : (
<>
<option value="">Default</option>
{variantCustom && <option value={variant}>{variant} (custom)</option>}
{variantOptions.map((entry) => (
<option key={entry} value={entry}>
{entry}
</option>
))}
</>
)}
</select>
) : (
<input
className="setup-input"
value={variant}
onChange={(e) => setVariant(e.target.value)}
placeholder="Variant"
title="Variant"
/>
)}
</div>
)}
{/* MCP Servers - collapsible */}
<div className="session-create-section">
<button
type="button"
className="session-create-section-toggle"
onClick={() => setMcpExpanded(!mcpExpanded)}
>
<span className="setup-label">MCP</span>
<span className="session-create-section-count">{mcpServers.length} server{mcpServers.length !== 1 ? "s" : ""}</span>
{mcpExpanded ? <ChevronDown size={12} className="session-create-section-arrow" /> : <ChevronRight size={12} className="session-create-section-arrow" />}
</button>
{mcpExpanded && (
<div className="session-create-section-body">
{mcpServers.length > 0 && !isEditingMcp && (
<div className="session-create-mcp-list">
{mcpServers.map((entry, index) => (
<div key={entry.name} className="session-create-mcp-item">
<div className="session-create-mcp-info">
<span className="session-create-mcp-name">{entry.name}</span>
{getServerType(entry.configJson) && (
<span className="session-create-mcp-type">{getServerType(entry.configJson)}</span>
)}
<span className="session-create-mcp-summary mono">{getServerSummary(entry.configJson)}</span>
</div>
<div className="session-create-mcp-actions">
<button
type="button"
className="session-create-skill-remove"
onClick={() => startEditMcp(index)}
title="Edit server"
>
<Pencil size={10} />
</button>
<button
type="button"
className="session-create-skill-remove"
onClick={() => removeMcp(index)}
title="Remove server"
>
<X size={12} />
</button>
</div>
</div>
))}
</div>
)}
{isEditingMcp ? (
<div className="session-create-mcp-edit">
<input
ref={mcpNameRef}
className="session-create-mcp-name-input"
value={mcpName}
onChange={(e) => { setMcpName(e.target.value); setMcpLocalError(null); }}
placeholder="server-name"
disabled={editingMcpIndex !== null}
/>
<textarea
ref={mcpJsonRef}
className="session-create-textarea mono"
value={mcpJson}
onChange={(e) => { setMcpJson(e.target.value); setMcpLocalError(null); }}
placeholder='{"type":"local","command":"node","args":["./server.js"]}'
rows={4}
/>
{mcpLocalError && (
<div className="session-create-inline-error">{mcpLocalError}</div>
)}
<div className="session-create-mcp-edit-actions">
<button type="button" className="session-create-mcp-save" onClick={commitMcp}>
{editingMcpIndex !== null ? "Save" : "Add"}
</button>
<button type="button" className="session-create-mcp-cancel" onClick={cancelMcpEdit}>
Cancel
</button>
</div>
</div>
) : (
<button
type="button"
className="session-create-add-btn"
onClick={startAddMcp}
>
<Plus size={12} />
Add server
</button>
)}
{mcpConfigError && !isEditingMcp && (
<div className="session-create-inline-error">{mcpConfigError}</div>
)}
</div>
)}
</div>
{/* Skills - collapsible with source-based list */}
<div className="session-create-section">
<button
type="button"
className="session-create-section-toggle"
onClick={() => setSkillsExpanded(!skillsExpanded)}
>
<span className="setup-label">Skills</span>
<span className="session-create-section-count">{skillSources.length} source{skillSources.length !== 1 ? "s" : ""}</span>
{skillsExpanded ? <ChevronDown size={12} className="session-create-section-arrow" /> : <ChevronRight size={12} className="session-create-section-arrow" />}
</button>
{skillsExpanded && (
<div className="session-create-section-body">
{skillSources.length > 0 && !isEditingSkill && (
<div className="session-create-skill-list">
{skillSources.map((entry, index) => (
<div key={`${entry.type}-${entry.source}-${index}`} className="session-create-skill-item">
<span className="session-create-skill-type-badge">{entry.type}</span>
<span className="session-create-skill-path mono">{skillSourceSummary(entry)}</span>
<div className="session-create-mcp-actions">
<button
type="button"
className="session-create-skill-remove"
onClick={() => startEditSkill(index)}
title="Edit source"
>
<Pencil size={10} />
</button>
<button
type="button"
className="session-create-skill-remove"
onClick={() => removeSkill(index)}
title="Remove source"
>
<X size={12} />
</button>
</div>
</div>
))}
</div>
)}
{isEditingSkill ? (
<div className="session-create-mcp-edit">
<div className="session-create-skill-type-row">
<select
className="session-create-skill-type-select"
value={skillType}
onChange={(e) => { setSkillType(e.target.value as "github" | "local" | "git"); setSkillLocalError(null); }}
>
<option value="github">github</option>
<option value="local">local</option>
<option value="git">git</option>
</select>
<input
ref={skillSourceRef}
className="session-create-skill-input mono"
value={skillSource}
onChange={(e) => { setSkillSource(e.target.value); setSkillLocalError(null); }}
placeholder={skillType === "github" ? "owner/repo" : skillType === "local" ? "/path/to/skill" : "https://git.example.com/repo.git"}
/>
</div>
<input
className="session-create-skill-input mono"
value={skillFilter}
onChange={(e) => setSkillFilter(e.target.value)}
placeholder="Filter skills (comma-separated, optional)"
/>
{skillType !== "local" && (
<div className="session-create-skill-type-row">
<input
className="session-create-skill-input mono"
value={skillRef}
onChange={(e) => setSkillRef(e.target.value)}
placeholder="Branch/tag (optional)"
/>
<input
className="session-create-skill-input mono"
value={skillSubpath}
onChange={(e) => setSkillSubpath(e.target.value)}
placeholder="Subpath (optional)"
/>
</div>
)}
{skillLocalError && (
<div className="session-create-inline-error">{skillLocalError}</div>
)}
<div className="session-create-mcp-edit-actions">
<button type="button" className="session-create-mcp-save" onClick={commitSkill}>
{editingSkillIndex !== null ? "Save" : "Add"}
</button>
<button type="button" className="session-create-mcp-cancel" onClick={cancelSkillEdit}>
Cancel
</button>
</div>
</div>
) : (
<button
type="button"
className="session-create-add-btn"
onClick={startAddSkill}
>
<Plus size={12} />
Add source
</button>
)}
</div>
)}
</div>
</div>
<div className="session-create-actions">
<button
className="button primary"
onClick={handleCreate}
disabled={Boolean(mcpConfigError)}
>
Create Session
</button>
</div>
</div>
);
};
export default SessionCreateMenu;

View file

@ -1,6 +1,17 @@
import { Plus, RefreshCw } from "lucide-react";
import { useEffect, useRef, useState } from "react";
import type { AgentInfo, SessionInfo } from "sandbox-agent";
import type { AgentInfo, AgentModelInfo, AgentModeInfo, SessionInfo, SkillSource } from "sandbox-agent";
import type { McpServerEntry } from "../App";
import SessionCreateMenu, { type SessionConfig } from "./SessionCreateMenu";
const agentLabels: Record<string, string> = {
claude: "Claude Code",
codex: "Codex",
opencode: "OpenCode",
amp: "Amp",
pi: "Pi",
mock: "Mock"
};
const SessionSidebar = ({
sessions,
@ -8,22 +19,48 @@ const SessionSidebar = ({
onSelectSession,
onRefresh,
onCreateSession,
onSelectAgent,
agents,
agentsLoading,
agentsError,
sessionsLoading,
sessionsError
sessionsError,
modesByAgent,
modelsByAgent,
defaultModelByAgent,
modesLoadingByAgent,
modelsLoadingByAgent,
modesErrorByAgent,
modelsErrorByAgent,
mcpServers,
onMcpServersChange,
mcpConfigError,
skillSources,
onSkillSourcesChange
}: {
sessions: SessionInfo[];
selectedSessionId: string;
onSelectSession: (session: SessionInfo) => void;
onRefresh: () => void;
onCreateSession: (agentId: string) => void;
onCreateSession: (agentId: string, config: SessionConfig) => void;
onSelectAgent: (agentId: string) => void;
agents: AgentInfo[];
agentsLoading: boolean;
agentsError: string | null;
sessionsLoading: boolean;
sessionsError: string | null;
modesByAgent: Record<string, AgentModeInfo[]>;
modelsByAgent: Record<string, AgentModelInfo[]>;
defaultModelByAgent: Record<string, string>;
modesLoadingByAgent: Record<string, boolean>;
modelsLoadingByAgent: Record<string, boolean>;
modesErrorByAgent: Record<string, string | null>;
modelsErrorByAgent: Record<string, string | null>;
mcpServers: McpServerEntry[];
onMcpServersChange: (servers: McpServerEntry[]) => void;
mcpConfigError: string | null;
skillSources: SkillSource[];
onSkillSourcesChange: (sources: SkillSource[]) => void;
}) => {
const [showMenu, setShowMenu] = useState(false);
const menuRef = useRef<HTMLDivElement | null>(null);
@ -40,15 +77,6 @@ const SessionSidebar = ({
return () => document.removeEventListener("mousedown", handler);
}, [showMenu]);
const agentLabels: Record<string, string> = {
claude: "Claude Code",
codex: "Codex",
opencode: "OpenCode",
amp: "Amp",
pi: "Pi",
mock: "Mock"
};
return (
<div className="session-sidebar">
<div className="sidebar-header">
@ -65,32 +93,27 @@ const SessionSidebar = ({
>
<Plus size={14} />
</button>
{showMenu && (
<div className="sidebar-add-menu">
{agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
{agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
{!agentsLoading && !agentsError && agents.length === 0 && (
<div className="sidebar-add-status">No agents available.</div>
)}
{!agentsLoading && !agentsError &&
agents.map((agent) => (
<button
key={agent.id}
className="sidebar-add-option"
onClick={() => {
onCreateSession(agent.id);
setShowMenu(false);
}}
>
<div className="agent-option-left">
<span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
{agent.version && <span className="agent-badge version">v{agent.version}</span>}
</div>
{agent.installed && <span className="agent-badge installed">Installed</span>}
</button>
))}
</div>
)}
<SessionCreateMenu
agents={agents}
agentsLoading={agentsLoading}
agentsError={agentsError}
modesByAgent={modesByAgent}
modelsByAgent={modelsByAgent}
defaultModelByAgent={defaultModelByAgent}
modesLoadingByAgent={modesLoadingByAgent}
modelsLoadingByAgent={modelsLoadingByAgent}
modesErrorByAgent={modesErrorByAgent}
modelsErrorByAgent={modelsErrorByAgent}
mcpServers={mcpServers}
onMcpServersChange={onMcpServersChange}
mcpConfigError={mcpConfigError}
skillSources={skillSources}
onSkillSourcesChange={onSkillSourcesChange}
onSelectAgent={onSelectAgent}
onCreateSession={onCreateSession}
open={showMenu}
onClose={() => setShowMenu(false)}
/>
</div>
</div>
</div>

View file

@ -1,16 +1,15 @@
import { MessageSquare, PauseCircle, PlayCircle, Plus, Square, Terminal } from "lucide-react";
import { MessageSquare, Plus, Square, Terminal } from "lucide-react";
import { useEffect, useRef, useState } from "react";
import type { AgentInfo, AgentModelInfo, AgentModeInfo, PermissionEventData, QuestionEventData } from "sandbox-agent";
import type { AgentInfo, AgentModelInfo, AgentModeInfo, PermissionEventData, QuestionEventData, SkillSource } from "sandbox-agent";
import type { McpServerEntry } from "../../App";
import ApprovalsTab from "../debug/ApprovalsTab";
import SessionCreateMenu, { type SessionConfig } from "../SessionCreateMenu";
import ChatInput from "./ChatInput";
import ChatMessages from "./ChatMessages";
import ChatSetup from "./ChatSetup";
import type { TimelineEntry } from "./types";
const ChatPanel = ({
sessionId,
polling,
turnStreaming,
transcriptEntries,
sessionError,
message,
@ -18,35 +17,18 @@ const ChatPanel = ({
onSendMessage,
onKeyDown,
onCreateSession,
onSelectAgent,
agents,
agentsLoading,
agentsError,
messagesEndRef,
agentId,
agentLabel,
agentMode,
permissionMode,
model,
variant,
modelOptions,
defaultModel,
modelsLoading,
modelsError,
variantOptions,
defaultVariant,
supportsVariants,
streamMode,
activeModes,
currentAgentVersion,
hasSession,
modesLoading,
modesError,
onAgentModeChange,
onPermissionModeChange,
onModelChange,
onVariantChange,
onStreamModeChange,
onToggleStream,
sessionModel,
sessionVariant,
sessionPermissionMode,
sessionMcpServerCount,
sessionSkillSourceCount,
onEndSession,
eventError,
questionRequests,
@ -55,47 +37,40 @@ const ChatPanel = ({
onSelectQuestionOption,
onAnswerQuestion,
onRejectQuestion,
onReplyPermission
onReplyPermission,
modesByAgent,
modelsByAgent,
defaultModelByAgent,
modesLoadingByAgent,
modelsLoadingByAgent,
modesErrorByAgent,
modelsErrorByAgent,
mcpServers,
onMcpServersChange,
mcpConfigError,
skillSources,
onSkillSourcesChange
}: {
sessionId: string;
polling: boolean;
turnStreaming: boolean;
transcriptEntries: TimelineEntry[];
sessionError: string | null;
message: string;
onMessageChange: (value: string) => void;
onSendMessage: () => void;
onKeyDown: (event: React.KeyboardEvent<HTMLTextAreaElement>) => void;
onCreateSession: (agentId: string) => void;
onCreateSession: (agentId: string, config: SessionConfig) => void;
onSelectAgent: (agentId: string) => void;
agents: AgentInfo[];
agentsLoading: boolean;
agentsError: string | null;
messagesEndRef: React.RefObject<HTMLDivElement>;
agentId: string;
agentLabel: string;
agentMode: string;
permissionMode: string;
model: string;
variant: string;
modelOptions: AgentModelInfo[];
defaultModel: string;
modelsLoading: boolean;
modelsError: string | null;
variantOptions: string[];
defaultVariant: string;
supportsVariants: boolean;
streamMode: "poll" | "sse" | "turn";
activeModes: AgentModeInfo[];
currentAgentVersion?: string | null;
hasSession: boolean;
modesLoading: boolean;
modesError: string | null;
onAgentModeChange: (value: string) => void;
onPermissionModeChange: (value: string) => void;
onModelChange: (value: string) => void;
onVariantChange: (value: string) => void;
onStreamModeChange: (value: "poll" | "sse" | "turn") => void;
onToggleStream: () => void;
sessionModel?: string | null;
sessionVariant?: string | null;
sessionPermissionMode?: string | null;
sessionMcpServerCount: number;
sessionSkillSourceCount: number;
onEndSession: () => void;
eventError: string | null;
questionRequests: QuestionEventData[];
@ -105,6 +80,18 @@ const ChatPanel = ({
onAnswerQuestion: (request: QuestionEventData) => void;
onRejectQuestion: (requestId: string) => void;
onReplyPermission: (requestId: string, reply: "once" | "always" | "reject") => void;
modesByAgent: Record<string, AgentModeInfo[]>;
modelsByAgent: Record<string, AgentModelInfo[]>;
defaultModelByAgent: Record<string, string>;
modesLoadingByAgent: Record<string, boolean>;
modelsLoadingByAgent: Record<string, boolean>;
modesErrorByAgent: Record<string, string | null>;
modelsErrorByAgent: Record<string, string | null>;
mcpServers: McpServerEntry[];
onMcpServersChange: (servers: McpServerEntry[]) => void;
mcpConfigError: string | null;
skillSources: SkillSource[];
onSkillSourcesChange: (sources: SkillSource[]) => void;
}) => {
const [showAgentMenu, setShowAgentMenu] = useState(false);
const menuRef = useRef<HTMLDivElement | null>(null);
@ -121,19 +108,7 @@ const ChatPanel = ({
return () => document.removeEventListener("mousedown", handler);
}, [showAgentMenu]);
const agentLabels: Record<string, string> = {
claude: "Claude Code",
codex: "Codex",
opencode: "OpenCode",
amp: "Amp",
pi: "Pi",
mock: "Mock"
};
const hasApprovals = questionRequests.length > 0 || permissionRequests.length > 0;
const isTurnMode = streamMode === "turn";
const isStreaming = isTurnMode ? turnStreaming : polling;
const turnLabel = turnStreaming ? "Streaming" : "On Send";
return (
<div className="chat-panel">
@ -142,12 +117,6 @@ const ChatPanel = ({
<MessageSquare className="button-icon" />
<span className="panel-title">{sessionId ? "Session" : "No Session"}</span>
{sessionId && <span className="session-id-display">{sessionId}</span>}
{sessionId && (
<span className="session-agent-display">
{agentLabel}
{currentAgentVersion && <span className="session-agent-version">v{currentAgentVersion}</span>}
</span>
)}
</div>
<div className="panel-header-right">
{sessionId && (
@ -161,42 +130,6 @@ const ChatPanel = ({
End
</button>
)}
<div className="setup-stream">
<select
className="setup-select-small"
value={streamMode}
onChange={(e) => onStreamModeChange(e.target.value as "poll" | "sse" | "turn")}
title="Stream Mode"
disabled={!sessionId}
>
<option value="poll">Poll</option>
<option value="sse">SSE</option>
<option value="turn">Turn</option>
</select>
<button
className={`setup-stream-btn ${isStreaming ? "active" : ""}`}
onClick={onToggleStream}
title={isTurnMode ? "Turn streaming starts on send" : polling ? "Stop streaming" : "Start streaming"}
disabled={!sessionId || isTurnMode}
>
{isTurnMode ? (
<>
<PlayCircle size={14} />
<span>{turnLabel}</span>
</>
) : polling ? (
<>
<PauseCircle size={14} />
<span>Pause</span>
</>
) : (
<>
<PlayCircle size={14} />
<span>Resume</span>
</>
)}
</button>
</div>
</div>
</div>
@ -214,32 +147,27 @@ const ChatPanel = ({
<Plus className="button-icon" />
Create Session
</button>
{showAgentMenu && (
<div className="empty-state-menu">
{agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
{agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
{!agentsLoading && !agentsError && agents.length === 0 && (
<div className="sidebar-add-status">No agents available.</div>
)}
{!agentsLoading && !agentsError &&
agents.map((agent) => (
<button
key={agent.id}
className="sidebar-add-option"
onClick={() => {
onCreateSession(agent.id);
setShowAgentMenu(false);
}}
>
<div className="agent-option-left">
<span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
{agent.version && <span className="agent-badge version">v{agent.version}</span>}
</div>
{agent.installed && <span className="agent-badge installed">Installed</span>}
</button>
))}
</div>
)}
<SessionCreateMenu
agents={agents}
agentsLoading={agentsLoading}
agentsError={agentsError}
modesByAgent={modesByAgent}
modelsByAgent={modelsByAgent}
defaultModelByAgent={defaultModelByAgent}
modesLoadingByAgent={modesLoadingByAgent}
modelsLoadingByAgent={modelsLoadingByAgent}
modesErrorByAgent={modesErrorByAgent}
modelsErrorByAgent={modelsErrorByAgent}
mcpServers={mcpServers}
onMcpServersChange={onMcpServersChange}
mcpConfigError={mcpConfigError}
skillSources={skillSources}
onSkillSourcesChange={onSkillSourcesChange}
onSelectAgent={onSelectAgent}
onCreateSession={onCreateSession}
open={showAgentMenu}
onClose={() => setShowAgentMenu(false)}
/>
</div>
</div>
) : transcriptEntries.length === 0 && !sessionError ? (
@ -247,7 +175,7 @@ const ChatPanel = ({
<Terminal className="empty-state-icon" />
<div className="empty-state-title">Ready to Chat</div>
<p className="empty-state-text">Send a message to start a conversation with the agent.</p>
{agentId === "mock" && (
{agentLabel === "Mock" && (
<div className="mock-agent-hint">
The mock agent simulates agent responses for testing the inspector UI without requiring API credentials. Send <code>help</code> for available commands.
</div>
@ -284,30 +212,37 @@ const ChatPanel = ({
onSendMessage={onSendMessage}
onKeyDown={onKeyDown}
placeholder={sessionId ? "Send a message..." : "Select or create a session first"}
disabled={!sessionId || turnStreaming}
disabled={!sessionId}
/>
<ChatSetup
agentMode={agentMode}
permissionMode={permissionMode}
model={model}
variant={variant}
modelOptions={modelOptions}
defaultModel={defaultModel}
modelsLoading={modelsLoading}
modelsError={modelsError}
variantOptions={variantOptions}
defaultVariant={defaultVariant}
supportsVariants={supportsVariants}
activeModes={activeModes}
modesLoading={modesLoading}
modesError={modesError}
onAgentModeChange={onAgentModeChange}
onPermissionModeChange={onPermissionModeChange}
onModelChange={onModelChange}
onVariantChange={onVariantChange}
hasSession={hasSession}
/>
{sessionId && (
<div className="session-config-bar">
<div className="session-config-field">
<span className="session-config-label">Agent</span>
<span className="session-config-value">{agentLabel}</span>
</div>
<div className="session-config-field">
<span className="session-config-label">Model</span>
<span className="session-config-value">{sessionModel || "-"}</span>
</div>
<div className="session-config-field">
<span className="session-config-label">Variant</span>
<span className="session-config-value">{sessionVariant || "-"}</span>
</div>
<div className="session-config-field">
<span className="session-config-label">Permission</span>
<span className="session-config-value">{sessionPermissionMode || "-"}</span>
</div>
<div className="session-config-field">
<span className="session-config-label">MCP Servers</span>
<span className="session-config-value">{sessionMcpServerCount}</span>
</div>
<div className="session-config-field">
<span className="session-config-label">Skills</span>
<span className="session-config-value">{sessionSkillSourceCount}</span>
</div>
</div>
)}
</div>
);
};

View file

@ -1,178 +0,0 @@
import type { AgentModelInfo, AgentModeInfo } from "sandbox-agent";
const ChatSetup = ({
agentMode,
permissionMode,
model,
variant,
modelOptions,
defaultModel,
modelsLoading,
modelsError,
variantOptions,
defaultVariant,
supportsVariants,
activeModes,
hasSession,
modesLoading,
modesError,
onAgentModeChange,
onPermissionModeChange,
onModelChange,
onVariantChange
}: {
agentMode: string;
permissionMode: string;
model: string;
variant: string;
modelOptions: AgentModelInfo[];
defaultModel: string;
modelsLoading: boolean;
modelsError: string | null;
variantOptions: string[];
defaultVariant: string;
supportsVariants: boolean;
activeModes: AgentModeInfo[];
hasSession: boolean;
modesLoading: boolean;
modesError: string | null;
onAgentModeChange: (value: string) => void;
onPermissionModeChange: (value: string) => void;
onModelChange: (value: string) => void;
onVariantChange: (value: string) => void;
}) => {
const hasModelOptions = modelOptions.length > 0;
const showModelSelect = hasModelOptions && !modelsError;
const hasVariantOptions = variantOptions.length > 0;
const showVariantSelect = supportsVariants && hasVariantOptions && !modelsError;
const modelCustom =
model && hasModelOptions && !modelOptions.some((entry) => entry.id === model);
const variantCustom =
variant && hasVariantOptions && !variantOptions.includes(variant);
return (
<div className="setup-row">
<div className="setup-field">
<span className="setup-label">Mode</span>
<select
className="setup-select"
value={agentMode}
onChange={(e) => onAgentModeChange(e.target.value)}
title="Mode"
disabled={!hasSession || modesLoading || Boolean(modesError)}
>
{modesLoading ? (
<option value="">Loading modes...</option>
) : modesError ? (
<option value="">{modesError}</option>
) : activeModes.length > 0 ? (
activeModes.map((mode) => (
<option key={mode.id} value={mode.id}>
{mode.name || mode.id}
</option>
))
) : (
<option value="">Mode</option>
)}
</select>
</div>
<div className="setup-field">
<span className="setup-label">Permission</span>
<select
className="setup-select"
value={permissionMode}
onChange={(e) => onPermissionModeChange(e.target.value)}
title="Permission Mode"
disabled={!hasSession}
>
<option value="default">Default</option>
<option value="plan">Plan</option>
<option value="bypass">Bypass</option>
</select>
</div>
<div className="setup-field">
<span className="setup-label">Model</span>
{showModelSelect ? (
<select
className="setup-select"
value={model}
onChange={(e) => onModelChange(e.target.value)}
title="Model"
disabled={!hasSession || modelsLoading || Boolean(modelsError)}
>
{modelsLoading ? (
<option value="">Loading models...</option>
) : modelsError ? (
<option value="">{modelsError}</option>
) : (
<>
<option value="">
{defaultModel ? `Default (${defaultModel})` : "Default"}
</option>
{modelCustom && <option value={model}>{model} (custom)</option>}
{modelOptions.map((entry) => (
<option key={entry.id} value={entry.id}>
{entry.name ?? entry.id}
</option>
))}
</>
)}
</select>
) : (
<input
className="setup-input"
value={model}
onChange={(e) => onModelChange(e.target.value)}
placeholder="Model"
title="Model"
disabled={!hasSession}
/>
)}
</div>
<div className="setup-field">
<span className="setup-label">Variant</span>
{showVariantSelect ? (
<select
className="setup-select"
value={variant}
onChange={(e) => onVariantChange(e.target.value)}
title="Variant"
disabled={!hasSession || !supportsVariants || modelsLoading || Boolean(modelsError)}
>
{modelsLoading ? (
<option value="">Loading variants...</option>
) : modelsError ? (
<option value="">{modelsError}</option>
) : (
<>
<option value="">
{defaultVariant ? `Default (${defaultVariant})` : "Default"}
</option>
{variantCustom && <option value={variant}>{variant} (custom)</option>}
{variantOptions.map((entry) => (
<option key={entry} value={entry}>
{entry}
</option>
))}
</>
)}
</select>
) : (
<input
className="setup-input"
value={variant}
onChange={(e) => onVariantChange(e.target.value)}
placeholder={supportsVariants ? "Variant" : "Variants unsupported"}
title="Variant"
disabled={!hasSession || !supportsVariants}
/>
)}
</div>
</div>
);
};
export default ChatSetup;

View file

@ -1,4 +1,5 @@
import { Download, RefreshCw } from "lucide-react";
import { Download, Loader2, RefreshCw } from "lucide-react";
import { useState } from "react";
import type { AgentInfo, AgentModeInfo } from "sandbox-agent";
import FeatureCoverageBadges from "../agents/FeatureCoverageBadges";
import { emptyFeatureCoverage } from "../../types/agents";
@ -16,10 +17,21 @@ const AgentsTab = ({
defaultAgents: string[];
modesByAgent: Record<string, AgentModeInfo[]>;
onRefresh: () => void;
onInstall: (agentId: string, reinstall: boolean) => void;
onInstall: (agentId: string, reinstall: boolean) => Promise<void>;
loading: boolean;
error: string | null;
}) => {
const [installingAgent, setInstallingAgent] = useState<string | null>(null);
const handleInstall = async (agentId: string, reinstall: boolean) => {
setInstallingAgent(agentId);
try {
await onInstall(agentId, reinstall);
} finally {
setInstallingAgent(null);
}
};
return (
<>
<div className="inline-row" style={{ marginBottom: 16 }}>
@ -39,19 +51,27 @@ const AgentsTab = ({
: defaultAgents.map((id) => ({
id,
installed: false,
credentialsAvailable: false,
version: undefined,
path: undefined,
capabilities: emptyFeatureCoverage
}))).map((agent) => (
}))).map((agent) => {
const isInstalling = installingAgent === agent.id;
return (
<div key={agent.id} className="card">
<div className="card-header">
<span className="card-title">{agent.id}</span>
<div className="card-header-pills">
<span className={`pill ${agent.installed ? "success" : "danger"}`}>
{agent.installed ? "Installed" : "Missing"}
</span>
<span className={`pill ${agent.credentialsAvailable ? "success" : "warning"}`}>
{agent.credentialsAvailable ? "Authenticated" : "No Credentials"}
</span>
</div>
</div>
<div className="card-meta">
{agent.version ? `v${agent.version}` : "Version unknown"}
{agent.version ?? "Version unknown"}
{agent.path && <span className="mono muted" style={{ marginLeft: 8 }}>{agent.path}</span>}
</div>
<div className="card-meta" style={{ marginTop: 8 }}>
@ -66,15 +86,22 @@ const AgentsTab = ({
</div>
)}
<div className="card-actions">
<button className="button secondary small" onClick={() => onInstall(agent.id, false)}>
<Download className="button-icon" /> Install
</button>
<button className="button ghost small" onClick={() => onInstall(agent.id, true)}>
Reinstall
<button
className="button secondary small"
onClick={() => handleInstall(agent.id, agent.installed)}
disabled={isInstalling}
>
{isInstalling ? (
<Loader2 className="button-icon spinner-icon" />
) : (
<Download className="button-icon" />
)}
{isInstalling ? "Installing..." : agent.installed ? "Reinstall" : "Install"}
</button>
</div>
</div>
))}
);
})}
</>
);
};

View file

@ -40,7 +40,7 @@ const DebugPanel = ({
defaultAgents: string[];
modesByAgent: Record<string, AgentModeInfo[]>;
onRefreshAgents: () => void;
onInstallAgent: (agentId: string, reinstall: boolean) => void;
onInstallAgent: (agentId: string, reinstall: boolean) => Promise<void>;
agentsLoading: boolean;
agentsError: string | null;
}) => {

View file

@ -30,6 +30,10 @@ export const getEventIcon = (type: string) => {
return PlayCircle;
case "session.ended":
return PauseCircle;
case "turn.started":
return PlayCircle;
case "turn.ended":
return PauseCircle;
case "item.started":
return MessageSquare;
case "item.delta":

View file

@ -1,6 +1,6 @@
FROM node:22-alpine AS build
WORKDIR /app
RUN npm install -g pnpm
RUN npm install -g pnpm@9
# Copy website package
COPY frontend/packages/website/package.json ./

View file

@ -17,9 +17,19 @@ fn run() -> Result<(), CliError> {
no_token: cli.no_token,
gigacode: true,
};
let command = cli
.command
.unwrap_or_else(|| Command::Opencode(OpencodeArgs::default()));
let yolo = cli.yolo;
let command = match cli.command {
Some(Command::Opencode(mut args)) => {
args.yolo = args.yolo || yolo;
Command::Opencode(args)
}
Some(other) => other,
None => {
let mut args = OpencodeArgs::default();
args.yolo = yolo;
Command::Opencode(args)
}
};
if let Err(err) = init_logging(&command) {
eprintln!("failed to init logging: {err}");
return Err(err);

View file

@ -27,8 +27,12 @@ release-build-all:
# =============================================================================
[group('dev')]
dev:
pnpm dev -F @sandbox-agent/inspector
dev-daemon:
SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p sandbox-agent -- daemon start --upgrade
[group('dev')]
dev: dev-daemon
pnpm dev -F @sandbox-agent/inspector -- --host 0.0.0.0
[group('dev')]
build:
@ -50,17 +54,27 @@ fmt:
[group('dev')]
install-fast-sa:
cargo build --release -p sandbox-agent
SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build --release -p sandbox-agent
rm -f ~/.cargo/bin/sandbox-agent
cp target/release/sandbox-agent ~/.cargo/bin/sandbox-agent
[group('dev')]
install-fast-gigacode:
cargo build --release -p gigacode
install-gigacode:
SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build --release -p gigacode
rm -f ~/.cargo/bin/gigacode
cp target/release/gigacode ~/.cargo/bin/gigacode
[group('dev')]
run-sa *ARGS:
SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p sandbox-agent -- {{ ARGS }}
[group('dev')]
run-gigacode *ARGS:
SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p gigacode -- {{ ARGS }}
[group('dev')]
dev-docs:
cd docs && pnpm dlx mintlify dev
cd docs && pnpm dlx mintlify dev --host 0.0.0.0
install:
pnpm install
@ -77,4 +91,3 @@ install-release:
pnpm build --filter @sandbox-agent/inspector...
cargo install --path server/packages/sandbox-agent
cargo install --path gigacode

1250
pnpm-lock.yaml generated

File diff suppressed because it is too large Load diff

View file

@ -415,6 +415,31 @@ if let Some(model) = options.model.as_deref() {
3. **Wait for Amp API** — Amp may add model/mode discovery in a future release
4. **Scrape ampcode.com** — Check if the web UI exposes available modes/models
## Command Execution & Process Management
### Agent Tool Execution
Amp executes commands via the `Bash` tool, similar to Claude Code. Synchronous execution, blocks the agent turn. Permission rules can pre-authorize specific commands:
```typescript
{ tool: "Bash", matches: { command: "git *" }, action: "allow" }
```
### No User-Initiated Command Injection
Amp does not expose any mechanism for external clients to inject command results into the agent's context. No `!` prefix equivalent, no command injection API.
### Comparison
| Capability | Supported? | Notes |
|-----------|-----------|-------|
| Agent runs commands | Yes (`Bash` tool) | Synchronous, blocks agent turn |
| User runs commands → agent sees output | No | |
| External API for command injection | No | |
| Command source tracking | No | |
| Background process management | No | Shell `&` only |
| PTY / interactive terminal | No | |
## Notes
- Amp is similar to Claude Code (same streaming format)

View file

@ -279,6 +279,44 @@ x-api-key: <ANTHROPIC_API_KEY>
anthropic-version: 2023-06-01
```
## Command Execution & Process Management
### Agent Tool Execution
The agent executes commands via the `Bash` tool. This is synchronous - the agent blocks until the command exits. Tool schema:
```json
{
"command": "string",
"timeout": "number",
"workingDirectory": "string"
}
```
There is no background process support. If the agent needs a long-running process (e.g., dev server), it uses shell backgrounding (`&`) within a single `Bash` tool call.
### User-Initiated Command Execution (`!` prefix)
Claude Code's TUI supports `!command` syntax where the user types `!npm test` to run a command directly. The output is injected into the conversation as a user message so the agent can see it on the next turn.
**This is a client-side TUI feature only.** It is not exposed in the API schema or streaming protocol. The CLI runs the command locally and stuffs the output into the next user message. There is no protocol-level concept of "user ran a command" vs "agent ran a command."
### No External Command Injection API
External clients (SDKs, frontends) cannot programmatically inject command results into Claude's conversation context. The only way to provide command output to the agent is:
- Include it in the user prompt text
- Use the `!` prefix in the interactive TUI
### Comparison
| Capability | Supported? | Notes |
|-----------|-----------|-------|
| Agent runs commands | Yes (`Bash` tool) | Synchronous, blocks agent turn |
| User runs commands → agent sees output | Yes (`!cmd` in TUI) | Client-side only, not in protocol |
| External API for command injection | No | |
| Background process management | No | Shell `&` only |
| PTY / interactive terminal | No | |
## Notes
- Claude CLI manages its own OAuth refresh internally

View file

@ -347,6 +347,68 @@ Requires a running Codex app-server process. Send the JSON-RPC request to the ap
- Requires an active app-server process (cannot query models without starting one)
- No standalone CLI command like `codex models`
## Command Execution & Process Management
### Agent Tool Execution
Codex executes commands via `LocalShellAction`. The agent proposes a command, and external clients approve/deny via JSON-RPC (`item/commandExecution/requestApproval`).
### Command Source Tracking (`ExecCommandSource`)
Codex is the only agent that explicitly tracks **who initiated a command** at the protocol level:
```json
{
"ExecCommandSource": {
"enum": ["agent", "user_shell", "unified_exec_startup", "unified_exec_interaction"]
}
}
```
| Source | Meaning |
|--------|---------|
| `agent` | Agent decided to run this command via tool call |
| `user_shell` | User ran a command in a shell (equivalent to Claude Code's `!` prefix) |
| `unified_exec_startup` | Startup script ran this command |
| `unified_exec_interaction` | Interactive execution |
This means user-initiated shell commands are **first-class protocol events** in Codex, not a client-side hack like Claude Code's `!` prefix.
### Command Execution Events
Codex emits structured events for command execution:
- `exec_command_begin` - Command started (includes `source`, `command`, `cwd`, `turn_id`)
- `exec_command_output_delta` - Streaming output chunk (includes `stream: stdout|stderr`)
- `exec_command_end` - Command completed (includes `exit_code`, `source`)
### Parsed Command Analysis (`CommandAction`)
Codex provides semantic analysis of what a command does:
```json
{
"commandActions": [
{ "type": "read", "path": "/src/main.ts" },
{ "type": "write", "path": "/src/utils.ts" },
{ "type": "install", "package": "lodash" }
]
}
```
Action types: `read`, `write`, `listFiles`, `search`, `install`, `remove`, `other`.
### Comparison
| Capability | Supported? | Notes |
|-----------|-----------|-------|
| Agent runs commands | Yes (`LocalShellAction`) | With approval workflow |
| User runs commands → agent sees output | Yes (`user_shell` source) | First-class protocol event |
| External API for command injection | Yes (JSON-RPC approval) | Can approve/deny before execution |
| Command source tracking | Yes (`ExecCommandSource` enum) | Distinguishes agent vs user vs startup |
| Background process management | No | |
| PTY / interactive terminal | No | |
## Notes
- SDK is dynamically imported to reduce bundle size

View file

@ -585,6 +585,60 @@ const response = await client.provider.list();
When an OpenCode server is running, call `GET /provider` on its HTTP port. Returns full model metadata including capabilities, costs, context limits, and modalities.
## Command Execution & Process Management
### Agent Tool Execution
The agent executes commands via internal tools (not exposed in the HTTP API). The agent's tool calls are synchronous within its turn. Tool parts have states: `pending`, `running`, `completed`, `error`.
### PTY System (`/pty/*`) - User-Facing Terminals
Separate from the agent's command execution. PTYs are server-scoped interactive terminals for the user:
- `POST /pty` - Create PTY (command, args, cwd, title, env)
- `GET /pty` - List all PTYs
- `GET /pty/{ptyID}` - Get PTY info
- `PUT /pty/{ptyID}` - Update PTY (title, resize via `size: {rows, cols}`)
- `DELETE /pty/{ptyID}` - Kill and remove PTY
- `GET /pty/{ptyID}/connect` - WebSocket for bidirectional I/O
PTY events (globally broadcast via SSE): `pty.created`, `pty.updated`, `pty.exited`, `pty.deleted`.
The agent does NOT use the PTY system. PTYs are for the user's interactive terminal panel, independent of any AI session.
### Session Commands (`/session/{id}/command`, `/session/{id}/shell`) - Context Injection
External clients can inject command results into an AI session's conversation context:
- `POST /session/{sessionID}/command` - Executes a command and records the result as an `AssistantMessage` in the session. Required fields: `command`, `arguments`. The output becomes part of the AI's context for subsequent turns.
- `POST /session/{sessionID}/shell` - Similar but wraps in `sh -c`. Required fields: `command`, `agent`.
- `GET /command` - Lists available command definitions (metadata, not execution).
Session commands emit `command.executed` events with `sessionID` + `messageID`.
**Key distinction**: These endpoints execute commands directly (not via the AI), then inject the output into the session as if the AI produced it. The AI doesn't actively run the command - it just finds the output in its conversation history on the next turn.
### Three Separate Execution Mechanisms
| Mechanism | Who uses it | Scoped to | AI sees output? |
|-----------|-------------|-----------|----------------|
| Agent tools (internal) | AI agent | Session turn | Yes (immediate) |
| PTY (`/pty/*`) | User/frontend | Server (global) | No |
| Session commands (`/session/{id}/*`) | Frontend/SDK client | Session | Yes (next turn) |
The agent has no tool to interact with PTYs and cannot access the session command endpoints. When the agent needs to run a background process, it uses its internal bash-equivalent tool with shell backgrounding (`&`).
### Comparison
| Capability | Supported? | Notes |
|-----------|-----------|-------|
| Agent runs commands | Yes (internal tools) | Synchronous, blocks agent turn |
| User runs commands → agent sees output | Yes (`/session/{id}/command`) | HTTP API, first-class |
| External API for command injection | Yes | Session-scoped endpoints |
| Command source tracking | Implicit | Endpoint implies source (no enum) |
| Background process management | No | Shell `&` only for agent |
| PTY / interactive terminal | Yes (`/pty/*`) | Server-scoped, WebSocket I/O |
## Notes
- OpenCode is the most feature-rich runtime (streaming, questions, permissions)

View file

@ -0,0 +1,374 @@
# Research: Process & Terminal System Design
Research on PTY/terminal and process management APIs across sandbox platforms, with design recommendations for sandbox-agent.
## Competitive Landscape
### Transport Comparison
| Platform | PTY Transport | Command Transport | Unified? |
|----------|--------------|-------------------|----------|
| **OpenCode** | WebSocket (`/pty/{id}/connect`) | REST (session-scoped, AI-mediated) | No |
| **E2B** | gRPC server-stream (output) + unary RPC (input) | Same gRPC service | Yes |
| **Daytona** | WebSocket | REST | No |
| **Kubernetes** | WebSocket (channel byte mux) | Same WebSocket | Yes |
| **Docker** | HTTP connection hijack | Same connection | Yes |
| **Fly.io** | SSH over WireGuard | REST (sync, 60s max) | No |
| **Vercel Sandboxes** | No PTY API | REST SDK (async generator for logs) | N/A |
| **Gitpod** | gRPC (Listen=output, Write=input) | Same gRPC service | Yes |
### Resize Mechanism
| Platform | How | Notes |
|----------|-----|-------|
| **OpenCode** | `PUT /pty/{id}` with `size: {rows, cols}` | Separate REST call |
| **E2B** | Separate `Update` RPC | Separate gRPC call |
| **Daytona** | Separate HTTP POST | Sends SIGWINCH |
| **Kubernetes** | In-band WebSocket message (channel byte 4) | `{"Width": N, "Height": N}` |
| **Docker** | `POST /exec/{id}/resize?h=N&w=N` | Separate REST call |
| **Gitpod** | Separate `SetSize` RPC | Separate gRPC call |
**Consensus**: Almost all platforms use a separate call for resize. Only Kubernetes does it in-band. Since resize is a control signal (not data), a separate mechanism is cleaner.
### I/O Multiplexing
I/O multiplexing is how platforms distinguish between stdout, stderr, and PTY data on a shared connection.
| Platform | Method | Detail |
|----------|--------|--------|
| **Docker** | 8-byte binary header per frame | Byte 0 = stream type (0=stdin, 1=stdout, 2=stderr). When TTY=true, no mux (raw stream). |
| **Kubernetes** | 1-byte channel prefix per WebSocket message | 0=stdin, 1=stdout, 2=stderr, 3=error, 4=resize, 255=close |
| **E2B** | gRPC `oneof` in protobuf | `DataEvent.output` is `oneof { bytes stdout, bytes stderr, bytes pty }` |
| **OpenCode** | None | PTY is a unified stream. Commands capture stdout/stderr separately in response. |
| **Daytona** | None | PTY is unified. Commands return structured `{stdout, stderr}`. |
**Key insight**: When a process runs with a PTY allocated, stdout and stderr are merged by the kernel into a single stream. Multiplexing only matters for non-PTY command execution. OpenCode and Daytona handle this by keeping PTY (unified stream) and commands (structured response) as separate APIs.
### Reconnection
| Platform | Method | Replays missed output? |
|----------|--------|----------------------|
| **E2B** | `Connect` RPC by PID or tag | No - only new events from reconnect point |
| **Daytona** | New WebSocket to same PTY session | No |
| **Kubernetes** | Not supported (connection = session) | N/A |
| **Docker** | Not supported (connection = session) | N/A |
| **OpenCode** | `GET /pty/{id}/connect` (WebSocket) | Unknown (not documented) |
### Process Identification
| Platform | ID Type | Notes |
|----------|---------|-------|
| **OpenCode** | String (`pty_N`) | Pattern `^pty.*` |
| **E2B** | PID (uint32) or tag (string) | Dual selector |
| **Daytona** | Session ID / PID | |
| **Docker** | Exec ID (string, server-generated) | |
| **Kubernetes** | Connection-scoped | No ID - the WebSocket IS the process |
| **Gitpod** | Alias (string) | Human-readable |
### Scoping
| Platform | PTY Scope | Command Scope |
|----------|-----------|---------------|
| **OpenCode** | Server-wide (global) | Session-specific (AI-mediated) |
| **E2B** | Sandbox-wide | Sandbox-wide |
| **Daytona** | Sandbox-wide | Sandbox-wide |
| **Docker** | Container-scoped | Container-scoped |
| **Kubernetes** | Pod-scoped | Pod-scoped |
## Key Questions & Analysis
### Q: Should PTY transport be WebSocket?
**Yes.** WebSocket is the right choice for PTY I/O:
- Bidirectional: client sends keystrokes, server sends terminal output
- Low latency: no HTTP request overhead per keystroke
- Persistent connection: terminal sessions are long-lived
- Industry consensus: OpenCode, Daytona, and Kubernetes all use WebSocket for PTY
### Q: Should command transport be WebSocket or REST?
**REST is sufficient for commands. WebSocket is not needed.**
The distinction comes down to the nature of each operation:
- **PTY**: Long-lived, bidirectional, interactive. User types, terminal responds. Needs WebSocket.
- **Commands**: Request-response. Client says "run `ls -la`", server runs it, returns stdout/stderr/exit_code. This is a natural REST operation.
The "full duplex" question: commands don't need full duplex because:
1. Input is sent once at invocation (the command string)
2. Output is collected and returned when the process exits
3. There's no ongoing interactive input during execution
For **streaming output** of long-running commands (e.g., `npm install`), there are two clean options:
1. **SSE**: Server-Sent Events for output streaming (output-only, which is all you need)
2. **PTY**: If the user needs to interact with the process (send ctrl+c, provide stdin), they should use a PTY instead
This matches how OpenCode separates the two: commands are REST, PTYs are WebSocket.
**Recommendation**: Keep commands as REST. If a command needs streaming output or interactive input, the user should create a PTY instead. This avoids building a second WebSocket protocol for a use case that PTYs already cover.
### Q: Should resize be WebSocket in-band or separate POST?
**Separate endpoint (PUT or POST).**
Reasons:
- Resize is a control signal, not data. Mixing it into the data stream requires a framing protocol to distinguish resize messages from terminal input.
- OpenCode already defines `PUT /pty/{id}` with `size: {rows, cols}` - this is the existing spec.
- E2B, Daytona, Docker, and Gitpod all use separate calls.
- Only Kubernetes does in-band (because their channel-byte protocol already has a mux layer).
- A separate endpoint is simpler to implement, test, and debug.
**Recommendation**: Use `PUT /pty/{id}` with `size` field (matching OpenCode spec). Alternatively, a dedicated `POST /pty/{id}/resize` if we want to keep update and resize semantically separate.
### Q: What is I/O multiplexing?
I/O multiplexing is the mechanism for distinguishing between different data streams (stdout, stderr, stdin, control signals) on a single connection.
**When it matters**: Non-PTY command execution where stdout and stderr need to be kept separate.
**When it doesn't matter**: PTY sessions. When a PTY is allocated, the kernel merges stdout and stderr into a single stream (the PTY master fd). There is only one output stream. This is why terminals show stdout and stderr interleaved - the PTY doesn't distinguish them.
**For sandbox-agent**: Since PTYs are unified streams and commands use REST (separate stdout/stderr in the JSON response), we don't need a multiplexing protocol. The API design naturally separates the two cases.
### Q: How should reconnect work?
**Reconnect is an application-level concept, not just HTTP/WebSocket reconnection.**
The distinction:
- **HTTP/WebSocket reconnect**: The transport-level connection drops and is re-established. This is handled by the client library automatically (retry logic, exponential backoff). The server doesn't need to know.
- **Process reconnect**: The client disconnects from a running process but the process keeps running. Later, the client (or a different client) connects to the same process and starts receiving output again.
**E2B's model**: Disconnecting a stream (via AbortController) leaves the process running. `Connect` RPC by PID or tag re-establishes the output stream. Missed output during disconnection is lost. This works because:
1. Processes are long-lived (servers, shells)
2. For terminals, the screen state can be recovered by the shell/application redrawing
3. For commands, if you care about all output, don't disconnect
**Recommendation for sandbox-agent**: Reconnect should be supported at the application level:
1. `GET /pty/{id}/connect` (WebSocket) can be called multiple times for the same PTY
2. If the WebSocket drops, the PTY process keeps running
3. Client reconnects by opening a new WebSocket to the same endpoint
4. No output replay (too complex, rarely needed - terminal apps redraw on reconnect via SIGWINCH)
5. This is essentially what OpenCode's `/pty/{id}/connect` endpoint already implies
This naturally leads to the **persistent process system** concept (see below).
### Q: How are PTY events different from PTY transport?
Two completely separate channels serving different purposes:
**PTY Events** (via SSE on `/event` or `/sessions/{id}/events/sse`):
- Lifecycle notifications: `pty.created`, `pty.updated`, `pty.exited`, `pty.deleted`
- Lightweight JSON metadata (PTY id, status, exit code)
- Broadcast to all subscribers
- Used by UIs to update PTY lists, show status indicators, handle cleanup
**PTY Transport** (via WebSocket on `/pty/{id}/connect`):
- Raw terminal I/O: binary input/output bytes
- High-frequency, high-bandwidth
- Point-to-point (one client connected to one PTY)
- Used by terminal emulators (xterm.js) to render the terminal
**Analogy**: Events are like email notifications ("a new terminal was opened"). Transport is like the phone call (the actual terminal session).
### Q: How are PTY and commands different in OpenCode?
They serve fundamentally different purposes:
**PTY (`/pty/*`)** - Direct execution environment:
- Server-scoped (not tied to any AI session)
- Creates a real terminal process
- User interacts directly via WebSocket
- Not part of the AI conversation
- Think: "the terminal panel in VS Code"
**Commands (`/session/{sessionID}/command`, `/session/{sessionID}/shell`)** - AI-mediated execution:
- Session-scoped (tied to an AI session)
- The command is sent **to the AI assistant** for execution
- Creates an `AssistantMessage` in the session's conversation history
- Output becomes part of the AI's context
- Think: "asking Claude to run a command as a tool call"
**Why commands are session-specific**: Because they're AI operations, not direct execution. When you call `POST /session/{id}/command`, the server:
1. Creates an assistant message in the session
2. Runs the command
3. Captures output as message parts
4. Emits `message.part.updated` events
5. The AI can see this output in subsequent turns
This is how the AI "uses terminal tools" - the command infrastructure provides the bridge between the AI session and system execution.
### Q: Should scoping be system-wide?
**Yes, for both PTY and commands.**
Current OpenCode behavior:
- PTYs: Already server-wide (global)
- Commands: Session-scoped (for AI context injection)
**For sandbox-agent**, since we're the orchestration layer (not the AI):
- **PTYs**: System-wide. Any client should be able to list, connect to, or manage any PTY.
- **Commands/processes**: System-wide. Process execution is a system primitive, not an AI primitive. If a caller wants to associate a process with a session, they can do so at their layer.
The session-scoping of commands in OpenCode is an OpenCode-specific concern (AI context injection). Sandbox-agent should provide the lower-level primitive (system-wide process execution) and let the OpenCode compat layer handle the session association.
## Persistent Process System
### The Concept
A persistent process system means:
1. **Spawn** a process (PTY or command) via API
2. Process runs independently of any client connection
3. **Connect/disconnect** to the process I/O at will
4. Process continues running through disconnections
5. **Query** process status, list running processes
6. **Kill/signal** processes explicitly
This is distinct from the typical "connection = process lifetime" model (Kubernetes, Docker exec) where closing the connection kills the process.
### How E2B Does It
E2B's `Process` service is the best reference implementation:
```
Start(cmd, pty?) → stream of events (output)
Connect(pid/tag) → stream of events (reconnect)
SendInput(pid, data) → ok
Update(pid, size) → ok (resize)
SendSignal(pid, signal) → ok
List() → running processes
```
Key design choices:
- **Unified service**: PTY and command are the same service, differentiated by the `pty` field in `StartRequest`
- **Process outlives connection**: Disconnecting the output stream (aborting the `Start`/`Connect` RPC) does NOT kill the process
- **Explicit termination**: Must call `SendSignal(SIGKILL)` to stop a process
- **Tag-based selection**: Processes can be tagged at creation for later lookup without knowing the PID
### Recommendation for Sandbox-Agent
Sandbox-agent should implement a **persistent process manager** that:
1. **Is system-wide** (not session-scoped)
2. **Supports both PTY and non-PTY modes**
3. **Decouples process lifetime from connection lifetime**
4. **Exposes via both REST (lifecycle) and WebSocket (I/O)**
#### Proposed API Surface
**Process Lifecycle (REST)**:
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/processes` | Create/spawn a process (PTY or command) |
| `GET` | `/v1/processes` | List all processes |
| `GET` | `/v1/processes/{id}` | Get process info (status, pid, exit code) |
| `DELETE` | `/v1/processes/{id}` | Kill process (SIGTERM, then SIGKILL) |
| `POST` | `/v1/processes/{id}/signal` | Send signal (SIGTERM, SIGKILL, SIGINT, etc.) |
| `POST` | `/v1/processes/{id}/resize` | Resize PTY (rows, cols) |
| `POST` | `/v1/processes/{id}/input` | Send stdin/pty input (REST fallback) |
**Process I/O (WebSocket)**:
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/v1/processes/{id}/connect` | WebSocket for bidirectional I/O |
**Process Events (SSE)**:
| Event | Description |
|-------|-------------|
| `process.created` | Process spawned |
| `process.updated` | Process metadata changed |
| `process.exited` | Process terminated (includes exit code) |
| `process.deleted` | Process record removed |
#### Create Request
```json
{
"command": "bash",
"args": ["-i", "-l"],
"cwd": "/workspace",
"env": {"TERM": "xterm-256color"},
"pty": { // Optional - if present, allocate PTY
"rows": 24,
"cols": 80
},
"tag": "main-terminal", // Optional - for lookup by name
"label": "Terminal 1" // Optional - display name
}
```
#### Process Object
```json
{
"id": "proc_abc123",
"tag": "main-terminal",
"label": "Terminal 1",
"command": "bash",
"args": ["-i", "-l"],
"cwd": "/workspace",
"pid": 12345,
"pty": true,
"status": "running", // "running" | "exited"
"exit_code": null, // Set when exited
"created_at": "2025-01-15T...",
"exited_at": null
}
```
#### OpenCode Compatibility Layer
The OpenCode compat layer maps to this system:
| OpenCode Endpoint | Maps To |
|-------------------|---------|
| `POST /pty` | `POST /v1/processes` (with `pty` field) |
| `GET /pty` | `GET /v1/processes?pty=true` |
| `GET /pty/{id}` | `GET /v1/processes/{id}` |
| `PUT /pty/{id}` | `POST /v1/processes/{id}/resize` + metadata update |
| `DELETE /pty/{id}` | `DELETE /v1/processes/{id}` |
| `GET /pty/{id}/connect` | `GET /v1/processes/{id}/connect` |
| `POST /session/{id}/command` | Create process + capture output into session |
| `POST /session/{id}/shell` | Create process (shell mode) + capture output into session |
### Open Questions
1. **Output buffering for reconnect**: Should we buffer recent output (e.g., last 64KB) so reconnecting clients get some history? E2B doesn't do this, but it would improve UX for flaky connections.
2. **Process limits**: Should there be a max number of concurrent processes? E2B doesn't expose one, but sandbox environments have limited resources.
3. **Auto-cleanup**: Should processes be auto-cleaned after exiting? Options:
- Keep forever until explicitly deleted
- Auto-delete after N seconds/minutes
- Keep metadata but release resources
4. **Input via REST vs WebSocket-only**: The REST `POST /processes/{id}/input` endpoint is useful for one-shot input (e.g., "send ctrl+c") without establishing a WebSocket. E2B has both `SendInput` (unary) and `StreamInput` (streaming) for this reason.
5. **Multiple WebSocket connections to same process**: Should we allow multiple clients to connect to the same process simultaneously? (Pair programming, monitoring). E2B supports this via multiple `Connect` calls.
## User-Initiated Command Injection ("Run command, give AI context")
A common pattern across agents: the user (or frontend) runs a command and the output is injected into the AI's conversation context. This is distinct from the agent running a command via its own tools.
| Agent | Feature | Mechanism | Protocol-level? |
|-------|---------|-----------|----------------|
| **Claude Code** | `!command` prefix in TUI | CLI runs command locally, injects output as user message | No - client-side hack, not in API schema |
| **Codex** | `user_shell` source | `ExecCommandSource` enum distinguishes `agent` vs `user_shell` vs `unified_exec_*` | Yes - first-class protocol event |
| **OpenCode** | `/session/{id}/command` | HTTP endpoint runs command, records result as `AssistantMessage` | Yes - HTTP API |
| **Amp** | N/A | Not supported | N/A |
**Design implication for sandbox-agent**: The process system should support an optional `session_id` field when creating a process. If provided, the process output is associated with that session so the agent can see it. If not provided, the process runs independently (like a PTY). This unifies:
- User interactive terminals (no session association)
- User-initiated commands for AI context (session association)
- Agent-initiated background processes (session association)
## Sources
- [E2B Process Proto](https://github.com/e2b-dev/E2B) - `process.proto` gRPC service definition
- [E2B JS SDK](https://github.com/e2b-dev/E2B/tree/main/packages/js-sdk) - `commands/pty.ts`, `commands/index.ts`
- [Daytona SDK](https://www.daytona.io/docs/en/typescript-sdk/process/) - REST + WebSocket PTY API
- [Kubernetes RemoteCommand](https://github.com/kubernetes/apimachinery/blob/master/pkg/util/remotecommand/constants.go) - WebSocket subprotocol
- [Docker Engine API](https://docker-docs.uclv.cu/engine/api/v1.21/) - Exec API with stream multiplexing
- [Fly.io Machines API](https://fly.io/docs/machines/api/) - REST exec with 60s limit
- [Gitpod terminal.proto](https://codeberg.org/kanishka-reading-list/gitpod/src/branch/main/components/supervisor-api/terminal.proto) - gRPC terminal service
- [OpenCode OpenAPI Spec](https://github.com/opencode-ai/opencode) - PTY and session command endpoints

View file

@ -0,0 +1,442 @@
# Universal Agent Configuration Support
Work-in-progress research on configuration features across agents and what can be made universal.
---
## TODO: Features Needed for Full Coverage
### Currently Implemented (in `CreateSessionRequest`)
- [x] `agent` - Agent selection (claude, codex, opencode, amp)
- [x] `agentMode` - Agent mode (plan, build, default)
- [x] `permissionMode` - Permission mode (default, plan, bypass)
- [x] `model` - Model selection
- [x] `variant` - Reasoning variant
- [x] `agentVersion` - Agent version selection
- [x] `mcp` - MCP server configuration (Claude/Codex/OpenCode/Amp)
- [x] `skills` - Skill path configuration (link or copy into agent skill roots)
### Tier 1: Universal Features (High Priority)
- [ ] `projectInstructions` - Inject CLAUDE.md / AGENTS.md content
- Write to appropriate file before agent spawn
- All agents support this natively
- [ ] `workingDirectory` - Set working directory for session
- Currently captures server `cwd` on session creation; not yet user-configurable
- [x] `mcp` - MCP server configuration
- Claude: Writes `.mcp.json` entries under `mcpServers`
- Codex: Updates `.codex/config.toml` with `mcp_servers`
- Amp: Calls `amp mcp add` for each server
- OpenCode: Uses `/mcp` API
- [x] `skills` - Skill path configuration
- Claude: Link to `./.claude/skills/<name>/`
- Codex: Link to `./.agents/skills/<name>/`
- OpenCode: Link to `./.opencode/skill/<name>/` + config `skills.paths`
- Amp: Link to Claude/Codex-style directories
- [ ] `credentials` - Pass credentials via API (not just env vars)
- Currently extracted from host env
- Need API-level credential injection
### Filesystem API (Implemented)
- [x] `/v1/fs` - Read/write/list/move/delete/stat files and upload batches
- Batch upload is tar-only (`application/x-tar`) with path output capped at 1024
- Relative paths resolve from session working dir when `sessionId` is provided
- CLI `sandbox-agent api fs ...` covers all filesystem endpoints
### Message Attachments (Implemented)
- [x] `MessageRequest.attachments` - Attach uploaded files when sending prompts
- OpenCode receives file parts; other agents get attachment paths appended to the prompt
### Tier 2: Partial Support (Medium Priority)
- [ ] `appendSystemPrompt` - High-priority system prompt additions
- Claude: `--append-system-prompt` flag
- Codex: `developer_instructions` config
- OpenCode: Custom agent definition
- Amp: Not supported (fallback to projectInstructions)
- [ ] `resumeSession` / native session resume
- Claude: `--resume SESSION_ID`
- Codex: Thread persistence (automatic)
- OpenCode: `-c/--continue`
- Amp: `--continue SESSION_ID`
### Tier 3: Agent-Specific Pass-through (Low Priority)
- [ ] `agentSpecific.claude` - Raw Claude options
- [ ] `agentSpecific.codex` - Raw Codex options (e.g., `replaceSystemPrompt`)
- [ ] `agentSpecific.opencode` - Raw OpenCode options (e.g., `customAgent`)
- [ ] `agentSpecific.amp` - Raw Amp options (e.g., `permissionRules`)
### Event/Feature Coverage Gaps (from compatibility matrix)
| Feature | Claude | Codex | OpenCode | Amp | Status |
|---------|--------|-------|----------|-----|--------|
| Tool Calls | —* | ✓ | ✓ | ✓ | Claude coming soon |
| Tool Results | —* | ✓ | ✓ | ✓ | Claude coming soon |
| Questions (HITL) | —* | — | ✓ | — | Only OpenCode |
| Permissions (HITL) | —* | — | ✓ | — | Only OpenCode |
| Images | — | ✓ | ✓ | — | 2/4 agents |
| File Attachments | — | ✓ | ✓ | — | 2/4 agents |
| Session Lifecycle | — | ✓ | ✓ | — | 2/4 agents |
| Reasoning/Thinking | — | ✓ | — | — | Codex only |
| Command Execution | — | ✓ | — | — | Codex only |
| File Changes | — | ✓ | — | — | Codex only |
| MCP Tools | ✓ | ✓ | ✓ | ✓ | Supported via session MCP config injection |
| Streaming Deltas | — | ✓ | ✓ | — | 2/4 agents |
\* Claude features marked as "coming imminently"
### Implementation Order (Suggested)
1. **mcp** - Done (session config injection + agent config writers)
2. **skills** - Done (session config injection + skill directory linking)
3. **projectInstructions** - Highest value, all agents support
4. **appendSystemPrompt** - High-priority instructions
5. **workingDirectory** - Basic session configuration
6. **resumeSession** - Session continuity
7. **credentials** - API-level auth injection
8. **agentSpecific** - Escape hatch for edge cases
---
## Legend
- ✅ Native support
- 🔄 Can be adapted/emulated
- ❌ Not supported
- ⚠️ Supported with caveats
---
## 1. Instructions & System Prompt
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Project instructions file** | ✅ `CLAUDE.md` | ✅ `AGENTS.md` | 🔄 Config-based | ⚠️ Limited | ✅ Yes - write to agent's file |
| **Append to system prompt** | ✅ `--append-system-prompt` | ✅ `developer_instructions` | 🔄 Custom agent | ❌ | ⚠️ Partial - 3/4 agents |
| **Replace system prompt** | ❌ | ✅ `model_instructions_file` | 🔄 Custom agent | ❌ | ❌ No - Codex only |
| **Hierarchical discovery** | ✅ cwd → root | ✅ root → cwd | ❌ | ❌ | ❌ No - Claude/Codex only |
### Priority Comparison
| Agent | Priority Order (highest → lowest) |
|-------|-----------------------------------|
| Claude | `--append-system-prompt` > base prompt > `CLAUDE.md` |
| Codex | `AGENTS.md` > `developer_instructions` > base prompt |
| OpenCode | Custom agent prompt > base prompt |
| Amp | Server-controlled (opaque) |
### Key Differences
**Claude**: System prompt additions have highest priority. `CLAUDE.md` is injected as first user message (below system prompt).
**Codex**: Project instructions (`AGENTS.md`) have highest priority and can override system prompt. This is the inverse of Claude's model.
---
## 2. Permission Modes
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Read-only** | ✅ `plan` | ✅ `read-only` | 🔄 Rulesets | 🔄 Rules | ✅ Yes |
| **Write workspace** | ✅ `acceptEdits` | ✅ `workspace-write` | 🔄 Rulesets | 🔄 Rules | ✅ Yes |
| **Full bypass** | ✅ `--dangerously-skip-permissions` | ✅ `danger-full-access` | 🔄 Allow-all ruleset | ✅ `--dangerously-skip-permissions` | ✅ Yes |
| **Per-tool rules** | ❌ | ❌ | ✅ | ✅ | ❌ No - OpenCode/Amp only |
### Universal Mapping
```typescript
type PermissionMode = "readonly" | "write" | "bypass";
// Maps to:
// Claude: plan | acceptEdits | --dangerously-skip-permissions
// Codex: read-only | workspace-write | danger-full-access
// OpenCode: restrictive ruleset | permissive ruleset | allow-all
// Amp: reject rules | allow rules | dangerouslyAllowAll
```
---
## 3. Agent Modes
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Plan mode** | ✅ `--permission-mode plan` | 🔄 Prompt prefix | ✅ `--agent plan` | 🔄 Mode selection | ✅ Yes |
| **Build/execute mode** | ✅ Default | ✅ Default | ✅ `--agent build` | ✅ Default | ✅ Yes |
| **Chat mode** | ❌ | 🔄 Prompt prefix | ❌ | ❌ | ❌ No - Codex only |
| **Custom agents** | ❌ | ❌ | ✅ Config-defined | ❌ | ❌ No - OpenCode only |
---
## 4. Model & Variant Selection
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Model selection** | ✅ `--model` | ✅ `-m/--model` | ✅ `-m provider/model` | ⚠️ `--mode` (abstracted) | ⚠️ Partial |
| **Model discovery API** | ✅ Anthropic API | ✅ `model/list` RPC | ✅ `GET /provider` | ❌ Server-side | ⚠️ Partial - 3/4 |
| **Reasoning variants** | ❌ | ✅ `model_reasoning_effort` | ✅ `--variant` | ✅ Deep mode levels | ⚠️ Partial |
---
## 5. MCP & Tools
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **MCP servers** | ✅ `mcpServers` in settings | ✅ `mcp_servers` in config | ✅ `/mcp` API | ✅ `--toolbox` | ✅ Yes - inject config |
| **Tool restrictions** | ❌ | ❌ | ✅ Per-tool permissions | ✅ Permission rules | ⚠️ Partial |
### MCP Config Mapping
| Agent | Local Server | Remote Server |
|-------|--------------|---------------|
| Claude | `.mcp.json` or `.claude/settings.json``mcpServers` | Same, with `url` |
| Codex | `.codex/config.toml``mcp_servers` | Same schema |
| OpenCode | `/mcp` API with `McpLocalConfig` | `McpRemoteConfig` with `url`, `headers` |
| Amp | `amp mcp add` CLI | Supports remote with headers |
Local MCP servers can be bundled (for example with `tsup`) and uploaded via the filesystem API, then referenced in the session `mcp` config to auto-start and serve custom tools.
---
## 6. Skills & Extensions
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Skills/plugins** | ✅ `.claude/skills/` | ✅ `.agents/skills/` | ✅ `.opencode/skill/` | 🔄 Claude-style | ✅ Yes - link dirs |
| **Slash commands** | ✅ `.claude/commands/` | ✅ Custom prompts (deprecated) | ❌ | ❌ | ⚠️ Partial |
### Skill Path Mapping
| Agent | Project Skills | User Skills |
|-------|----------------|-------------|
| Claude | `.claude/skills/<name>/SKILL.md` | `~/.claude/skills/<name>/SKILL.md` |
| Codex | `.agents/skills/` | `~/.agents/skills/` |
| OpenCode | `.opencode/skill/`, `.claude/skills/`, `.agents/skills/` | `~/.config/opencode/skill/` |
| Amp | Uses Claude/Codex directories | — |
---
## 7. Session Management
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Resume session** | ✅ `--resume` | ✅ Thread persistence | ✅ `-c/--continue` | ✅ `--continue` | ✅ Yes |
| **Session ID** | ✅ `session_id` | ✅ `thread_id` | ✅ `sessionID` | ✅ `session_id` | ✅ Yes |
---
## 8. Human-in-the-Loop
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **Permission requests** | ✅ Events | ⚠️ Upfront only | ✅ SSE events | ❌ Pre-configured | ⚠️ Partial |
| **Questions** | ⚠️ Limited in headless | ❌ | ✅ Full support | ❌ | ❌ No - OpenCode best |
---
## 9. Credentials
| Feature | Claude | Codex | OpenCode | Amp | Universal? |
|---------|--------|-------|----------|-----|------------|
| **API key env var** | ✅ `ANTHROPIC_API_KEY` | ✅ `OPENAI_API_KEY` | ✅ Both | ✅ `ANTHROPIC_API_KEY` | ✅ Yes |
| **OAuth tokens** | ✅ | ✅ | ✅ | ✅ | ✅ Yes |
| **Config file auth** | ✅ `~/.claude.json` | ✅ `~/.codex/auth.json` | ✅ `~/.local/share/opencode/auth.json` | ✅ `~/.amp/config.json` | ✅ Yes - extract per agent |
---
## Configuration Files Per Agent
### Claude Code
| File/Location | Purpose |
|---------------|---------|
| `CLAUDE.md` | Project instructions (hierarchical, cwd → root) |
| `~/.claude/CLAUDE.md` | Global user instructions |
| `~/.claude/settings.json` | User settings (permissions, MCP servers, env) |
| `.claude/settings.json` | Project-level settings |
| `.claude/settings.local.json` | Local overrides (gitignored) |
| `~/.claude/commands/` | Custom slash commands (user-level) |
| `.claude/commands/` | Project-level slash commands |
| `~/.claude/skills/` | Installed skills |
| `~/.claude/keybindings.json` | Custom keyboard shortcuts |
| `~/.claude/projects/<hash>/memory/MEMORY.md` | Auto-memory per project |
| `~/.claude.json` | Authentication/credentials |
| `~/.claude.json.api` | API key storage |
### OpenAI Codex
| File/Location | Purpose |
|---------------|---------|
| `AGENTS.md` | Project instructions (hierarchical, root → cwd) |
| `AGENTS.override.md` | Override file (takes precedence) |
| `~/.codex/AGENTS.md` | Global user instructions |
| `~/.codex/AGENTS.override.md` | Global override |
| `~/.codex/config.toml` | User configuration |
| `.codex/config.toml` | Project-level configuration |
| `~/.codex/auth.json` | Authentication/credentials |
Key config.toml options:
- `model` - Default model
- `developer_instructions` - Appended to system prompt
- `model_instructions_file` - Replace entire system prompt
- `project_doc_max_bytes` - Max AGENTS.md size (default 32KB)
- `project_doc_fallback_filenames` - Alternative instruction files
- `mcp_servers` - MCP server configuration
### OpenCode
| File/Location | Purpose |
|---------------|---------|
| `~/.local/share/opencode/auth.json` | Authentication |
| `~/.config/opencode/config.toml` | User configuration |
| `.opencode/config.toml` | Project configuration |
### Amp
| File/Location | Purpose |
|---------------|---------|
| `~/.amp/config.json` | Main configuration |
| `~/.config/amp/settings.json` | Additional settings |
| `.amp/rules.json` | Project permission rules |
---
## Summary: Universalization Tiers
### Tier 1: Fully Universal (implement now)
| Feature | API | Notes |
|---------|-----|-------|
| Project instructions | `projectInstructions: string` | Write to CLAUDE.md / AGENTS.md |
| Permission mode | `permissionMode: "readonly" \| "write" \| "bypass"` | Map to agent-specific flags |
| Agent mode | `agentMode: "plan" \| "build"` | Map to agent-specific mechanisms |
| Model selection | `model: string` | Pass through to agent |
| Resume session | `sessionId: string` | Map to agent's resume flag |
| Credentials | `credentials: { apiKey?, oauthToken? }` | Inject via env vars |
| MCP servers | `mcp: McpConfig` | Write to agent's config (docs drafted) |
| Skills | `skills: { paths: string[] }` | Link to agent's skill dirs (docs drafted) |
### Tier 2: Partial Support (with fallbacks)
| Feature | API | Notes |
|---------|-----|-------|
| Append system prompt | `appendSystemPrompt: string` | Falls back to projectInstructions for Amp |
| Reasoning variant | `variant: string` | Ignored for Claude |
### Tier 3: Agent-Specific (pass-through)
| Feature | Notes |
|---------|-------|
| Replace system prompt | Codex only (`model_instructions_file`) |
| Per-tool permissions | OpenCode/Amp only |
| Custom agents | OpenCode only |
| Hierarchical file discovery | Let agents handle natively |
---
## Recommended Universal API
```typescript
interface UniversalSessionConfig {
// Tier 1 - Universal
agent: "claude" | "codex" | "opencode" | "amp";
model?: string;
permissionMode?: "readonly" | "write" | "bypass";
agentMode?: "plan" | "build";
projectInstructions?: string;
sessionId?: string; // For resume
workingDirectory?: string;
credentials?: {
apiKey?: string;
oauthToken?: string;
};
// MCP servers (docs drafted in docs/mcp.mdx)
mcp?: Record<string, McpServerConfig>;
// Skills (docs drafted in docs/skills.mdx)
skills?: {
paths: string[];
};
// Tier 2 - Partial (with fallbacks)
appendSystemPrompt?: string;
variant?: string;
// Tier 3 - Pass-through
agentSpecific?: {
claude?: { /* raw Claude options */ };
codex?: { replaceSystemPrompt?: string; /* etc */ };
opencode?: { customAgent?: AgentDef; /* etc */ };
amp?: { permissionRules?: Rule[]; /* etc */ };
};
}
interface McpServerConfig {
type: "local" | "remote";
// Local
command?: string;
args?: string[];
env?: Record<string, string>;
timeoutMs?: number;
// Remote
url?: string;
headers?: Record<string, string>;
}
```
---
## Implementation Notes
### Priority Inversion Warning
Claude and Codex have inverted priority for project instructions vs system prompt:
- **Claude**: `--append-system-prompt` > base prompt > `CLAUDE.md`
- **Codex**: `AGENTS.md` > `developer_instructions` > base prompt
This means:
- In Claude, system prompt additions override project files
- In Codex, project files override system prompt additions
When using both `appendSystemPrompt` and `projectInstructions`, document this behavior clearly or consider normalizing by only using one mechanism.
### File Injection Strategy
For `projectInstructions`, sandbox-agent should:
1. Create a temp directory or use session working directory
2. Write instructions to the appropriate file:
- Claude: `.claude/CLAUDE.md` or `CLAUDE.md` in cwd
- Codex: `.codex/AGENTS.md` or `AGENTS.md` in cwd
- OpenCode: Config file or environment
- Amp: Limited - may only influence via context
3. Start agent in that directory
4. Agent discovers and loads instructions automatically
### MCP Server Injection
For `mcp`, sandbox-agent should:
1. Write MCP config to agent's settings file:
- Claude: `.mcp.json` or `.claude/settings.json``mcpServers` key
- Codex: `.codex/config.toml``mcp_servers`
- OpenCode: Call `/mcp` API
- Amp: Run `amp mcp add` or pass via `--toolbox`
2. Ensure MCP server binaries are available in PATH
3. Handle cleanup on session end
### Skill Linking
For `skills.paths`, sandbox-agent should:
1. For each skill path, symlink or copy to agent's skill directory:
- Claude: `.claude/skills/<name>/`
- Codex: `.agents/skills/<name>/`
- OpenCode: Update `skills.paths` in config
2. Skill directory must contain `SKILL.md`
3. Handle cleanup on session end

View file

@ -9,6 +9,10 @@
"type": {
"type": "string",
"enum": [
"system",
"user",
"assistant",
"result",
"message",
"tool_call",
"tool_result",
@ -27,6 +31,45 @@
},
"error": {
"type": "string"
},
"subtype": {
"type": "string"
},
"cwd": {
"type": "string"
},
"session_id": {
"type": "string"
},
"tools": {
"type": "array",
"items": {
"type": "string"
}
},
"mcp_servers": {
"type": "array",
"items": {
"type": "object"
}
},
"message": {
"type": "object"
},
"parent_tool_use_id": {
"type": "string"
},
"duration_ms": {
"type": "number"
},
"is_error": {
"type": "boolean"
},
"num_turns": {
"type": "number"
},
"result": {
"type": "string"
}
},
"required": [

View file

@ -204,12 +204,27 @@ function createFallbackSchema(): NormalizedSchema {
properties: {
type: {
type: "string",
enum: ["message", "tool_call", "tool_result", "error", "done"],
enum: ["system", "user", "assistant", "result", "message", "tool_call", "tool_result", "error", "done"],
},
// Common fields
id: { type: "string" },
content: { type: "string" },
tool_call: { $ref: "#/definitions/ToolCall" },
error: { type: "string" },
// System message fields
subtype: { type: "string" },
cwd: { type: "string" },
session_id: { type: "string" },
tools: { type: "array", items: { type: "string" } },
mcp_servers: { type: "array", items: { type: "object" } },
// User/Assistant message fields
message: { type: "object" },
parent_tool_use_id: { type: "string" },
// Result fields
duration_ms: { type: "number" },
is_error: { type: "boolean" },
num_turns: { type: "number" },
result: { type: "string" },
},
required: ["type"],
},

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-shared",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "Shared helpers for sandbox-agent CLI and SDK",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "CLI for sandbox-agent - run AI coding agents in sandboxes",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-darwin-arm64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "sandbox-agent CLI binary for macOS ARM64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-darwin-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "sandbox-agent CLI binary for macOS x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-linux-arm64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "sandbox-agent CLI binary for Linux arm64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-linux-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "sandbox-agent CLI binary for Linux x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/cli-win32-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "sandbox-agent CLI binary for Windows x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "Gigacode CLI (sandbox-agent with OpenCode attach by default)",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode-darwin-arm64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "gigacode CLI binary for macOS arm64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode-darwin-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "gigacode CLI binary for macOS x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode-linux-arm64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "gigacode CLI binary for Linux arm64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode-linux-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "gigacode CLI binary for Linux x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,6 +1,6 @@
{
"name": "@sandbox-agent/gigacode-win32-x64",
"version": "0.1.7",
"version": "0.1.12-rc.1",
"description": "gigacode CLI binary for Windows x64",
"license": "Apache-2.0",
"repository": {

View file

@ -1,7 +1,7 @@
{
"name": "sandbox-agent",
"version": "0.1.7",
"description": "Universal API for automatic coding agents in sandboxes. Supprots Claude Code, Codex, OpenCode, and Amp.",
"version": "0.1.12-rc.1",
"description": "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, and Amp.",
"license": "Apache-2.0",
"repository": {
"type": "git",
@ -39,6 +39,6 @@
"vitest": "^3.0.0"
},
"optionalDependencies": {
"@sandbox-agent/cli": "0.1.0"
"@sandbox-agent/cli": "workspace:*"
}
}

Some files were not shown because too many files have changed in this diff Show more