Merge branch 'main' into feat/support-pi

2026-04-15 07:04:48 +00:00 · 2026-02-10 22:27:03 -08:00 · 2026-02-10 22:27:03 -08:00 · 4c6c5983c0
commit 4c6c5983c0
parent 8b068eb1ae 8f93d51883
156 changed files with 16196 additions and 2338 deletions
--- a/.claude/commands/release.md
+++ b/.claude/commands/release.md
@ -0,0 +1,165 @@
+# Release Agent
+
+You are a release agent for the Gigacode project (sandbox-agent). Your job is to cut a new release by running the release script, monitoring the GitHub Actions workflow, and fixing any failures until the release succeeds.
+
+## Step 1: Gather Release Information
+
+Ask the user what type of release they want to cut:
+
+- **patch** - Bug fixes (e.g., 0.1.8 -> 0.1.9)
+- **minor** - New features (e.g., 0.1.8 -> 0.2.0)
+- **major** - Breaking changes (e.g., 0.1.8 -> 1.0.0)
+- **rc** - Release candidate (e.g., 0.2.0-rc.1)
+
+For **rc** releases, also ask:
+1. What base version the RC is for (e.g., 0.2.0). If the user doesn't specify, determine it by bumping the minor version from the current version.
+2. What RC number (e.g., 1, 2, 3). If the user doesn't specify, check existing git tags to auto-determine the next RC number:
+
+```bash
+git tag -l "v<base_version>-rc.*" | sort -V
+```
+
+If no prior RC tags exist for that base version, use `rc.1`. Otherwise, increment the highest existing RC number.
+
+The final RC version string is `<base_version>-rc.<number>` (e.g., `0.2.0-rc.1`).
+
+## Step 2: Confirm Release Details
+
+Before proceeding, display the release details to the user and ask for explicit confirmation:
+
+- Current version (read from `Cargo.toml` workspace.package.version)
+- New version
+- Current branch
+- Whether it will be tagged as "latest" (RC releases are never tagged as latest)
+
+Do NOT proceed without user confirmation.
+
+## Step 3: Run the Release Script (Setup Local)
+
+The release script handles version bumping, local checks, committing, pushing, and triggering the workflow.
+
+For **major**, **minor**, or **patch** releases:
+
+```bash
+echo "yes" | ./scripts/release/main.ts --<type> --phase setup-local
+```
+
+For **rc** releases (using explicit version):
+
+```bash
+echo "yes" | ./scripts/release/main.ts --version <version> --phase setup-local
+```
+
+Where `<type>` is `major`, `minor`, or `patch`, and `<version>` is the full RC version string like `0.2.0-rc.1`.
+
+The `--phase setup-local` runs these steps in order:
+1. Confirms release details (interactive prompt - piping "yes" handles this)
+2. Updates version in all files (Cargo.toml, package.json files)
+3. Runs local checks (cargo check, cargo fmt, pnpm typecheck)
+4. Git commits with message `chore(release): update version to X.Y.Z`
+5. Git pushes
+6. Triggers the GitHub Actions workflow
+
+If local checks fail at step 3, fix the issues in the codebase, then re-run using `--only-steps` to avoid re-running already-completed steps:
+
+```bash
+echo "yes" | ./scripts/release/main.ts --version <version> --only-steps run-local-checks,git-commit,git-push,trigger-workflow
+```
+
+## Step 4: Monitor the GitHub Actions Workflow
+
+After the workflow is triggered, wait 5 seconds for it to register, then begin polling.
+
+### Find the workflow run
+
+```bash
+gh run list --workflow=release.yaml --limit=1 --json databaseId,status,conclusion,createdAt,url
+```
+
+Verify the run was created recently (within the last 2 minutes) to confirm you are monitoring the correct run. Save the `databaseId` as the run ID.
+
+### Poll for completion
+
+Poll every 15 seconds using:
+
+```bash
+gh run view <run-id> --json status,conclusion
+```
+
+Report progress to the user periodically (every ~60 seconds or when status changes). The status values are:
+- `queued` / `in_progress` / `waiting` - Still running, keep polling
+- `completed` - Done, check `conclusion`
+
+When `status` is `completed`, check `conclusion`:
+- `success` - Release succeeded! Proceed to Step 6.
+- `failure` - Proceed to Step 5.
+- `cancelled` - Inform the user and stop.
+
+## Step 5: Handle Workflow Failures
+
+If the workflow fails:
+
+### 5a. Get failure logs
+
+```bash
+gh run view <run-id> --log-failed
+```
+
+### 5b. Analyze the error
+
+Read the failure logs carefully. Common failure categories:
+- **Build failures** (cargo build, TypeScript compilation) - Fix the code
+- **Formatting issues** (cargo fmt) - Run `cargo fmt` and commit
+- **Test failures** - Fix the failing tests
+- **Publishing failures** (crates.io, npm) - These may be transient; check if retry will help
+- **Docker build failures** - Check Dockerfile or build script issues
+- **Infrastructure/transient failures** (network timeouts, rate limits) - Just re-trigger without code changes
+
+### 5c. Fix and re-push
+
+If a code fix is needed:
+1. Make the fix in the codebase
+2. Amend the release commit (since the release version commit is the most recent):
+
+```bash
+git add -A
+git commit --amend --no-edit
+git push --force-with-lease
+```
+
+IMPORTANT: Use `--force-with-lease` (not `--force`) for safety. Amend the commit rather than creating a new one so the release stays as a single version-bump commit.
+
+3. Re-trigger the workflow:
+
+```bash
+gh workflow run .github/workflows/release.yaml \
+  -f version=<version> \
+  -f latest=<true|false> \
+  --ref <branch>
+```
+
+Where `<branch>` is the current branch (usually `main`). Set `latest` to `false` for RC releases, `true` for stable releases that are newer than the current latest tag.
+
+4. Return to Step 4 to monitor the new run.
+
+If no code fix is needed (transient failure), skip straight to re-triggering the workflow (step 3 above).
+
+### 5d. Retry limit
+
+If the workflow has failed **5 times**, stop and report all errors to the user. Ask whether they want to continue retrying or abort the release. Do not retry infinitely.
+
+## Step 6: Report Success
+
+When the workflow completes successfully:
+1. Print the GitHub Actions run URL
+2. Print the new version number
+3. Suggest running post-release testing: "Run `/project:post-release-testing` to verify the release works correctly."
+
+## Important Notes
+
+- The product name is "Gigacode" (capital G, lowercase c). The CLI binary is `gigacode` (lowercase).
+- Do not include co-authors in any commit messages.
+- Use conventional commits style (e.g., `chore(release): update version to X.Y.Z`).
+- Keep commit messages to a single line.
+- The release script requires `tsx` to run (it's a TypeScript file with a shebang).
+- Always work on the current branch. Releases are typically cut from `main`.
--- a/.dockerignore
+++ b/.dockerignore
@ -4,7 +4,7 @@ dist/
 build/

 # Dependencies
-node_modules/
+**/node_modules/

 # Cache
 .cache/
--- a/.github/workflows/skill-generator.yml
+++ b/.github/workflows/skill-generator.yml
@ -20,17 +20,25 @@ jobs:

      - name: Sync to skills repo
        env:
-          SKILLS_REPO_TOKEN: ${{ secrets.RIVET_GITHUB_PAT }}
+          GH_TOKEN: ${{ secrets.RIVET_GITHUB_PAT }}
        run: |
-          if [ -z "$SKILLS_REPO_TOKEN" ]; then
-            echo "SKILLS_REPO_TOKEN is not set" >&2
+          if [ -z "$GH_TOKEN" ]; then
+            echo "::error::RIVET_GITHUB_PAT secret is not set"
+            exit 1
+          fi
+
+          # Validate token before proceeding
+          if ! gh auth status 2>/dev/null; then
+            echo "::error::RIVET_GITHUB_PAT is invalid or expired. Rotate the token at https://github.com/settings/tokens"
            exit 1
          fi

          git config --global user.name "github-actions[bot]"
          git config --global user.email "github-actions[bot]@users.noreply.github.com"

-          git clone "https://x-access-token:${SKILLS_REPO_TOKEN}@github.com/rivet-dev/skills.git" /tmp/rivet-skills
+          # Clone public repo, configure auth via gh credential helper
+          gh auth setup-git
+          git clone https://github.com/rivet-dev/skills.git /tmp/rivet-skills

          mkdir -p /tmp/rivet-skills/skills/sandbox-agent
          rm -rf /tmp/rivet-skills/skills/sandbox-agent/*
--- a/.gitignore
+++ b/.gitignore
@ -40,5 +40,13 @@ npm-debug.log*
 Cargo.lock
 **/*.rs.bk

+# Agent runtime directories
+.agents/
+.claude/
+.opencode/
+
+# Example temp files
+.tmp-upload/
+
 # CLI binaries (downloaded during npm publish)
 sdks/cli/platforms/*/bin/
--- a/.mcp.json
+++ b/.mcp.json
@ -0,0 +1,10 @@
+{
+  "mcpServers": {
+    "everything": {
+      "args": [
+        "@modelcontextprotocol/server-everything"
+      ],
+      "command": "npx"
+    }
+  }
+}
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -47,6 +47,16 @@ Universal schema guidance:
 - On parse failures, emit an `agent.unparsed` event (source=daemon, synthetic=true) and treat it as a test failure. Preserve raw payloads when `include_raw=true`.
 - Track subagent support in `docs/conversion.md`. For now, normalize subagent activity into normal message/tool flow, but revisit explicit subagent modeling later.
 - Keep the FAQ in `README.md` and `frontend/packages/website/src/components/FAQ.tsx` in sync. When adding or modifying FAQ entries, update both files.
+- Update `research/wip-agent-support.md` as agent support changes are implemented.
+
+### OpenAPI / utoipa requirements
+
+Every `#[utoipa::path(...)]` handler function must have a doc comment where:
+- The **first line** becomes the OpenAPI `summary` (short human-readable title, e.g. `"List Agents"`). This is used as the sidebar label and page heading in the docs site.
+- The **remaining lines** become the OpenAPI `description` (one-sentence explanation of what the endpoint does).
+- Every `responses(...)` entry must have a `description` (no empty descriptions).
+
+When adding or modifying endpoints, regenerate `docs/openapi.json` and verify titles render correctly in the docs site.

 ### CLI ⇄ HTTP endpoint map (keep in sync)

@ -64,11 +74,45 @@ Universal schema guidance:
 - `sandbox-agent api sessions reply-question` ↔ `POST /v1/sessions/{sessionId}/questions/{questionId}/reply`
 - `sandbox-agent api sessions reject-question` ↔ `POST /v1/sessions/{sessionId}/questions/{questionId}/reject`
 - `sandbox-agent api sessions reply-permission` ↔ `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply`
+- `sandbox-agent api fs entries` ↔ `GET /v1/fs/entries`
+- `sandbox-agent api fs read` ↔ `GET /v1/fs/file`
+- `sandbox-agent api fs write` ↔ `PUT /v1/fs/file`
+- `sandbox-agent api fs delete` ↔ `DELETE /v1/fs/entry`
+- `sandbox-agent api fs mkdir` ↔ `POST /v1/fs/mkdir`
+- `sandbox-agent api fs move` ↔ `POST /v1/fs/move`
+- `sandbox-agent api fs stat` ↔ `GET /v1/fs/stat`
+- `sandbox-agent api fs upload-batch` ↔ `POST /v1/fs/upload-batch`

-## OpenCode CLI (Experimental)
+## OpenCode Compatibility Layer

 `sandbox-agent opencode` starts a sandbox-agent server and attaches an OpenCode session (uses `/opencode`).

+### Session ownership
+
+Sessions are stored **only** in sandbox-agent's v1 `SessionManager` — they are never sent to or stored in the native OpenCode server. The OpenCode TUI reads sessions via `GET /session` which the compat layer serves from the v1 store. The native OpenCode process has no knowledge of sessions.
+
+### Proxy elimination strategy
+
+The `/opencode` compat layer (`opencode_compat.rs`) historically proxied many endpoints to the native OpenCode server via `proxy_native_opencode()`. The goal is to **eliminate proxying** by implementing each endpoint natively using the v1 `SessionManager` as the single source of truth.
+
+**Already de-proxied** (use v1 SessionManager directly):
+- `GET /session` — `oc_session_list` reads from `SessionManager::list_sessions()`
+- `GET /session/{id}` — `oc_session_get` reads from `SessionManager::get_session_info()`
+- `GET /session/status` — `oc_session_status` derives busy/idle from v1 session `ended` flag
+- `POST /tui/open-sessions` — returns `true` directly (TUI fetches sessions from `GET /session`)
+- `POST /tui/select-session` — emits `tui.session.select` event via the OpenCode event broadcaster
+
+**Still proxied** (none of these reference session IDs or the session list — all are session-agnostic):
+- `GET /command` — command list
+- `GET /config`, `PATCH /config` — project config read/write
+- `GET /global/config`, `PATCH /global/config` — global config read/write
+- `GET /tui/control/next`, `POST /tui/control/response` — TUI control loop
+- `POST /tui/append-prompt`, `/tui/submit-prompt`, `/tui/clear-prompt` — prompt management
+- `POST /tui/open-help`, `/tui/open-themes`, `/tui/open-models` — TUI navigation
+- `POST /tui/execute-command`, `/tui/show-toast`, `/tui/publish` — TUI actions
+
+When converting a proxied endpoint: add needed fields to `SessionState`/`SessionInfo` in `router.rs`, implement the logic natively in `opencode_compat.rs`, and use `session_info_to_opencode_value()` to format responses.
+
 ## Post-Release Testing

 After cutting a release, verify the release works correctly. Run `/project:post-release-testing` to execute the testing agent.
--- a/Cargo.toml
+++ b/Cargo.toml
@ -3,21 +3,21 @@ resolver = "2"
 members = ["server/packages/*", "gigacode"]

 [workspace.package]
-version = "0.1.7"
+version = "0.1.12-rc.1"
 edition = "2021"
 authors = [ "Rivet Gaming, LLC <developer@rivet.gg>" ]
 license = "Apache-2.0"
 repository = "https://github.com/rivet-dev/sandbox-agent"
-description = "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, Amp, and Pi."
+description = "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, Cursor, Amp, and Pi."

 [workspace.dependencies]
 # Internal crates
-sandbox-agent = { version = "0.1.7", path = "server/packages/sandbox-agent" }
-sandbox-agent-error = { version = "0.1.7", path = "server/packages/error" }
-sandbox-agent-agent-management = { version = "0.1.7", path = "server/packages/agent-management" }
-sandbox-agent-agent-credentials = { version = "0.1.7", path = "server/packages/agent-credentials" }
-sandbox-agent-universal-agent-schema = { version = "0.1.7", path = "server/packages/universal-agent-schema" }
-sandbox-agent-extracted-agent-schemas = { version = "0.1.7", path = "server/packages/extracted-agent-schemas" }
+sandbox-agent = { version = "0.1.12-rc.1", path = "server/packages/sandbox-agent" }
+sandbox-agent-error = { version = "0.1.12-rc.1", path = "server/packages/error" }
+sandbox-agent-agent-management = { version = "0.1.12-rc.1", path = "server/packages/agent-management" }
+sandbox-agent-agent-credentials = { version = "0.1.12-rc.1", path = "server/packages/agent-credentials" }
+sandbox-agent-universal-agent-schema = { version = "0.1.12-rc.1", path = "server/packages/universal-agent-schema" }
+sandbox-agent-extracted-agent-schemas = { version = "0.1.12-rc.1", path = "server/packages/extracted-agent-schemas" }

 # Serialization
 serde = { version = "1.0", features = ["derive"] }
@ -69,6 +69,7 @@ url = "2.5"
 regress = "0.10"
 include_dir = "0.7"
 base64 = "0.22"
+toml_edit = "0.22"

 # Code generation (build deps)
 typify = "0.4"
--- a/README.md
+++ b/README.md
@ -5,7 +5,7 @@
 <h3 align="center">Run Coding Agents in Sandboxes. Control Them Over HTTP.</h3>

 <p align="center">
-  A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Amp, or Pi — streaming events, handling permissions, managing sessions.
+  A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Cursor, Amp, or Pi — streaming events, handling permissions, managing sessions.
 </p>

 <p align="center">
@ -24,13 +24,13 @@ Sandbox Agent solves three problems:

 1. **Coding agents need sandboxes** — You can't let AI execute arbitrary code on your production servers. Coding agents need isolated environments, but existing SDKs assume local execution. Sandbox Agent is a server that runs inside the sandbox and exposes HTTP/SSE.

-2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
+2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Cursor, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.

 3. **Sessions are ephemeral** — Agent transcripts live in the sandbox. When the process ends, you lose everything. Sandbox Agent streams events in a universal schema to your storage. Persist to Postgres, ClickHouse, or [Rivet](https://rivet.dev). Replay later, audit everything.

 ## Features

- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Amp, and Pi with full feature coverage
+- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Cursor, Amp, and Pi with full feature coverage
 - **Streaming Events**: Real-time SSE stream of everything the agent does — tool calls, permission requests, file edits, and more
 - **Universal Session Schema**: [Standardized schema](https://sandboxagent.dev/docs/session-transcript-schema) that normalizes all agent event formats for storage and replay
 - **Human-in-the-Loop**: Approve or deny tool executions and answer agent questions remotely over HTTP
@ -131,6 +131,8 @@ for await (const event of client.streamEvents("demo", { offset: 0 })) {
 }
 ```

+`permissionMode: "acceptEdits"` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
+
 [SDK documentation](https://sandboxagent.dev/docs/sdks/typescript) — [Building a Chat UI](https://sandboxagent.dev/docs/building-chat-ui) — [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions)

 ### HTTP Server
@ -232,7 +234,7 @@ No, they're complementary. AI SDK is for building chat interfaces and calling LL
 <details>
 <summary><strong>Which coding agents are supported?</strong></summary>

-Claude Code, Codex, OpenCode, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
+Claude Code, Codex, OpenCode, Cursor, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
 </details>

 <details>
--- a/docs/agent-sessions.mdx
+++ b/docs/agent-sessions.mdx
@ -0,0 +1,278 @@
+---
+title: "Agent Sessions"
+description: "Create sessions and send messages to agents."
+sidebarTitle: "Sessions"
+icon: "comments"
+---
+
+Sessions are the unit of interaction with an agent. You create one session per task, then send messages and stream events.
+
+## Session Options
+
+`POST /v1/sessions/{sessionId}` accepts the following fields:
+
+- `agent` (required): `claude`, `codex`, `opencode`, `amp`, or `mock`
+- `agentMode`: agent mode string (for example, `build`, `plan`)
+- `permissionMode`: permission mode string (`default`, `plan`, `bypass`, etc.)
+- `model`: model override (agent-specific)
+- `variant`: model variant (agent-specific)
+- `agentVersion`: agent version override
+- `mcp`: MCP server config map (see `MCP`)
+- `skills`: skill path config (see `Skills`)
+
+## Create A Session
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.createSession("build-session", {
+  agent: "codex",
+  agentMode: "build",
+  permissionMode: "default",
+  model: "gpt-4.1",
+  variant: "reasoning",
+  agentVersion: "latest",
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "agent": "codex",
+    "agentMode": "build",
+    "permissionMode": "default",
+    "model": "gpt-4.1",
+    "variant": "reasoning",
+    "agentVersion": "latest"
+  }'
+```
+</CodeGroup>
+
+## Send A Message
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.postMessage("build-session", {
+  message: "Summarize the repository structure.",
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"message":"Summarize the repository structure."}'
+```
+</CodeGroup>
+
+## Stream A Turn
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+const response = await client.postMessageStream("build-session", {
+  message: "Explain the main entrypoints.",
+});
+
+const reader = response.body?.getReader();
+if (reader) {
+  const decoder = new TextDecoder();
+  while (true) {
+    const { done, value } = await reader.read();
+    if (done) break;
+    console.log(decoder.decode(value, { stream: true }));
+  }
+}
+```
+
+```bash cURL
+curl -N -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages/stream" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"message":"Explain the main entrypoints."}'
+```
+</CodeGroup>
+
+## Fetch Events
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+const events = await client.getEvents("build-session", {
+  offset: 0,
+  limit: 50,
+  includeRaw: false,
+});
+
+console.log(events.events);
+```
+
+```bash cURL
+curl -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events?offset=0&limit=50" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+`GET /v1/sessions/{sessionId}/get-messages` is an alias for `events`.
+
+## Stream Events (SSE)
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+for await (const event of client.streamEvents("build-session", { offset: 0 })) {
+  console.log(event.type, event.data);
+}
+```
+
+```bash cURL
+curl -N -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events/sse?offset=0" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## List Sessions
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+const sessions = await client.listSessions();
+console.log(sessions.sessions);
+```
+
+```bash cURL
+curl -X GET "http://127.0.0.1:2468/v1/sessions" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## Reply To A Question
+
+When the agent asks a question, reply with an array of answers. Each inner array is one multi-select response.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.replyQuestion("build-session", "question-1", {
+  answers: [["yes"]],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reply" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"answers":[["yes"]]}'
+```
+</CodeGroup>
+
+## Reject A Question
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.rejectQuestion("build-session", "question-1");
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reject" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## Reply To A Permission Request
+
+Use `once`, `always`, or `reject`.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.replyPermission("build-session", "permission-1", {
+  reply: "once",
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/permissions/permission-1/reply" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"reply":"once"}'
+```
+</CodeGroup>
+
+## Terminate A Session
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.terminateSession("build-session");
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/terminate" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
--- a/docs/attachments.mdx
+++ b/docs/attachments.mdx
@ -0,0 +1,87 @@
+---
+title: "Attachments"
+description: "Upload files into the sandbox and attach them to prompts."
+sidebarTitle: "Attachments"
+icon: "paperclip"
+---
+
+Use the filesystem API to upload files, then reference them as attachments when sending prompts.
+
+<Steps>
+  <Step title="Upload a file">
+    <CodeGroup>
+    ```ts TypeScript
+    import { SandboxAgent } from "sandbox-agent";
+    import fs from "node:fs";
+
+    const client = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+      token: process.env.SANDBOX_TOKEN,
+    });
+
+    const buffer = await fs.promises.readFile("./data.csv");
+
+    const upload = await client.writeFsFile(
+      { path: "./uploads/data.csv", sessionId: "my-session" },
+      buffer,
+    );
+
+    console.log(upload.path);
+    ```
+
+    ```bash cURL
+    curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv&sessionId=my-session" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      --data-binary @./data.csv
+    ```
+    </CodeGroup>
+
+    The response returns the absolute path that you should use for attachments.
+  </Step>
+
+  <Step title="Attach the file in a prompt">
+    <CodeGroup>
+    ```ts TypeScript
+    import { SandboxAgent } from "sandbox-agent";
+
+    const client = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+      token: process.env.SANDBOX_TOKEN,
+    });
+
+    await client.postMessage("my-session", {
+      message: "Please analyze the attached CSV.",
+      attachments: [
+        {
+          path: "/home/sandbox/uploads/data.csv",
+          mime: "text/csv",
+          filename: "data.csv",
+        },
+      ],
+    });
+    ```
+
+    ```bash cURL
+    curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      -H "Content-Type: application/json" \
+      -d '{
+        "message": "Please analyze the attached CSV.",
+        "attachments": [
+          {
+            "path": "/home/sandbox/uploads/data.csv",
+            "mime": "text/csv",
+            "filename": "data.csv"
+          }
+        ]
+      }'
+    ```
+    </CodeGroup>
+  </Step>
+</Steps>
+
+## Notes
+
+- Use absolute paths from the upload response to avoid ambiguity.
+- If `mime` is omitted, the server defaults to `application/octet-stream`.
+- OpenCode receives file parts directly; other agents will see the attachment paths appended to the prompt.
--- a/docs/building-chat-ui.mdx
+++ b/docs/building-chat-ui.mdx
@ -29,7 +29,7 @@ const sessionId = `session-${crypto.randomUUID()}`;
 await client.createSession(sessionId, {
  agent: "claude",
  agentMode: "code",        // Optional: agent-specific mode
-  permissionMode: "default", // Optional: "default" | "plan" | "bypass"
+  permissionMode: "default", // Optional: "default" | "plan" | "bypass" | "acceptEdits" (Claude: accept edits; Codex: auto-approve file changes; others: default)
  model: "claude-sonnet-4", // Optional: model override
 });
 ```
@ -70,7 +70,7 @@ Use `offset` to track the last seen `sequence` number and resume from where you

 ### Bare minimum

-Handle these three events to render a basic chat:
+Handle item lifecycle plus turn lifecycle to render a basic chat:

 ```ts
 type ItemState = {
@ -79,9 +79,20 @@ type ItemState = {
 };

 const items = new Map<string, ItemState>();
+let turnInProgress = false;

 function handleEvent(event: UniversalEvent) {
  switch (event.type) {
+    case "turn.started": {
+      turnInProgress = true;
+      break;
+    }
+
+    case "turn.ended": {
+      turnInProgress = false;
+      break;
+    }
+
    case "item.started": {
      const { item } = event.data as ItemEventData;
      items.set(item.item_id, { item, deltas: [] });
@ -110,12 +121,14 @@ function handleEvent(event: UniversalEvent) {
 }
 ```

-When rendering, show a loading indicator while `item.status === "in_progress"`:
+When rendering:
+- Use `turnInProgress` for turn-level UI state (disable send button, show global "Agent is responding", etc.).
+- Use `item.status === "in_progress"` for per-item streaming state.

 ```ts
 function renderItem(state: ItemState) {
  const { item, deltas } = state;
-  const isLoading = item.status === "in_progress";
+  const isItemLoading = item.status === "in_progress";

  // For streaming text, combine item content with accumulated deltas
  const text = item.content
@ -126,7 +139,8 @@ function renderItem(state: ItemState) {

  return {
    content: streamedText,
-    isLoading,
+    isItemLoading,
+    isTurnLoading: turnInProgress,
    role: item.role,
    kind: item.kind,
  };
--- a/docs/cli.mdx
+++ b/docs/cli.mdx
@ -2,7 +2,6 @@
 title: "CLI Reference"
 description: "Complete CLI reference for sandbox-agent."
 sidebarTitle: "CLI"
-icon: "terminal"
 ---

 ## Server
@ -71,7 +70,6 @@ sandbox-agent opencode [OPTIONS]
 | `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
 | `-p, --port <PORT>` | `2468` | Port to bind to |
 | `--session-title <TITLE>` | - | Title for the OpenCode session |
-| `--opencode-bin <PATH>` | - | Override `opencode` binary path |

 ```bash
 sandbox-agent opencode --token "$TOKEN"
@ -79,7 +77,7 @@ sandbox-agent opencode --token "$TOKEN"

 The daemon logs to a per-host log file under the sandbox-agent data directory (for example, `~/.local/share/sandbox-agent/daemon/daemon-127-0-0-1-2468.log`).

-Requires the `opencode` binary to be installed (or set `OPENCODE_BIN` / `--opencode-bin`). If it is not found on `PATH`, sandbox-agent installs it automatically.
+Existing installs are reused and missing binaries are installed automatically.

 ---

@ -247,10 +245,12 @@ sandbox-agent api sessions create <SESSION_ID> [OPTIONS]
 |--------|-------------|
 | `-a, --agent <AGENT>` | Agent identifier (required) |
 | `-g, --agent-mode <MODE>` | Agent mode |
-| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`) |
+| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`, `acceptEdits`) |
 | `-m, --model <MODEL>` | Model override |
 | `-v, --variant <VARIANT>` | Model variant |
 | `-A, --agent-version <VERSION>` | Agent version |
+| `--mcp-config <PATH>` | JSON file with MCP server config (see `mcp` docs) |
+| `--skill <PATH>` | Skill directory or `SKILL.md` path (repeatable) |

 ```bash
 sandbox-agent api sessions create my-session \
@ -259,6 +259,8 @@ sandbox-agent api sessions create my-session \
  --permission-mode default
 ```

+`acceptEdits` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
+
 #### Send Message

 ```bash
@ -380,6 +382,132 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once

 ---

+### Filesystem
+
+#### List Entries
+
+```bash
+sandbox-agent api fs entries [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--path <PATH>` | Directory path (default: `.`) |
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs entries --path ./workspace
+```
+
+#### Read File
+
+`api fs read` writes raw bytes to stdout.
+
+```bash
+sandbox-agent api fs read <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs read ./notes.txt > ./notes.txt
+```
+
+#### Write File
+
+```bash
+sandbox-agent api fs write <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--content <TEXT>` | Write UTF-8 content |
+| `--from-file <PATH>` | Read content from a local file |
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs write ./hello.txt --content "hello"
+sandbox-agent api fs write ./image.bin --from-file ./image.bin
+```
+
+#### Delete Entry
+
+```bash
+sandbox-agent api fs delete <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--recursive` | Delete directories recursively |
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs delete ./old.log
+```
+
+#### Create Directory
+
+```bash
+sandbox-agent api fs mkdir <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs mkdir ./cache
+```
+
+#### Move/Rename
+
+```bash
+sandbox-agent api fs move <FROM> <TO> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--overwrite` | Overwrite destination if it exists |
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs move ./a.txt ./b.txt --overwrite
+```
+
+#### Stat
+
+```bash
+sandbox-agent api fs stat <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs stat ./notes.txt
+```
+
+#### Upload Batch (tar)
+
+```bash
+sandbox-agent api fs upload-batch --tar <PATH> [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--tar <PATH>` | Tar archive to extract |
+| `--path <PATH>` | Destination directory |
+| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
+
+```bash
+sandbox-agent api fs upload-batch --tar ./skills.tar --path ./skills
+```
+
+---
+
 ## CLI to HTTP Mapping

 | CLI Command | HTTP Endpoint |
@ -398,3 +526,11 @@ sandbox-agent api sessions reply-permission my-session perm1 --reply once
 | `api sessions reply-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reply` |
 | `api sessions reject-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reject` |
 | `api sessions reply-permission` | `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply` |
+| `api fs entries` | `GET /v1/fs/entries` |
+| `api fs read` | `GET /v1/fs/file` |
+| `api fs write` | `PUT /v1/fs/file` |
+| `api fs delete` | `DELETE /v1/fs/entry` |
+| `api fs mkdir` | `POST /v1/fs/mkdir` |
+| `api fs move` | `POST /v1/fs/move` |
+| `api fs stat` | `GET /v1/fs/stat` |
+| `api fs upload-batch` | `POST /v1/fs/upload-batch` |
--- a/docs/conversion.mdx
+++ b/docs/conversion.mdx
@ -44,9 +44,11 @@ Events / Message Flow
 +------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+
 | session.started        | none                         | method=thread/started                      | type=session.created                    | none                             | none                       |
 | session.ended          | SDKMessage.type=result       | no explicit session end (turn/completed)   | no explicit session end (session.deleted)| type=done                        | none (daemon synthetic)    |
+| turn.started           | synthetic on message send    | method=turn/started                        | type=session.status (busy)              | synthetic on message send        | none (daemon synthetic)    |
+| turn.ended             | synthetic after result       | method=turn/completed                      | type=session.idle                       | synthetic on done                | none (daemon synthetic)    |
 | message (user)         | SDKMessage.type=user         | item/completed (ThreadItem.type=userMessage)| message.updated (Message.role=user)    | type=message                     | none (daemon synthetic)    |
 | message (assistant)    | SDKMessage.type=assistant    | item/completed (ThreadItem.type=agentMessage)| message.updated (Message.role=assistant)| type=message                  | message_start/message_end  |
-| message.delta          | stream_event (partial) or synthetic | method=item/agentMessage/delta      | type=message.part.updated (delta)       | synthetic                        | message_update (text_delta/thinking_delta) |
+| message.delta          | stream_event (partial) or synthetic | method=item/agentMessage/delta      | type=message.part.updated (text-part delta) | synthetic                    | message_update (text_delta/thinking_delta) |
 | tool call              | type=tool_use               | method=item/mcpToolCall/progress           | message.part.updated (part.type=tool)   | type=tool_call                   | tool_execution_start       |
 | tool result            | user.message.content.tool_result | item/completed (tool result ThreadItem variants) | message.part.updated (part.type=tool, state=completed) | type=tool_result     | tool_execution_end        |
 | permission.requested   | control_request.can_use_tool | none                                      | type=permission.asked                   | none                             | none                       |
@ -56,6 +58,10 @@ Events / Message Flow
 | error                  | SDKResultMessage.error       | method=error                               | type=session.error (or message error)   | type=error                        | hook_error (status item)   |
 +------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+----------------------------+

+Permission status normalization:
+- `permission.requested` uses `status=requested`.
+- `permission.resolved` uses `status=accept`, `accept_for_session`, or `reject`.
+
 Synthetics

 +------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
@ -63,6 +69,8 @@ Synthetics
 +------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
 | session.started              | When agent emits no explicit start | session.started event | Mark source=daemon                                            |
 | session.ended                | When agent emits no explicit end   | session.ended event   | Mark source=daemon; reason may be inferred                    |
+| turn.started                 | When agent emits no explicit turn start | turn.started event | Mark source=daemon                                            |
+| turn.ended                   | When agent emits no explicit turn end   | turn.ended event   | Mark source=daemon                                            |
 | item_id (Claude)             | Claude provides no item IDs        | item_id               | Maintain provider_item_id map when possible                   |
 | user message (Claude)        | Claude emits only assistant output | item.completed        | Mark source=daemon; preserve raw input in event metadata       |
 | question events (Claude)     | AskUserQuestion tool usage         | question.requested/resolved | Derived from tool_use blocks (source=agent)                   |
@ -71,7 +79,7 @@ Synthetics
 | message.delta (Claude)       | No native deltas emitted        | item.delta             | Synthetic delta with full message content; source=daemon       |
 | message.delta (Amp)          | No native deltas                | item.delta             | Synthetic delta with full message content; source=daemon       |
 +------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-| message.delta (OpenCode)     | part delta before message       | item.delta             | If part arrives first, create item.started stub then delta     |
+| message.delta (OpenCode)     | text part delta before message  | item.delta             | If part arrives first, create item.started stub then delta     |
 +------------------------------+------------------------+--------------------------+--------------------------------------------------------------+

 Delta handling
@ -82,10 +90,11 @@ Delta handling
 - Pi emits message_update deltas and cumulative tool_execution_update partialResult values (we diff to produce deltas).

 Policy:
- Always emit item.delta across all providers.
+- Emit item.delta for streamable text content across providers.
 - For providers without native deltas, emit a single synthetic delta containing the full content prior to item.completed.
 - For Claude when partial streaming is enabled, forward native deltas and skip the synthetic full-content delta.
 - For providers with native deltas, forward as-is; also emit item.completed when final content is known.
+- For OpenCode reasoning part deltas, emit typed reasoning item updates (item.started/item.completed with content.type=reasoning) instead of item.delta.

 Message normalization notes

--- a/docs/credentials.mdx
+++ b/docs/credentials.mdx
@ -0,0 +1,144 @@
+---
+title: "Credentials"
+description: "How sandbox-agent discovers and uses provider credentials."
+icon: "key"
+---
+
+Sandbox-agent automatically discovers API credentials from environment variables and agent config files. Credentials are used to authenticate with AI providers (Anthropic, OpenAI) when spawning agents.
+
+## Credential sources
+
+Credentials are extracted in priority order. The first valid credential found for each provider is used.
+
+### Environment variables (highest priority)
+
+**API keys** (checked first):
+
+| Variable | Provider |
+|----------|----------|
+| `ANTHROPIC_API_KEY` | Anthropic |
+| `CLAUDE_API_KEY` | Anthropic (fallback) |
+| `OPENAI_API_KEY` | OpenAI |
+| `CODEX_API_KEY` | OpenAI (fallback) |
+
+**OAuth tokens** (checked if no API key found):
+
+| Variable | Provider |
+|----------|----------|
+| `CLAUDE_CODE_OAUTH_TOKEN` | Anthropic (OAuth) |
+| `ANTHROPIC_AUTH_TOKEN` | Anthropic (OAuth fallback) |
+
+OAuth tokens from environment variables are only used when `include_oauth` is enabled (the default).
+
+### Agent config files
+
+If no environment variable is set, sandbox-agent checks agent-specific config files:
+
+| Agent | Config path | Provider |
+|-------|-------------|----------|
+| Amp | `~/.amp/config.json` | Anthropic |
+| Claude Code | `~/.claude.json`, `~/.claude/.credentials.json` | Anthropic |
+| Codex | `~/.codex/auth.json` | OpenAI |
+| OpenCode | `~/.local/share/opencode/auth.json` | Both |
+
+OAuth tokens are supported for Claude Code, Codex, and OpenCode. Expired tokens are automatically skipped.
+
+## Provider requirements by agent
+
+| Agent | Required provider |
+|-------|-------------------|
+| Claude Code | Anthropic |
+| Amp | Anthropic |
+| Codex | OpenAI |
+| OpenCode | Anthropic or OpenAI |
+| Mock | None |
+
+## Error handling behavior
+
+Sandbox-agent uses a **best-effort, fail-forward** approach to credentials:
+
+### Extraction failures are silent
+
+If a config file is missing, unreadable, or malformed, extraction continues to the next source. No errors are thrown. Missing credentials simply mean the provider is marked as unavailable.
+
+```
+~/.claude.json missing     → try ~/.claude/.credentials.json
+~/.claude/.credentials.json missing → try OpenCode config
+All sources exhausted      → anthropic = None (not an error)
+```
+
+### Agents spawn without credential validation
+
+When you send a message to a session, sandbox-agent does **not** pre-validate credentials. The agent process is spawned with whatever credentials were found (or none), and the agent's native error surfaces if authentication fails.
+
+This design:
+- Lets you test agent error handling behavior
+- Avoids duplicating provider-specific auth validation
+- Ensures sandbox-agent faithfully proxies agent behavior
+
+For example, sending a message to Claude Code without Anthropic credentials will spawn the agent, which will then emit its own "ANTHROPIC_API_KEY not set" error through the event stream.
+
+## Checking credential status
+
+### API endpoint
+
+The `GET /v1/agents` endpoint includes a `credentialsAvailable` field for each agent:
+
+```json
+{
+  "agents": [
+    {
+      "id": "claude",
+      "installed": true,
+      "credentialsAvailable": true,
+      ...
+    },
+    {
+      "id": "codex",
+      "installed": true,
+      "credentialsAvailable": false,
+      ...
+    }
+  ]
+}
+```
+
+### TypeScript SDK
+
+```typescript
+const { agents } = await client.listAgents();
+for (const agent of agents) {
+  console.log(`${agent.id}: ${agent.credentialsAvailable ? 'authenticated' : 'no credentials'}`);
+}
+```
+
+### OpenCode compatibility
+
+The `/opencode/provider` endpoint returns a `connected` array listing providers with valid credentials:
+
+```json
+{
+  "all": [...],
+  "connected": ["claude", "mock"]
+}
+```
+
+## Passing credentials explicitly
+
+You can override auto-discovered credentials by setting environment variables before starting sandbox-agent:
+
+```bash
+export ANTHROPIC_API_KEY=sk-ant-...
+export OPENAI_API_KEY=sk-...
+sandbox-agent daemon start
+```
+
+Or when using the SDK in embedded mode:
+
+```typescript
+const client = await SandboxAgentClient.spawn({
+  env: {
+    ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
+  },
+});
+```
--- a/docs/custom-tools.mdx
+++ b/docs/custom-tools.mdx
@ -0,0 +1,245 @@
+---
+title: "Custom Tools"
+description: "Give agents custom tools inside the sandbox using MCP servers or skills."
+sidebarTitle: "Custom Tools"
+icon: "wrench"
+---
+
+There are two ways to give agents custom tools that run inside the sandbox:
+
+| | MCP Server | Skill |
+|---|---|---|
+| **How it works** | Sandbox Agent spawns your MCP server process and routes tool calls to it via stdio | A markdown file that instructs the agent to run your script with `node` (or any command) |
+| **Tool discovery** | Agent sees tools automatically via MCP protocol | Agent reads instructions from the skill file |
+| **Best for** | Structured tools with typed inputs/outputs | Lightweight scripts with natural-language instructions |
+| **Requires** | `@modelcontextprotocol/sdk` dependency | Just a markdown file and a script |
+
+Both approaches execute code inside the sandbox, so your tools have full access to the sandbox filesystem, network, and installed system tools.
+
+## Option A: Tools via MCP
+
+<Steps>
+  <Step title="Write your MCP server">
+    Create an MCP server that exposes tools using `@modelcontextprotocol/sdk` with `StdioServerTransport`. This server will run inside the sandbox.
+
+    ```ts src/mcp-server.ts
+    import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+    import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+    import { z } from "zod";
+
+    const server = new McpServer({
+      name: "rand",
+      version: "1.0.0",
+    });
+
+    server.tool(
+      "random_number",
+      "Generate a random integer between min and max (inclusive)",
+      {
+        min: z.number().describe("Minimum value"),
+        max: z.number().describe("Maximum value"),
+      },
+      async ({ min, max }) => ({
+        content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
+      }),
+    );
+
+    const transport = new StdioServerTransport();
+    await server.connect(transport);
+    ```
+
+    This is a simple example. Your MCP server runs inside the sandbox, so you can execute any code you'd like: query databases, call internal APIs, run shell commands, or interact with any service available in the container.
+  </Step>
+
+  <Step title="Package the MCP server">
+    Bundle into a single JS file so it can be uploaded and executed without a `node_modules` folder.
+
+    ```bash
+    npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs
+    ```
+
+    This creates `dist/mcp-server.cjs` ready to upload.
+  </Step>
+
+  <Step title="Create sandbox and upload MCP server">
+    Start your sandbox, then write the bundled file into it.
+
+    <CodeGroup>
+    ```ts TypeScript
+    import { SandboxAgent } from "sandbox-agent";
+    import fs from "node:fs";
+
+    const client = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+      token: process.env.SANDBOX_TOKEN,
+    });
+
+    const content = await fs.promises.readFile("./dist/mcp-server.cjs");
+    await client.writeFsFile(
+      { path: "/opt/mcp/custom-tools/mcp-server.cjs" },
+      content,
+    );
+    ```
+
+    ```bash cURL
+    curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      --data-binary @./dist/mcp-server.cjs
+    ```
+    </CodeGroup>
+  </Step>
+
+  <Step title="Create a session">
+    Point an MCP server config at the bundled JS file. When the session starts, Sandbox Agent spawns the MCP server process and routes tool calls to it.
+
+    <CodeGroup>
+    ```ts TypeScript
+    await client.createSession("custom-tools", {
+      agent: "claude",
+      mcp: {
+        customTools: {
+          type: "local",
+          command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
+        },
+      },
+    });
+    ```
+
+    ```bash cURL
+    curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      -H "Content-Type: application/json" \
+      -d '{
+        "agent": "claude",
+        "mcp": {
+          "customTools": {
+            "type": "local",
+            "command": ["node", "/opt/mcp/custom-tools/mcp-server.cjs"]
+          }
+        }
+      }'
+    ```
+    </CodeGroup>
+  </Step>
+</Steps>
+
+## Option B: Tools via Skills
+
+Skills are markdown files that instruct the agent how to use a script. Upload the script and a skill file, then point the session at the skill directory.
+
+<Steps>
+  <Step title="Write your script">
+    Write a script that the agent will execute. This runs inside the sandbox just like an MCP server, but the agent invokes it directly via its shell tool.
+
+    ```ts src/random-number.ts
+    const min = Number(process.argv[2]);
+    const max = Number(process.argv[3]);
+
+    if (Number.isNaN(min) || Number.isNaN(max)) {
+      console.error("Usage: random-number <min> <max>");
+      process.exit(1);
+    }
+
+    console.log(Math.floor(Math.random() * (max - min + 1)) + min);
+    ```
+  </Step>
+
+  <Step title="Write a skill file">
+    Create a `SKILL.md` that tells the agent what the script does and how to run it. The frontmatter `name` and `description` fields are required. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
+
+    ```md SKILL.md
+    ---
+    name: random-number
+    description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
+    ---
+
+    To generate a random number, run:
+
+    ```bash
+    node /opt/skills/random-number/random-number.cjs <min> <max>
+    ```
+
+    This prints a single random integer between min and max (inclusive).
+  </Step>
+
+  <Step title="Package the script">
+    Bundle the script just like an MCP server so it has no dependencies at runtime.
+
+    ```bash
+    npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs
+    ```
+  </Step>
+
+  <Step title="Create sandbox and upload files">
+    Upload both the bundled script and the skill file.
+
+    <CodeGroup>
+    ```ts TypeScript
+    import { SandboxAgent } from "sandbox-agent";
+    import fs from "node:fs";
+
+    const client = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+      token: process.env.SANDBOX_TOKEN,
+    });
+
+    const script = await fs.promises.readFile("./dist/random-number.cjs");
+    await client.writeFsFile(
+      { path: "/opt/skills/random-number/random-number.cjs" },
+      script,
+    );
+
+    const skill = await fs.promises.readFile("./SKILL.md");
+    await client.writeFsFile(
+      { path: "/opt/skills/random-number/SKILL.md" },
+      skill,
+    );
+    ```
+
+    ```bash cURL
+    curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/random-number.cjs" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      --data-binary @./dist/random-number.cjs
+
+    curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/SKILL.md" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      --data-binary @./SKILL.md
+    ```
+    </CodeGroup>
+  </Step>
+
+  <Step title="Create a session">
+    Point the session at the skill directory. The agent reads `SKILL.md` and learns how to use your script.
+
+    <CodeGroup>
+    ```ts TypeScript
+    await client.createSession("custom-tools", {
+      agent: "claude",
+      skills: {
+        sources: [
+          { type: "local", source: "/opt/skills/random-number" },
+        ],
+      },
+    });
+    ```
+
+    ```bash cURL
+    curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
+      -H "Authorization: Bearer $SANDBOX_TOKEN" \
+      -H "Content-Type: application/json" \
+      -d '{
+        "agent": "claude",
+        "skills": {
+          "sources": [
+            { "type": "local", "source": "/opt/skills/random-number" }
+          ]
+        }
+      }'
+    ```
+    </CodeGroup>
+  </Step>
+</Steps>
+
+## Notes
+
+- The sandbox image must include a Node.js runtime that can execute the bundled files.
--- a/docs/deploy/computesdk.mdx
+++ b/docs/deploy/computesdk.mdx
@ -0,0 +1,214 @@
+---
+title: "ComputeSDK"
+description: "Deploy the daemon using ComputeSDK's provider-agnostic sandbox API."
+---
+
+[ComputeSDK](https://computesdk.com) provides a unified interface for managing sandboxes across multiple providers. Write once, deploy anywhere—switch providers by changing environment variables.
+
+## Prerequisites
+
+- `COMPUTESDK_API_KEY` from [console.computesdk.com](https://console.computesdk.com)
+- Provider API key (one of: `E2B_API_KEY`, `DAYTONA_API_KEY`, `VERCEL_TOKEN`, `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`, `BLAXEL_API_KEY`, `CSB_API_KEY`)
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
+
+## TypeScript Example
+
+```typescript
+import {
+  compute,
+  detectProvider,
+  getMissingEnvVars,
+  getProviderConfigFromEnv,
+  isProviderAuthComplete,
+  isValidProvider,
+  PROVIDER_NAMES,
+  type ExplicitComputeConfig,
+  type ProviderName,
+} from "computesdk";
+import { SandboxAgent } from "sandbox-agent";
+
+const PORT = 3000;
+const REQUEST_TIMEOUT_MS =
+  Number.parseInt(process.env.COMPUTESDK_TIMEOUT_MS || "", 10) || 120_000;
+
+/**
+ * Detects and validates the provider to use.
+ * Priority: COMPUTESDK_PROVIDER env var > auto-detection from API keys
+ */
+function resolveProvider(): ProviderName {
+  const providerOverride = process.env.COMPUTESDK_PROVIDER;
+
+  if (providerOverride) {
+    if (!isValidProvider(providerOverride)) {
+      throw new Error(
+        `Unsupported provider "${providerOverride}". Supported: ${PROVIDER_NAMES.join(", ")}`
+      );
+    }
+    if (!isProviderAuthComplete(providerOverride)) {
+      const missing = getMissingEnvVars(providerOverride);
+      throw new Error(
+        `Missing credentials for "${providerOverride}". Set: ${missing.join(", ")}`
+      );
+    }
+    return providerOverride as ProviderName;
+  }
+
+  const detected = detectProvider();
+  if (!detected) {
+    throw new Error(
+      `No provider credentials found. Set one of: ${PROVIDER_NAMES.map((p) => getMissingEnvVars(p).join(", ")).join(" | ")}`
+    );
+  }
+  return detected as ProviderName;
+}
+
+function configureComputeSDK(): void {
+  const provider = resolveProvider();
+
+  const config: ExplicitComputeConfig = {
+    provider,
+    computesdkApiKey: process.env.COMPUTESDK_API_KEY,
+    requestTimeoutMs: REQUEST_TIMEOUT_MS,
+  };
+
+  // Add provider-specific config from environment
+  const providerConfig = getProviderConfigFromEnv(provider);
+  if (Object.keys(providerConfig).length > 0) {
+    (config as any)[provider] = providerConfig;
+  }
+
+  compute.setConfig(config);
+}
+
+configureComputeSDK();
+
+// Build environment variables to pass to sandbox
+const envs: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+// Create sandbox
+const sandbox = await compute.sandbox.create({
+  envs: Object.keys(envs).length > 0 ? envs : undefined,
+});
+
+// Helper to run commands with error handling
+const run = async (cmd: string, options?: { background?: boolean }) => {
+  const result = await sandbox.runCommand(cmd, options);
+  if (typeof result?.exitCode === "number" && result.exitCode !== 0) {
+    throw new Error(`Command failed: ${cmd} (exit ${result.exitCode})\n${result.stderr || ""}`);
+  }
+  return result;
+};
+
+// Install sandbox-agent
+await run("curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh");
+
+// Install agents conditionally based on available API keys
+if (envs.ANTHROPIC_API_KEY) {
+  await run("sandbox-agent install-agent claude");
+}
+if (envs.OPENAI_API_KEY) {
+  await run("sandbox-agent install-agent codex");
+}
+
+// Start the server in the background
+await run(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`, { background: true });
+
+// Get the public URL for the sandbox
+const baseUrl = await sandbox.getUrl({ port: PORT });
+
+// Wait for server to be ready
+const deadline = Date.now() + REQUEST_TIMEOUT_MS;
+while (Date.now() < deadline) {
+  try {
+    const response = await fetch(`${baseUrl}/v1/health`);
+    if (response.ok) {
+      const data = await response.json();
+      if (data?.status === "ok") break;
+    }
+  } catch {
+    // Server not ready yet
+  }
+  await new Promise((r) => setTimeout(r, 500));
+}
+
+// Connect to the server
+const client = await SandboxAgent.connect({ baseUrl });
+
+// Detect which agent to use based on available API keys
+const agent = envs.ANTHROPIC_API_KEY ? "claude" : "codex";
+
+// Create a session and start coding
+await client.createSession("my-session", { agent });
+
+await client.postMessage("my-session", {
+  message: "Summarize this repository",
+});
+
+for await (const event of client.streamEvents("my-session")) {
+  console.log(event.type, event.data);
+}
+
+// Cleanup
+await sandbox.destroy();
+```
+
+## Supported Providers
+
+ComputeSDK auto-detects your provider from environment variables:
+
+| Provider | Environment Variables |
+|----------|----------------------|
+| E2B | `E2B_API_KEY` |
+| Daytona | `DAYTONA_API_KEY` |
+| Vercel | `VERCEL_TOKEN` or `VERCEL_OIDC_TOKEN` |
+| Modal | `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET` |
+| Blaxel | `BLAXEL_API_KEY` |
+| CodeSandbox | `CSB_API_KEY` |
+
+## Notes
+
+- **Provider resolution order**: `COMPUTESDK_PROVIDER` env var takes priority, otherwise auto-detection from API keys.
+- **Conditional agent installation**: Only agents with available API keys are installed, reducing setup time.
+- **Command error handling**: The example validates exit codes and throws on failures for easier debugging.
+- `sandbox.runCommand(..., { background: true })` keeps the server running while your app continues.
+- `sandbox.getUrl({ port })` returns a public URL for the sandbox port.
+- Always destroy the sandbox when you are done to avoid leaking resources.
+- If sandbox creation times out, set `COMPUTESDK_TIMEOUT_MS` to a higher value (default: 120000ms).
+
+## Explicit Provider Selection
+
+To force a specific provider instead of auto-detection, set the `COMPUTESDK_PROVIDER` environment variable:
+
+```bash
+export COMPUTESDK_PROVIDER=e2b
+```
+
+Or configure programmatically using `getProviderConfigFromEnv()`:
+
+```typescript
+import { compute, getProviderConfigFromEnv, type ExplicitComputeConfig } from "computesdk";
+
+const config: ExplicitComputeConfig = {
+  provider: "e2b",
+  computesdkApiKey: process.env.COMPUTESDK_API_KEY,
+  requestTimeoutMs: 120_000,
+};
+
+// Automatically populate provider-specific config from environment
+const providerConfig = getProviderConfigFromEnv("e2b");
+if (Object.keys(providerConfig).length > 0) {
+  (config as any).e2b = providerConfig;
+}
+
+compute.setConfig(config);
+```
+
+## Direct Mode (No ComputeSDK API Key)
+
+To bypass the ComputeSDK gateway and use provider SDKs directly, see the provider-specific examples:
+
+- [E2B](/deploy/e2b)
+- [Daytona](/deploy/daytona)
+- [Vercel](/deploy/vercel)
--- a/docs/deploy/index.mdx
+++ b/docs/deploy/index.mdx
@ -1,27 +0,0 @@
---
-title: "Deploy"
-sidebarTitle: "Overview"
-description: "Choose where to run the sandbox-agent server."
-icon: "server"
---
-
-<CardGroup cols={2}>
-  <Card title="Local" icon="laptop" href="/deploy/local">
-    Run locally for development. The SDK can auto-spawn the server.
-  </Card>
-  <Card title="E2B" icon="cube" href="/deploy/e2b">
-    Deploy inside an E2B sandbox with network access.
-  </Card>
-  <Card title="Vercel" icon="triangle" href="/deploy/vercel">
-    Deploy inside a Vercel Sandbox with port forwarding.
-  </Card>
-  <Card title="Cloudflare" icon="cloud" href="/deploy/cloudflare">
-    Deploy inside a Cloudflare Sandbox with port exposure.
-  </Card>
-  <Card title="Daytona" icon="cloud" href="/deploy/daytona">
-    Run in a Daytona workspace with port forwarding.
-  </Card>
-  <Card title="Docker" icon="docker" href="/deploy/docker">
-    Build and run in a container (development only).
-  </Card>
-</CardGroup>
--- a/docs/docs.json
+++ b/docs/docs.json
@ -25,19 +25,26 @@
 	},
 	"navbar": {
 		"links": [
+			{
+				"label": "Gigacode",
+				"icon": "terminal",
+				"href": "https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode"
+			},
 			{
 				"label": "Discord",
 				"icon": "discord",
 				"href": "https://discord.gg/auCecybynK"
 			},
 			{
-				"label": "GitHub",
-				"icon": "github",
+				"type": "github",
 				"href": "https://github.com/rivet-dev/sandbox-agent"
 			}
 		]
 	},
 	"navigation": {
+		"tabs": [
+			{
+				"tab": "Documentation",
 				"pages": [
 					{
 						"group": "Getting started",
@ -45,46 +52,72 @@
 							"quickstart",
 							"building-chat-ui",
 							"manage-sessions",
-					"opencode-compatibility"
-				]
-			},
 							{
 								"group": "Deploy",
+								"icon": "server",
 								"pages": [
-					"deploy/index",
 									"deploy/local",
+									"deploy/computesdk",
 									"deploy/e2b",
 									"deploy/daytona",
 									"deploy/vercel",
 									"deploy/cloudflare",
 									"deploy/docker"
 								]
+							}
+						]
 					},
 					{
 						"group": "SDKs",
 						"pages": ["sdks/typescript", "sdks/python"]
 					},
+					{
+						"group": "Agent Features",
+						"pages": [
+							"agent-sessions",
+							"attachments",
+							"skills-config",
+							"mcp-config",
+							"custom-tools"
+						]
+					},
+					{
+						"group": "Features",
+						"pages": ["file-system"]
+					},
 					{
 						"group": "Reference",
 						"pages": [
 							"cli",
 							"inspector",
 							"session-transcript-schema",
-					"gigacode",
+							"opencode-compatibility",
+							{
+								"group": "More",
+								"pages": [
+									"credentials",
+									"daemon",
+									"cors",
+									"telemetry",
 									{
 										"group": "AI",
 										"pages": ["ai/skill", "ai/llms-txt"]
-					},
-					{
-						"group": "Advanced",
-						"pages": ["daemon", "cors", "telemetry"]
+									}
+								]
+							}
+						]
 					}
 				]
 			},
 			{
-				"group": "HTTP API Reference",
+				"tab": "HTTP API",
+				"pages": [
+					{
+						"group": "HTTP Reference",
 						"openapi": "openapi.json"
 					}
 				]
 			}
+		]
+	}
 }
--- a/docs/file-system.mdx
+++ b/docs/file-system.mdx
@ -0,0 +1,184 @@
+---
+title: "File System"
+description: "Read, write, and manage files inside the sandbox."
+sidebarTitle: "File System"
+icon: "folder"
+---
+
+The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload batches of files via tar archives.
+
+## Path Resolution
+
+- Absolute paths are used as-is.
+- Relative paths use the session working directory when `sessionId` is provided.
+- Without `sessionId`, relative paths resolve against the server home directory.
+- Relative paths cannot contain `..` or absolute prefixes; requests that attempt to escape the root are rejected.
+
+The session working directory is the server process current working directory at the moment the session is created.
+
+## List Entries
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+const entries = await client.listFsEntries({
+  path: "./workspace",
+  sessionId: "my-session",
+});
+
+console.log(entries);
+```
+
+```bash cURL
+curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## Read And Write Files
+
+`PUT /v1/fs/file` writes raw bytes. `GET /v1/fs/file` returns raw bytes.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.writeFsFile({ path: "./notes.txt", sessionId: "my-session" }, "hello");
+
+const bytes = await client.readFsFile({
+  path: "./notes.txt",
+  sessionId: "my-session",
+});
+
+const text = new TextDecoder().decode(bytes);
+console.log(text);
+```
+
+```bash cURL
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  --data-binary "hello"
+
+curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  --output ./notes.txt
+```
+</CodeGroup>
+
+## Create Directories
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.mkdirFs({
+  path: "./data",
+  sessionId: "my-session",
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## Move, Delete, And Stat
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.moveFs(
+  { from: "./notes.txt", to: "./notes-old.txt", overwrite: true },
+  { sessionId: "my-session" },
+);
+
+const stat = await client.statFs({
+  path: "./notes-old.txt",
+  sessionId: "my-session",
+});
+
+await client.deleteFsEntry({
+  path: "./notes-old.txt",
+  sessionId: "my-session",
+});
+
+console.log(stat);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/move?sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"from":"./notes.txt","to":"./notes-old.txt","overwrite":true}'
+
+curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+
+curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN"
+```
+</CodeGroup>
+
+## Batch Upload (Tar)
+
+Batch upload accepts `application/x-tar` only and extracts into the destination directory. The response returns absolute paths for extracted files, capped at 1024 entries.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+import fs from "node:fs";
+import path from "node:path";
+import tar from "tar";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+const archivePath = path.join(process.cwd(), "skills.tar");
+await tar.c({
+  cwd: "./skills",
+  file: archivePath,
+}, ["."]);
+
+const tarBuffer = await fs.promises.readFile(archivePath);
+const result = await client.uploadFsBatch(tarBuffer, {
+  path: "./skills",
+  sessionId: "my-session",
+});
+
+console.log(result);
+```
+
+```bash cURL
+tar -cf skills.tar -C ./skills .
+
+curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills&sessionId=my-session" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/x-tar" \
+  --data-binary @skills.tar
+```
+</CodeGroup>
--- a/docs/inspector.mdx
+++ b/docs/inspector.mdx
@ -1,7 +1,6 @@
 ---
 title: "Inspector"
 description: "Debug and inspect agent sessions with the Inspector UI."
-icon: "magnifying-glass"
 ---

 The Inspector is a web-based GUI for debugging and inspecting Sandbox Agent sessions. Use it to view events, send messages, and troubleshoot agent behavior in real-time.
--- a/docs/mcp-config.mdx
+++ b/docs/mcp-config.mdx
@ -0,0 +1,122 @@
+---
+title: "MCP"
+description: "Configure MCP servers for agent sessions."
+sidebarTitle: "MCP"
+icon: "plug"
+---
+
+MCP (Model Context Protocol) servers extend agents with tools. Sandbox Agent can auto-load MCP servers when a session starts by passing an `mcp` map in the create-session request.
+
+## Session Config
+
+The `mcp` field is a map of server name to config. Use `type: "local"` for stdio servers and `type: "remote"` for HTTP/SSE servers:
+
+<CodeGroup>
+
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.createSession("claude-mcp", {
+  agent: "claude",
+  mcp: {
+    filesystem: {
+      type: "local",
+      command: "my-mcp-server",
+      args: ["--root", "."],
+    },
+    github: {
+      type: "remote",
+      url: "https://example.com/mcp",
+      headers: {
+        Authorization: "Bearer ${GITHUB_TOKEN}",
+      },
+    },
+  },
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-mcp" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "agent": "claude",
+    "mcp": {
+      "filesystem": {
+        "type": "local",
+        "command": "my-mcp-server",
+        "args": ["--root", "."]
+      },
+      "github": {
+        "type": "remote",
+        "url": "https://example.com/mcp",
+        "headers": {
+          "Authorization": "Bearer ${GITHUB_TOKEN}"
+        }
+      }
+    }
+  }'
+```
+
+</CodeGroup>
+
+## Config Fields
+
+### Local Server
+
+Stdio servers that run inside the sandbox.
+
+| Field | Description |
+|---|---|
+| `type` | `local` |
+| `command` | string or array (`["node", "server.js"]`) |
+| `args` | array of string arguments |
+| `env` | environment variables map |
+| `enabled` | enable or disable the server |
+| `timeoutMs` | tool timeout override |
+| `cwd` | working directory for the MCP process |
+
+```json
+{
+  "type": "local",
+  "command": ["node", "./mcp/server.js"],
+  "args": ["--root", "."],
+  "env": { "LOG_LEVEL": "debug" },
+  "cwd": "/workspace"
+}
+```
+
+### Remote Server
+
+HTTP/SSE servers accessed over the network.
+
+| Field | Description |
+|---|---|
+| `type` | `remote` |
+| `url` | MCP server URL |
+| `headers` | static headers map |
+| `bearerTokenEnvVar` | env var name to inject into `Authorization: Bearer ...` |
+| `envHeaders` | map of header name to env var name |
+| `oauth` | object with `clientId`, `clientSecret`, `scope`, or `false` to disable |
+| `enabled` | enable or disable the server |
+| `timeoutMs` | tool timeout override |
+| `transport` | `http` or `sse` |
+
+```json
+{
+  "type": "remote",
+  "url": "https://example.com/mcp",
+  "headers": { "x-client": "sandbox-agent" },
+  "bearerTokenEnvVar": "MCP_TOKEN",
+  "transport": "sse"
+}
+```
+
+## Custom MCP Servers
+
+To bundle and upload your own MCP server into the sandbox, see [Custom Tools](/custom-tools).
--- a/docs/openapi.json
+++ b/docs/openapi.json
--- a/docs/opencode-compatibility.mdx
+++ b/docs/opencode-compatibility.mdx
@ -1,7 +1,6 @@
 ---
-title: "OpenCode SDK & UI Support"
+title: "OpenCode Compatibility"
 description: "Connect OpenCode clients, SDKs, and web UI to Sandbox Agent."
-icon: "rectangle-terminal"
 ---

 <Warning>
@ -60,10 +59,11 @@ The OpenCode web UI can connect to Sandbox Agent for a full browser-based experi
  </Step>
  <Step title="Clone and Start the OpenCode Web App">
    ```bash
-    git clone https://github.com/opencode-ai/opencode
+    git clone https://github.com/anomalyco/opencode
    cd opencode/packages/app
    export VITE_OPENCODE_SERVER_HOST=127.0.0.1
    export VITE_OPENCODE_SERVER_PORT=2468
+    bun install
    bun run dev -- --host 127.0.0.1 --port 5173
    ```
  </Step>
@ -113,6 +113,7 @@ for await (const event of events.stream) {
 - **CORS**: When using the web UI from a different origin, configure `--cors-allow-origin`
 - **Provider Selection**: Use the provider/model selector in the UI to choose which backing agent to use (claude, codex, opencode, amp)
 - **Models & Variants**: Providers are grouped by backing agent (e.g. Claude Code, Codex, Amp). OpenCode models are grouped by `OpenCode (<provider>)` to preserve their native provider grouping. Each model keeps its real model ID, and variants are exposed when available (Codex/OpenCode/Amp).
+- **Optional Native Proxy for TUI/Config Endpoints**: Set `OPENCODE_COMPAT_PROXY_URL` (for example `http://127.0.0.1:4096`) to proxy select OpenCode-native endpoints to a real OpenCode server. This currently applies to `/command`, `/config`, `/global/config`, and `/tui/*`. If not set, sandbox-agent uses its built-in compatibility handlers.

 ## Endpoint Coverage

@ -134,10 +135,15 @@ See the full endpoint compatibility table below. Most endpoints are functional f
 | `GET /question` | ✓ | List pending questions |
 | `POST /question/{id}/reply` | ✓ | Answer agent questions |
 | `GET /provider` | ✓ | Returns provider metadata |
+| `GET /command` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
+| `GET /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
+| `PATCH /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
+| `GET /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
+| `PATCH /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
+| `/tui/*` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
 | `GET /agent` | − | Returns agent list |
-| `GET /config` | − | Returns config |
 | *other endpoints* | − | Return empty/stub responses |

-✓ Functional &nbsp;&nbsp; − Stubbed
+✓ Functional &nbsp;&nbsp; ↔ Proxied (optional) &nbsp;&nbsp; − Stubbed

 </Accordion>
--- a/docs/session-transcript-schema.mdx
+++ b/docs/session-transcript-schema.mdx
@ -1,7 +1,6 @@
 ---
 title: "Session Transcript Schema"
 description: "Universal event schema for session transcripts across all agents."
-icon: "brackets-curly"
 ---

 Each coding agent outputs events in its own native format. The sandbox-agent converts these into a universal event schema, giving you a consistent session transcript regardless of which agent you use.
@ -27,7 +26,7 @@ This table shows which agent feature coverage appears in the universal event str
 | Reasoning/Thinking |   -    |   ✓   |      -       |      -       |      ✓       |
 | Command Execution  |   -    |   ✓   |      -       |      -       |              |
 | File Changes       |   -    |   ✓   |      -       |      -       |              |
-| MCP Tools          |   -    |   ✓   |      -       |      -       |              |
+| MCP Tools          |   ✓    |   ✓   |      ✓       |      ✓       |              |
 | Streaming Deltas   |   ✓    |   ✓   |      ✓       |      -       |      ✓       |
 | Variants           |        |   ✓   |      ✓       |      ✓       |      ✓       |

@ -125,6 +124,13 @@ Every event from the API is wrapped in a `UniversalEvent` envelope.
 | `session.started` | Session has started | `{ metadata?: any }` |
 | `session.ended` | Session has ended | `{ reason, terminated_by, message?, exit_code? }` |

+### Turn Lifecycle
+
+| Type | Description | Data |
+|------|-------------|------|
+| `turn.started` | Turn has started | `{ phase: "started", turn_id?, metadata? }` |
+| `turn.ended` | Turn has ended | `{ phase: "ended", turn_id?, metadata? }` |
+
 **SessionEndedData**

 | Field | Type | Values |
@ -159,7 +165,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
 | Type | Description | Data |
 |------|-------------|------|
 | `permission.requested` | Permission request pending | `{ permission_id, action, status, metadata? }` |
-| `permission.resolved` | Permission granted or denied | `{ permission_id, action, status, metadata? }` |
+| `permission.resolved` | Permission decision recorded | `{ permission_id, action, status, metadata? }` |
 | `question.requested` | Question pending user input | `{ question_id, prompt, options, status }` |
 | `question.resolved` | Question answered or rejected | `{ question_id, prompt, options, status, response? }` |

@ -169,7 +175,7 @@ Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more)
 |-------|------|-------------|
 | `permission_id` | string | Identifier for the permission request |
 | `action` | string | What the agent wants to do |
-| `status` | string | `requested`, `approved`, `denied` |
+| `status` | string | `requested`, `accept`, `accept_for_session`, `reject` |
 | `metadata` | any? | Additional context |

 **QuestionEventData**
@ -366,6 +372,8 @@ The daemon emits synthetic events (`synthetic: true`, `source: "daemon"`) to pro
 |-----------|------|
 | `session.started` | Agent doesn't emit explicit session start |
 | `session.ended` | Agent doesn't emit explicit session end |
+| `turn.started` | Agent doesn't emit explicit turn start |
+| `turn.ended` | Agent doesn't emit explicit turn end |
 | `item.started` | Agent doesn't emit item start events |
 | `item.delta` | Agent doesn't stream deltas natively |
 | `question.*` | Claude Code plan mode (from ExitPlanMode tool) |
--- a/docs/skills-config.mdx
+++ b/docs/skills-config.mdx
@ -0,0 +1,87 @@
+---
+title: "Skills"
+description: "Auto-load skills into agent sessions."
+sidebarTitle: "Skills"
+icon: "sparkles"
+---
+
+Skills are local instruction bundles stored in `SKILL.md` files. Sandbox Agent can fetch, discover, and link skill directories into agent-specific skill paths at session start using the `skills.sources` field. The format is fully compatible with [skills.sh](https://skills.sh).
+
+## Session Config
+
+Pass `skills.sources` when creating a session to load skills from GitHub repos, local paths, or git URLs.
+
+<CodeGroup>
+
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const client = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  token: process.env.SANDBOX_TOKEN,
+});
+
+await client.createSession("claude-skills", {
+  agent: "claude",
+  skills: {
+    sources: [
+      { type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
+      { type: "local", source: "/workspace/my-custom-skill" },
+    ],
+  },
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-skills" \
+  -H "Authorization: Bearer $SANDBOX_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "agent": "claude",
+    "skills": {
+      "sources": [
+        { "type": "github", "source": "rivet-dev/skills", "skills": ["sandbox-agent"] },
+        { "type": "local", "source": "/workspace/my-custom-skill" }
+      ]
+    }
+  }'
+```
+
+</CodeGroup>
+
+Each skill directory must contain `SKILL.md`. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
+
+## Skill Sources
+
+Each entry in `skills.sources` describes where to find skills. Three source types are supported:
+
+| Type | `source` value | Example |
+|------|---------------|---------|
+| `github` | `owner/repo` | `"rivet-dev/skills"` |
+| `local` | Filesystem path | `"/workspace/my-skill"` |
+| `git` | Git clone URL | `"https://git.example.com/skills.git"` |
+
+### Optional fields
+
+- **`skills`** — Array of skill directory names to include. When omitted, all discovered skills are installed.
+- **`ref`** — Branch, tag, or commit to check out (default: HEAD). Applies to `github` and `git` types.
+- **`subpath`** — Subdirectory within the repo to search for skills.
+
+## Custom Skills
+
+To write, upload, and configure your own skills inside the sandbox, see [Custom Tools](/custom-tools).
+
+## Advanced
+
+### Discovery logic
+
+After resolving a source to a local directory (cloning if needed), Sandbox Agent discovers skills by:
+1. Checking if the directory itself contains `SKILL.md`.
+2. Scanning `skills/` subdirectory for child directories containing `SKILL.md`.
+3. Scanning immediate children of the directory for `SKILL.md`.
+
+Discovered skills are symlinked into project-local skill roots (`.claude/skills/<name>`, `.agents/skills/<name>`, `.opencode/skill/<name>`).
+
+### Caching
+
+GitHub sources are downloaded as zip archives and git sources are cloned to `~/.sandbox-agent/skills-cache/` and updated on subsequent session creations. GitHub sources do not require `git` to be installed.
--- a/examples/CLAUDE.md
+++ b/examples/CLAUDE.md
@ -0,0 +1,17 @@
+# Examples Instructions
+
+## Docker Isolation
+
+- Docker examples must behave like standalone sandboxes.
+- Do not bind mount host files or host directories into Docker example containers.
+- If an example needs tools, skills, or MCP servers, install them inside the container during setup.
+
+## Testing Examples
+
+Examples can be tested by starting them in the background and communicating directly with the sandbox-agent API:
+
+1. Start the example: `SANDBOX_AGENT_DEV=1 pnpm start &`
+2. Note the base URL and session ID from the output.
+3. Send messages: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/messages -H "Content-Type: application/json" -d '{"message":"..."}'`
+4. Poll events: `curl http://127.0.0.1:<port>/v1/sessions/<sessionId>/events`
+5. Approve permissions: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/permissions/<permissionId>/reply -H "Content-Type: application/json" -d '{"reply":"once"}'`
--- a/examples/cloudflare/src/cloudflare.ts
+++ b/examples/cloudflare/src/cloudflare.ts
--- a/examples/cloudflare/wrangler.jsonc
+++ b/examples/cloudflare/wrangler.jsonc
@ -1,7 +1,7 @@
 {
  "$schema": "node_modules/wrangler/config-schema.json",
  "name": "sandbox-agent-cloudflare",
-  "main": "src/cloudflare.ts",
+  "main": "src/index.ts",
  "compatibility_date": "2025-01-01",
  "compatibility_flags": ["nodejs_compat"],
  "assets": {
--- a/examples/computesdk/package.json
+++ b/examples/computesdk/package.json
@ -0,0 +1,19 @@
+{
+  "name": "@sandbox-agent/example-computesdk",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "start": "tsx src/computesdk.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sandbox-agent/example-shared": "workspace:*",
+    "computesdk": "latest"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "tsx": "latest",
+    "typescript": "latest",
+    "vitest": "^3.0.0"
+  }
+}
--- a/examples/computesdk/src/computesdk.ts
+++ b/examples/computesdk/src/computesdk.ts
@ -0,0 +1,156 @@
+import {
+  compute,
+  detectProvider,
+  getMissingEnvVars,
+  getProviderConfigFromEnv,
+  isProviderAuthComplete,
+  isValidProvider,
+  PROVIDER_NAMES,
+  type ExplicitComputeConfig,
+  type ProviderName,
+} from "computesdk";
+import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
+import { fileURLToPath } from "node:url";
+import { resolve } from "node:path";
+
+const PORT = 3000;
+const REQUEST_TIMEOUT_MS =
+  Number.parseInt(process.env.COMPUTESDK_TIMEOUT_MS || "", 10) || 120_000;
+
+/**
+ * Detects and validates the provider to use.
+ * Priority: COMPUTESDK_PROVIDER env var > auto-detection from API keys
+ */
+function resolveProvider(): ProviderName {
+  const providerOverride = process.env.COMPUTESDK_PROVIDER;
+  
+  if (providerOverride) {
+    if (!isValidProvider(providerOverride)) {
+      throw new Error(
+        `Unsupported ComputeSDK provider "${providerOverride}". Supported providers: ${PROVIDER_NAMES.join(", ")}`
+      );
+    }
+    if (!isProviderAuthComplete(providerOverride)) {
+      const missing = getMissingEnvVars(providerOverride);
+      throw new Error(
+        `Missing credentials for provider "${providerOverride}". Set: ${missing.join(", ")}`
+      );
+    }
+    console.log(`Using ComputeSDK provider: ${providerOverride} (explicit)`);
+    return providerOverride as ProviderName;
+  }
+  
+  const detected = detectProvider();
+  if (!detected) {
+    throw new Error(
+      `No provider credentials found. Set one of: ${PROVIDER_NAMES.map((p) => getMissingEnvVars(p).join(", ")).join(" | ")}`
+    );
+  }
+  console.log(`Using ComputeSDK provider: ${detected} (auto-detected)`);
+  return detected as ProviderName;
+}
+
+function configureComputeSDK(): void {
+  const provider = resolveProvider();
+  
+  const config: ExplicitComputeConfig = {
+    provider,
+    computesdkApiKey: process.env.COMPUTESDK_API_KEY,
+    requestTimeoutMs: REQUEST_TIMEOUT_MS,
+  };
+  
+  const providerConfig = getProviderConfigFromEnv(provider);
+  if (Object.keys(providerConfig).length > 0) {
+    const configWithProvider =
+      config as ExplicitComputeConfig & Record<ProviderName, Record<string, string>>;
+    configWithProvider[provider] = providerConfig;
+  }
+  
+  compute.setConfig(config);
+}
+
+configureComputeSDK();
+
+const buildEnv = (): Record<string, string> => {
+  const env: Record<string, string> = {};
+  if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+  if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+  return env;
+};
+
+export async function setupComputeSdkSandboxAgent(): Promise<{
+  baseUrl: string;
+  cleanup: () => Promise<void>;
+}> {
+  const env = buildEnv();
+
+  console.log("Creating ComputeSDK sandbox...");
+  const sandbox = await compute.sandbox.create({
+    envs: Object.keys(env).length > 0 ? env : undefined,
+  });
+
+  const run = async (cmd: string, options?: { background?: boolean }) => {
+    const result = await sandbox.runCommand(cmd, options);
+    if (typeof result?.exitCode === "number" && result.exitCode !== 0) {
+      throw new Error(`Command failed: ${cmd} (exit ${result.exitCode})\n${result.stderr || ""}`);
+    }
+    return result;
+  };
+
+  console.log("Installing sandbox-agent...");
+  await run("curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh");
+
+  if (env.ANTHROPIC_API_KEY) {
+    console.log("Installing Claude agent...");
+    await run("sandbox-agent install-agent claude");
+  }
+
+  if (env.OPENAI_API_KEY) {
+    console.log("Installing Codex agent...");
+    await run("sandbox-agent install-agent codex");
+  }
+
+  console.log("Starting server...");
+  await run(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`, { background: true });
+
+  const baseUrl = await sandbox.getUrl({ port: PORT });
+
+  console.log("Waiting for server...");
+  await waitForHealth({ baseUrl });
+
+  const cleanup = async () => {
+    try {
+      await sandbox.destroy();
+    } catch (error) {
+      console.warn("Cleanup failed:", error instanceof Error ? error.message : error);
+    }
+  };
+
+  return { baseUrl, cleanup };
+}
+
+export async function runComputeSdkExample(): Promise<void> {
+  const { baseUrl, cleanup } = await setupComputeSdkSandboxAgent();
+
+  const handleExit = async () => {
+    await cleanup();
+    process.exit(0);
+  };
+
+  process.once("SIGINT", handleExit);
+  process.once("SIGTERM", handleExit);
+
+  await runPrompt(baseUrl);
+  await cleanup();
+}
+
+const isDirectRun = Boolean(
+  process.argv[1] && resolve(process.argv[1]) === fileURLToPath(import.meta.url)
+);
+
+if (isDirectRun) {
+  runComputeSdkExample().catch((error) => {
+    console.error(error instanceof Error ? error.message : error);
+    process.exit(1);
+  });
+}
--- a/examples/computesdk/tests/computesdk.test.ts
+++ b/examples/computesdk/tests/computesdk.test.ts
@ -0,0 +1,39 @@
+import { describe, it, expect } from "vitest";
+import { buildHeaders } from "@sandbox-agent/example-shared";
+import { setupComputeSdkSandboxAgent } from "../src/computesdk.ts";
+
+const hasModal = Boolean(process.env.MODAL_TOKEN_ID && process.env.MODAL_TOKEN_SECRET);
+const hasVercel = Boolean(process.env.VERCEL_TOKEN || process.env.VERCEL_OIDC_TOKEN);
+const hasProviderKey = Boolean(
+  process.env.BLAXEL_API_KEY ||
+    process.env.CSB_API_KEY ||
+    process.env.DAYTONA_API_KEY ||
+    process.env.E2B_API_KEY ||
+    hasModal ||
+    hasVercel
+);
+
+const shouldRun = Boolean(process.env.COMPUTESDK_API_KEY) && hasProviderKey;
+const timeoutMs = Number.parseInt(process.env.SANDBOX_TEST_TIMEOUT_MS || "", 10) || 300_000;
+
+const testFn = shouldRun ? it : it.skip;
+
+describe("computesdk example", () => {
+  testFn(
+    "starts sandbox-agent and responds to /v1/health",
+    async () => {
+      const { baseUrl, cleanup } = await setupComputeSdkSandboxAgent();
+      try {
+        const response = await fetch(`${baseUrl}/v1/health`, {
+          headers: buildHeaders({}),
+        });
+        expect(response.ok).toBe(true);
+        const data = await response.json();
+        expect(data.status).toBe("ok");
+      } finally {
+        await cleanup();
+      }
+    },
+    timeoutMs
+  );
+});
--- a/examples/computesdk/tsconfig.json
+++ b/examples/computesdk/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/daytona/package.json
+++ b/examples/daytona/package.json
@ -3,13 +3,14 @@
  "private": true,
  "type": "module",
  "scripts": {
-    "start": "tsx src/daytona.ts",
+    "start": "tsx src/index.ts",
    "start:snapshot": "tsx src/daytona-with-snapshot.ts",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
    "@daytonaio/sdk": "latest",
-    "@sandbox-agent/example-shared": "workspace:*"
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*"
  },
  "devDependencies": {
    "@types/node": "latest",
--- a/examples/daytona/src/daytona-with-snapshot.ts
+++ b/examples/daytona/src/daytona-with-snapshot.ts
@ -1,5 +1,6 @@
 import { Daytona, Image } from "@daytonaio/sdk";
-import { runPrompt } from "@sandbox-agent/example-shared";
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";

 const daytona = new Daytona();

@ -24,12 +25,21 @@ await sandbox.process.executeCommand(

 const baseUrl = (await sandbox.getSignedPreviewUrl(3000, 4 * 60 * 60)).url;

+console.log("Waiting for server...");
+await waitForHealth({ baseUrl });
+
+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
 const cleanup = async () => {
+	clearInterval(keepAlive);
 	await sandbox.delete(60);
 	process.exit(0);
 };
 process.once("SIGINT", cleanup);
 process.once("SIGTERM", cleanup);
-
-await runPrompt(baseUrl);
-await cleanup();
--- a/examples/daytona/src/daytona.ts
+++ b/examples/daytona/src/daytona.ts
@ -1,5 +1,6 @@
 import { Daytona } from "@daytonaio/sdk";
-import { runPrompt } from "@sandbox-agent/example-shared";
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";

 const daytona = new Daytona();

@ -25,12 +26,21 @@ await sandbox.process.executeCommand(

 const baseUrl = (await sandbox.getSignedPreviewUrl(3000, 4 * 60 * 60)).url;

+console.log("Waiting for server...");
+await waitForHealth({ baseUrl });
+
+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
 const cleanup = async () => {
+	clearInterval(keepAlive);
 	await sandbox.delete(60);
 	process.exit(0);
 };
 process.once("SIGINT", cleanup);
 process.once("SIGTERM", cleanup);
-
-await runPrompt(baseUrl);
-await cleanup();
--- a/examples/docker/package.json
+++ b/examples/docker/package.json
@ -3,12 +3,13 @@
  "private": true,
  "type": "module",
  "scripts": {
-    "start": "tsx src/docker.ts",
+    "start": "tsx src/index.ts",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
    "@sandbox-agent/example-shared": "workspace:*",
-    "dockerode": "latest"
+    "dockerode": "latest",
+    "sandbox-agent": "workspace:*"
  },
  "devDependencies": {
    "@types/dockerode": "latest",
--- a/examples/docker/src/docker.ts
+++ b/examples/docker/src/docker.ts
@ -1,5 +1,6 @@
 import Docker from "dockerode";
-import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";

 const IMAGE = "alpine:latest";
 const PORT = 3000;
@ -44,13 +45,19 @@ await container.start();
 const baseUrl = `http://127.0.0.1:${PORT}`;
 await waitForHealth({ baseUrl });

+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
 const cleanup = async () => {
+  clearInterval(keepAlive);
  try { await container.stop({ t: 5 }); } catch {}
  try { await container.remove({ force: true }); } catch {}
  process.exit(0);
 };
 process.once("SIGINT", cleanup);
 process.once("SIGTERM", cleanup);
-
-await runPrompt(baseUrl);
-await cleanup();
--- a/examples/e2b/package.json
+++ b/examples/e2b/package.json
@ -3,7 +3,7 @@
  "private": true,
  "type": "module",
  "scripts": {
-    "start": "tsx src/e2b.ts",
+    "start": "tsx src/index.ts",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
--- a/examples/e2b/src/index.ts
+++ b/examples/e2b/src/index.ts
@ -1,5 +1,6 @@
 import { Sandbox } from "@e2b/code-interpreter";
-import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";

 const envs: Record<string, string> = {};
 if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
@ -29,12 +30,18 @@ const baseUrl = `https://${sandbox.getHost(3000)}`;
 console.log("Waiting for server...");
 await waitForHealth({ baseUrl });

+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
 const cleanup = async () => {
+  clearInterval(keepAlive);
  await sandbox.kill();
  process.exit(0);
 };
 process.once("SIGINT", cleanup);
 process.once("SIGTERM", cleanup);
-
-await runPrompt(baseUrl);
-await cleanup();
--- a/examples/file-system/package.json
+++ b/examples/file-system/package.json
@ -0,0 +1,19 @@
+{
+  "name": "@sandbox-agent/example-file-system",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "start": "tsx src/index.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*",
+    "tar": "^7"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "tsx": "latest",
+    "typescript": "latest"
+  }
+}
--- a/examples/file-system/src/index.ts
+++ b/examples/file-system/src/index.ts
@ -0,0 +1,57 @@
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
+import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
+import * as tar from "tar";
+import fs from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+console.log("Starting sandbox...");
+const { baseUrl, cleanup } = await startDockerSandbox({ port: 3003 });
+
+console.log("Creating sample files...");
+const tmpDir = path.resolve(__dirname, "../.tmp-upload");
+const projectDir = path.join(tmpDir, "my-project");
+fs.mkdirSync(path.join(projectDir, "src"), { recursive: true });
+fs.writeFileSync(path.join(projectDir, "README.md"), "# My Project\n\nUploaded via batch tar.\n");
+fs.writeFileSync(path.join(projectDir, "src", "index.ts"), 'console.log("hello from uploaded project");\n');
+fs.writeFileSync(path.join(projectDir, "package.json"), JSON.stringify({ name: "my-project", version: "1.0.0" }, null, 2) + "\n");
+console.log("  Created 3 files in my-project/");
+
+console.log("Uploading files via batch tar...");
+const client = await SandboxAgent.connect({ baseUrl });
+
+const tarPath = path.join(tmpDir, "upload.tar");
+await tar.create(
+  { file: tarPath, cwd: tmpDir },
+  ["my-project"],
+);
+const tarBuffer = await fs.promises.readFile(tarPath);
+const uploadResult = await client.uploadFsBatch(tarBuffer, { path: "/opt" });
+console.log(`  Uploaded ${uploadResult.paths.length} files: ${uploadResult.paths.join(", ")}`);
+
+// Cleanup temp files
+fs.rmSync(tmpDir, { recursive: true, force: true });
+
+console.log("Verifying uploaded files...");
+const entries = await client.listFsEntries({ path: "/opt/my-project" });
+console.log(`  Found ${entries.length} entries in /opt/my-project`);
+for (const entry of entries) {
+  console.log(`    ${entry.entryType === "directory" ? "d" : "-"} ${entry.name}`);
+}
+
+const readmeBytes = await client.readFsFile({ path: "/opt/my-project/README.md" });
+const readmeText = new TextDecoder().decode(readmeBytes);
+console.log(`  README.md content: ${readmeText.trim()}`);
+
+console.log("Creating session...");
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log('  Try: "read the README in /opt/my-project"');
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
+process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });
--- a/examples/file-system/tsconfig.json
+++ b/examples/file-system/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/mcp-custom-tool/package.json
+++ b/examples/mcp-custom-tool/package.json
@ -0,0 +1,22 @@
+{
+  "name": "@sandbox-agent/example-mcp-custom-tool",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "build:mcp": "esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs",
+    "start": "pnpm build:mcp && tsx src/index.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "latest",
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*",
+    "zod": "latest"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "esbuild": "latest",
+    "tsx": "latest",
+    "typescript": "latest"
+  }
+}
--- a/examples/mcp-custom-tool/src/index.ts
+++ b/examples/mcp-custom-tool/src/index.ts
@ -0,0 +1,49 @@
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
+import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
+import fs from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+// Verify the bundled MCP server exists (built by `pnpm build:mcp`).
+const serverFile = path.resolve(__dirname, "../dist/mcp-server.cjs");
+if (!fs.existsSync(serverFile)) {
+  console.error("Error: dist/mcp-server.cjs not found. Run `pnpm build:mcp` first.");
+  process.exit(1);
+}
+
+// Start a Docker container running sandbox-agent.
+console.log("Starting sandbox...");
+const { baseUrl, cleanup } = await startDockerSandbox({ port: 3004 });
+
+// Upload the bundled MCP server into the sandbox filesystem.
+console.log("Uploading MCP server bundle...");
+const client = await SandboxAgent.connect({ baseUrl });
+
+const bundle = await fs.promises.readFile(serverFile);
+const written = await client.writeFsFile(
+  { path: "/opt/mcp/custom-tools/mcp-server.cjs" },
+  bundle,
+);
+console.log(`  Written: ${written.path} (${written.bytesWritten} bytes)`);
+
+// Create a session with the uploaded MCP server as a local command.
+console.log("Creating session with custom MCP tool...");
+const sessionId = generateSessionId();
+await client.createSession(sessionId, {
+  agent: detectAgent(),
+  mcp: {
+    customTools: {
+      type: "local",
+      command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
+    },
+  },
+});
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log('  Try: "generate a random number between 1 and 100"');
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
+process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });
--- a/examples/mcp-custom-tool/src/mcp-server.ts
+++ b/examples/mcp-custom-tool/src/mcp-server.ts
@ -0,0 +1,24 @@
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { z } from "zod";
+
+async function main() {
+  const server = new McpServer({ name: "rand", version: "1.0.0" });
+
+  server.tool(
+    "random_number",
+    "Generate a random integer between min and max (inclusive)",
+    {
+      min: z.number().describe("Minimum value"),
+      max: z.number().describe("Maximum value"),
+    },
+    async ({ min, max }) => ({
+      content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
+    }),
+  );
+
+  const transport = new StdioServerTransport();
+  await server.connect(transport);
+}
+
+main();
--- a/examples/mcp-custom-tool/tsconfig.json
+++ b/examples/mcp-custom-tool/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/mcp/package.json
+++ b/examples/mcp/package.json
@ -0,0 +1,18 @@
+{
+  "name": "@sandbox-agent/example-mcp",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "start": "tsx src/index.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "tsx": "latest",
+    "typescript": "latest"
+  }
+}
--- a/examples/mcp/src/index.ts
+++ b/examples/mcp/src/index.ts
@ -0,0 +1,31 @@
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
+import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
+
+console.log("Starting sandbox...");
+const { baseUrl, cleanup } = await startDockerSandbox({
+  port: 3002,
+  setupCommands: [
+    "npm install -g --silent @modelcontextprotocol/server-everything@2026.1.26",
+  ],
+});
+
+console.log("Creating session with everything MCP server...");
+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, {
+  agent: detectAgent(),
+  mcp: {
+    everything: {
+      type: "local",
+      command: ["mcp-server-everything"],
+      timeoutMs: 10000,
+    },
+  },
+});
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log('  Try: "generate a random number between 1 and 100"');
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
+process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });
--- a/examples/mcp/tsconfig.json
+++ b/examples/mcp/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/shared/Dockerfile
+++ b/examples/shared/Dockerfile
@ -0,0 +1,5 @@
+FROM node:22-bookworm-slim
+RUN apt-get update -qq && apt-get install -y -qq --no-install-recommends ca-certificates > /dev/null 2>&1 && \
+    rm -rf /var/lib/apt/lists/* && \
+    npm install -g --silent @sandbox-agent/cli@latest && \
+    sandbox-agent install-agent claude
--- a/examples/shared/Dockerfile.dev
+++ b/examples/shared/Dockerfile.dev
@ -0,0 +1,58 @@
+FROM node:22-bookworm-slim AS frontend
+RUN corepack enable && corepack prepare pnpm@latest --activate
+WORKDIR /build
+
+# Copy workspace root config
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+
+# Copy packages needed for the inspector build chain:
+#   inspector -> sandbox-agent SDK -> cli-shared
+COPY sdks/typescript/ sdks/typescript/
+COPY sdks/cli-shared/ sdks/cli-shared/
+COPY frontend/packages/inspector/ frontend/packages/inspector/
+COPY docs/openapi.json docs/
+
+# Create stub package.json for workspace packages referenced in pnpm-workspace.yaml
+# but not needed for the inspector build (avoids install errors).
+RUN set -e; for dir in \
+      sdks/cli sdks/gigacode \
+      resources/agent-schemas resources/vercel-ai-sdk-schemas \
+      scripts/release scripts/sandbox-testing \
+      examples/shared examples/docker examples/e2b examples/vercel \
+      examples/daytona examples/cloudflare examples/file-system \
+      examples/mcp examples/mcp-custom-tool \
+      examples/skills examples/skills-custom-tool \
+      frontend/packages/website; do \
+      mkdir -p "$dir"; \
+      printf '{"name":"@stub/%s","private":true,"version":"0.0.0"}\n' "$(basename "$dir")" > "$dir/package.json"; \
+    done; \
+    for parent in sdks/cli/platforms sdks/gigacode/platforms; do \
+      for plat in darwin-arm64 darwin-x64 linux-arm64 linux-x64 win32-x64; do \
+        mkdir -p "$parent/$plat"; \
+        printf '{"name":"@stub/%s-%s","private":true,"version":"0.0.0"}\n' "$(basename "$parent")" "$plat" > "$parent/$plat/package.json"; \
+      done; \
+    done
+
+RUN pnpm install --no-frozen-lockfile
+ENV SKIP_OPENAPI_GEN=1
+RUN pnpm --filter sandbox-agent build && \
+    pnpm --filter @sandbox-agent/inspector build
+
+FROM rust:1.88.0-bookworm AS builder
+WORKDIR /build
+COPY Cargo.toml Cargo.lock ./
+COPY server/ ./server/
+COPY gigacode/ ./gigacode/
+COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/
+COPY --from=frontend /build/frontend/packages/inspector/dist/ ./frontend/packages/inspector/dist/
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent --release && \
+    cp target/release/sandbox-agent /sandbox-agent
+
+FROM node:22-bookworm-slim
+RUN apt-get update -qq && apt-get install -y -qq --no-install-recommends ca-certificates > /dev/null 2>&1 && \
+    rm -rf /var/lib/apt/lists/*
+COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
+RUN sandbox-agent install-agent claude
--- a/examples/shared/package.json
+++ b/examples/shared/package.json
@ -3,15 +3,18 @@
  "private": true,
  "type": "module",
  "exports": {
-    ".": "./src/sandbox-agent-client.ts"
+    ".": "./src/sandbox-agent-client.ts",
+    "./docker": "./src/docker.ts"
  },
  "scripts": {
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
+    "dockerode": "latest",
    "sandbox-agent": "workspace:*"
  },
  "devDependencies": {
+    "@types/dockerode": "latest",
    "@types/node": "latest",
    "typescript": "latest"
  }
--- a/examples/shared/src/docker.ts
+++ b/examples/shared/src/docker.ts
@ -0,0 +1,301 @@
+import Docker from "dockerode";
+import { execFileSync } from "node:child_process";
+import fs from "node:fs";
+import path from "node:path";
+import { PassThrough } from "node:stream";
+import { fileURLToPath } from "node:url";
+import { waitForHealth } from "./sandbox-agent-client.ts";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const EXAMPLE_IMAGE = "sandbox-agent-examples:latest";
+const EXAMPLE_IMAGE_DEV = "sandbox-agent-examples-dev:latest";
+const DOCKERFILE_DIR = path.resolve(__dirname, "..");
+const REPO_ROOT = path.resolve(DOCKERFILE_DIR, "../..");
+
+export interface DockerSandboxOptions {
+  /** Container port used by sandbox-agent inside Docker. */
+  port: number;
+  /** Optional fixed host port mapping. If omitted, Docker assigns a free host port automatically. */
+  hostPort?: number;
+  /** Additional shell commands to run before starting sandbox-agent. */
+  setupCommands?: string[];
+  /** Docker image to use. Defaults to the pre-built sandbox-agent-examples image. */
+  image?: string;
+}
+
+export interface DockerSandbox {
+  baseUrl: string;
+  cleanup: () => Promise<void>;
+}
+
+const DIRECT_CREDENTIAL_KEYS = [
+  "ANTHROPIC_API_KEY",
+  "CLAUDE_API_KEY",
+  "CLAUDE_CODE_OAUTH_TOKEN",
+  "ANTHROPIC_AUTH_TOKEN",
+  "OPENAI_API_KEY",
+  "CODEX_API_KEY",
+  "CEREBRAS_API_KEY",
+  "OPENCODE_API_KEY",
+] as const;
+
+function stripShellQuotes(value: string): string {
+  const trimmed = value.trim();
+  if (trimmed.length >= 2 && trimmed.startsWith("\"") && trimmed.endsWith("\"")) {
+    return trimmed.slice(1, -1);
+  }
+  if (trimmed.length >= 2 && trimmed.startsWith("'") && trimmed.endsWith("'")) {
+    return trimmed.slice(1, -1);
+  }
+  return trimmed;
+}
+
+function parseExtractedCredentials(output: string): Record<string, string> {
+  const parsed: Record<string, string> = {};
+  for (const rawLine of output.split("\n")) {
+    const line = rawLine.trim();
+    if (!line) continue;
+    const cleanLine = line.startsWith("export ") ? line.slice(7) : line;
+    const match = cleanLine.match(/^([A-Z0-9_]+)=(.*)$/);
+    if (!match) continue;
+    const [, key, rawValue] = match;
+    const value = stripShellQuotes(rawValue);
+    if (!value) continue;
+    parsed[key] = value;
+  }
+  return parsed;
+}
+
+interface ClaudeCredentialFile {
+  hostPath: string;
+  containerPath: string;
+  base64Content: string;
+}
+
+function readClaudeCredentialFiles(): ClaudeCredentialFile[] {
+  const homeDir = process.env.HOME || "";
+  if (!homeDir) return [];
+
+  const candidates: Array<{ hostPath: string; containerPath: string }> = [
+    {
+      hostPath: path.join(homeDir, ".claude", ".credentials.json"),
+      containerPath: "/root/.claude/.credentials.json",
+    },
+    {
+      hostPath: path.join(homeDir, ".claude-oauth-credentials.json"),
+      containerPath: "/root/.claude-oauth-credentials.json",
+    },
+  ];
+
+  const files: ClaudeCredentialFile[] = [];
+  for (const candidate of candidates) {
+    if (!fs.existsSync(candidate.hostPath)) continue;
+    try {
+      const raw = fs.readFileSync(candidate.hostPath, "utf8");
+      files.push({
+        hostPath: candidate.hostPath,
+        containerPath: candidate.containerPath,
+        base64Content: Buffer.from(raw, "utf8").toString("base64"),
+      });
+    } catch {
+      // Ignore unreadable credential file candidates.
+    }
+  }
+  return files;
+}
+
+function collectCredentialEnv(): Record<string, string> {
+  const merged: Record<string, string> = {};
+  let extracted: Record<string, string> = {};
+  try {
+    const output = execFileSync(
+      "sandbox-agent",
+      ["credentials", "extract-env"],
+      { encoding: "utf8", stdio: ["ignore", "pipe", "pipe"] },
+    );
+    extracted = parseExtractedCredentials(output);
+  } catch {
+    // Fall back to direct env vars if extraction is unavailable.
+  }
+
+  for (const [key, value] of Object.entries(extracted)) {
+    if (value) merged[key] = value;
+  }
+  for (const key of DIRECT_CREDENTIAL_KEYS) {
+    const direct = process.env[key];
+    if (direct) merged[key] = direct;
+  }
+  return merged;
+}
+
+function shellSingleQuotedLiteral(value: string): string {
+  return `'${value.replace(/'/g, `'\"'\"'`)}'`;
+}
+
+function stripAnsi(value: string): string {
+  return value.replace(
+    /[\u001B\u009B][[\]()#;?]*(?:(?:[a-zA-Z\d]*(?:;[a-zA-Z\d]*)*)?\u0007|(?:\d{1,4}(?:;\d{0,4})*)?[0-9A-ORZcf-nqry=><])/g,
+    "",
+  );
+}
+
+async function ensureExampleImage(_docker: Docker): Promise<string> {
+  const dev = !!process.env.SANDBOX_AGENT_DEV;
+  const imageName = dev ? EXAMPLE_IMAGE_DEV : EXAMPLE_IMAGE;
+
+  if (dev) {
+    console.log("  Building sandbox image from source (may take a while, only runs once)...");
+    try {
+      execFileSync("docker", [
+        "build", "-t", imageName,
+        "-f", path.join(DOCKERFILE_DIR, "Dockerfile.dev"),
+        REPO_ROOT,
+      ], {
+        stdio: ["ignore", "ignore", "pipe"],
+      });
+    } catch (err: unknown) {
+      const stderr = err instanceof Error && "stderr" in err ? String((err as { stderr: unknown }).stderr) : "";
+      throw new Error(`Failed to build sandbox image: ${stderr}`);
+    }
+  } else {
+    console.log("  Building sandbox image (may take a while, only runs once)...");
+    try {
+      execFileSync("docker", ["build", "-t", imageName, DOCKERFILE_DIR], {
+        stdio: ["ignore", "ignore", "pipe"],
+      });
+    } catch (err: unknown) {
+      const stderr = err instanceof Error && "stderr" in err ? String((err as { stderr: unknown }).stderr) : "";
+      throw new Error(`Failed to build sandbox image: ${stderr}`);
+    }
+  }
+
+  return imageName;
+}
+
+/**
+ * Start a Docker container running sandbox-agent and wait for it to be healthy.
+ * Registers SIGINT/SIGTERM handlers for cleanup.
+ */
+export async function startDockerSandbox(opts: DockerSandboxOptions): Promise<DockerSandbox> {
+  const { port, hostPort } = opts;
+  const useCustomImage = !!opts.image;
+  let image = opts.image ?? EXAMPLE_IMAGE;
+  // TODO: Replace setupCommands shell bootstrapping with native sandbox-agent exec API once available.
+  const setupCommands = [...(opts.setupCommands ?? [])];
+  const credentialEnv = collectCredentialEnv();
+  const claudeCredentialFiles = readClaudeCredentialFiles();
+  const bootstrapEnv: Record<string, string> = {};
+
+  if (claudeCredentialFiles.length > 0) {
+    delete credentialEnv.ANTHROPIC_API_KEY;
+    delete credentialEnv.CLAUDE_API_KEY;
+    delete credentialEnv.CLAUDE_CODE_OAUTH_TOKEN;
+    delete credentialEnv.ANTHROPIC_AUTH_TOKEN;
+
+    const credentialBootstrapCommands = claudeCredentialFiles.flatMap((file, index) => {
+      const envKey = `SANDBOX_AGENT_CLAUDE_CREDENTIAL_${index}_B64`;
+      bootstrapEnv[envKey] = file.base64Content;
+      return [
+        `mkdir -p ${shellSingleQuotedLiteral(path.posix.dirname(file.containerPath))}`,
+        `printf %s "$${envKey}" | base64 -d > ${shellSingleQuotedLiteral(file.containerPath)}`,
+      ];
+    });
+    setupCommands.unshift(...credentialBootstrapCommands);
+  }
+
+  for (const [key, value] of Object.entries(credentialEnv)) {
+    if (!process.env[key]) process.env[key] = value;
+  }
+
+  const docker = new Docker({ socketPath: "/var/run/docker.sock" });
+
+  if (useCustomImage) {
+    try {
+      await docker.getImage(image).inspect();
+    } catch {
+      console.log(`  Pulling ${image}...`);
+      await new Promise<void>((resolve, reject) => {
+        docker.pull(image, (err: Error | null, stream: NodeJS.ReadableStream) => {
+          if (err) return reject(err);
+          docker.modem.followProgress(stream, (err: Error | null) => (err ? reject(err) : resolve()));
+        });
+      });
+    }
+  } else {
+    image = await ensureExampleImage(docker);
+  }
+
+  const bootCommands = [
+    ...setupCommands,
+    `sandbox-agent server --no-token --host 0.0.0.0 --port ${port}`,
+  ];
+
+  const container = await docker.createContainer({
+    Image: image,
+    WorkingDir: "/root",
+    Cmd: ["sh", "-c", bootCommands.join(" && ")],
+    Env: [
+      ...Object.entries(credentialEnv).map(([key, value]) => `${key}=${value}`),
+      ...Object.entries(bootstrapEnv).map(([key, value]) => `${key}=${value}`),
+    ],
+    ExposedPorts: { [`${port}/tcp`]: {} },
+    HostConfig: {
+      AutoRemove: true,
+      PortBindings: { [`${port}/tcp`]: [{ HostPort: hostPort ? `${hostPort}` : "0" }] },
+    },
+  });
+  await container.start();
+
+  const logChunks: string[] = [];
+  const startupLogs = await container.logs({
+    follow: true,
+    stdout: true,
+    stderr: true,
+    since: 0,
+  }) as NodeJS.ReadableStream;
+  const stdoutStream = new PassThrough();
+  const stderrStream = new PassThrough();
+  stdoutStream.on("data", (chunk) => {
+    logChunks.push(stripAnsi(String(chunk)));
+  });
+  stderrStream.on("data", (chunk) => {
+    logChunks.push(stripAnsi(String(chunk)));
+  });
+  docker.modem.demuxStream(startupLogs, stdoutStream, stderrStream);
+  const stopStartupLogs = () => {
+    const stream = startupLogs as NodeJS.ReadableStream & { destroy?: () => void };
+    try { stream.destroy?.(); } catch {}
+  };
+
+  const inspect = await container.inspect();
+  const mappedPorts = inspect.NetworkSettings?.Ports?.[`${port}/tcp`];
+  const mappedHostPort = mappedPorts?.[0]?.HostPort;
+  if (!mappedHostPort) {
+    throw new Error(`Failed to resolve mapped host port for container port ${port}`);
+  }
+  const baseUrl = `http://127.0.0.1:${mappedHostPort}`;
+
+  try {
+    await waitForHealth({ baseUrl });
+  } catch (err) {
+    stopStartupLogs();
+    console.error("  Container logs:");
+    for (const chunk of logChunks) {
+      process.stderr.write(`    ${chunk}`);
+    }
+    throw err;
+  }
+  stopStartupLogs();
+  console.log(`  Ready (${baseUrl})`);
+
+  const cleanup = async () => {
+    stopStartupLogs();
+    try { await container.stop({ t: 5 }); } catch {}
+    try { await container.remove({ force: true }); } catch {}
+    process.exit(0);
+  };
+  process.once("SIGINT", cleanup);
+  process.once("SIGTERM", cleanup);
+
+  return { baseUrl, cleanup };
+}
--- a/examples/shared/src/sandbox-agent-client.ts
+++ b/examples/shared/src/sandbox-agent-client.ts
@ -3,11 +3,7 @@
 * Provides minimal helpers for connecting to and interacting with sandbox-agent servers.
 */

-import { createInterface } from "node:readline/promises";
-import { randomUUID } from "node:crypto";
 import { setTimeout as delay } from "node:timers/promises";
-import { SandboxAgent } from "sandbox-agent";
-import type { PermissionEventData, QuestionEventData } from "sandbox-agent";

 function normalizeBaseUrl(baseUrl: string): string {
  return baseUrl.replace(/\/+$/, "");
@ -27,10 +23,12 @@ export function buildInspectorUrl({
  baseUrl,
  token,
  headers,
+  sessionId,
 }: {
  baseUrl: string;
  token?: string;
  headers?: Record<string, string>;
+  sessionId?: string;
 }): string {
  const normalized = normalizeBaseUrl(ensureUrl(baseUrl));
  const params = new URLSearchParams();
@ -41,7 +39,8 @@ export function buildInspectorUrl({
    params.set("headers", JSON.stringify(headers));
  }
  const queryString = params.toString();
-  return `${normalized}/ui/${queryString ? `?${queryString}` : ""}`;
+  const sessionPath = sessionId ? `sessions/${sessionId}` : "";
+  return `${normalized}/ui/${sessionPath}${queryString ? `?${queryString}` : ""}`;
 }

 export function logInspectorUrl({
@ -110,125 +109,39 @@ export async function waitForHealth({
  throw (lastError ?? new Error("Timed out waiting for /v1/health")) as Error;
 }

-function detectAgent(): string {
+export function generateSessionId(): string {
+  const chars = "abcdefghijklmnopqrstuvwxyz0123456789";
+  let id = "session-";
+  for (let i = 0; i < 8; i++) {
+    id += chars[Math.floor(Math.random() * chars.length)];
+  }
+  return id;
+}
+
+export function detectAgent(): string {
  if (process.env.SANDBOX_AGENT) return process.env.SANDBOX_AGENT;
-  if (process.env.ANTHROPIC_API_KEY) return "claude";
-  if (process.env.OPENAI_API_KEY) return "codex";
+  const hasClaude = Boolean(
+    process.env.ANTHROPIC_API_KEY ||
+    process.env.CLAUDE_API_KEY ||
+    process.env.CLAUDE_CODE_OAUTH_TOKEN ||
+    process.env.ANTHROPIC_AUTH_TOKEN,
+  );
+  const openAiLikeKey = process.env.OPENAI_API_KEY || process.env.CODEX_API_KEY || "";
+  const hasCodexApiKey = openAiLikeKey.startsWith("sk-");
+  if (hasCodexApiKey && hasClaude) {
+    console.log("Both Claude and Codex API keys detected; defaulting to codex. Set SANDBOX_AGENT to override.");
+    return "codex";
+  }
+  if (!hasCodexApiKey && openAiLikeKey) {
+    console.log("OpenAI/Codex credential is not an API key (expected sk-...), skipping codex auto-select.");
+  }
+  if (hasCodexApiKey) return "codex";
+  if (hasClaude) {
+    if (openAiLikeKey && !hasCodexApiKey) {
+      console.log("Using claude by default.");
+    }
+    return "claude";
+  }
  return "claude";
 }

-export async function runPrompt(baseUrl: string): Promise<void> {
-  console.log(`UI: ${buildInspectorUrl({ baseUrl })}`);
-
-  const client = await SandboxAgent.connect({ baseUrl });
-
-  const agent = detectAgent();
-  console.log(`Using agent: ${agent}`);
-  const sessionId = randomUUID();
-  await client.createSession(sessionId, { agent });
-  console.log(`Session ${sessionId}. Press Ctrl+C to quit.`);
-
-  const rl = createInterface({ input: process.stdin, output: process.stdout });
-
-  let isThinking = false;
-  let hasStartedOutput = false;
-  let turnResolve: (() => void) | null = null;
-  let sessionEnded = false;
-
-  const processEvents = async () => {
-    for await (const event of client.streamEvents(sessionId)) {
-      if (event.type === "item.started") {
-        const item = (event.data as any)?.item;
-        if (item?.role === "assistant") {
-          isThinking = true;
-          hasStartedOutput = false;
-          process.stdout.write("Thinking...");
-        }
-      }
-
-      if (event.type === "item.delta" && isThinking) {
-        const delta = (event.data as any)?.delta;
-        if (delta) {
-          if (!hasStartedOutput) {
-            process.stdout.write("\r\x1b[K");
-            hasStartedOutput = true;
-          }
-          const text = typeof delta === "string" ? delta : delta.type === "text" ? delta.text || "" : "";
-          if (text) process.stdout.write(text);
-        }
-      }
-
-      if (event.type === "item.completed") {
-        const item = (event.data as any)?.item;
-        if (item?.role === "assistant") {
-          isThinking = false;
-          process.stdout.write("\n");
-          turnResolve?.();
-          turnResolve = null;
-        }
-      }
-
-      if (event.type === "permission.requested") {
-        const data = event.data as PermissionEventData;
-        if (isThinking && !hasStartedOutput) {
-          process.stdout.write("\r\x1b[K");
-        }
-        console.log(`[Auto-approved] ${data.action}`);
-        await client.replyPermission(sessionId, data.permission_id, { reply: "once" });
-      }
-
-      if (event.type === "question.requested") {
-        const data = event.data as QuestionEventData;
-        if (isThinking && !hasStartedOutput) {
-          process.stdout.write("\r\x1b[K");
-        }
-        console.log(`[Question rejected] ${data.prompt}`);
-        await client.rejectQuestion(sessionId, data.question_id);
-      }
-
-      if (event.type === "error") {
-        const data = event.data as any;
-        console.error(`\nError: ${data?.message || JSON.stringify(data)}`);
-      }
-
-      if (event.type === "session.ended") {
-        const data = event.data as any;
-        const reason = data?.reason || "unknown";
-        if (reason === "error") {
-          console.error(`\nAgent exited with error: ${data?.message || ""}`);
-          if (data?.exit_code !== undefined) {
-            console.error(`  Exit code: ${data.exit_code}`);
-          }
-        } else {
-          console.log(`Agent session ${reason}`);
-        }
-        sessionEnded = true;
-        turnResolve?.();
-        turnResolve = null;
-      }
-    }
-  };
-
-  processEvents().catch((err) => {
-    if (!sessionEnded) {
-      console.error("Event stream error:", err instanceof Error ? err.message : err);
-    }
-  });
-
-  while (true) {
-    const line = await rl.question("> ");
-    if (!line.trim()) continue;
-
-    const turnComplete = new Promise<void>((resolve) => {
-      turnResolve = resolve;
-    });
-
-    try {
-      await client.postMessage(sessionId, { message: line.trim() });
-      await turnComplete;
-    } catch (error) {
-      console.error(error instanceof Error ? error.message : error);
-      turnResolve = null;
-    }
-  }
-}
--- a/examples/skills-custom-tool/SKILL.md
+++ b/examples/skills-custom-tool/SKILL.md
@ -0,0 +1,12 @@
+---
+name: random-number
+description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
+---
+
+To generate a random number, run:
+
+```bash
+node /opt/skills/random-number/random-number.cjs <min> <max>
+```
+
+This prints a single random integer between min and max (inclusive).
--- a/examples/skills-custom-tool/package.json
+++ b/examples/skills-custom-tool/package.json
@ -0,0 +1,20 @@
+{
+  "name": "@sandbox-agent/example-skills-custom-tool",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "build:script": "esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs",
+    "start": "pnpm build:script && tsx src/index.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "esbuild": "latest",
+    "tsx": "latest",
+    "typescript": "latest"
+  }
+}
--- a/examples/skills-custom-tool/src/index.ts
+++ b/examples/skills-custom-tool/src/index.ts
@ -0,0 +1,53 @@
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
+import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
+import fs from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+// Verify the bundled script exists (built by `pnpm build:script`).
+const scriptFile = path.resolve(__dirname, "../dist/random-number.cjs");
+if (!fs.existsSync(scriptFile)) {
+  console.error("Error: dist/random-number.cjs not found. Run `pnpm build:script` first.");
+  process.exit(1);
+}
+
+// Start a Docker container running sandbox-agent.
+console.log("Starting sandbox...");
+const { baseUrl, cleanup } = await startDockerSandbox({ port: 3005 });
+
+// Upload the bundled script and SKILL.md into the sandbox filesystem.
+console.log("Uploading script and skill file...");
+const client = await SandboxAgent.connect({ baseUrl });
+
+const script = await fs.promises.readFile(scriptFile);
+const scriptResult = await client.writeFsFile(
+  { path: "/opt/skills/random-number/random-number.cjs" },
+  script,
+);
+console.log(`  Script: ${scriptResult.path} (${scriptResult.bytesWritten} bytes)`);
+
+const skillMd = await fs.promises.readFile(path.resolve(__dirname, "../SKILL.md"));
+const skillResult = await client.writeFsFile(
+  { path: "/opt/skills/random-number/SKILL.md" },
+  skillMd,
+);
+console.log(`  Skill:  ${skillResult.path} (${skillResult.bytesWritten} bytes)`);
+
+// Create a session with the uploaded skill as a local source.
+console.log("Creating session with custom skill...");
+const sessionId = generateSessionId();
+await client.createSession(sessionId, {
+  agent: detectAgent(),
+  skills: {
+    sources: [{ type: "local", source: "/opt/skills/random-number" }],
+  },
+});
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log('  Try: "generate a random number between 1 and 100"');
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
+process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });
--- a/examples/skills-custom-tool/src/random-number.ts
+++ b/examples/skills-custom-tool/src/random-number.ts
@ -0,0 +1,9 @@
+const min = Number(process.argv[2]);
+const max = Number(process.argv[3]);
+
+if (Number.isNaN(min) || Number.isNaN(max)) {
+  console.error("Usage: random-number <min> <max>");
+  process.exit(1);
+}
+
+console.log(Math.floor(Math.random() * (max - min + 1)) + min);
--- a/examples/skills-custom-tool/tsconfig.json
+++ b/examples/skills-custom-tool/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/skills/package.json
+++ b/examples/skills/package.json
@ -0,0 +1,18 @@
+{
+  "name": "@sandbox-agent/example-skills",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "start": "tsx src/index.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sandbox-agent/example-shared": "workspace:*",
+    "sandbox-agent": "workspace:*"
+  },
+  "devDependencies": {
+    "@types/node": "latest",
+    "tsx": "latest",
+    "typescript": "latest"
+  }
+}
--- a/examples/skills/src/index.ts
+++ b/examples/skills/src/index.ts
@ -0,0 +1,26 @@
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId } from "@sandbox-agent/example-shared";
+import { startDockerSandbox } from "@sandbox-agent/example-shared/docker";
+
+console.log("Starting sandbox...");
+const { baseUrl, cleanup } = await startDockerSandbox({
+  port: 3001,
+});
+
+console.log("Creating session with skill source...");
+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, {
+  agent: detectAgent(),
+  skills: {
+    sources: [
+      { type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
+    ],
+  },
+});
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log('  Try: "How do I start sandbox-agent?"');
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
+process.on("SIGINT", () => { clearInterval(keepAlive); cleanup().then(() => process.exit(0)); });
--- a/examples/skills/tsconfig.json
+++ b/examples/skills/tsconfig.json
@ -0,0 +1,16 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": ["ES2022", "DOM"],
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "strict": true,
+    "skipLibCheck": true,
+    "resolveJsonModule": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "**/*.test.ts"]
+}
--- a/examples/vercel/package.json
+++ b/examples/vercel/package.json
@ -3,7 +3,7 @@
  "private": true,
  "type": "module",
  "scripts": {
-    "start": "tsx src/vercel.ts",
+    "start": "tsx src/index.ts",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
--- a/examples/vercel/src/vercel.ts
+++ b/examples/vercel/src/vercel.ts
@ -1,5 +1,6 @@
 import { Sandbox } from "@vercel/sandbox";
-import { runPrompt, waitForHealth } from "@sandbox-agent/example-shared";
+import { SandboxAgent } from "sandbox-agent";
+import { detectAgent, buildInspectorUrl, generateSessionId, waitForHealth } from "@sandbox-agent/example-shared";

 const envs: Record<string, string> = {};
 if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
@ -40,12 +41,18 @@ const baseUrl = sandbox.domain(3000);
 console.log("Waiting for server...");
 await waitForHealth({ baseUrl });

+const client = await SandboxAgent.connect({ baseUrl });
+const sessionId = generateSessionId();
+await client.createSession(sessionId, { agent: detectAgent() });
+
+console.log(`  UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
+console.log("  Press Ctrl+C to stop.");
+
+const keepAlive = setInterval(() => {}, 60_000);
 const cleanup = async () => {
+  clearInterval(keepAlive);
  await sandbox.stop();
  process.exit(0);
 };
 process.once("SIGINT", cleanup);
 process.once("SIGTERM", cleanup);
-
-await runPrompt(baseUrl);
-await cleanup();
--- a/frontend/packages/inspector/Dockerfile
+++ b/frontend/packages/inspector/Dockerfile
@ -1,15 +1,20 @@
 FROM node:22-alpine AS build
 WORKDIR /app
-RUN npm install -g pnpm
+RUN npm install -g pnpm@9

 # Copy package files for all workspaces
 COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
 COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
 COPY sdks/typescript/package.json ./sdks/typescript/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/

 # Install dependencies
 RUN pnpm install --filter @sandbox-agent/inspector...

+# Copy cli-shared source and build it
+COPY sdks/cli-shared ./sdks/cli-shared
+RUN cd sdks/cli-shared && pnpm exec tsup
+
 # Copy SDK source (with pre-generated types)
 COPY sdks/typescript ./sdks/typescript

--- a/frontend/packages/inspector/index.html
+++ b/frontend/packages/inspector/index.html
@ -336,6 +336,12 @@
        color: var(--danger);
      }

+      .banner.config-note {
+        background: rgba(255, 159, 10, 0.12);
+        border-left: 3px solid var(--warning);
+        color: var(--warning);
+      }
+
      .banner.success {
        background: rgba(48, 209, 88, 0.1);
        border-left: 3px solid var(--success);
@ -471,11 +477,12 @@
        position: relative;
      }

-      .sidebar-add-menu {
+      .sidebar-add-menu,
+      .session-create-menu {
        position: absolute;
        top: 36px;
        left: 0;
-        min-width: 200px;
+        min-width: 220px;
        background: var(--surface);
        border: 1px solid var(--border-2);
        border-radius: 8px;
@ -487,6 +494,405 @@
        z-index: 60;
      }

+      .session-create-header {
+        display: flex;
+        align-items: center;
+        gap: 8px;
+        padding: 6px 6px 4px;
+        margin-bottom: 4px;
+      }
+
+      .session-create-back {
+        width: 24px;
+        height: 24px;
+        background: transparent;
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+        color: var(--muted);
+        cursor: pointer;
+        display: flex;
+        align-items: center;
+        justify-content: center;
+        transition: all var(--transition);
+        flex-shrink: 0;
+      }
+
+      .session-create-back:hover {
+        border-color: var(--accent);
+        color: var(--accent);
+      }
+
+      .session-create-agent-name {
+        font-size: 12px;
+        font-weight: 600;
+        color: var(--text);
+      }
+
+      .session-create-form {
+        display: flex;
+        flex-direction: column;
+        gap: 0;
+        padding: 4px 2px;
+      }
+
+      .session-create-form .setup-field {
+        display: flex;
+        flex-direction: row;
+        align-items: center;
+        gap: 8px;
+        height: 28px;
+      }
+
+      .session-create-form .setup-label {
+        width: 72px;
+        flex-shrink: 0;
+        text-align: right;
+      }
+
+      .session-create-form .setup-select,
+      .session-create-form .setup-input {
+        flex: 1;
+        min-width: 0;
+      }
+
+      .session-create-section {
+        overflow: hidden;
+      }
+
+      .session-create-section-toggle {
+        display: flex;
+        align-items: center;
+        gap: 8px;
+        width: 100%;
+        height: 28px;
+        padding: 0;
+        background: transparent;
+        border: none;
+        color: var(--text-secondary);
+        font-size: 11px;
+        cursor: pointer;
+        transition: color var(--transition);
+      }
+
+      .session-create-section-toggle:hover {
+        color: var(--text);
+      }
+
+      .session-create-section-toggle .setup-label {
+        width: 72px;
+        flex-shrink: 0;
+        text-align: right;
+      }
+
+      .session-create-section-count {
+        font-size: 11px;
+        font-weight: 400;
+        color: var(--muted);
+      }
+
+      .session-create-section-arrow {
+        margin-left: auto;
+        color: var(--muted-2);
+        flex-shrink: 0;
+      }
+
+      .session-create-section-body {
+        margin: 4px 0 6px;
+        padding: 8px;
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+        background: var(--surface-2);
+      }
+
+      .session-create-textarea {
+        width: 100%;
+        background: var(--surface-2);
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+        padding: 6px 8px;
+        font-size: 10px;
+        color: var(--text);
+        outline: none;
+        resize: vertical;
+        min-height: 60px;
+        font-family: ui-monospace, SFMono-Regular, 'SF Mono', Consolas, monospace;
+        transition: border-color var(--transition);
+      }
+
+      .session-create-textarea:focus {
+        border-color: var(--accent);
+      }
+
+      .session-create-textarea::placeholder {
+        color: var(--muted-2);
+      }
+
+      .session-create-inline-error {
+        font-size: 10px;
+        color: var(--danger);
+        margin-top: 4px;
+        line-height: 1.4;
+      }
+
+      .session-create-skill-list {
+        display: flex;
+        flex-direction: column;
+        gap: 2px;
+        margin-bottom: 4px;
+      }
+
+      .session-create-skill-item {
+        display: flex;
+        align-items: center;
+        gap: 4px;
+        padding: 3px 4px 3px 8px;
+        background: var(--surface-2);
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+      }
+
+      .session-create-skill-path {
+        flex: 1;
+        min-width: 0;
+        font-size: 10px;
+        color: var(--text-secondary);
+        overflow: hidden;
+        text-overflow: ellipsis;
+        white-space: nowrap;
+      }
+
+      .session-create-skill-remove {
+        width: 18px;
+        height: 18px;
+        background: transparent;
+        border: none;
+        border-radius: 3px;
+        color: var(--muted);
+        cursor: pointer;
+        display: flex;
+        align-items: center;
+        justify-content: center;
+        flex-shrink: 0;
+        transition: all var(--transition);
+      }
+
+      .session-create-skill-remove:hover {
+        color: var(--danger);
+        background: rgba(255, 59, 48, 0.12);
+      }
+
+      .session-create-skill-add-row {
+        display: flex;
+      }
+
+      .session-create-skill-input {
+        width: 100%;
+        background: var(--surface-2);
+        border: 1px solid var(--accent);
+        border-radius: 4px;
+        padding: 4px 8px;
+        font-size: 10px;
+        color: var(--text);
+        outline: none;
+        font-family: ui-monospace, SFMono-Regular, 'SF Mono', Consolas, monospace;
+      }
+
+      .session-create-skill-input::placeholder {
+        color: var(--muted-2);
+      }
+
+      .session-create-skill-type-badge {
+        display: inline-flex;
+        align-items: center;
+        padding: 1px 5px;
+        border-radius: 3px;
+        font-size: 9px;
+        font-weight: 600;
+        text-transform: uppercase;
+        letter-spacing: 0.3px;
+        background: rgba(255, 79, 0, 0.15);
+        color: var(--accent);
+        flex-shrink: 0;
+      }
+
+      .session-create-skill-type-row {
+        display: flex;
+        gap: 4px;
+      }
+
+      .session-create-skill-type-select {
+        width: 80px;
+        flex-shrink: 0;
+        background: var(--surface-2);
+        border: 1px solid var(--accent);
+        border-radius: 4px;
+        padding: 4px 6px;
+        font-size: 10px;
+        color: var(--text);
+        outline: none;
+        cursor: pointer;
+      }
+
+      .session-create-mcp-list {
+        display: flex;
+        flex-direction: column;
+        gap: 2px;
+        margin-bottom: 4px;
+      }
+
+      .session-create-mcp-item {
+        display: flex;
+        align-items: center;
+        gap: 4px;
+        padding: 3px 4px 3px 8px;
+        background: var(--surface-2);
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+      }
+
+      .session-create-mcp-info {
+        flex: 1;
+        min-width: 0;
+        display: flex;
+        align-items: center;
+        gap: 6px;
+      }
+
+      .session-create-mcp-name {
+        font-size: 11px;
+        font-weight: 600;
+        color: var(--text);
+        white-space: nowrap;
+      }
+
+      .session-create-mcp-type {
+        font-size: 9px;
+        font-weight: 500;
+        text-transform: uppercase;
+        letter-spacing: 0.3px;
+        color: var(--muted);
+        background: var(--surface);
+        padding: 1px 4px;
+        border-radius: 3px;
+        white-space: nowrap;
+      }
+
+      .session-create-mcp-summary {
+        font-size: 10px;
+        color: var(--muted);
+        overflow: hidden;
+        text-overflow: ellipsis;
+        white-space: nowrap;
+        min-width: 0;
+      }
+
+      .session-create-mcp-actions {
+        display: flex;
+        align-items: center;
+        gap: 2px;
+        flex-shrink: 0;
+      }
+
+      .session-create-mcp-edit {
+        display: flex;
+        flex-direction: column;
+        gap: 4px;
+      }
+
+      .session-create-mcp-name-input {
+        width: 100%;
+        background: var(--surface-2);
+        border: 1px solid var(--accent);
+        border-radius: 4px;
+        padding: 4px 8px;
+        font-size: 11px;
+        color: var(--text);
+        outline: none;
+      }
+
+      .session-create-mcp-name-input:disabled {
+        opacity: 0.55;
+        cursor: not-allowed;
+      }
+
+      .session-create-mcp-name-input::placeholder {
+        color: var(--muted-2);
+      }
+
+      .session-create-mcp-edit-actions {
+        display: flex;
+        gap: 4px;
+      }
+
+      .session-create-mcp-save,
+      .session-create-mcp-cancel {
+        flex: 1;
+        padding: 4px 8px;
+        border-radius: 4px;
+        border: none;
+        font-size: 10px;
+        font-weight: 600;
+        cursor: pointer;
+        transition: background var(--transition);
+      }
+
+      .session-create-mcp-save {
+        background: var(--accent);
+        color: #fff;
+      }
+
+      .session-create-mcp-save:hover {
+        background: var(--accent-hover);
+      }
+
+      .session-create-mcp-cancel {
+        background: var(--border-2);
+        color: var(--text-secondary);
+      }
+
+      .session-create-mcp-cancel:hover {
+        background: var(--muted-2);
+      }
+
+      .session-create-add-btn {
+        display: flex;
+        align-items: center;
+        gap: 4px;
+        width: 100%;
+        padding: 4px 8px;
+        background: transparent;
+        border: 1px dashed var(--border-2);
+        border-radius: 4px;
+        color: var(--muted);
+        font-size: 10px;
+        cursor: pointer;
+        transition: all var(--transition);
+      }
+
+      .session-create-add-btn:hover {
+        border-color: var(--accent);
+        color: var(--accent);
+      }
+
+      .session-create-actions {
+        padding: 4px 2px 2px;
+        margin-top: 4px;
+      }
+
+      .session-create-actions .button.primary {
+        width: 100%;
+        padding: 8px 12px;
+        font-size: 12px;
+      }
+
+      /* Empty state variant of session-create-menu */
+      .empty-state-menu-wrapper .session-create-menu {
+        top: 100%;
+        left: 50%;
+        transform: translateX(-50%);
+        margin-top: 8px;
+      }
+
      .sidebar-add-option {
        background: transparent;
        border: 1px solid transparent;
@ -515,12 +921,40 @@
      .agent-option-left {
        display: flex;
        flex-direction: column;
+        align-items: flex-start;
        gap: 2px;
        min-width: 0;
      }

      .agent-option-name {
        white-space: nowrap;
+        min-width: 0;
+      }
+
+      .agent-option-version {
+        font-size: 10px;
+        color: var(--muted);
+        white-space: nowrap;
+      }
+
+      .sidebar-add-option:hover .agent-option-version {
+        color: rgba(255, 255, 255, 0.6);
+      }
+
+      .agent-option-badges {
+        display: flex;
+        align-items: center;
+        gap: 6px;
+        flex-shrink: 0;
+      }
+
+      .agent-option-arrow {
+        color: var(--muted-2);
+        transition: color var(--transition);
+      }
+
+      .sidebar-add-option:hover .agent-option-arrow {
+        color: rgba(255, 255, 255, 0.6);
      }

      .agent-badge {
@ -535,9 +969,6 @@
        flex-shrink: 0;
      }

-      .agent-badge.version {
-        color: var(--muted);
-      }

      .sidebar-add-status {
        padding: 6px 8px;
@ -1043,6 +1474,36 @@
        height: 16px;
      }

+      /* Session Config Bar */
+      .session-config-bar {
+        display: flex;
+        align-items: flex-start;
+        gap: 20px;
+        padding: 10px 16px 12px;
+        border-top: 1px solid var(--border);
+        flex-shrink: 0;
+        flex-wrap: wrap;
+      }
+
+      .session-config-field {
+        display: flex;
+        flex-direction: column;
+        gap: 2px;
+      }
+
+      .session-config-label {
+        font-size: 10px;
+        font-weight: 600;
+        text-transform: uppercase;
+        letter-spacing: 0.5px;
+        color: var(--muted);
+      }
+
+      .session-config-value {
+        font-size: 12px;
+        color: #8e8e93;
+      }
+
      /* Setup Row */
      .setup-row {
        display: flex;
@ -1207,6 +1668,29 @@
        color: #fff;
      }

+      .setup-config-actions {
+        display: flex;
+        gap: 6px;
+        flex-wrap: wrap;
+      }
+
+      .setup-config-btn {
+        border: 1px solid var(--border-2);
+        border-radius: 4px;
+        background: var(--surface);
+        color: var(--text-secondary);
+      }
+
+      .setup-config-btn:hover {
+        border-color: var(--accent);
+        color: var(--accent);
+      }
+
+      .setup-config-btn.error {
+        color: var(--danger);
+        border-color: rgba(255, 59, 48, 0.4);
+      }
+
      .setup-version {
        font-size: 10px;
        color: var(--muted);
@ -1311,6 +1795,15 @@
        margin-bottom: 0;
      }

+      .config-textarea {
+        min-height: 130px;
+      }
+
+      .config-inline-error {
+        margin-top: 8px;
+        margin-bottom: 0;
+      }
+
      .card-header {
        display: flex;
        align-items: center;
@ -1319,6 +1812,16 @@
        margin-bottom: 8px;
      }

+      .card-header-pills {
+        display: flex;
+        align-items: center;
+        gap: 6px;
+      }
+
+      .spinner-icon {
+        animation: spin 0.8s linear infinite;
+      }
+
      .card-title {
        font-size: 13px;
        font-weight: 600;
--- a/frontend/packages/inspector/src/App.tsx
+++ b/frontend/packages/inspector/src/App.tsx
@ -3,11 +3,13 @@ import {
  SandboxAgentError,
  SandboxAgent,
  type AgentInfo,
+  type CreateSessionRequest,
  type AgentModelInfo,
  type AgentModeInfo,
  type PermissionEventData,
  type QuestionEventData,
  type SessionInfo,
+  type SkillSource,
  type UniversalEvent,
  type UniversalItem
 } from "sandbox-agent";
@ -32,6 +34,41 @@ type ItemDeltaEventData = {
  delta: string;
 };

+export type McpServerEntry = {
+  name: string;
+  configJson: string;
+  error: string | null;
+};
+
+type ParsedMcpConfig = {
+  value: NonNullable<CreateSessionRequest["mcp"]>;
+  count: number;
+  error: string | null;
+};
+
+const buildMcpConfig = (entries: McpServerEntry[]): ParsedMcpConfig => {
+  if (entries.length === 0) {
+    return { value: {}, count: 0, error: null };
+  }
+  const firstError = entries.find((e) => e.error);
+  if (firstError) {
+    return { value: {}, count: entries.length, error: `${firstError.name}: ${firstError.error}` };
+  }
+  const value: NonNullable<CreateSessionRequest["mcp"]> = {};
+  for (const entry of entries) {
+    try {
+      value[entry.name] = JSON.parse(entry.configJson);
+    } catch {
+      return { value: {}, count: entries.length, error: `${entry.name}: Invalid JSON` };
+    }
+  }
+  return { value, count: entries.length, error: null };
+};
+
+const buildSkillsConfig = (sources: SkillSource[]): NonNullable<CreateSessionRequest["skills"]> => {
+  return { sources };
+};
+
 const buildStubItem = (itemId: string, nativeItemId?: string | null): UniversalItem => {
  return {
    item_id: itemId,
@ -53,6 +90,23 @@ const getCurrentOriginEndpoint = () => {
  return window.location.origin;
 };

+const getSessionIdFromPath = (): string => {
+  const basePath = import.meta.env.BASE_URL;
+  const path = window.location.pathname;
+  const relative = path.startsWith(basePath) ? path.slice(basePath.length) : path;
+  const match = relative.match(/^sessions\/(.+)/);
+  return match ? match[1] : "";
+};
+
+const updateSessionPath = (id: string) => {
+  const basePath = import.meta.env.BASE_URL;
+  const params = window.location.search;
+  const newPath = id ? `${basePath}sessions/${id}${params}` : `${basePath}${params}`;
+  if (window.location.pathname + window.location.search !== newPath) {
+    window.history.replaceState(null, "", newPath);
+  }
+};
+
 const getInitialConnection = () => {
  if (typeof window === "undefined") {
    return { endpoint: "http://127.0.0.1:2468", token: "", headers: {} as Record<string, string>, hasUrlParam: false };
@ -103,11 +157,7 @@ export default function App() {
  const [modelsErrorByAgent, setModelsErrorByAgent] = useState<Record<string, string | null>>({});

  const [agentId, setAgentId] = useState("claude");
-  const [agentMode, setAgentMode] = useState("");
-  const [permissionMode, setPermissionMode] = useState("default");
-  const [model, setModel] = useState("");
-  const [variant, setVariant] = useState("");
-  const [sessionId, setSessionId] = useState("");
+  const [sessionId, setSessionId] = useState(getSessionIdFromPath());
  const [sessionError, setSessionError] = useState<string | null>(null);

  const [message, setMessage] = useState("");
@ -115,6 +165,8 @@ export default function App() {
  const [offset, setOffset] = useState(0);
  const offsetRef = useRef(0);
  const [eventsLoading, setEventsLoading] = useState(false);
+  const [mcpServers, setMcpServers] = useState<McpServerEntry[]>([]);
+  const [skillSources, setSkillSources] = useState<SkillSource[]>([]);

  const [polling, setPolling] = useState(false);
  const pollTimerRef = useRef<number | null>(null);
@ -377,50 +429,52 @@ export default function App() {
    stopSse();
    stopTurnStream();
    setSessionId(session.sessionId);
+    updateSessionPath(session.sessionId);
    setAgentId(session.agent);
-    setAgentMode(session.agentMode);
-    setPermissionMode(session.permissionMode);
-    setModel(session.model ?? "");
-    setVariant(session.variant ?? "");
    setEvents([]);
    setOffset(0);
    offsetRef.current = 0;
    setSessionError(null);
  };

-  const createNewSession = async (nextAgentId?: string) => {
+  const createNewSession = async (
+    nextAgentId: string,
+    config: { model: string; agentMode: string; permissionMode: string; variant: string }
+  ) => {
    stopPolling();
    stopSse();
    stopTurnStream();
-    const selectedAgent = nextAgentId ?? agentId;
-    if (nextAgentId) {
    setAgentId(nextAgentId);
+    if (parsedMcpConfig.error) {
+      setSessionError(parsedMcpConfig.error);
+      return;
    }
    const chars = "abcdefghijklmnopqrstuvwxyz0123456789";
    let id = "session-";
    for (let i = 0; i < 8; i++) {
      id += chars[Math.floor(Math.random() * chars.length)];
    }
-    setSessionId(id);
-    setEvents([]);
-    setOffset(0);
-    offsetRef.current = 0;
    setSessionError(null);

    try {
-      const body: {
-        agent: string;
-        agentMode?: string;
-        permissionMode?: string;
-        model?: string;
-        variant?: string;
-      } = { agent: selectedAgent };
-      if (agentMode) body.agentMode = agentMode;
-      if (permissionMode) body.permissionMode = permissionMode;
-      if (model) body.model = model;
-      if (variant) body.variant = variant;
+      const body: CreateSessionRequest = { agent: nextAgentId };
+      if (config.agentMode) body.agentMode = config.agentMode;
+      if (config.permissionMode) body.permissionMode = config.permissionMode;
+      if (config.model) body.model = config.model;
+      if (config.variant) body.variant = config.variant;
+      if (parsedMcpConfig.count > 0) {
+        body.mcp = parsedMcpConfig.value;
+      }
+      if (parsedSkillsConfig.sources.length > 0) {
+        body.skills = parsedSkillsConfig;
+      }

      await getClient().createSession(id, body);
+      setSessionId(id);
+      updateSessionPath(id);
+      setEvents([]);
+      setOffset(0);
+      offsetRef.current = 0;
      await fetchSessions();
    } catch (error) {
      setSessionError(getErrorMessage(error, "Unable to create session"));
@ -762,6 +816,30 @@ export default function App() {
          });
          break;
        }
+        case "turn.started": {
+          entries.push({
+            id: event.event_id,
+            kind: "meta",
+            time: event.time,
+            meta: {
+              title: "Turn started",
+              severity: "info"
+            }
+          });
+          break;
+        }
+        case "turn.ended": {
+          entries.push({
+            id: event.event_id,
+            kind: "meta",
+            time: event.time,
+            meta: {
+              title: "Turn ended",
+              severity: "info"
+            }
+          });
+          break;
+        }
        default:
          break;
      }
@ -852,38 +930,10 @@ export default function App() {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [transcriptEntries]);

-  useEffect(() => {
-    if (connected && agentId && !modesByAgent[agentId]) {
-      loadModes(agentId);
-    }
-  }, [connected, agentId]);
-
-  useEffect(() => {
-    if (connected && agentId && !modelsByAgent[agentId]) {
-      loadModels(agentId);
-    }
-  }, [connected, agentId]);
-
-  useEffect(() => {
-    const modes = modesByAgent[agentId];
-    if (modes && modes.length > 0 && !agentMode) {
-      setAgentMode(modes[0].id);
-    }
-  }, [modesByAgent, agentId]);
-
  const currentAgent = agents.find((agent) => agent.id === agentId);
-  const activeModes = modesByAgent[agentId] ?? [];
-  const modesLoading = modesLoadingByAgent[agentId] ?? false;
-  const modesError = modesErrorByAgent[agentId] ?? null;
-  const modelOptions = modelsByAgent[agentId] ?? [];
-  const modelsLoading = modelsLoadingByAgent[agentId] ?? false;
-  const modelsError = modelsErrorByAgent[agentId] ?? null;
-  const defaultModel = defaultModelByAgent[agentId] ?? "";
-  const selectedModelId = model || defaultModel;
-  const selectedModel = modelOptions.find((entry) => entry.id === selectedModelId);
-  const variantOptions = selectedModel?.variants ?? [];
-  const defaultVariant = selectedModel?.defaultVariant ?? "";
-  const supportsVariants = Boolean(currentAgent?.capabilities?.variants);
+  const currentSessionInfo = sessions.find((s) => s.sessionId === sessionId);
+  const parsedMcpConfig = useMemo(() => buildMcpConfig(mcpServers), [mcpServers]);
+  const parsedSkillsConfig = useMemo(() => buildSkillsConfig(skillSources), [skillSources]);
  const agentDisplayNames: Record<string, string> = {
    claude: "Claude Code",
    codex: "Codex",
@ -894,6 +944,15 @@ export default function App() {
  };
  const agentLabel = agentDisplayNames[agentId] ?? agentId;

+  const handleSelectAgent = useCallback((targetAgentId: string) => {
+    if (connected && !modesByAgent[targetAgentId]) {
+      loadModes(targetAgentId);
+    }
+    if (connected && !modelsByAgent[targetAgentId]) {
+      loadModels(targetAgentId);
+    }
+  }, [connected, modesByAgent, modelsByAgent]);
+
  const handleKeyDown = (event: React.KeyboardEvent<HTMLTextAreaElement>) => {
    if (event.key === "Enter" && !event.shiftKey) {
      event.preventDefault();
@ -957,17 +1016,28 @@ export default function App() {
          onSelectSession={selectSession}
          onRefresh={fetchSessions}
          onCreateSession={createNewSession}
+          onSelectAgent={handleSelectAgent}
          agents={agents.length ? agents : defaultAgents.map((id) => ({ id, installed: false, capabilities: {} }) as AgentInfo)}
          agentsLoading={agentsLoading}
          agentsError={agentsError}
          sessionsLoading={sessionsLoading}
          sessionsError={sessionsError}
+          modesByAgent={modesByAgent}
+          modelsByAgent={modelsByAgent}
+          defaultModelByAgent={defaultModelByAgent}
+          modesLoadingByAgent={modesLoadingByAgent}
+          modelsLoadingByAgent={modelsLoadingByAgent}
+          modesErrorByAgent={modesErrorByAgent}
+          modelsErrorByAgent={modelsErrorByAgent}
+          mcpServers={mcpServers}
+          onMcpServersChange={setMcpServers}
+          mcpConfigError={parsedMcpConfig.error}
+          skillSources={skillSources}
+          onSkillSourcesChange={setSkillSources}
        />

        <ChatPanel
          sessionId={sessionId}
-          polling={polling}
-          turnStreaming={turnStreaming}
          transcriptEntries={transcriptEntries}
          sessionError={sessionError}
          message={message}
@ -975,36 +1045,19 @@ export default function App() {
          onSendMessage={sendMessage}
          onKeyDown={handleKeyDown}
          onCreateSession={createNewSession}
+          onSelectAgent={handleSelectAgent}
          agents={agents.length ? agents : defaultAgents.map((id) => ({ id, installed: false, capabilities: {} }) as AgentInfo)}
          agentsLoading={agentsLoading}
          agentsError={agentsError}
          messagesEndRef={messagesEndRef}
-          agentId={agentId}
          agentLabel={agentLabel}
-          agentMode={agentMode}
-          permissionMode={permissionMode}
-          model={model}
-          variant={variant}
-          modelOptions={modelOptions}
-          defaultModel={defaultModel}
-          modelsLoading={modelsLoading}
-          modelsError={modelsError}
-          variantOptions={variantOptions}
-          defaultVariant={defaultVariant}
-          supportsVariants={supportsVariants}
-          streamMode={streamMode}
-          activeModes={activeModes}
          currentAgentVersion={currentAgent?.version ?? null}
-          modesLoading={modesLoading}
-          modesError={modesError}
-          onAgentModeChange={setAgentMode}
-          onPermissionModeChange={setPermissionMode}
-          onModelChange={setModel}
-          onVariantChange={setVariant}
-          onStreamModeChange={setStreamMode}
-          onToggleStream={toggleStream}
+          sessionModel={currentSessionInfo?.model ?? null}
+          sessionVariant={currentSessionInfo?.variant ?? null}
+          sessionPermissionMode={currentSessionInfo?.permissionMode ?? null}
+          sessionMcpServerCount={currentSessionInfo?.mcp ? Object.keys(currentSessionInfo.mcp).length : 0}
+          sessionSkillSourceCount={currentSessionInfo?.skills?.sources?.length ?? 0}
          onEndSession={endSession}
-          hasSession={Boolean(sessionId)}
          eventError={eventError}
          questionRequests={questionRequests}
          permissionRequests={permissionRequests}
@ -1013,6 +1066,18 @@ export default function App() {
          onAnswerQuestion={answerQuestion}
          onRejectQuestion={rejectQuestion}
          onReplyPermission={replyPermission}
+          modesByAgent={modesByAgent}
+          modelsByAgent={modelsByAgent}
+          defaultModelByAgent={defaultModelByAgent}
+          modesLoadingByAgent={modesLoadingByAgent}
+          modelsLoadingByAgent={modelsLoadingByAgent}
+          modesErrorByAgent={modesErrorByAgent}
+          modelsErrorByAgent={modelsErrorByAgent}
+          mcpServers={mcpServers}
+          onMcpServersChange={setMcpServers}
+          mcpConfigError={parsedMcpConfig.error}
+          skillSources={skillSources}
+          onSkillSourcesChange={setSkillSources}
        />

        <DebugPanel
--- a/frontend/packages/inspector/src/components/SessionCreateMenu.tsx
+++ b/frontend/packages/inspector/src/components/SessionCreateMenu.tsx
@ -0,0 +1,750 @@
+import { ArrowLeft, ArrowRight, ChevronDown, ChevronRight, Pencil, Plus, X } from "lucide-react";
+import { useEffect, useRef, useState } from "react";
+import type { AgentInfo, AgentModelInfo, AgentModeInfo, SkillSource } from "sandbox-agent";
+import type { McpServerEntry } from "../App";
+
+export type SessionConfig = {
+  model: string;
+  agentMode: string;
+  permissionMode: string;
+  variant: string;
+};
+
+const agentLabels: Record<string, string> = {
+  claude: "Claude Code",
+  codex: "Codex",
+  opencode: "OpenCode",
+  amp: "Amp",
+  mock: "Mock"
+};
+
+const validateServerJson = (json: string): string | null => {
+  const trimmed = json.trim();
+  if (!trimmed) return "Config is required";
+  try {
+    const parsed = JSON.parse(trimmed);
+    if (parsed === null || typeof parsed !== "object" || Array.isArray(parsed)) {
+      return "Must be a JSON object";
+    }
+    if (!parsed.type) return 'Missing "type" field';
+    if (parsed.type !== "local" && parsed.type !== "remote") {
+      return 'Type must be "local" or "remote"';
+    }
+    if (parsed.type === "local" && !parsed.command) return 'Local server requires "command"';
+    if (parsed.type === "remote" && !parsed.url) return 'Remote server requires "url"';
+    return null;
+  } catch {
+    return "Invalid JSON";
+  }
+};
+
+const getServerType = (configJson: string): string | null => {
+  try {
+    const parsed = JSON.parse(configJson);
+    return parsed?.type ?? null;
+  } catch {
+    return null;
+  }
+};
+
+const getServerSummary = (configJson: string): string => {
+  try {
+    const parsed = JSON.parse(configJson);
+    if (parsed?.type === "local") {
+      const cmd = Array.isArray(parsed.command) ? parsed.command.join(" ") : parsed.command;
+      return cmd ?? "local";
+    }
+    if (parsed?.type === "remote") {
+      return parsed.url ?? "remote";
+    }
+    return parsed?.type ?? "";
+  } catch {
+    return "";
+  }
+};
+
+const skillSourceSummary = (source: SkillSource): string => {
+  let summary = source.source;
+  if (source.skills && source.skills.length > 0) {
+    summary += ` [${source.skills.join(", ")}]`;
+  }
+  return summary;
+};
+
+const SessionCreateMenu = ({
+  agents,
+  agentsLoading,
+  agentsError,
+  modesByAgent,
+  modelsByAgent,
+  defaultModelByAgent,
+  modesLoadingByAgent,
+  modelsLoadingByAgent,
+  modesErrorByAgent,
+  modelsErrorByAgent,
+  mcpServers,
+  onMcpServersChange,
+  mcpConfigError,
+  skillSources,
+  onSkillSourcesChange,
+  onSelectAgent,
+  onCreateSession,
+  open,
+  onClose
+}: {
+  agents: AgentInfo[];
+  agentsLoading: boolean;
+  agentsError: string | null;
+  modesByAgent: Record<string, AgentModeInfo[]>;
+  modelsByAgent: Record<string, AgentModelInfo[]>;
+  defaultModelByAgent: Record<string, string>;
+  modesLoadingByAgent: Record<string, boolean>;
+  modelsLoadingByAgent: Record<string, boolean>;
+  modesErrorByAgent: Record<string, string | null>;
+  modelsErrorByAgent: Record<string, string | null>;
+  mcpServers: McpServerEntry[];
+  onMcpServersChange: (servers: McpServerEntry[]) => void;
+  mcpConfigError: string | null;
+  skillSources: SkillSource[];
+  onSkillSourcesChange: (sources: SkillSource[]) => void;
+  onSelectAgent: (agentId: string) => void;
+  onCreateSession: (agentId: string, config: SessionConfig) => void;
+  open: boolean;
+  onClose: () => void;
+}) => {
+  const [phase, setPhase] = useState<"agent" | "config">("agent");
+  const [selectedAgent, setSelectedAgent] = useState("");
+  const [agentMode, setAgentMode] = useState("");
+  const [permissionMode, setPermissionMode] = useState("default");
+  const [model, setModel] = useState("");
+  const [variant, setVariant] = useState("");
+
+  const [mcpExpanded, setMcpExpanded] = useState(false);
+  const [skillsExpanded, setSkillsExpanded] = useState(false);
+
+  // Skill add/edit state
+  const [addingSkill, setAddingSkill] = useState(false);
+  const [editingSkillIndex, setEditingSkillIndex] = useState<number | null>(null);
+  const [skillType, setSkillType] = useState<"github" | "local" | "git">("github");
+  const [skillSource, setSkillSource] = useState("");
+  const [skillFilter, setSkillFilter] = useState("");
+  const [skillRef, setSkillRef] = useState("");
+  const [skillSubpath, setSkillSubpath] = useState("");
+  const [skillLocalError, setSkillLocalError] = useState<string | null>(null);
+  const skillSourceRef = useRef<HTMLInputElement>(null);
+
+  // MCP add/edit state
+  const [addingMcp, setAddingMcp] = useState(false);
+  const [editingMcpIndex, setEditingMcpIndex] = useState<number | null>(null);
+  const [mcpName, setMcpName] = useState("");
+  const [mcpJson, setMcpJson] = useState("");
+  const [mcpLocalError, setMcpLocalError] = useState<string | null>(null);
+  const mcpNameRef = useRef<HTMLInputElement>(null);
+  const mcpJsonRef = useRef<HTMLTextAreaElement>(null);
+
+  const cancelSkillEdit = () => {
+    setAddingSkill(false);
+    setEditingSkillIndex(null);
+    setSkillType("github");
+    setSkillSource("");
+    setSkillFilter("");
+    setSkillRef("");
+    setSkillSubpath("");
+    setSkillLocalError(null);
+  };
+
+  // Reset state when menu closes
+  useEffect(() => {
+    if (!open) {
+      setPhase("agent");
+      setSelectedAgent("");
+      setAgentMode("");
+      setPermissionMode("default");
+      setModel("");
+      setVariant("");
+      setMcpExpanded(false);
+      setSkillsExpanded(false);
+      cancelSkillEdit();
+      setAddingMcp(false);
+      setEditingMcpIndex(null);
+      setMcpName("");
+      setMcpJson("");
+      setMcpLocalError(null);
+    }
+  }, [open]);
+
+  // Auto-select first mode when modes load for selected agent
+  useEffect(() => {
+    if (!selectedAgent) return;
+    const modes = modesByAgent[selectedAgent];
+    if (modes && modes.length > 0 && !agentMode) {
+      setAgentMode(modes[0].id);
+    }
+  }, [modesByAgent, selectedAgent, agentMode]);
+
+  // Focus skill source input when adding
+  useEffect(() => {
+    if ((addingSkill || editingSkillIndex !== null) && skillSourceRef.current) {
+      skillSourceRef.current.focus();
+    }
+  }, [addingSkill, editingSkillIndex]);
+
+  // Focus MCP name input when adding
+  useEffect(() => {
+    if (addingMcp && mcpNameRef.current) {
+      mcpNameRef.current.focus();
+    }
+  }, [addingMcp]);
+
+  // Focus MCP json textarea when editing
+  useEffect(() => {
+    if (editingMcpIndex !== null && mcpJsonRef.current) {
+      mcpJsonRef.current.focus();
+    }
+  }, [editingMcpIndex]);
+
+  if (!open) return null;
+
+  const handleAgentClick = (agentId: string) => {
+    setSelectedAgent(agentId);
+    setPhase("config");
+    onSelectAgent(agentId);
+  };
+
+  const handleBack = () => {
+    setPhase("agent");
+    setSelectedAgent("");
+    setAgentMode("");
+    setPermissionMode("default");
+    setModel("");
+    setVariant("");
+  };
+
+  const handleCreate = () => {
+    if (mcpConfigError) return;
+    onCreateSession(selectedAgent, { model, agentMode, permissionMode, variant });
+    onClose();
+  };
+
+  // Skill source helpers
+  const startAddSkill = () => {
+    setAddingSkill(true);
+    setEditingSkillIndex(null);
+    setSkillType("github");
+    setSkillSource("rivet-dev/skills");
+    setSkillFilter("sandbox-agent");
+    setSkillRef("");
+    setSkillSubpath("");
+    setSkillLocalError(null);
+  };
+
+  const startEditSkill = (index: number) => {
+    const entry = skillSources[index];
+    setEditingSkillIndex(index);
+    setAddingSkill(false);
+    setSkillType(entry.type as "github" | "local" | "git");
+    setSkillSource(entry.source);
+    setSkillFilter(entry.skills?.join(", ") ?? "");
+    setSkillRef(entry.ref ?? "");
+    setSkillSubpath(entry.subpath ?? "");
+    setSkillLocalError(null);
+  };
+
+  const commitSkill = () => {
+    const src = skillSource.trim();
+    if (!src) {
+      setSkillLocalError("Source is required");
+      return;
+    }
+    const entry: SkillSource = {
+      type: skillType,
+      source: src,
+    };
+    const filterList = skillFilter.trim()
+      ? skillFilter.split(",").map((s) => s.trim()).filter(Boolean)
+      : undefined;
+    if (filterList && filterList.length > 0) entry.skills = filterList;
+    if (skillRef.trim()) entry.ref = skillRef.trim();
+    if (skillSubpath.trim()) entry.subpath = skillSubpath.trim();
+
+    if (editingSkillIndex !== null) {
+      const updated = [...skillSources];
+      updated[editingSkillIndex] = entry;
+      onSkillSourcesChange(updated);
+    } else {
+      onSkillSourcesChange([...skillSources, entry]);
+    }
+    cancelSkillEdit();
+  };
+
+  const removeSkill = (index: number) => {
+    onSkillSourcesChange(skillSources.filter((_, i) => i !== index));
+    if (editingSkillIndex === index) {
+      cancelSkillEdit();
+    }
+  };
+
+  const isEditingSkill = addingSkill || editingSkillIndex !== null;
+
+  const startAddMcp = () => {
+    setAddingMcp(true);
+    setEditingMcpIndex(null);
+    setMcpName("everything");
+    setMcpJson('{\n  "type": "local",\n  "command": "npx",\n  "args": ["@modelcontextprotocol/server-everything"]\n}');
+    setMcpLocalError(null);
+  };
+
+  const startEditMcp = (index: number) => {
+    const entry = mcpServers[index];
+    setEditingMcpIndex(index);
+    setAddingMcp(false);
+    setMcpName(entry.name);
+    setMcpJson(entry.configJson);
+    setMcpLocalError(entry.error);
+  };
+
+  const cancelMcpEdit = () => {
+    setAddingMcp(false);
+    setEditingMcpIndex(null);
+    setMcpName("");
+    setMcpJson("");
+    setMcpLocalError(null);
+  };
+
+  const commitMcp = () => {
+    const name = mcpName.trim();
+    if (!name) {
+      setMcpLocalError("Server name is required");
+      return;
+    }
+    const error = validateServerJson(mcpJson);
+    if (error) {
+      setMcpLocalError(error);
+      return;
+    }
+    // Check for duplicate names (except when editing the same entry)
+    const duplicate = mcpServers.findIndex((e) => e.name === name);
+    if (duplicate !== -1 && duplicate !== editingMcpIndex) {
+      setMcpLocalError(`Server "${name}" already exists`);
+      return;
+    }
+
+    const entry: McpServerEntry = { name, configJson: mcpJson.trim(), error: null };
+
+    if (editingMcpIndex !== null) {
+      const updated = [...mcpServers];
+      updated[editingMcpIndex] = entry;
+      onMcpServersChange(updated);
+    } else {
+      onMcpServersChange([...mcpServers, entry]);
+    }
+    cancelMcpEdit();
+  };
+
+  const removeMcp = (index: number) => {
+    onMcpServersChange(mcpServers.filter((_, i) => i !== index));
+    if (editingMcpIndex === index) {
+      cancelMcpEdit();
+    }
+  };
+
+  const isEditingMcp = addingMcp || editingMcpIndex !== null;
+
+  if (phase === "agent") {
+    return (
+      <div className="session-create-menu">
+        {agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
+        {agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
+        {!agentsLoading && !agentsError && agents.length === 0 && (
+          <div className="sidebar-add-status">No agents available.</div>
+        )}
+        {!agentsLoading && !agentsError &&
+          agents.map((agent) => (
+            <button
+              key={agent.id}
+              className="sidebar-add-option"
+              onClick={() => handleAgentClick(agent.id)}
+            >
+              <div className="agent-option-left">
+                <span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
+                {agent.version && <span className="agent-option-version">{agent.version}</span>}
+              </div>
+              <div className="agent-option-badges">
+                {agent.installed && <span className="agent-badge installed">Installed</span>}
+                <ArrowRight size={12} className="agent-option-arrow" />
+              </div>
+            </button>
+          ))}
+      </div>
+    );
+  }
+
+  // Phase 2: config form
+  const activeModes = modesByAgent[selectedAgent] ?? [];
+  const modesLoading = modesLoadingByAgent[selectedAgent] ?? false;
+  const modesError = modesErrorByAgent[selectedAgent] ?? null;
+  const modelOptions = modelsByAgent[selectedAgent] ?? [];
+  const modelsLoading = modelsLoadingByAgent[selectedAgent] ?? false;
+  const modelsError = modelsErrorByAgent[selectedAgent] ?? null;
+  const defaultModel = defaultModelByAgent[selectedAgent] ?? "";
+  const selectedModelId = model || defaultModel;
+  const selectedModelObj = modelOptions.find((entry) => entry.id === selectedModelId);
+  const variantOptions = selectedModelObj?.variants ?? [];
+  const showModelSelect = modelsLoading || Boolean(modelsError) || modelOptions.length > 0;
+  const hasModelOptions = modelOptions.length > 0;
+  const modelCustom =
+    model && hasModelOptions && !modelOptions.some((entry) => entry.id === model);
+  const supportsVariants =
+    modelsLoading ||
+    Boolean(modelsError) ||
+    modelOptions.some((entry) => (entry.variants?.length ?? 0) > 0);
+  const showVariantSelect =
+    supportsVariants && (modelsLoading || Boolean(modelsError) || variantOptions.length > 0);
+  const hasVariantOptions = variantOptions.length > 0;
+  const variantCustom = variant && hasVariantOptions && !variantOptions.includes(variant);
+  const agentLabel = agentLabels[selectedAgent] ?? selectedAgent;
+
+  return (
+    <div className="session-create-menu">
+      <div className="session-create-header">
+        <button className="session-create-back" onClick={handleBack} title="Back to agents">
+          <ArrowLeft size={14} />
+        </button>
+        <span className="session-create-agent-name">{agentLabel}</span>
+      </div>
+
+      <div className="session-create-form">
+        <div className="setup-field">
+          <span className="setup-label">Model</span>
+          {showModelSelect ? (
+            <select
+              className="setup-select"
+              value={model}
+              onChange={(e) => { setModel(e.target.value); setVariant(""); }}
+              title="Model"
+              disabled={modelsLoading || Boolean(modelsError)}
+            >
+              {modelsLoading ? (
+                <option value="">Loading models...</option>
+              ) : modelsError ? (
+                <option value="">{modelsError}</option>
+              ) : (
+                <>
+                  <option value="">
+                    {defaultModel ? `Default (${defaultModel})` : "Default"}
+                  </option>
+                  {modelCustom && <option value={model}>{model} (custom)</option>}
+                  {modelOptions.map((entry) => (
+                    <option key={entry.id} value={entry.id}>
+                      {entry.name ?? entry.id}
+                    </option>
+                  ))}
+                </>
+              )}
+            </select>
+          ) : (
+            <input
+              className="setup-input"
+              value={model}
+              onChange={(e) => setModel(e.target.value)}
+              placeholder="Model"
+              title="Model"
+            />
+          )}
+        </div>
+
+        <div className="setup-field">
+          <span className="setup-label">Mode</span>
+          <select
+            className="setup-select"
+            value={agentMode}
+            onChange={(e) => setAgentMode(e.target.value)}
+            title="Mode"
+            disabled={modesLoading || Boolean(modesError)}
+          >
+            {modesLoading ? (
+              <option value="">Loading modes...</option>
+            ) : modesError ? (
+              <option value="">{modesError}</option>
+            ) : activeModes.length > 0 ? (
+              activeModes.map((m) => (
+                <option key={m.id} value={m.id}>
+                  {m.name || m.id}
+                </option>
+              ))
+            ) : (
+              <option value="">Mode</option>
+            )}
+          </select>
+        </div>
+
+        <div className="setup-field">
+          <span className="setup-label">Permission</span>
+          <select
+            className="setup-select"
+            value={permissionMode}
+            onChange={(e) => setPermissionMode(e.target.value)}
+            title="Permission Mode"
+          >
+            <option value="default">Default</option>
+            <option value="plan">Plan</option>
+            <option value="bypass">Bypass</option>
+          </select>
+        </div>
+
+        {supportsVariants && (
+          <div className="setup-field">
+            <span className="setup-label">Variant</span>
+            {showVariantSelect ? (
+              <select
+                className="setup-select"
+                value={variant}
+                onChange={(e) => setVariant(e.target.value)}
+                title="Variant"
+                disabled={modelsLoading || Boolean(modelsError)}
+              >
+                {modelsLoading ? (
+                  <option value="">Loading variants...</option>
+                ) : modelsError ? (
+                  <option value="">{modelsError}</option>
+                ) : (
+                  <>
+                    <option value="">Default</option>
+                    {variantCustom && <option value={variant}>{variant} (custom)</option>}
+                    {variantOptions.map((entry) => (
+                      <option key={entry} value={entry}>
+                        {entry}
+                      </option>
+                    ))}
+                  </>
+                )}
+              </select>
+            ) : (
+              <input
+                className="setup-input"
+                value={variant}
+                onChange={(e) => setVariant(e.target.value)}
+                placeholder="Variant"
+                title="Variant"
+              />
+            )}
+          </div>
+        )}
+
+        {/* MCP Servers - collapsible */}
+        <div className="session-create-section">
+          <button
+            type="button"
+            className="session-create-section-toggle"
+            onClick={() => setMcpExpanded(!mcpExpanded)}
+          >
+            <span className="setup-label">MCP</span>
+            <span className="session-create-section-count">{mcpServers.length} server{mcpServers.length !== 1 ? "s" : ""}</span>
+            {mcpExpanded ? <ChevronDown size={12} className="session-create-section-arrow" /> : <ChevronRight size={12} className="session-create-section-arrow" />}
+          </button>
+          {mcpExpanded && (
+            <div className="session-create-section-body">
+              {mcpServers.length > 0 && !isEditingMcp && (
+                <div className="session-create-mcp-list">
+                  {mcpServers.map((entry, index) => (
+                    <div key={entry.name} className="session-create-mcp-item">
+                      <div className="session-create-mcp-info">
+                        <span className="session-create-mcp-name">{entry.name}</span>
+                        {getServerType(entry.configJson) && (
+                          <span className="session-create-mcp-type">{getServerType(entry.configJson)}</span>
+                        )}
+                        <span className="session-create-mcp-summary mono">{getServerSummary(entry.configJson)}</span>
+                      </div>
+                      <div className="session-create-mcp-actions">
+                        <button
+                          type="button"
+                          className="session-create-skill-remove"
+                          onClick={() => startEditMcp(index)}
+                          title="Edit server"
+                        >
+                          <Pencil size={10} />
+                        </button>
+                        <button
+                          type="button"
+                          className="session-create-skill-remove"
+                          onClick={() => removeMcp(index)}
+                          title="Remove server"
+                        >
+                          <X size={12} />
+                        </button>
+                      </div>
+                    </div>
+                  ))}
+                </div>
+              )}
+              {isEditingMcp ? (
+                <div className="session-create-mcp-edit">
+                  <input
+                    ref={mcpNameRef}
+                    className="session-create-mcp-name-input"
+                    value={mcpName}
+                    onChange={(e) => { setMcpName(e.target.value); setMcpLocalError(null); }}
+                    placeholder="server-name"
+                    disabled={editingMcpIndex !== null}
+                  />
+                  <textarea
+                    ref={mcpJsonRef}
+                    className="session-create-textarea mono"
+                    value={mcpJson}
+                    onChange={(e) => { setMcpJson(e.target.value); setMcpLocalError(null); }}
+                    placeholder='{"type":"local","command":"node","args":["./server.js"]}'
+                    rows={4}
+                  />
+                  {mcpLocalError && (
+                    <div className="session-create-inline-error">{mcpLocalError}</div>
+                  )}
+                  <div className="session-create-mcp-edit-actions">
+                    <button type="button" className="session-create-mcp-save" onClick={commitMcp}>
+                      {editingMcpIndex !== null ? "Save" : "Add"}
+                    </button>
+                    <button type="button" className="session-create-mcp-cancel" onClick={cancelMcpEdit}>
+                      Cancel
+                    </button>
+                  </div>
+                </div>
+              ) : (
+                <button
+                  type="button"
+                  className="session-create-add-btn"
+                  onClick={startAddMcp}
+                >
+                  <Plus size={12} />
+                  Add server
+                </button>
+              )}
+              {mcpConfigError && !isEditingMcp && (
+                <div className="session-create-inline-error">{mcpConfigError}</div>
+              )}
+            </div>
+          )}
+        </div>
+
+        {/* Skills - collapsible with source-based list */}
+        <div className="session-create-section">
+          <button
+            type="button"
+            className="session-create-section-toggle"
+            onClick={() => setSkillsExpanded(!skillsExpanded)}
+          >
+            <span className="setup-label">Skills</span>
+            <span className="session-create-section-count">{skillSources.length} source{skillSources.length !== 1 ? "s" : ""}</span>
+            {skillsExpanded ? <ChevronDown size={12} className="session-create-section-arrow" /> : <ChevronRight size={12} className="session-create-section-arrow" />}
+          </button>
+          {skillsExpanded && (
+            <div className="session-create-section-body">
+              {skillSources.length > 0 && !isEditingSkill && (
+                <div className="session-create-skill-list">
+                  {skillSources.map((entry, index) => (
+                    <div key={`${entry.type}-${entry.source}-${index}`} className="session-create-skill-item">
+                      <span className="session-create-skill-type-badge">{entry.type}</span>
+                      <span className="session-create-skill-path mono">{skillSourceSummary(entry)}</span>
+                      <div className="session-create-mcp-actions">
+                        <button
+                          type="button"
+                          className="session-create-skill-remove"
+                          onClick={() => startEditSkill(index)}
+                          title="Edit source"
+                        >
+                          <Pencil size={10} />
+                        </button>
+                        <button
+                          type="button"
+                          className="session-create-skill-remove"
+                          onClick={() => removeSkill(index)}
+                          title="Remove source"
+                        >
+                          <X size={12} />
+                        </button>
+                      </div>
+                    </div>
+                  ))}
+                </div>
+              )}
+              {isEditingSkill ? (
+                <div className="session-create-mcp-edit">
+                  <div className="session-create-skill-type-row">
+                    <select
+                      className="session-create-skill-type-select"
+                      value={skillType}
+                      onChange={(e) => { setSkillType(e.target.value as "github" | "local" | "git"); setSkillLocalError(null); }}
+                    >
+                      <option value="github">github</option>
+                      <option value="local">local</option>
+                      <option value="git">git</option>
+                    </select>
+                    <input
+                      ref={skillSourceRef}
+                      className="session-create-skill-input mono"
+                      value={skillSource}
+                      onChange={(e) => { setSkillSource(e.target.value); setSkillLocalError(null); }}
+                      placeholder={skillType === "github" ? "owner/repo" : skillType === "local" ? "/path/to/skill" : "https://git.example.com/repo.git"}
+                    />
+                  </div>
+                  <input
+                    className="session-create-skill-input mono"
+                    value={skillFilter}
+                    onChange={(e) => setSkillFilter(e.target.value)}
+                    placeholder="Filter skills (comma-separated, optional)"
+                  />
+                  {skillType !== "local" && (
+                    <div className="session-create-skill-type-row">
+                      <input
+                        className="session-create-skill-input mono"
+                        value={skillRef}
+                        onChange={(e) => setSkillRef(e.target.value)}
+                        placeholder="Branch/tag (optional)"
+                      />
+                      <input
+                        className="session-create-skill-input mono"
+                        value={skillSubpath}
+                        onChange={(e) => setSkillSubpath(e.target.value)}
+                        placeholder="Subpath (optional)"
+                      />
+                    </div>
+                  )}
+                  {skillLocalError && (
+                    <div className="session-create-inline-error">{skillLocalError}</div>
+                  )}
+                  <div className="session-create-mcp-edit-actions">
+                    <button type="button" className="session-create-mcp-save" onClick={commitSkill}>
+                      {editingSkillIndex !== null ? "Save" : "Add"}
+                    </button>
+                    <button type="button" className="session-create-mcp-cancel" onClick={cancelSkillEdit}>
+                      Cancel
+                    </button>
+                  </div>
+                </div>
+              ) : (
+                <button
+                  type="button"
+                  className="session-create-add-btn"
+                  onClick={startAddSkill}
+                >
+                  <Plus size={12} />
+                  Add source
+                </button>
+              )}
+            </div>
+          )}
+        </div>
+      </div>
+
+      <div className="session-create-actions">
+        <button
+          className="button primary"
+          onClick={handleCreate}
+          disabled={Boolean(mcpConfigError)}
+        >
+          Create Session
+        </button>
+      </div>
+    </div>
+  );
+};
+
+export default SessionCreateMenu;
--- a/frontend/packages/inspector/src/components/SessionSidebar.tsx
+++ b/frontend/packages/inspector/src/components/SessionSidebar.tsx
@ -1,6 +1,17 @@
 import { Plus, RefreshCw } from "lucide-react";
 import { useEffect, useRef, useState } from "react";
-import type { AgentInfo, SessionInfo } from "sandbox-agent";
+import type { AgentInfo, AgentModelInfo, AgentModeInfo, SessionInfo, SkillSource } from "sandbox-agent";
+import type { McpServerEntry } from "../App";
+import SessionCreateMenu, { type SessionConfig } from "./SessionCreateMenu";
+
+const agentLabels: Record<string, string> = {
+  claude: "Claude Code",
+  codex: "Codex",
+  opencode: "OpenCode",
+  amp: "Amp",
+  pi: "Pi",
+  mock: "Mock"
+};

 const SessionSidebar = ({
  sessions,
@ -8,22 +19,48 @@ const SessionSidebar = ({
  onSelectSession,
  onRefresh,
  onCreateSession,
+  onSelectAgent,
  agents,
  agentsLoading,
  agentsError,
  sessionsLoading,
-  sessionsError
+  sessionsError,
+  modesByAgent,
+  modelsByAgent,
+  defaultModelByAgent,
+  modesLoadingByAgent,
+  modelsLoadingByAgent,
+  modesErrorByAgent,
+  modelsErrorByAgent,
+  mcpServers,
+  onMcpServersChange,
+  mcpConfigError,
+  skillSources,
+  onSkillSourcesChange
 }: {
  sessions: SessionInfo[];
  selectedSessionId: string;
  onSelectSession: (session: SessionInfo) => void;
  onRefresh: () => void;
-  onCreateSession: (agentId: string) => void;
+  onCreateSession: (agentId: string, config: SessionConfig) => void;
+  onSelectAgent: (agentId: string) => void;
  agents: AgentInfo[];
  agentsLoading: boolean;
  agentsError: string | null;
  sessionsLoading: boolean;
  sessionsError: string | null;
+  modesByAgent: Record<string, AgentModeInfo[]>;
+  modelsByAgent: Record<string, AgentModelInfo[]>;
+  defaultModelByAgent: Record<string, string>;
+  modesLoadingByAgent: Record<string, boolean>;
+  modelsLoadingByAgent: Record<string, boolean>;
+  modesErrorByAgent: Record<string, string | null>;
+  modelsErrorByAgent: Record<string, string | null>;
+  mcpServers: McpServerEntry[];
+  onMcpServersChange: (servers: McpServerEntry[]) => void;
+  mcpConfigError: string | null;
+  skillSources: SkillSource[];
+  onSkillSourcesChange: (sources: SkillSource[]) => void;
 }) => {
  const [showMenu, setShowMenu] = useState(false);
  const menuRef = useRef<HTMLDivElement | null>(null);
@ -40,15 +77,6 @@ const SessionSidebar = ({
    return () => document.removeEventListener("mousedown", handler);
  }, [showMenu]);

-  const agentLabels: Record<string, string> = {
-    claude: "Claude Code",
-    codex: "Codex",
-    opencode: "OpenCode",
-    amp: "Amp",
-    pi: "Pi",
-    mock: "Mock"
-  };
-
  return (
    <div className="session-sidebar">
      <div className="sidebar-header">
@ -65,32 +93,27 @@ const SessionSidebar = ({
            >
              <Plus size={14} />
            </button>
-            {showMenu && (
-              <div className="sidebar-add-menu">
-                {agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
-                {agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
-                {!agentsLoading && !agentsError && agents.length === 0 && (
-                  <div className="sidebar-add-status">No agents available.</div>
-                )}
-                {!agentsLoading && !agentsError &&
-                  agents.map((agent) => (
-                    <button
-                      key={agent.id}
-                      className="sidebar-add-option"
-                      onClick={() => {
-                        onCreateSession(agent.id);
-                        setShowMenu(false);
-                      }}
-                    >
-                      <div className="agent-option-left">
-                        <span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
-                        {agent.version && <span className="agent-badge version">v{agent.version}</span>}
-                      </div>
-                      {agent.installed && <span className="agent-badge installed">Installed</span>}
-                    </button>
-                  ))}
-              </div>
-            )}
+            <SessionCreateMenu
+              agents={agents}
+              agentsLoading={agentsLoading}
+              agentsError={agentsError}
+              modesByAgent={modesByAgent}
+              modelsByAgent={modelsByAgent}
+              defaultModelByAgent={defaultModelByAgent}
+              modesLoadingByAgent={modesLoadingByAgent}
+              modelsLoadingByAgent={modelsLoadingByAgent}
+              modesErrorByAgent={modesErrorByAgent}
+              modelsErrorByAgent={modelsErrorByAgent}
+              mcpServers={mcpServers}
+              onMcpServersChange={onMcpServersChange}
+              mcpConfigError={mcpConfigError}
+              skillSources={skillSources}
+              onSkillSourcesChange={onSkillSourcesChange}
+              onSelectAgent={onSelectAgent}
+              onCreateSession={onCreateSession}
+              open={showMenu}
+              onClose={() => setShowMenu(false)}
+            />
          </div>
        </div>
      </div>
--- a/frontend/packages/inspector/src/components/chat/ChatPanel.tsx
+++ b/frontend/packages/inspector/src/components/chat/ChatPanel.tsx
@ -1,16 +1,15 @@
-import { MessageSquare, PauseCircle, PlayCircle, Plus, Square, Terminal } from "lucide-react";
+import { MessageSquare, Plus, Square, Terminal } from "lucide-react";
 import { useEffect, useRef, useState } from "react";
-import type { AgentInfo, AgentModelInfo, AgentModeInfo, PermissionEventData, QuestionEventData } from "sandbox-agent";
+import type { AgentInfo, AgentModelInfo, AgentModeInfo, PermissionEventData, QuestionEventData, SkillSource } from "sandbox-agent";
+import type { McpServerEntry } from "../../App";
 import ApprovalsTab from "../debug/ApprovalsTab";
+import SessionCreateMenu, { type SessionConfig } from "../SessionCreateMenu";
 import ChatInput from "./ChatInput";
 import ChatMessages from "./ChatMessages";
-import ChatSetup from "./ChatSetup";
 import type { TimelineEntry } from "./types";

 const ChatPanel = ({
  sessionId,
-  polling,
-  turnStreaming,
  transcriptEntries,
  sessionError,
  message,
@ -18,35 +17,18 @@ const ChatPanel = ({
  onSendMessage,
  onKeyDown,
  onCreateSession,
+  onSelectAgent,
  agents,
  agentsLoading,
  agentsError,
  messagesEndRef,
-  agentId,
  agentLabel,
-  agentMode,
-  permissionMode,
-  model,
-  variant,
-  modelOptions,
-  defaultModel,
-  modelsLoading,
-  modelsError,
-  variantOptions,
-  defaultVariant,
-  supportsVariants,
-  streamMode,
-  activeModes,
  currentAgentVersion,
-  hasSession,
-  modesLoading,
-  modesError,
-  onAgentModeChange,
-  onPermissionModeChange,
-  onModelChange,
-  onVariantChange,
-  onStreamModeChange,
-  onToggleStream,
+  sessionModel,
+  sessionVariant,
+  sessionPermissionMode,
+  sessionMcpServerCount,
+  sessionSkillSourceCount,
  onEndSession,
  eventError,
  questionRequests,
@ -55,47 +37,40 @@ const ChatPanel = ({
  onSelectQuestionOption,
  onAnswerQuestion,
  onRejectQuestion,
-  onReplyPermission
+  onReplyPermission,
+  modesByAgent,
+  modelsByAgent,
+  defaultModelByAgent,
+  modesLoadingByAgent,
+  modelsLoadingByAgent,
+  modesErrorByAgent,
+  modelsErrorByAgent,
+  mcpServers,
+  onMcpServersChange,
+  mcpConfigError,
+  skillSources,
+  onSkillSourcesChange
 }: {
  sessionId: string;
-  polling: boolean;
-  turnStreaming: boolean;
  transcriptEntries: TimelineEntry[];
  sessionError: string | null;
  message: string;
  onMessageChange: (value: string) => void;
  onSendMessage: () => void;
  onKeyDown: (event: React.KeyboardEvent<HTMLTextAreaElement>) => void;
-  onCreateSession: (agentId: string) => void;
+  onCreateSession: (agentId: string, config: SessionConfig) => void;
+  onSelectAgent: (agentId: string) => void;
  agents: AgentInfo[];
  agentsLoading: boolean;
  agentsError: string | null;
  messagesEndRef: React.RefObject<HTMLDivElement>;
-  agentId: string;
  agentLabel: string;
-  agentMode: string;
-  permissionMode: string;
-  model: string;
-  variant: string;
-  modelOptions: AgentModelInfo[];
-  defaultModel: string;
-  modelsLoading: boolean;
-  modelsError: string | null;
-  variantOptions: string[];
-  defaultVariant: string;
-  supportsVariants: boolean;
-  streamMode: "poll" | "sse" | "turn";
-  activeModes: AgentModeInfo[];
  currentAgentVersion?: string | null;
-  hasSession: boolean;
-  modesLoading: boolean;
-  modesError: string | null;
-  onAgentModeChange: (value: string) => void;
-  onPermissionModeChange: (value: string) => void;
-  onModelChange: (value: string) => void;
-  onVariantChange: (value: string) => void;
-  onStreamModeChange: (value: "poll" | "sse" | "turn") => void;
-  onToggleStream: () => void;
+  sessionModel?: string | null;
+  sessionVariant?: string | null;
+  sessionPermissionMode?: string | null;
+  sessionMcpServerCount: number;
+  sessionSkillSourceCount: number;
  onEndSession: () => void;
  eventError: string | null;
  questionRequests: QuestionEventData[];
@ -105,6 +80,18 @@ const ChatPanel = ({
  onAnswerQuestion: (request: QuestionEventData) => void;
  onRejectQuestion: (requestId: string) => void;
  onReplyPermission: (requestId: string, reply: "once" | "always" | "reject") => void;
+  modesByAgent: Record<string, AgentModeInfo[]>;
+  modelsByAgent: Record<string, AgentModelInfo[]>;
+  defaultModelByAgent: Record<string, string>;
+  modesLoadingByAgent: Record<string, boolean>;
+  modelsLoadingByAgent: Record<string, boolean>;
+  modesErrorByAgent: Record<string, string | null>;
+  modelsErrorByAgent: Record<string, string | null>;
+  mcpServers: McpServerEntry[];
+  onMcpServersChange: (servers: McpServerEntry[]) => void;
+  mcpConfigError: string | null;
+  skillSources: SkillSource[];
+  onSkillSourcesChange: (sources: SkillSource[]) => void;
 }) => {
  const [showAgentMenu, setShowAgentMenu] = useState(false);
  const menuRef = useRef<HTMLDivElement | null>(null);
@ -121,19 +108,7 @@ const ChatPanel = ({
    return () => document.removeEventListener("mousedown", handler);
  }, [showAgentMenu]);

-  const agentLabels: Record<string, string> = {
-    claude: "Claude Code",
-    codex: "Codex",
-    opencode: "OpenCode",
-    amp: "Amp",
-    pi: "Pi",
-    mock: "Mock"
-  };
-
  const hasApprovals = questionRequests.length > 0 || permissionRequests.length > 0;
-  const isTurnMode = streamMode === "turn";
-  const isStreaming = isTurnMode ? turnStreaming : polling;
-  const turnLabel = turnStreaming ? "Streaming" : "On Send";

  return (
    <div className="chat-panel">
@ -142,12 +117,6 @@ const ChatPanel = ({
          <MessageSquare className="button-icon" />
          <span className="panel-title">{sessionId ? "Session" : "No Session"}</span>
          {sessionId && <span className="session-id-display">{sessionId}</span>}
-          {sessionId && (
-            <span className="session-agent-display">
-              {agentLabel}
-              {currentAgentVersion && <span className="session-agent-version">v{currentAgentVersion}</span>}
-            </span>
-          )}
        </div>
        <div className="panel-header-right">
          {sessionId && (
@ -161,42 +130,6 @@ const ChatPanel = ({
              End
            </button>
          )}
-          <div className="setup-stream">
-            <select
-              className="setup-select-small"
-              value={streamMode}
-              onChange={(e) => onStreamModeChange(e.target.value as "poll" | "sse" | "turn")}
-              title="Stream Mode"
-              disabled={!sessionId}
-            >
-              <option value="poll">Poll</option>
-              <option value="sse">SSE</option>
-              <option value="turn">Turn</option>
-            </select>
-            <button
-              className={`setup-stream-btn ${isStreaming ? "active" : ""}`}
-              onClick={onToggleStream}
-              title={isTurnMode ? "Turn streaming starts on send" : polling ? "Stop streaming" : "Start streaming"}
-              disabled={!sessionId || isTurnMode}
-            >
-              {isTurnMode ? (
-                <>
-                  <PlayCircle size={14} />
-                  <span>{turnLabel}</span>
-                </>
-              ) : polling ? (
-                <>
-                  <PauseCircle size={14} />
-                  <span>Pause</span>
-                </>
-              ) : (
-                <>
-                  <PlayCircle size={14} />
-                  <span>Resume</span>
-                </>
-              )}
-            </button>
-          </div>
        </div>
      </div>

@ -214,32 +147,27 @@ const ChatPanel = ({
                <Plus className="button-icon" />
                Create Session
              </button>
-              {showAgentMenu && (
-                <div className="empty-state-menu">
-                  {agentsLoading && <div className="sidebar-add-status">Loading agents...</div>}
-                  {agentsError && <div className="sidebar-add-status error">{agentsError}</div>}
-                  {!agentsLoading && !agentsError && agents.length === 0 && (
-                    <div className="sidebar-add-status">No agents available.</div>
-                  )}
-                  {!agentsLoading && !agentsError &&
-                    agents.map((agent) => (
-                      <button
-                        key={agent.id}
-                        className="sidebar-add-option"
-                        onClick={() => {
-                          onCreateSession(agent.id);
-                          setShowAgentMenu(false);
-                        }}
-                      >
-                        <div className="agent-option-left">
-                          <span className="agent-option-name">{agentLabels[agent.id] ?? agent.id}</span>
-                          {agent.version && <span className="agent-badge version">v{agent.version}</span>}
-                        </div>
-                        {agent.installed && <span className="agent-badge installed">Installed</span>}
-                      </button>
-                    ))}
-                </div>
-              )}
+              <SessionCreateMenu
+                agents={agents}
+                agentsLoading={agentsLoading}
+                agentsError={agentsError}
+                modesByAgent={modesByAgent}
+                modelsByAgent={modelsByAgent}
+                defaultModelByAgent={defaultModelByAgent}
+                modesLoadingByAgent={modesLoadingByAgent}
+                modelsLoadingByAgent={modelsLoadingByAgent}
+                modesErrorByAgent={modesErrorByAgent}
+                modelsErrorByAgent={modelsErrorByAgent}
+                mcpServers={mcpServers}
+                onMcpServersChange={onMcpServersChange}
+                mcpConfigError={mcpConfigError}
+                skillSources={skillSources}
+                onSkillSourcesChange={onSkillSourcesChange}
+                onSelectAgent={onSelectAgent}
+                onCreateSession={onCreateSession}
+                open={showAgentMenu}
+                onClose={() => setShowAgentMenu(false)}
+              />
            </div>
          </div>
        ) : transcriptEntries.length === 0 && !sessionError ? (
@ -247,7 +175,7 @@ const ChatPanel = ({
            <Terminal className="empty-state-icon" />
            <div className="empty-state-title">Ready to Chat</div>
            <p className="empty-state-text">Send a message to start a conversation with the agent.</p>
-            {agentId === "mock" && (
+            {agentLabel === "Mock" && (
              <div className="mock-agent-hint">
                The mock agent simulates agent responses for testing the inspector UI without requiring API credentials. Send <code>help</code> for available commands.
              </div>
@ -284,30 +212,37 @@ const ChatPanel = ({
        onSendMessage={onSendMessage}
        onKeyDown={onKeyDown}
        placeholder={sessionId ? "Send a message..." : "Select or create a session first"}
-        disabled={!sessionId || turnStreaming}
+        disabled={!sessionId}
      />

-      <ChatSetup
-        agentMode={agentMode}
-        permissionMode={permissionMode}
-        model={model}
-        variant={variant}
-        modelOptions={modelOptions}
-        defaultModel={defaultModel}
-        modelsLoading={modelsLoading}
-        modelsError={modelsError}
-        variantOptions={variantOptions}
-        defaultVariant={defaultVariant}
-        supportsVariants={supportsVariants}
-        activeModes={activeModes}
-        modesLoading={modesLoading}
-        modesError={modesError}
-        onAgentModeChange={onAgentModeChange}
-        onPermissionModeChange={onPermissionModeChange}
-        onModelChange={onModelChange}
-        onVariantChange={onVariantChange}
-        hasSession={hasSession}
-      />
+      {sessionId && (
+        <div className="session-config-bar">
+          <div className="session-config-field">
+            <span className="session-config-label">Agent</span>
+            <span className="session-config-value">{agentLabel}</span>
+          </div>
+          <div className="session-config-field">
+            <span className="session-config-label">Model</span>
+            <span className="session-config-value">{sessionModel || "-"}</span>
+          </div>
+          <div className="session-config-field">
+            <span className="session-config-label">Variant</span>
+            <span className="session-config-value">{sessionVariant || "-"}</span>
+          </div>
+          <div className="session-config-field">
+            <span className="session-config-label">Permission</span>
+            <span className="session-config-value">{sessionPermissionMode || "-"}</span>
+          </div>
+          <div className="session-config-field">
+            <span className="session-config-label">MCP Servers</span>
+            <span className="session-config-value">{sessionMcpServerCount}</span>
+          </div>
+          <div className="session-config-field">
+            <span className="session-config-label">Skills</span>
+            <span className="session-config-value">{sessionSkillSourceCount}</span>
+          </div>
+        </div>
+      )}
    </div>
  );
 };
--- a/frontend/packages/inspector/src/components/chat/ChatSetup.tsx
+++ b/frontend/packages/inspector/src/components/chat/ChatSetup.tsx
@ -1,178 +0,0 @@
-import type { AgentModelInfo, AgentModeInfo } from "sandbox-agent";
-
-const ChatSetup = ({
-  agentMode,
-  permissionMode,
-  model,
-  variant,
-  modelOptions,
-  defaultModel,
-  modelsLoading,
-  modelsError,
-  variantOptions,
-  defaultVariant,
-  supportsVariants,
-  activeModes,
-  hasSession,
-  modesLoading,
-  modesError,
-  onAgentModeChange,
-  onPermissionModeChange,
-  onModelChange,
-  onVariantChange
-}: {
-  agentMode: string;
-  permissionMode: string;
-  model: string;
-  variant: string;
-  modelOptions: AgentModelInfo[];
-  defaultModel: string;
-  modelsLoading: boolean;
-  modelsError: string | null;
-  variantOptions: string[];
-  defaultVariant: string;
-  supportsVariants: boolean;
-  activeModes: AgentModeInfo[];
-  hasSession: boolean;
-  modesLoading: boolean;
-  modesError: string | null;
-  onAgentModeChange: (value: string) => void;
-  onPermissionModeChange: (value: string) => void;
-  onModelChange: (value: string) => void;
-  onVariantChange: (value: string) => void;
-}) => {
-  const hasModelOptions = modelOptions.length > 0;
-  const showModelSelect = hasModelOptions && !modelsError;
-  const hasVariantOptions = variantOptions.length > 0;
-  const showVariantSelect = supportsVariants && hasVariantOptions && !modelsError;
-  const modelCustom =
-    model && hasModelOptions && !modelOptions.some((entry) => entry.id === model);
-  const variantCustom =
-    variant && hasVariantOptions && !variantOptions.includes(variant);
-
-  return (
-    <div className="setup-row">
-      <div className="setup-field">
-        <span className="setup-label">Mode</span>
-        <select
-          className="setup-select"
-          value={agentMode}
-          onChange={(e) => onAgentModeChange(e.target.value)}
-          title="Mode"
-          disabled={!hasSession || modesLoading || Boolean(modesError)}
-        >
-          {modesLoading ? (
-            <option value="">Loading modes...</option>
-          ) : modesError ? (
-            <option value="">{modesError}</option>
-          ) : activeModes.length > 0 ? (
-            activeModes.map((mode) => (
-              <option key={mode.id} value={mode.id}>
-                {mode.name || mode.id}
-              </option>
-            ))
-          ) : (
-            <option value="">Mode</option>
-          )}
-        </select>
-      </div>
-
-      <div className="setup-field">
-        <span className="setup-label">Permission</span>
-        <select
-          className="setup-select"
-          value={permissionMode}
-          onChange={(e) => onPermissionModeChange(e.target.value)}
-          title="Permission Mode"
-          disabled={!hasSession}
-        >
-          <option value="default">Default</option>
-          <option value="plan">Plan</option>
-          <option value="bypass">Bypass</option>
-        </select>
-      </div>
-
-      <div className="setup-field">
-        <span className="setup-label">Model</span>
-        {showModelSelect ? (
-          <select
-            className="setup-select"
-            value={model}
-            onChange={(e) => onModelChange(e.target.value)}
-            title="Model"
-            disabled={!hasSession || modelsLoading || Boolean(modelsError)}
-          >
-            {modelsLoading ? (
-              <option value="">Loading models...</option>
-            ) : modelsError ? (
-              <option value="">{modelsError}</option>
-            ) : (
-              <>
-                <option value="">
-                  {defaultModel ? `Default (${defaultModel})` : "Default"}
-                </option>
-                {modelCustom && <option value={model}>{model} (custom)</option>}
-                {modelOptions.map((entry) => (
-                  <option key={entry.id} value={entry.id}>
-                    {entry.name ?? entry.id}
-                  </option>
-                ))}
-              </>
-            )}
-          </select>
-        ) : (
-          <input
-            className="setup-input"
-            value={model}
-            onChange={(e) => onModelChange(e.target.value)}
-            placeholder="Model"
-            title="Model"
-            disabled={!hasSession}
-          />
-        )}
-      </div>
-
-      <div className="setup-field">
-        <span className="setup-label">Variant</span>
-        {showVariantSelect ? (
-          <select
-            className="setup-select"
-            value={variant}
-            onChange={(e) => onVariantChange(e.target.value)}
-            title="Variant"
-            disabled={!hasSession || !supportsVariants || modelsLoading || Boolean(modelsError)}
-          >
-            {modelsLoading ? (
-              <option value="">Loading variants...</option>
-            ) : modelsError ? (
-              <option value="">{modelsError}</option>
-            ) : (
-              <>
-                <option value="">
-                  {defaultVariant ? `Default (${defaultVariant})` : "Default"}
-                </option>
-                {variantCustom && <option value={variant}>{variant} (custom)</option>}
-                {variantOptions.map((entry) => (
-                  <option key={entry} value={entry}>
-                    {entry}
-                  </option>
-                ))}
-              </>
-            )}
-          </select>
-        ) : (
-          <input
-            className="setup-input"
-            value={variant}
-            onChange={(e) => onVariantChange(e.target.value)}
-            placeholder={supportsVariants ? "Variant" : "Variants unsupported"}
-            title="Variant"
-            disabled={!hasSession || !supportsVariants}
-          />
-        )}
-      </div>
-    </div>
-  );
-};
-
-export default ChatSetup;
--- a/frontend/packages/inspector/src/components/debug/AgentsTab.tsx
+++ b/frontend/packages/inspector/src/components/debug/AgentsTab.tsx
@ -1,4 +1,5 @@
-import { Download, RefreshCw } from "lucide-react";
+import { Download, Loader2, RefreshCw } from "lucide-react";
+import { useState } from "react";
 import type { AgentInfo, AgentModeInfo } from "sandbox-agent";
 import FeatureCoverageBadges from "../agents/FeatureCoverageBadges";
 import { emptyFeatureCoverage } from "../../types/agents";
@ -16,10 +17,21 @@ const AgentsTab = ({
  defaultAgents: string[];
  modesByAgent: Record<string, AgentModeInfo[]>;
  onRefresh: () => void;
-  onInstall: (agentId: string, reinstall: boolean) => void;
+  onInstall: (agentId: string, reinstall: boolean) => Promise<void>;
  loading: boolean;
  error: string | null;
 }) => {
+  const [installingAgent, setInstallingAgent] = useState<string | null>(null);
+
+  const handleInstall = async (agentId: string, reinstall: boolean) => {
+    setInstallingAgent(agentId);
+    try {
+      await onInstall(agentId, reinstall);
+    } finally {
+      setInstallingAgent(null);
+    }
+  };
+
  return (
    <>
      <div className="inline-row" style={{ marginBottom: 16 }}>
@ -39,19 +51,27 @@ const AgentsTab = ({
        : defaultAgents.map((id) => ({
            id,
            installed: false,
+            credentialsAvailable: false,
            version: undefined,
            path: undefined,
            capabilities: emptyFeatureCoverage
-          }))).map((agent) => (
+          }))).map((agent) => {
+        const isInstalling = installingAgent === agent.id;
+        return (
          <div key={agent.id} className="card">
            <div className="card-header">
              <span className="card-title">{agent.id}</span>
+              <div className="card-header-pills">
                <span className={`pill ${agent.installed ? "success" : "danger"}`}>
                  {agent.installed ? "Installed" : "Missing"}
                </span>
+                <span className={`pill ${agent.credentialsAvailable ? "success" : "warning"}`}>
+                  {agent.credentialsAvailable ? "Authenticated" : "No Credentials"}
+                </span>
+              </div>
            </div>
            <div className="card-meta">
-            {agent.version ? `v${agent.version}` : "Version unknown"}
+              {agent.version ?? "Version unknown"}
              {agent.path && <span className="mono muted" style={{ marginLeft: 8 }}>{agent.path}</span>}
            </div>
            <div className="card-meta" style={{ marginTop: 8 }}>
@ -66,15 +86,22 @@ const AgentsTab = ({
              </div>
            )}
            <div className="card-actions">
-            <button className="button secondary small" onClick={() => onInstall(agent.id, false)}>
-              <Download className="button-icon" /> Install
-            </button>
-            <button className="button ghost small" onClick={() => onInstall(agent.id, true)}>
-              Reinstall
+              <button
+                className="button secondary small"
+                onClick={() => handleInstall(agent.id, agent.installed)}
+                disabled={isInstalling}
+              >
+                {isInstalling ? (
+                  <Loader2 className="button-icon spinner-icon" />
+                ) : (
+                  <Download className="button-icon" />
+                )}
+                {isInstalling ? "Installing..." : agent.installed ? "Reinstall" : "Install"}
              </button>
            </div>
          </div>
-      ))}
+        );
+      })}
    </>
  );
 };
--- a/frontend/packages/inspector/src/components/debug/DebugPanel.tsx
+++ b/frontend/packages/inspector/src/components/debug/DebugPanel.tsx
@ -40,7 +40,7 @@ const DebugPanel = ({
  defaultAgents: string[];
  modesByAgent: Record<string, AgentModeInfo[]>;
  onRefreshAgents: () => void;
-  onInstallAgent: (agentId: string, reinstall: boolean) => void;
+  onInstallAgent: (agentId: string, reinstall: boolean) => Promise<void>;
  agentsLoading: boolean;
  agentsError: string | null;
 }) => {
--- a/frontend/packages/inspector/src/components/debug/eventUtils.ts
+++ b/frontend/packages/inspector/src/components/debug/eventUtils.ts
@ -30,6 +30,10 @@ export const getEventIcon = (type: string) => {
      return PlayCircle;
    case "session.ended":
      return PauseCircle;
+    case "turn.started":
+      return PlayCircle;
+    case "turn.ended":
+      return PauseCircle;
    case "item.started":
      return MessageSquare;
    case "item.delta":
--- a/frontend/packages/website/Dockerfile
+++ b/frontend/packages/website/Dockerfile
@ -1,6 +1,6 @@
 FROM node:22-alpine AS build
 WORKDIR /app
-RUN npm install -g pnpm
+RUN npm install -g pnpm@9

 # Copy website package
 COPY frontend/packages/website/package.json ./
--- a/gigacode/src/main.rs
+++ b/gigacode/src/main.rs
@ -17,9 +17,19 @@ fn run() -> Result<(), CliError> {
        no_token: cli.no_token,
        gigacode: true,
    };
-    let command = cli
-        .command
-        .unwrap_or_else(|| Command::Opencode(OpencodeArgs::default()));
+    let yolo = cli.yolo;
+    let command = match cli.command {
+        Some(Command::Opencode(mut args)) => {
+            args.yolo = args.yolo || yolo;
+            Command::Opencode(args)
+        }
+        Some(other) => other,
+        None => {
+            let mut args = OpencodeArgs::default();
+            args.yolo = yolo;
+            Command::Opencode(args)
+        }
+    };
    if let Err(err) = init_logging(&command) {
        eprintln!("failed to init logging: {err}");
        return Err(err);
--- a/27
+++ b/27
@ -27,8 +27,12 @@ release-build-all:
 # =============================================================================

 [group('dev')]
-dev:
-	pnpm dev -F @sandbox-agent/inspector
+dev-daemon:
+	SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p sandbox-agent -- daemon start --upgrade
+
+[group('dev')]
+dev: dev-daemon
+	pnpm dev -F @sandbox-agent/inspector -- --host 0.0.0.0

 [group('dev')]
 build:
@ -50,17 +54,27 @@ fmt:

 [group('dev')]
 install-fast-sa:
-	cargo build --release -p sandbox-agent
+	SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build --release -p sandbox-agent
+	rm -f ~/.cargo/bin/sandbox-agent
 	cp target/release/sandbox-agent ~/.cargo/bin/sandbox-agent

 [group('dev')]
-install-fast-gigacode:
-	cargo build --release -p gigacode
+install-gigacode:
+	SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build --release -p gigacode
+	rm -f ~/.cargo/bin/gigacode
 	cp target/release/gigacode ~/.cargo/bin/gigacode

+[group('dev')]
+run-sa *ARGS:
+	SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p sandbox-agent -- {{ ARGS }}
+
+[group('dev')]
+run-gigacode *ARGS:
+	SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo run -p gigacode -- {{ ARGS }}
+
 [group('dev')]
 dev-docs:
-	cd docs && pnpm dlx mintlify dev
+	cd docs && pnpm dlx mintlify dev --host 0.0.0.0

 install:
    pnpm install
@ -77,4 +91,3 @@ install-release:
    pnpm build --filter @sandbox-agent/inspector...
    cargo install --path server/packages/sandbox-agent
    cargo install --path gigacode
-
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
--- a/research/agents/amp.md
+++ b/research/agents/amp.md
@ -415,6 +415,31 @@ if let Some(model) = options.model.as_deref() {
 3. **Wait for Amp API** — Amp may add model/mode discovery in a future release
 4. **Scrape ampcode.com** — Check if the web UI exposes available modes/models

+## Command Execution & Process Management
+
+### Agent Tool Execution
+
+Amp executes commands via the `Bash` tool, similar to Claude Code. Synchronous execution, blocks the agent turn. Permission rules can pre-authorize specific commands:
+
+```typescript
+{ tool: "Bash", matches: { command: "git *" }, action: "allow" }
+```
+
+### No User-Initiated Command Injection
+
+Amp does not expose any mechanism for external clients to inject command results into the agent's context. No `!` prefix equivalent, no command injection API.
+
+### Comparison
+
+| Capability | Supported? | Notes |
+|-----------|-----------|-------|
+| Agent runs commands | Yes (`Bash` tool) | Synchronous, blocks agent turn |
+| User runs commands → agent sees output | No | |
+| External API for command injection | No | |
+| Command source tracking | No | |
+| Background process management | No | Shell `&` only |
+| PTY / interactive terminal | No | |
+
 ## Notes

 - Amp is similar to Claude Code (same streaming format)
--- a/research/agents/claude.md
+++ b/research/agents/claude.md
@ -279,6 +279,44 @@ x-api-key: <ANTHROPIC_API_KEY>
 anthropic-version: 2023-06-01
 ```

+## Command Execution & Process Management
+
+### Agent Tool Execution
+
+The agent executes commands via the `Bash` tool. This is synchronous - the agent blocks until the command exits. Tool schema:
+
+```json
+{
+  "command": "string",
+  "timeout": "number",
+  "workingDirectory": "string"
+}
+```
+
+There is no background process support. If the agent needs a long-running process (e.g., dev server), it uses shell backgrounding (`&`) within a single `Bash` tool call.
+
+### User-Initiated Command Execution (`!` prefix)
+
+Claude Code's TUI supports `!command` syntax where the user types `!npm test` to run a command directly. The output is injected into the conversation as a user message so the agent can see it on the next turn.
+
+**This is a client-side TUI feature only.** It is not exposed in the API schema or streaming protocol. The CLI runs the command locally and stuffs the output into the next user message. There is no protocol-level concept of "user ran a command" vs "agent ran a command."
+
+### No External Command Injection API
+
+External clients (SDKs, frontends) cannot programmatically inject command results into Claude's conversation context. The only way to provide command output to the agent is:
+- Include it in the user prompt text
+- Use the `!` prefix in the interactive TUI
+
+### Comparison
+
+| Capability | Supported? | Notes |
+|-----------|-----------|-------|
+| Agent runs commands | Yes (`Bash` tool) | Synchronous, blocks agent turn |
+| User runs commands → agent sees output | Yes (`!cmd` in TUI) | Client-side only, not in protocol |
+| External API for command injection | No | |
+| Background process management | No | Shell `&` only |
+| PTY / interactive terminal | No | |
+
 ## Notes

 - Claude CLI manages its own OAuth refresh internally
--- a/research/agents/codex.md
+++ b/research/agents/codex.md
@ -347,6 +347,68 @@ Requires a running Codex app-server process. Send the JSON-RPC request to the ap
 - Requires an active app-server process (cannot query models without starting one)
 - No standalone CLI command like `codex models`

+## Command Execution & Process Management
+
+### Agent Tool Execution
+
+Codex executes commands via `LocalShellAction`. The agent proposes a command, and external clients approve/deny via JSON-RPC (`item/commandExecution/requestApproval`).
+
+### Command Source Tracking (`ExecCommandSource`)
+
+Codex is the only agent that explicitly tracks **who initiated a command** at the protocol level:
+
+```json
+{
+  "ExecCommandSource": {
+    "enum": ["agent", "user_shell", "unified_exec_startup", "unified_exec_interaction"]
+  }
+}
+```
+
+| Source | Meaning |
+|--------|---------|
+| `agent` | Agent decided to run this command via tool call |
+| `user_shell` | User ran a command in a shell (equivalent to Claude Code's `!` prefix) |
+| `unified_exec_startup` | Startup script ran this command |
+| `unified_exec_interaction` | Interactive execution |
+
+This means user-initiated shell commands are **first-class protocol events** in Codex, not a client-side hack like Claude Code's `!` prefix.
+
+### Command Execution Events
+
+Codex emits structured events for command execution:
+
+- `exec_command_begin` - Command started (includes `source`, `command`, `cwd`, `turn_id`)
+- `exec_command_output_delta` - Streaming output chunk (includes `stream: stdout|stderr`)
+- `exec_command_end` - Command completed (includes `exit_code`, `source`)
+
+### Parsed Command Analysis (`CommandAction`)
+
+Codex provides semantic analysis of what a command does:
+
+```json
+{
+  "commandActions": [
+    { "type": "read", "path": "/src/main.ts" },
+    { "type": "write", "path": "/src/utils.ts" },
+    { "type": "install", "package": "lodash" }
+  ]
+}
+```
+
+Action types: `read`, `write`, `listFiles`, `search`, `install`, `remove`, `other`.
+
+### Comparison
+
+| Capability | Supported? | Notes |
+|-----------|-----------|-------|
+| Agent runs commands | Yes (`LocalShellAction`) | With approval workflow |
+| User runs commands → agent sees output | Yes (`user_shell` source) | First-class protocol event |
+| External API for command injection | Yes (JSON-RPC approval) | Can approve/deny before execution |
+| Command source tracking | Yes (`ExecCommandSource` enum) | Distinguishes agent vs user vs startup |
+| Background process management | No | |
+| PTY / interactive terminal | No | |
+
 ## Notes

 - SDK is dynamically imported to reduce bundle size
--- a/research/agents/opencode.md
+++ b/research/agents/opencode.md
@ -585,6 +585,60 @@ const response = await client.provider.list();

 When an OpenCode server is running, call `GET /provider` on its HTTP port. Returns full model metadata including capabilities, costs, context limits, and modalities.

+## Command Execution & Process Management
+
+### Agent Tool Execution
+
+The agent executes commands via internal tools (not exposed in the HTTP API). The agent's tool calls are synchronous within its turn. Tool parts have states: `pending`, `running`, `completed`, `error`.
+
+### PTY System (`/pty/*`) - User-Facing Terminals
+
+Separate from the agent's command execution. PTYs are server-scoped interactive terminals for the user:
+
+- `POST /pty` - Create PTY (command, args, cwd, title, env)
+- `GET /pty` - List all PTYs
+- `GET /pty/{ptyID}` - Get PTY info
+- `PUT /pty/{ptyID}` - Update PTY (title, resize via `size: {rows, cols}`)
+- `DELETE /pty/{ptyID}` - Kill and remove PTY
+- `GET /pty/{ptyID}/connect` - WebSocket for bidirectional I/O
+
+PTY events (globally broadcast via SSE): `pty.created`, `pty.updated`, `pty.exited`, `pty.deleted`.
+
+The agent does NOT use the PTY system. PTYs are for the user's interactive terminal panel, independent of any AI session.
+
+### Session Commands (`/session/{id}/command`, `/session/{id}/shell`) - Context Injection
+
+External clients can inject command results into an AI session's conversation context:
+
+- `POST /session/{sessionID}/command` - Executes a command and records the result as an `AssistantMessage` in the session. Required fields: `command`, `arguments`. The output becomes part of the AI's context for subsequent turns.
+- `POST /session/{sessionID}/shell` - Similar but wraps in `sh -c`. Required fields: `command`, `agent`.
+- `GET /command` - Lists available command definitions (metadata, not execution).
+
+Session commands emit `command.executed` events with `sessionID` + `messageID`.
+
+**Key distinction**: These endpoints execute commands directly (not via the AI), then inject the output into the session as if the AI produced it. The AI doesn't actively run the command - it just finds the output in its conversation history on the next turn.
+
+### Three Separate Execution Mechanisms
+
+| Mechanism | Who uses it | Scoped to | AI sees output? |
+|-----------|-------------|-----------|----------------|
+| Agent tools (internal) | AI agent | Session turn | Yes (immediate) |
+| PTY (`/pty/*`) | User/frontend | Server (global) | No |
+| Session commands (`/session/{id}/*`) | Frontend/SDK client | Session | Yes (next turn) |
+
+The agent has no tool to interact with PTYs and cannot access the session command endpoints. When the agent needs to run a background process, it uses its internal bash-equivalent tool with shell backgrounding (`&`).
+
+### Comparison
+
+| Capability | Supported? | Notes |
+|-----------|-----------|-------|
+| Agent runs commands | Yes (internal tools) | Synchronous, blocks agent turn |
+| User runs commands → agent sees output | Yes (`/session/{id}/command`) | HTTP API, first-class |
+| External API for command injection | Yes | Session-scoped endpoints |
+| Command source tracking | Implicit | Endpoint implies source (no enum) |
+| Background process management | No | Shell `&` only for agent |
+| PTY / interactive terminal | Yes (`/pty/*`) | Server-scoped, WebSocket I/O |
+
 ## Notes

 - OpenCode is the most feature-rich runtime (streaming, questions, permissions)
--- a/research/process-terminal-design.md
+++ b/research/process-terminal-design.md
@ -0,0 +1,374 @@
+# Research: Process & Terminal System Design
+
+Research on PTY/terminal and process management APIs across sandbox platforms, with design recommendations for sandbox-agent.
+
+## Competitive Landscape
+
+### Transport Comparison
+
+| Platform | PTY Transport | Command Transport | Unified? |
+|----------|--------------|-------------------|----------|
+| **OpenCode** | WebSocket (`/pty/{id}/connect`) | REST (session-scoped, AI-mediated) | No |
+| **E2B** | gRPC server-stream (output) + unary RPC (input) | Same gRPC service | Yes |
+| **Daytona** | WebSocket | REST | No |
+| **Kubernetes** | WebSocket (channel byte mux) | Same WebSocket | Yes |
+| **Docker** | HTTP connection hijack | Same connection | Yes |
+| **Fly.io** | SSH over WireGuard | REST (sync, 60s max) | No |
+| **Vercel Sandboxes** | No PTY API | REST SDK (async generator for logs) | N/A |
+| **Gitpod** | gRPC (Listen=output, Write=input) | Same gRPC service | Yes |
+
+### Resize Mechanism
+
+| Platform | How | Notes |
+|----------|-----|-------|
+| **OpenCode** | `PUT /pty/{id}` with `size: {rows, cols}` | Separate REST call |
+| **E2B** | Separate `Update` RPC | Separate gRPC call |
+| **Daytona** | Separate HTTP POST | Sends SIGWINCH |
+| **Kubernetes** | In-band WebSocket message (channel byte 4) | `{"Width": N, "Height": N}` |
+| **Docker** | `POST /exec/{id}/resize?h=N&w=N` | Separate REST call |
+| **Gitpod** | Separate `SetSize` RPC | Separate gRPC call |
+
+**Consensus**: Almost all platforms use a separate call for resize. Only Kubernetes does it in-band. Since resize is a control signal (not data), a separate mechanism is cleaner.
+
+### I/O Multiplexing
+
+I/O multiplexing is how platforms distinguish between stdout, stderr, and PTY data on a shared connection.
+
+| Platform | Method | Detail |
+|----------|--------|--------|
+| **Docker** | 8-byte binary header per frame | Byte 0 = stream type (0=stdin, 1=stdout, 2=stderr). When TTY=true, no mux (raw stream). |
+| **Kubernetes** | 1-byte channel prefix per WebSocket message | 0=stdin, 1=stdout, 2=stderr, 3=error, 4=resize, 255=close |
+| **E2B** | gRPC `oneof` in protobuf | `DataEvent.output` is `oneof { bytes stdout, bytes stderr, bytes pty }` |
+| **OpenCode** | None | PTY is a unified stream. Commands capture stdout/stderr separately in response. |
+| **Daytona** | None | PTY is unified. Commands return structured `{stdout, stderr}`. |
+
+**Key insight**: When a process runs with a PTY allocated, stdout and stderr are merged by the kernel into a single stream. Multiplexing only matters for non-PTY command execution. OpenCode and Daytona handle this by keeping PTY (unified stream) and commands (structured response) as separate APIs.
+
+### Reconnection
+
+| Platform | Method | Replays missed output? |
+|----------|--------|----------------------|
+| **E2B** | `Connect` RPC by PID or tag | No - only new events from reconnect point |
+| **Daytona** | New WebSocket to same PTY session | No |
+| **Kubernetes** | Not supported (connection = session) | N/A |
+| **Docker** | Not supported (connection = session) | N/A |
+| **OpenCode** | `GET /pty/{id}/connect` (WebSocket) | Unknown (not documented) |
+
+### Process Identification
+
+| Platform | ID Type | Notes |
+|----------|---------|-------|
+| **OpenCode** | String (`pty_N`) | Pattern `^pty.*` |
+| **E2B** | PID (uint32) or tag (string) | Dual selector |
+| **Daytona** | Session ID / PID | |
+| **Docker** | Exec ID (string, server-generated) | |
+| **Kubernetes** | Connection-scoped | No ID - the WebSocket IS the process |
+| **Gitpod** | Alias (string) | Human-readable |
+
+### Scoping
+
+| Platform | PTY Scope | Command Scope |
+|----------|-----------|---------------|
+| **OpenCode** | Server-wide (global) | Session-specific (AI-mediated) |
+| **E2B** | Sandbox-wide | Sandbox-wide |
+| **Daytona** | Sandbox-wide | Sandbox-wide |
+| **Docker** | Container-scoped | Container-scoped |
+| **Kubernetes** | Pod-scoped | Pod-scoped |
+
+## Key Questions & Analysis
+
+### Q: Should PTY transport be WebSocket?
+
+**Yes.** WebSocket is the right choice for PTY I/O:
+- Bidirectional: client sends keystrokes, server sends terminal output
+- Low latency: no HTTP request overhead per keystroke
+- Persistent connection: terminal sessions are long-lived
+- Industry consensus: OpenCode, Daytona, and Kubernetes all use WebSocket for PTY
+
+### Q: Should command transport be WebSocket or REST?
+
+**REST is sufficient for commands. WebSocket is not needed.**
+
+The distinction comes down to the nature of each operation:
+
+- **PTY**: Long-lived, bidirectional, interactive. User types, terminal responds. Needs WebSocket.
+- **Commands**: Request-response. Client says "run `ls -la`", server runs it, returns stdout/stderr/exit_code. This is a natural REST operation.
+
+The "full duplex" question: commands don't need full duplex because:
+1. Input is sent once at invocation (the command string)
+2. Output is collected and returned when the process exits
+3. There's no ongoing interactive input during execution
+
+For **streaming output** of long-running commands (e.g., `npm install`), there are two clean options:
+1. **SSE**: Server-Sent Events for output streaming (output-only, which is all you need)
+2. **PTY**: If the user needs to interact with the process (send ctrl+c, provide stdin), they should use a PTY instead
+
+This matches how OpenCode separates the two: commands are REST, PTYs are WebSocket.
+
+**Recommendation**: Keep commands as REST. If a command needs streaming output or interactive input, the user should create a PTY instead. This avoids building a second WebSocket protocol for a use case that PTYs already cover.
+
+### Q: Should resize be WebSocket in-band or separate POST?
+
+**Separate endpoint (PUT or POST).**
+
+Reasons:
+- Resize is a control signal, not data. Mixing it into the data stream requires a framing protocol to distinguish resize messages from terminal input.
+- OpenCode already defines `PUT /pty/{id}` with `size: {rows, cols}` - this is the existing spec.
+- E2B, Daytona, Docker, and Gitpod all use separate calls.
+- Only Kubernetes does in-band (because their channel-byte protocol already has a mux layer).
+- A separate endpoint is simpler to implement, test, and debug.
+
+**Recommendation**: Use `PUT /pty/{id}` with `size` field (matching OpenCode spec). Alternatively, a dedicated `POST /pty/{id}/resize` if we want to keep update and resize semantically separate.
+
+### Q: What is I/O multiplexing?
+
+I/O multiplexing is the mechanism for distinguishing between different data streams (stdout, stderr, stdin, control signals) on a single connection.
+
+**When it matters**: Non-PTY command execution where stdout and stderr need to be kept separate.
+
+**When it doesn't matter**: PTY sessions. When a PTY is allocated, the kernel merges stdout and stderr into a single stream (the PTY master fd). There is only one output stream. This is why terminals show stdout and stderr interleaved - the PTY doesn't distinguish them.
+
+**For sandbox-agent**: Since PTYs are unified streams and commands use REST (separate stdout/stderr in the JSON response), we don't need a multiplexing protocol. The API design naturally separates the two cases.
+
+### Q: How should reconnect work?
+
+**Reconnect is an application-level concept, not just HTTP/WebSocket reconnection.**
+
+The distinction:
+
+- **HTTP/WebSocket reconnect**: The transport-level connection drops and is re-established. This is handled by the client library automatically (retry logic, exponential backoff). The server doesn't need to know.
+- **Process reconnect**: The client disconnects from a running process but the process keeps running. Later, the client (or a different client) connects to the same process and starts receiving output again.
+
+**E2B's model**: Disconnecting a stream (via AbortController) leaves the process running. `Connect` RPC by PID or tag re-establishes the output stream. Missed output during disconnection is lost. This works because:
+1. Processes are long-lived (servers, shells)
+2. For terminals, the screen state can be recovered by the shell/application redrawing
+3. For commands, if you care about all output, don't disconnect
+
+**Recommendation for sandbox-agent**: Reconnect should be supported at the application level:
+1. `GET /pty/{id}/connect` (WebSocket) can be called multiple times for the same PTY
+2. If the WebSocket drops, the PTY process keeps running
+3. Client reconnects by opening a new WebSocket to the same endpoint
+4. No output replay (too complex, rarely needed - terminal apps redraw on reconnect via SIGWINCH)
+5. This is essentially what OpenCode's `/pty/{id}/connect` endpoint already implies
+
+This naturally leads to the **persistent process system** concept (see below).
+
+### Q: How are PTY events different from PTY transport?
+
+Two completely separate channels serving different purposes:
+
+**PTY Events** (via SSE on `/event` or `/sessions/{id}/events/sse`):
+- Lifecycle notifications: `pty.created`, `pty.updated`, `pty.exited`, `pty.deleted`
+- Lightweight JSON metadata (PTY id, status, exit code)
+- Broadcast to all subscribers
+- Used by UIs to update PTY lists, show status indicators, handle cleanup
+
+**PTY Transport** (via WebSocket on `/pty/{id}/connect`):
+- Raw terminal I/O: binary input/output bytes
+- High-frequency, high-bandwidth
+- Point-to-point (one client connected to one PTY)
+- Used by terminal emulators (xterm.js) to render the terminal
+
+**Analogy**: Events are like email notifications ("a new terminal was opened"). Transport is like the phone call (the actual terminal session).
+
+### Q: How are PTY and commands different in OpenCode?
+
+They serve fundamentally different purposes:
+
+**PTY (`/pty/*`)** - Direct execution environment:
+- Server-scoped (not tied to any AI session)
+- Creates a real terminal process
+- User interacts directly via WebSocket
+- Not part of the AI conversation
+- Think: "the terminal panel in VS Code"
+
+**Commands (`/session/{sessionID}/command`, `/session/{sessionID}/shell`)** - AI-mediated execution:
+- Session-scoped (tied to an AI session)
+- The command is sent **to the AI assistant** for execution
+- Creates an `AssistantMessage` in the session's conversation history
+- Output becomes part of the AI's context
+- Think: "asking Claude to run a command as a tool call"
+
+**Why commands are session-specific**: Because they're AI operations, not direct execution. When you call `POST /session/{id}/command`, the server:
+1. Creates an assistant message in the session
+2. Runs the command
+3. Captures output as message parts
+4. Emits `message.part.updated` events
+5. The AI can see this output in subsequent turns
+
+This is how the AI "uses terminal tools" - the command infrastructure provides the bridge between the AI session and system execution.
+
+### Q: Should scoping be system-wide?
+
+**Yes, for both PTY and commands.**
+
+Current OpenCode behavior:
+- PTYs: Already server-wide (global)
+- Commands: Session-scoped (for AI context injection)
+
+**For sandbox-agent**, since we're the orchestration layer (not the AI):
+- **PTYs**: System-wide. Any client should be able to list, connect to, or manage any PTY.
+- **Commands/processes**: System-wide. Process execution is a system primitive, not an AI primitive. If a caller wants to associate a process with a session, they can do so at their layer.
+
+The session-scoping of commands in OpenCode is an OpenCode-specific concern (AI context injection). Sandbox-agent should provide the lower-level primitive (system-wide process execution) and let the OpenCode compat layer handle the session association.
+
+## Persistent Process System
+
+### The Concept
+
+A persistent process system means:
+1. **Spawn** a process (PTY or command) via API
+2. Process runs independently of any client connection
+3. **Connect/disconnect** to the process I/O at will
+4. Process continues running through disconnections
+5. **Query** process status, list running processes
+6. **Kill/signal** processes explicitly
+
+This is distinct from the typical "connection = process lifetime" model (Kubernetes, Docker exec) where closing the connection kills the process.
+
+### How E2B Does It
+
+E2B's `Process` service is the best reference implementation:
+
+```
+Start(cmd, pty?) → stream of events (output)
+Connect(pid/tag) → stream of events (reconnect)
+SendInput(pid, data) → ok
+Update(pid, size) → ok (resize)
+SendSignal(pid, signal) → ok
+List() → running processes
+```
+
+Key design choices:
+- **Unified service**: PTY and command are the same service, differentiated by the `pty` field in `StartRequest`
+- **Process outlives connection**: Disconnecting the output stream (aborting the `Start`/`Connect` RPC) does NOT kill the process
+- **Explicit termination**: Must call `SendSignal(SIGKILL)` to stop a process
+- **Tag-based selection**: Processes can be tagged at creation for later lookup without knowing the PID
+
+### Recommendation for Sandbox-Agent
+
+Sandbox-agent should implement a **persistent process manager** that:
+
+1. **Is system-wide** (not session-scoped)
+2. **Supports both PTY and non-PTY modes**
+3. **Decouples process lifetime from connection lifetime**
+4. **Exposes via both REST (lifecycle) and WebSocket (I/O)**
+
+#### Proposed API Surface
+
+**Process Lifecycle (REST)**:
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| `POST` | `/v1/processes` | Create/spawn a process (PTY or command) |
+| `GET` | `/v1/processes` | List all processes |
+| `GET` | `/v1/processes/{id}` | Get process info (status, pid, exit code) |
+| `DELETE` | `/v1/processes/{id}` | Kill process (SIGTERM, then SIGKILL) |
+| `POST` | `/v1/processes/{id}/signal` | Send signal (SIGTERM, SIGKILL, SIGINT, etc.) |
+| `POST` | `/v1/processes/{id}/resize` | Resize PTY (rows, cols) |
+| `POST` | `/v1/processes/{id}/input` | Send stdin/pty input (REST fallback) |
+
+**Process I/O (WebSocket)**:
+| Method | Endpoint | Description |
+|--------|----------|-------------|
+| `GET` | `/v1/processes/{id}/connect` | WebSocket for bidirectional I/O |
+
+**Process Events (SSE)**:
+| Event | Description |
+|-------|-------------|
+| `process.created` | Process spawned |
+| `process.updated` | Process metadata changed |
+| `process.exited` | Process terminated (includes exit code) |
+| `process.deleted` | Process record removed |
+
+#### Create Request
+
+```json
+{
+  "command": "bash",
+  "args": ["-i", "-l"],
+  "cwd": "/workspace",
+  "env": {"TERM": "xterm-256color"},
+  "pty": {                         // Optional - if present, allocate PTY
+    "rows": 24,
+    "cols": 80
+  },
+  "tag": "main-terminal",          // Optional - for lookup by name
+  "label": "Terminal 1"            // Optional - display name
+}
+```
+
+#### Process Object
+
+```json
+{
+  "id": "proc_abc123",
+  "tag": "main-terminal",
+  "label": "Terminal 1",
+  "command": "bash",
+  "args": ["-i", "-l"],
+  "cwd": "/workspace",
+  "pid": 12345,
+  "pty": true,
+  "status": "running",             // "running" | "exited"
+  "exit_code": null,               // Set when exited
+  "created_at": "2025-01-15T...",
+  "exited_at": null
+}
+```
+
+#### OpenCode Compatibility Layer
+
+The OpenCode compat layer maps to this system:
+
+| OpenCode Endpoint | Maps To |
+|-------------------|---------|
+| `POST /pty` | `POST /v1/processes` (with `pty` field) |
+| `GET /pty` | `GET /v1/processes?pty=true` |
+| `GET /pty/{id}` | `GET /v1/processes/{id}` |
+| `PUT /pty/{id}` | `POST /v1/processes/{id}/resize` + metadata update |
+| `DELETE /pty/{id}` | `DELETE /v1/processes/{id}` |
+| `GET /pty/{id}/connect` | `GET /v1/processes/{id}/connect` |
+| `POST /session/{id}/command` | Create process + capture output into session |
+| `POST /session/{id}/shell` | Create process (shell mode) + capture output into session |
+
+### Open Questions
+
+1. **Output buffering for reconnect**: Should we buffer recent output (e.g., last 64KB) so reconnecting clients get some history? E2B doesn't do this, but it would improve UX for flaky connections.
+
+2. **Process limits**: Should there be a max number of concurrent processes? E2B doesn't expose one, but sandbox environments have limited resources.
+
+3. **Auto-cleanup**: Should processes be auto-cleaned after exiting? Options:
+   - Keep forever until explicitly deleted
+   - Auto-delete after N seconds/minutes
+   - Keep metadata but release resources
+
+4. **Input via REST vs WebSocket-only**: The REST `POST /processes/{id}/input` endpoint is useful for one-shot input (e.g., "send ctrl+c") without establishing a WebSocket. E2B has both `SendInput` (unary) and `StreamInput` (streaming) for this reason.
+
+5. **Multiple WebSocket connections to same process**: Should we allow multiple clients to connect to the same process simultaneously? (Pair programming, monitoring). E2B supports this via multiple `Connect` calls.
+
+## User-Initiated Command Injection ("Run command, give AI context")
+
+A common pattern across agents: the user (or frontend) runs a command and the output is injected into the AI's conversation context. This is distinct from the agent running a command via its own tools.
+
+| Agent | Feature | Mechanism | Protocol-level? |
+|-------|---------|-----------|----------------|
+| **Claude Code** | `!command` prefix in TUI | CLI runs command locally, injects output as user message | No - client-side hack, not in API schema |
+| **Codex** | `user_shell` source | `ExecCommandSource` enum distinguishes `agent` vs `user_shell` vs `unified_exec_*` | Yes - first-class protocol event |
+| **OpenCode** | `/session/{id}/command` | HTTP endpoint runs command, records result as `AssistantMessage` | Yes - HTTP API |
+| **Amp** | N/A | Not supported | N/A |
+
+**Design implication for sandbox-agent**: The process system should support an optional `session_id` field when creating a process. If provided, the process output is associated with that session so the agent can see it. If not provided, the process runs independently (like a PTY). This unifies:
+- User interactive terminals (no session association)
+- User-initiated commands for AI context (session association)
+- Agent-initiated background processes (session association)
+
+## Sources
+
+- [E2B Process Proto](https://github.com/e2b-dev/E2B) - `process.proto` gRPC service definition
+- [E2B JS SDK](https://github.com/e2b-dev/E2B/tree/main/packages/js-sdk) - `commands/pty.ts`, `commands/index.ts`
+- [Daytona SDK](https://www.daytona.io/docs/en/typescript-sdk/process/) - REST + WebSocket PTY API
+- [Kubernetes RemoteCommand](https://github.com/kubernetes/apimachinery/blob/master/pkg/util/remotecommand/constants.go) - WebSocket subprotocol
+- [Docker Engine API](https://docker-docs.uclv.cu/engine/api/v1.21/) - Exec API with stream multiplexing
+- [Fly.io Machines API](https://fly.io/docs/machines/api/) - REST exec with 60s limit
+- [Gitpod terminal.proto](https://codeberg.org/kanishka-reading-list/gitpod/src/branch/main/components/supervisor-api/terminal.proto) - gRPC terminal service
+- [OpenCode OpenAPI Spec](https://github.com/opencode-ai/opencode) - PTY and session command endpoints
--- a/research/wip-agent-support.md
+++ b/research/wip-agent-support.md
@ -0,0 +1,442 @@
+# Universal Agent Configuration Support
+
+Work-in-progress research on configuration features across agents and what can be made universal.
+
+---
+
+## TODO: Features Needed for Full Coverage
+
+### Currently Implemented (in `CreateSessionRequest`)
+
+- [x] `agent` - Agent selection (claude, codex, opencode, amp)
+- [x] `agentMode` - Agent mode (plan, build, default)
+- [x] `permissionMode` - Permission mode (default, plan, bypass)
+- [x] `model` - Model selection
+- [x] `variant` - Reasoning variant
+- [x] `agentVersion` - Agent version selection
+- [x] `mcp` - MCP server configuration (Claude/Codex/OpenCode/Amp)
+- [x] `skills` - Skill path configuration (link or copy into agent skill roots)
+
+### Tier 1: Universal Features (High Priority)
+
+- [ ] `projectInstructions` - Inject CLAUDE.md / AGENTS.md content
+  - Write to appropriate file before agent spawn
+  - All agents support this natively
+- [ ] `workingDirectory` - Set working directory for session
+  - Currently captures server `cwd` on session creation; not yet user-configurable
+- [x] `mcp` - MCP server configuration
+  - Claude: Writes `.mcp.json` entries under `mcpServers`
+  - Codex: Updates `.codex/config.toml` with `mcp_servers`
+  - Amp: Calls `amp mcp add` for each server
+  - OpenCode: Uses `/mcp` API
+- [x] `skills` - Skill path configuration
+  - Claude: Link to `./.claude/skills/<name>/`
+  - Codex: Link to `./.agents/skills/<name>/`
+  - OpenCode: Link to `./.opencode/skill/<name>/` + config `skills.paths`
+  - Amp: Link to Claude/Codex-style directories
+- [ ] `credentials` - Pass credentials via API (not just env vars)
+  - Currently extracted from host env
+  - Need API-level credential injection
+
+### Filesystem API (Implemented)
+
+- [x] `/v1/fs` - Read/write/list/move/delete/stat files and upload batches
+  - Batch upload is tar-only (`application/x-tar`) with path output capped at 1024
+  - Relative paths resolve from session working dir when `sessionId` is provided
+  - CLI `sandbox-agent api fs ...` covers all filesystem endpoints
+
+### Message Attachments (Implemented)
+
+- [x] `MessageRequest.attachments` - Attach uploaded files when sending prompts
+  - OpenCode receives file parts; other agents get attachment paths appended to the prompt
+
+### Tier 2: Partial Support (Medium Priority)
+
+- [ ] `appendSystemPrompt` - High-priority system prompt additions
+  - Claude: `--append-system-prompt` flag
+  - Codex: `developer_instructions` config
+  - OpenCode: Custom agent definition
+  - Amp: Not supported (fallback to projectInstructions)
+- [ ] `resumeSession` / native session resume
+  - Claude: `--resume SESSION_ID`
+  - Codex: Thread persistence (automatic)
+  - OpenCode: `-c/--continue`
+  - Amp: `--continue SESSION_ID`
+
+### Tier 3: Agent-Specific Pass-through (Low Priority)
+
+- [ ] `agentSpecific.claude` - Raw Claude options
+- [ ] `agentSpecific.codex` - Raw Codex options (e.g., `replaceSystemPrompt`)
+- [ ] `agentSpecific.opencode` - Raw OpenCode options (e.g., `customAgent`)
+- [ ] `agentSpecific.amp` - Raw Amp options (e.g., `permissionRules`)
+
+### Event/Feature Coverage Gaps (from compatibility matrix)
+
+| Feature | Claude | Codex | OpenCode | Amp | Status |
+|---------|--------|-------|----------|-----|--------|
+| Tool Calls | —* | ✓ | ✓ | ✓ | Claude coming soon |
+| Tool Results | —* | ✓ | ✓ | ✓ | Claude coming soon |
+| Questions (HITL) | —* | — | ✓ | — | Only OpenCode |
+| Permissions (HITL) | —* | — | ✓ | — | Only OpenCode |
+| Images | — | ✓ | ✓ | — | 2/4 agents |
+| File Attachments | — | ✓ | ✓ | — | 2/4 agents |
+| Session Lifecycle | — | ✓ | ✓ | — | 2/4 agents |
+| Reasoning/Thinking | — | ✓ | — | — | Codex only |
+| Command Execution | — | ✓ | — | — | Codex only |
+| File Changes | — | ✓ | — | — | Codex only |
+| MCP Tools | ✓ | ✓ | ✓ | ✓ | Supported via session MCP config injection |
+| Streaming Deltas | — | ✓ | ✓ | — | 2/4 agents |
+
+\* Claude features marked as "coming imminently"
+
+### Implementation Order (Suggested)
+
+1. **mcp** - Done (session config injection + agent config writers)
+2. **skills** - Done (session config injection + skill directory linking)
+3. **projectInstructions** - Highest value, all agents support
+4. **appendSystemPrompt** - High-priority instructions
+5. **workingDirectory** - Basic session configuration
+6. **resumeSession** - Session continuity
+7. **credentials** - API-level auth injection
+8. **agentSpecific** - Escape hatch for edge cases
+
+---
+
+## Legend
+
+- ✅ Native support
+- 🔄 Can be adapted/emulated
+- ❌ Not supported
+- ⚠️ Supported with caveats
+
+---
+
+## 1. Instructions & System Prompt
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Project instructions file** | ✅ `CLAUDE.md` | ✅ `AGENTS.md` | 🔄 Config-based | ⚠️ Limited | ✅ Yes - write to agent's file |
+| **Append to system prompt** | ✅ `--append-system-prompt` | ✅ `developer_instructions` | 🔄 Custom agent | ❌ | ⚠️ Partial - 3/4 agents |
+| **Replace system prompt** | ❌ | ✅ `model_instructions_file` | 🔄 Custom agent | ❌ | ❌ No - Codex only |
+| **Hierarchical discovery** | ✅ cwd → root | ✅ root → cwd | ❌ | ❌ | ❌ No - Claude/Codex only |
+
+### Priority Comparison
+
+| Agent | Priority Order (highest → lowest) |
+|-------|-----------------------------------|
+| Claude | `--append-system-prompt` > base prompt > `CLAUDE.md` |
+| Codex | `AGENTS.md` > `developer_instructions` > base prompt |
+| OpenCode | Custom agent prompt > base prompt |
+| Amp | Server-controlled (opaque) |
+
+### Key Differences
+
+**Claude**: System prompt additions have highest priority. `CLAUDE.md` is injected as first user message (below system prompt).
+
+**Codex**: Project instructions (`AGENTS.md`) have highest priority and can override system prompt. This is the inverse of Claude's model.
+
+---
+
+## 2. Permission Modes
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Read-only** | ✅ `plan` | ✅ `read-only` | 🔄 Rulesets | 🔄 Rules | ✅ Yes |
+| **Write workspace** | ✅ `acceptEdits` | ✅ `workspace-write` | 🔄 Rulesets | 🔄 Rules | ✅ Yes |
+| **Full bypass** | ✅ `--dangerously-skip-permissions` | ✅ `danger-full-access` | 🔄 Allow-all ruleset | ✅ `--dangerously-skip-permissions` | ✅ Yes |
+| **Per-tool rules** | ❌ | ❌ | ✅ | ✅ | ❌ No - OpenCode/Amp only |
+
+### Universal Mapping
+
+```typescript
+type PermissionMode = "readonly" | "write" | "bypass";
+
+// Maps to:
+// Claude: plan | acceptEdits | --dangerously-skip-permissions
+// Codex: read-only | workspace-write | danger-full-access
+// OpenCode: restrictive ruleset | permissive ruleset | allow-all
+// Amp: reject rules | allow rules | dangerouslyAllowAll
+```
+
+---
+
+## 3. Agent Modes
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Plan mode** | ✅ `--permission-mode plan` | 🔄 Prompt prefix | ✅ `--agent plan` | 🔄 Mode selection | ✅ Yes |
+| **Build/execute mode** | ✅ Default | ✅ Default | ✅ `--agent build` | ✅ Default | ✅ Yes |
+| **Chat mode** | ❌ | 🔄 Prompt prefix | ❌ | ❌ | ❌ No - Codex only |
+| **Custom agents** | ❌ | ❌ | ✅ Config-defined | ❌ | ❌ No - OpenCode only |
+
+---
+
+## 4. Model & Variant Selection
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Model selection** | ✅ `--model` | ✅ `-m/--model` | ✅ `-m provider/model` | ⚠️ `--mode` (abstracted) | ⚠️ Partial |
+| **Model discovery API** | ✅ Anthropic API | ✅ `model/list` RPC | ✅ `GET /provider` | ❌ Server-side | ⚠️ Partial - 3/4 |
+| **Reasoning variants** | ❌ | ✅ `model_reasoning_effort` | ✅ `--variant` | ✅ Deep mode levels | ⚠️ Partial |
+
+---
+
+## 5. MCP & Tools
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **MCP servers** | ✅ `mcpServers` in settings | ✅ `mcp_servers` in config | ✅ `/mcp` API | ✅ `--toolbox` | ✅ Yes - inject config |
+| **Tool restrictions** | ❌ | ❌ | ✅ Per-tool permissions | ✅ Permission rules | ⚠️ Partial |
+
+### MCP Config Mapping
+
+| Agent | Local Server | Remote Server |
+|-------|--------------|---------------|
+| Claude | `.mcp.json` or `.claude/settings.json` → `mcpServers` | Same, with `url` |
+| Codex | `.codex/config.toml` → `mcp_servers` | Same schema |
+| OpenCode | `/mcp` API with `McpLocalConfig` | `McpRemoteConfig` with `url`, `headers` |
+| Amp | `amp mcp add` CLI | Supports remote with headers |
+
+Local MCP servers can be bundled (for example with `tsup`) and uploaded via the filesystem API, then referenced in the session `mcp` config to auto-start and serve custom tools.
+
+---
+
+## 6. Skills & Extensions
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Skills/plugins** | ✅ `.claude/skills/` | ✅ `.agents/skills/` | ✅ `.opencode/skill/` | 🔄 Claude-style | ✅ Yes - link dirs |
+| **Slash commands** | ✅ `.claude/commands/` | ✅ Custom prompts (deprecated) | ❌ | ❌ | ⚠️ Partial |
+
+### Skill Path Mapping
+
+| Agent | Project Skills | User Skills |
+|-------|----------------|-------------|
+| Claude | `.claude/skills/<name>/SKILL.md` | `~/.claude/skills/<name>/SKILL.md` |
+| Codex | `.agents/skills/` | `~/.agents/skills/` |
+| OpenCode | `.opencode/skill/`, `.claude/skills/`, `.agents/skills/` | `~/.config/opencode/skill/` |
+| Amp | Uses Claude/Codex directories | — |
+
+---
+
+## 7. Session Management
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Resume session** | ✅ `--resume` | ✅ Thread persistence | ✅ `-c/--continue` | ✅ `--continue` | ✅ Yes |
+| **Session ID** | ✅ `session_id` | ✅ `thread_id` | ✅ `sessionID` | ✅ `session_id` | ✅ Yes |
+
+---
+
+## 8. Human-in-the-Loop
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **Permission requests** | ✅ Events | ⚠️ Upfront only | ✅ SSE events | ❌ Pre-configured | ⚠️ Partial |
+| **Questions** | ⚠️ Limited in headless | ❌ | ✅ Full support | ❌ | ❌ No - OpenCode best |
+
+---
+
+## 9. Credentials
+
+| Feature | Claude | Codex | OpenCode | Amp | Universal? |
+|---------|--------|-------|----------|-----|------------|
+| **API key env var** | ✅ `ANTHROPIC_API_KEY` | ✅ `OPENAI_API_KEY` | ✅ Both | ✅ `ANTHROPIC_API_KEY` | ✅ Yes |
+| **OAuth tokens** | ✅ | ✅ | ✅ | ✅ | ✅ Yes |
+| **Config file auth** | ✅ `~/.claude.json` | ✅ `~/.codex/auth.json` | ✅ `~/.local/share/opencode/auth.json` | ✅ `~/.amp/config.json` | ✅ Yes - extract per agent |
+
+---
+
+## Configuration Files Per Agent
+
+### Claude Code
+
+| File/Location | Purpose |
+|---------------|---------|
+| `CLAUDE.md` | Project instructions (hierarchical, cwd → root) |
+| `~/.claude/CLAUDE.md` | Global user instructions |
+| `~/.claude/settings.json` | User settings (permissions, MCP servers, env) |
+| `.claude/settings.json` | Project-level settings |
+| `.claude/settings.local.json` | Local overrides (gitignored) |
+| `~/.claude/commands/` | Custom slash commands (user-level) |
+| `.claude/commands/` | Project-level slash commands |
+| `~/.claude/skills/` | Installed skills |
+| `~/.claude/keybindings.json` | Custom keyboard shortcuts |
+| `~/.claude/projects/<hash>/memory/MEMORY.md` | Auto-memory per project |
+| `~/.claude.json` | Authentication/credentials |
+| `~/.claude.json.api` | API key storage |
+
+### OpenAI Codex
+
+| File/Location | Purpose |
+|---------------|---------|
+| `AGENTS.md` | Project instructions (hierarchical, root → cwd) |
+| `AGENTS.override.md` | Override file (takes precedence) |
+| `~/.codex/AGENTS.md` | Global user instructions |
+| `~/.codex/AGENTS.override.md` | Global override |
+| `~/.codex/config.toml` | User configuration |
+| `.codex/config.toml` | Project-level configuration |
+| `~/.codex/auth.json` | Authentication/credentials |
+
+Key config.toml options:
+- `model` - Default model
+- `developer_instructions` - Appended to system prompt
+- `model_instructions_file` - Replace entire system prompt
+- `project_doc_max_bytes` - Max AGENTS.md size (default 32KB)
+- `project_doc_fallback_filenames` - Alternative instruction files
+- `mcp_servers` - MCP server configuration
+
+### OpenCode
+
+| File/Location | Purpose |
+|---------------|---------|
+| `~/.local/share/opencode/auth.json` | Authentication |
+| `~/.config/opencode/config.toml` | User configuration |
+| `.opencode/config.toml` | Project configuration |
+
+### Amp
+
+| File/Location | Purpose |
+|---------------|---------|
+| `~/.amp/config.json` | Main configuration |
+| `~/.config/amp/settings.json` | Additional settings |
+| `.amp/rules.json` | Project permission rules |
+
+---
+
+## Summary: Universalization Tiers
+
+### Tier 1: Fully Universal (implement now)
+
+| Feature | API | Notes |
+|---------|-----|-------|
+| Project instructions | `projectInstructions: string` | Write to CLAUDE.md / AGENTS.md |
+| Permission mode | `permissionMode: "readonly" \| "write" \| "bypass"` | Map to agent-specific flags |
+| Agent mode | `agentMode: "plan" \| "build"` | Map to agent-specific mechanisms |
+| Model selection | `model: string` | Pass through to agent |
+| Resume session | `sessionId: string` | Map to agent's resume flag |
+| Credentials | `credentials: { apiKey?, oauthToken? }` | Inject via env vars |
+| MCP servers | `mcp: McpConfig` | Write to agent's config (docs drafted) |
+| Skills | `skills: { paths: string[] }` | Link to agent's skill dirs (docs drafted) |
+
+### Tier 2: Partial Support (with fallbacks)
+
+| Feature | API | Notes |
+|---------|-----|-------|
+| Append system prompt | `appendSystemPrompt: string` | Falls back to projectInstructions for Amp |
+| Reasoning variant | `variant: string` | Ignored for Claude |
+
+### Tier 3: Agent-Specific (pass-through)
+
+| Feature | Notes |
+|---------|-------|
+| Replace system prompt | Codex only (`model_instructions_file`) |
+| Per-tool permissions | OpenCode/Amp only |
+| Custom agents | OpenCode only |
+| Hierarchical file discovery | Let agents handle natively |
+
+---
+
+## Recommended Universal API
+
+```typescript
+interface UniversalSessionConfig {
+  // Tier 1 - Universal
+  agent: "claude" | "codex" | "opencode" | "amp";
+  model?: string;
+  permissionMode?: "readonly" | "write" | "bypass";
+  agentMode?: "plan" | "build";
+  projectInstructions?: string;
+  sessionId?: string;  // For resume
+  workingDirectory?: string;
+  credentials?: {
+    apiKey?: string;
+    oauthToken?: string;
+  };
+
+  // MCP servers (docs drafted in docs/mcp.mdx)
+  mcp?: Record<string, McpServerConfig>;
+
+  // Skills (docs drafted in docs/skills.mdx)
+  skills?: {
+    paths: string[];
+  };
+
+  // Tier 2 - Partial (with fallbacks)
+  appendSystemPrompt?: string;
+  variant?: string;
+
+  // Tier 3 - Pass-through
+  agentSpecific?: {
+    claude?: { /* raw Claude options */ };
+    codex?: { replaceSystemPrompt?: string; /* etc */ };
+    opencode?: { customAgent?: AgentDef; /* etc */ };
+    amp?: { permissionRules?: Rule[]; /* etc */ };
+  };
+}
+
+interface McpServerConfig {
+  type: "local" | "remote";
+  // Local
+  command?: string;
+  args?: string[];
+  env?: Record<string, string>;
+  timeoutMs?: number;
+  // Remote
+  url?: string;
+  headers?: Record<string, string>;
+}
+```
+
+---
+
+## Implementation Notes
+
+### Priority Inversion Warning
+
+Claude and Codex have inverted priority for project instructions vs system prompt:
+
+- **Claude**: `--append-system-prompt` > base prompt > `CLAUDE.md`
+- **Codex**: `AGENTS.md` > `developer_instructions` > base prompt
+
+This means:
+- In Claude, system prompt additions override project files
+- In Codex, project files override system prompt additions
+
+When using both `appendSystemPrompt` and `projectInstructions`, document this behavior clearly or consider normalizing by only using one mechanism.
+
+### File Injection Strategy
+
+For `projectInstructions`, sandbox-agent should:
+
+1. Create a temp directory or use session working directory
+2. Write instructions to the appropriate file:
+   - Claude: `.claude/CLAUDE.md` or `CLAUDE.md` in cwd
+   - Codex: `.codex/AGENTS.md` or `AGENTS.md` in cwd
+   - OpenCode: Config file or environment
+   - Amp: Limited - may only influence via context
+3. Start agent in that directory
+4. Agent discovers and loads instructions automatically
+
+### MCP Server Injection
+
+For `mcp`, sandbox-agent should:
+
+1. Write MCP config to agent's settings file:
+   - Claude: `.mcp.json` or `.claude/settings.json` → `mcpServers` key
+   - Codex: `.codex/config.toml` → `mcp_servers`
+   - OpenCode: Call `/mcp` API
+   - Amp: Run `amp mcp add` or pass via `--toolbox`
+2. Ensure MCP server binaries are available in PATH
+3. Handle cleanup on session end
+
+### Skill Linking
+
+For `skills.paths`, sandbox-agent should:
+
+1. For each skill path, symlink or copy to agent's skill directory:
+   - Claude: `.claude/skills/<name>/`
+   - Codex: `.agents/skills/<name>/`
+   - OpenCode: Update `skills.paths` in config
+2. Skill directory must contain `SKILL.md`
+3. Handle cleanup on session end
--- a/resources/agent-schemas/artifacts/json-schema/amp.json
+++ b/resources/agent-schemas/artifacts/json-schema/amp.json
@ -9,6 +9,10 @@
        "type": {
          "type": "string",
          "enum": [
+            "system",
+            "user",
+            "assistant",
+            "result",
            "message",
            "tool_call",
            "tool_result",
@ -27,6 +31,45 @@
        },
        "error": {
          "type": "string"
+        },
+        "subtype": {
+          "type": "string"
+        },
+        "cwd": {
+          "type": "string"
+        },
+        "session_id": {
+          "type": "string"
+        },
+        "tools": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "mcp_servers": {
+          "type": "array",
+          "items": {
+            "type": "object"
+          }
+        },
+        "message": {
+          "type": "object"
+        },
+        "parent_tool_use_id": {
+          "type": "string"
+        },
+        "duration_ms": {
+          "type": "number"
+        },
+        "is_error": {
+          "type": "boolean"
+        },
+        "num_turns": {
+          "type": "number"
+        },
+        "result": {
+          "type": "string"
        }
      },
      "required": [
--- a/resources/agent-schemas/src/amp.ts
+++ b/resources/agent-schemas/src/amp.ts
@ -204,12 +204,27 @@ function createFallbackSchema(): NormalizedSchema {
      properties: {
        type: {
          type: "string",
-          enum: ["message", "tool_call", "tool_result", "error", "done"],
+          enum: ["system", "user", "assistant", "result", "message", "tool_call", "tool_result", "error", "done"],
        },
+        // Common fields
        id: { type: "string" },
        content: { type: "string" },
        tool_call: { $ref: "#/definitions/ToolCall" },
        error: { type: "string" },
+        // System message fields
+        subtype: { type: "string" },
+        cwd: { type: "string" },
+        session_id: { type: "string" },
+        tools: { type: "array", items: { type: "string" } },
+        mcp_servers: { type: "array", items: { type: "object" } },
+        // User/Assistant message fields
+        message: { type: "object" },
+        parent_tool_use_id: { type: "string" },
+        // Result fields
+        duration_ms: { type: "number" },
+        is_error: { type: "boolean" },
+        num_turns: { type: "number" },
+        result: { type: "string" },
      },
      required: ["type"],
    },
--- a/sdks/cli-shared/package.json
+++ b/sdks/cli-shared/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-shared",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "Shared helpers for sandbox-agent CLI and SDK",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/package.json
+++ b/sdks/cli/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "CLI for sandbox-agent - run AI coding agents in sandboxes",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/platforms/darwin-arm64/package.json
+++ b/sdks/cli/platforms/darwin-arm64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-darwin-arm64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "sandbox-agent CLI binary for macOS ARM64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/platforms/darwin-x64/package.json
+++ b/sdks/cli/platforms/darwin-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-darwin-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "sandbox-agent CLI binary for macOS x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/platforms/linux-arm64/package.json
+++ b/sdks/cli/platforms/linux-arm64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-linux-arm64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "sandbox-agent CLI binary for Linux arm64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/platforms/linux-x64/package.json
+++ b/sdks/cli/platforms/linux-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-linux-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "sandbox-agent CLI binary for Linux x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/cli/platforms/win32-x64/package.json
+++ b/sdks/cli/platforms/win32-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/cli-win32-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "sandbox-agent CLI binary for Windows x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/package.json
+++ b/sdks/gigacode/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "Gigacode CLI (sandbox-agent with OpenCode attach by default)",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/platforms/darwin-arm64/package.json
+++ b/sdks/gigacode/platforms/darwin-arm64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode-darwin-arm64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "gigacode CLI binary for macOS arm64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/platforms/darwin-x64/package.json
+++ b/sdks/gigacode/platforms/darwin-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode-darwin-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "gigacode CLI binary for macOS x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/platforms/linux-arm64/package.json
+++ b/sdks/gigacode/platforms/linux-arm64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode-linux-arm64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "gigacode CLI binary for Linux arm64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/platforms/linux-x64/package.json
+++ b/sdks/gigacode/platforms/linux-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode-linux-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "gigacode CLI binary for Linux x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/gigacode/platforms/win32-x64/package.json
+++ b/sdks/gigacode/platforms/win32-x64/package.json
@ -1,6 +1,6 @@
 {
  "name": "@sandbox-agent/gigacode-win32-x64",
-  "version": "0.1.7",
+  "version": "0.1.12-rc.1",
  "description": "gigacode CLI binary for Windows x64",
  "license": "Apache-2.0",
  "repository": {
--- a/sdks/typescript/package.json
+++ b/sdks/typescript/package.json
@ -1,7 +1,7 @@
 {
  "name": "sandbox-agent",
-  "version": "0.1.7",
-  "description": "Universal API for automatic coding agents in sandboxes. Supprots Claude Code, Codex, OpenCode, and Amp.",
+  "version": "0.1.12-rc.1",
+  "description": "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, and Amp.",
  "license": "Apache-2.0",
  "repository": {
    "type": "git",
@ -39,6 +39,6 @@
    "vitest": "^3.0.0"
  },
  "optionalDependencies": {
-    "@sandbox-agent/cli": "0.1.0"
+    "@sandbox-agent/cli": "workspace:*"
  }
 }
--- a/Show more
+++ b/Show more