docs: clean up orphaned docs and add session event types

Delete orphaned docs not in docs.json navigation (gigacode.mdx, foundry-self-hosting.mdx, session-transcript-schema.mdx, pi-support-plan.md). Remove outdated musl/glibc troubleshooting section. Add event types documentation with example payloads to agent-sessions.mdx. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chore(release): update version to 0.4.2
2026-04-15 14:03:52 +00:00 · 2026-03-25 19:11:19 -07:00 · 2026-03-25 18:07:26 -07:00 · 2026-03-25 17:00:51 -04:00 · 2026-03-25 16:54:40 -04:00 · 2026-03-25 13:20:57 -07:00
1016 changed files with 169540 additions and 50467 deletions
--- a/.agents/skills/agent-browser/SKILL.md
+++ b/.agents/skills/agent-browser/SKILL.md
@ -1,265 +1,24 @@
 ---
 name: agent-browser
-description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
-allowed-tools: Bash(agent-browser:*)
+description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
+allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*)
 ---

 # Browser Automation with agent-browser

-## Quick start
+## Core Workflow

-```bash
-agent-browser open <url>        # Navigate to page
-agent-browser snapshot -i       # Get interactive elements with refs
-agent-browser click @e1         # Click element by ref
-agent-browser fill @e2 "text"   # Fill input by ref
-agent-browser close             # Close browser
-```
+Every browser automation follows this pattern:

-## Core workflow
-
-1. Navigate: `agent-browser open <url>`
-2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
-3. Interact using refs from the snapshot
-4. Re-snapshot after navigation or significant DOM changes
-
-## Commands
-
-### Navigation
-
-```bash
-agent-browser open <url>      # Navigate to URL (aliases: goto, navigate)
-                              # Supports: https://, http://, file://, about:, data://
-                              # Auto-prepends https:// if no protocol given
-agent-browser back            # Go back
-agent-browser forward         # Go forward
-agent-browser reload          # Reload page
-agent-browser close           # Close browser (aliases: quit, exit)
-agent-browser connect 9222    # Connect to browser via CDP port
-```
-
-### Snapshot (page analysis)
-
-```bash
-agent-browser snapshot            # Full accessibility tree
-agent-browser snapshot -i         # Interactive elements only (recommended)
-agent-browser snapshot -c         # Compact output
-agent-browser snapshot -d 3       # Limit depth to 3
-agent-browser snapshot -s "#main" # Scope to CSS selector
-```
-
-### Interactions (use @refs from snapshot)
-
-```bash
-agent-browser click @e1           # Click
-agent-browser dblclick @e1        # Double-click
-agent-browser focus @e1           # Focus element
-agent-browser fill @e2 "text"     # Clear and type
-agent-browser type @e2 "text"     # Type without clearing
-agent-browser press Enter         # Press key (alias: key)
-agent-browser press Control+a     # Key combination
-agent-browser keydown Shift       # Hold key down
-agent-browser keyup Shift         # Release key
-agent-browser hover @e1           # Hover
-agent-browser check @e1           # Check checkbox
-agent-browser uncheck @e1         # Uncheck checkbox
-agent-browser select @e1 "value"  # Select dropdown option
-agent-browser select @e1 "a" "b"  # Select multiple options
-agent-browser scroll down 500     # Scroll page (default: down 300px)
-agent-browser scrollintoview @e1  # Scroll element into view (alias: scrollinto)
-agent-browser drag @e1 @e2        # Drag and drop
-agent-browser upload @e1 file.pdf # Upload files
-```
-
-### Get information
-
-```bash
-agent-browser get text @e1        # Get element text
-agent-browser get html @e1        # Get innerHTML
-agent-browser get value @e1       # Get input value
-agent-browser get attr @e1 href   # Get attribute
-agent-browser get title           # Get page title
-agent-browser get url             # Get current URL
-agent-browser get count ".item"   # Count matching elements
-agent-browser get box @e1         # Get bounding box
-agent-browser get styles @e1      # Get computed styles (font, color, bg, etc.)
-```
-
-### Check state
-
-```bash
-agent-browser is visible @e1      # Check if visible
-agent-browser is enabled @e1      # Check if enabled
-agent-browser is checked @e1      # Check if checked
-```
-
-### Screenshots & PDF
-
-```bash
-agent-browser screenshot          # Save to a temporary directory
-agent-browser screenshot path.png # Save to a specific path
-agent-browser screenshot --full   # Full page
-agent-browser pdf output.pdf      # Save as PDF
-```
-
-### Video recording
-
-```bash
-agent-browser record start ./demo.webm    # Start recording (uses current URL + state)
-agent-browser click @e1                   # Perform actions
-agent-browser record stop                 # Stop and save video
-agent-browser record restart ./take2.webm # Stop current + start new recording
-```
-
-Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it
-automatically returns to your current page. For smooth demos, explore first, then start recording.
-
-### Wait
-
-```bash
-agent-browser wait @e1                     # Wait for element
-agent-browser wait 2000                    # Wait milliseconds
-agent-browser wait --text "Success"        # Wait for text (or -t)
-agent-browser wait --url "**/dashboard"    # Wait for URL pattern (or -u)
-agent-browser wait --load networkidle      # Wait for network idle (or -l)
-agent-browser wait --fn "window.ready"     # Wait for JS condition (or -f)
-```
-
-### Mouse control
-
-```bash
-agent-browser mouse move 100 200      # Move mouse
-agent-browser mouse down left         # Press button
-agent-browser mouse up left           # Release button
-agent-browser mouse wheel 100         # Scroll wheel
-```
-
-### Semantic locators (alternative to refs)
-
-```bash
-agent-browser find role button click --name "Submit"
-agent-browser find text "Sign In" click
-agent-browser find text "Sign In" click --exact      # Exact match only
-agent-browser find label "Email" fill "user@test.com"
-agent-browser find placeholder "Search" type "query"
-agent-browser find alt "Logo" click
-agent-browser find title "Close" click
-agent-browser find testid "submit-btn" click
-agent-browser find first ".item" click
-agent-browser find last ".item" click
-agent-browser find nth 2 "a" hover
-```
-
-### Browser settings
-
-```bash
-agent-browser set viewport 1920 1080          # Set viewport size
-agent-browser set device "iPhone 14"          # Emulate device
-agent-browser set geo 37.7749 -122.4194       # Set geolocation (alias: geolocation)
-agent-browser set offline on                  # Toggle offline mode
-agent-browser set headers '{"X-Key":"v"}'     # Extra HTTP headers
-agent-browser set credentials user pass       # HTTP basic auth (alias: auth)
-agent-browser set media dark                  # Emulate color scheme
-agent-browser set media light reduced-motion  # Light mode + reduced motion
-```
-
-### Cookies & Storage
-
-```bash
-agent-browser cookies                     # Get all cookies
-agent-browser cookies set name value      # Set cookie
-agent-browser cookies clear               # Clear cookies
-agent-browser storage local               # Get all localStorage
-agent-browser storage local key           # Get specific key
-agent-browser storage local set k v       # Set value
-agent-browser storage local clear         # Clear all
-```
-
-### Network
-
-```bash
-agent-browser network route <url>              # Intercept requests
-agent-browser network route <url> --abort      # Block requests
-agent-browser network route <url> --body '{}'  # Mock response
-agent-browser network unroute [url]            # Remove routes
-agent-browser network requests                 # View tracked requests
-agent-browser network requests --filter api    # Filter requests
-```
-
-### Tabs & Windows
-
-```bash
-agent-browser tab                 # List tabs
-agent-browser tab new [url]       # New tab
-agent-browser tab 2               # Switch to tab by index
-agent-browser tab close           # Close current tab
-agent-browser tab close 2         # Close tab by index
-agent-browser window new          # New window
-```
-
-### Frames
-
-```bash
-agent-browser frame "#iframe"     # Switch to iframe
-agent-browser frame main          # Back to main frame
-```
-
-### Dialogs
-
-```bash
-agent-browser dialog accept [text]  # Accept dialog
-agent-browser dialog dismiss        # Dismiss dialog
-```
-
-### JavaScript
-
-```bash
-agent-browser eval "document.title"   # Run JavaScript
-```
-
-## Global options
-
-```bash
-agent-browser --session <name> ...    # Isolated browser session
-agent-browser --json ...              # JSON output for parsing
-agent-browser --headed ...            # Show browser window (not headless)
-agent-browser --full ...              # Full page screenshot (-f)
-agent-browser --cdp <port> ...        # Connect via Chrome DevTools Protocol
-agent-browser -p <provider> ...       # Cloud browser provider (--provider)
-agent-browser --proxy <url> ...       # Use proxy server
-agent-browser --headers <json> ...    # HTTP headers scoped to URL's origin
-agent-browser --executable-path <p>   # Custom browser executable
-agent-browser --extension <path> ...  # Load browser extension (repeatable)
-agent-browser --help                  # Show help (-h)
-agent-browser --version               # Show version (-V)
-agent-browser <command> --help        # Show detailed help for a command
-```
-
-### Proxy support
-
-```bash
-agent-browser --proxy http://proxy.com:8080 open example.com
-agent-browser --proxy http://user:pass@proxy.com:8080 open example.com
-agent-browser --proxy socks5://proxy.com:1080 open example.com
-```
-
-## Environment variables
-
-```bash
-AGENT_BROWSER_SESSION="mysession"            # Default session name
-AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
-AGENT_BROWSER_EXTENSIONS="/ext1,/ext2"       # Comma-separated extension paths
-AGENT_BROWSER_PROVIDER="your-cloud-browser-provider"  # Cloud browser provider (select browseruse or browserbase)
-AGENT_BROWSER_STREAM_PORT="9223"             # WebSocket streaming port
-AGENT_BROWSER_HOME="/path/to/agent-browser"  # Custom install location (for daemon.js)
-```
-
-## Example: Form submission
+1. **Navigate**: `agent-browser open <url>`
+2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
+3. **Interact**: Use refs to click, fill, select
+4. **Re-snapshot**: After navigation or DOM changes, get fresh refs

 ```bash
 agent-browser open https://example.com/form
 agent-browser snapshot -i
-# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
+# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

 agent-browser fill @e1 "user@example.com"
 agent-browser fill @e2 "password123"
@ -268,72 +27,504 @@ agent-browser wait --load networkidle
 agent-browser snapshot -i  # Check result
 ```

-## Example: Authentication with saved state
+## Command Chaining
+
+Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.

 ```bash
-# Login once
+# Chain open + wait + snapshot in one call
+agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
+
+# Chain multiple interactions
+agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
+
+# Navigate and capture
+agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
+```
+
+**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
+
+## Essential Commands
+
+```bash
+# Navigation
+agent-browser open <url>              # Navigate (aliases: goto, navigate)
+agent-browser close                   # Close browser
+
+# Snapshot
+agent-browser snapshot -i             # Interactive elements with refs (recommended)
+agent-browser snapshot -i -C          # Include cursor-interactive elements (divs with onclick, cursor:pointer)
+agent-browser snapshot -s "#selector" # Scope to CSS selector
+
+# Interaction (use @refs from snapshot)
+agent-browser click @e1               # Click element
+agent-browser click @e1 --new-tab     # Click and open in new tab
+agent-browser fill @e2 "text"         # Clear and type text
+agent-browser type @e2 "text"         # Type without clearing
+agent-browser select @e1 "option"     # Select dropdown option
+agent-browser check @e1               # Check checkbox
+agent-browser press Enter             # Press key
+agent-browser keyboard type "text"    # Type at current focus (no selector)
+agent-browser keyboard inserttext "text"  # Insert without key events
+agent-browser scroll down 500         # Scroll page
+agent-browser scroll down 500 --selector "div.content"  # Scroll within a specific container
+
+# Get information
+agent-browser get text @e1            # Get element text
+agent-browser get url                 # Get current URL
+agent-browser get title               # Get page title
+
+# Wait
+agent-browser wait @e1                # Wait for element
+agent-browser wait --load networkidle # Wait for network idle
+agent-browser wait --url "**/page"    # Wait for URL pattern
+agent-browser wait 2000               # Wait milliseconds
+
+# Downloads
+agent-browser download @e1 ./file.pdf          # Click element to trigger download
+agent-browser wait --download ./output.zip     # Wait for any download to complete
+agent-browser --download-path ./downloads open <url>  # Set default download directory
+
+# Capture
+agent-browser screenshot              # Screenshot to temp dir
+agent-browser screenshot --full       # Full page screenshot
+agent-browser screenshot --annotate   # Annotated screenshot with numbered element labels
+agent-browser pdf output.pdf          # Save as PDF
+
+# Diff (compare page states)
+agent-browser diff snapshot                          # Compare current vs last snapshot
+agent-browser diff snapshot --baseline before.txt    # Compare current vs saved file
+agent-browser diff screenshot --baseline before.png  # Visual pixel diff
+agent-browser diff url <url1> <url2>                 # Compare two pages
+agent-browser diff url <url1> <url2> --wait-until networkidle  # Custom wait strategy
+agent-browser diff url <url1> <url2> --selector "#main"  # Scope to element
+```
+
+## Common Patterns
+
+### Form Submission
+
+```bash
+agent-browser open https://example.com/signup
+agent-browser snapshot -i
+agent-browser fill @e1 "Jane Doe"
+agent-browser fill @e2 "jane@example.com"
+agent-browser select @e3 "California"
+agent-browser check @e4
+agent-browser click @e5
+agent-browser wait --load networkidle
+```
+
+### Authentication with Auth Vault (Recommended)
+
+```bash
+# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY)
+# Recommended: pipe password via stdin to avoid shell history exposure
+echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
+
+# Login using saved profile (LLM never sees password)
+agent-browser auth login github
+
+# List/show/delete profiles
+agent-browser auth list
+agent-browser auth show github
+agent-browser auth delete github
+```
+
+### Authentication with State Persistence
+
+```bash
+# Login once and save state
 agent-browser open https://app.example.com/login
 agent-browser snapshot -i
-agent-browser fill @e1 "username"
-agent-browser fill @e2 "password"
+agent-browser fill @e1 "$USERNAME"
+agent-browser fill @e2 "$PASSWORD"
 agent-browser click @e3
 agent-browser wait --url "**/dashboard"
 agent-browser state save auth.json

-# Later sessions: load saved state
+# Reuse in future sessions
 agent-browser state load auth.json
 agent-browser open https://app.example.com/dashboard
 ```

-## Sessions (parallel browsers)
+### Session Persistence

 ```bash
-agent-browser --session test1 open site-a.com
-agent-browser --session test2 open site-b.com
-agent-browser session list
+# Auto-save/restore cookies and localStorage across browser restarts
+agent-browser --session-name myapp open https://app.example.com/login
+# ... login flow ...
+agent-browser close  # State auto-saved to ~/.agent-browser/sessions/
+
+# Next time, state is auto-loaded
+agent-browser --session-name myapp open https://app.example.com/dashboard
+
+# Encrypt state at rest
+export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
+agent-browser --session-name secure open https://app.example.com
+
+# Manage saved states
+agent-browser state list
+agent-browser state show myapp-default.json
+agent-browser state clear myapp
+agent-browser state clean --older-than 7
 ```

-## JSON output (for parsing)
-
-Add `--json` for machine-readable output:
+### Data Extraction

 ```bash
+agent-browser open https://example.com/products
+agent-browser snapshot -i
+agent-browser get text @e5           # Get specific element text
+agent-browser get text body > page.txt  # Get all page text
+
+# JSON output for parsing
 agent-browser snapshot -i --json
 agent-browser get text @e1 --json
 ```

-## Debugging
+### Parallel Sessions

 ```bash
-agent-browser --headed open example.com   # Show browser window
-agent-browser --cdp 9222 snapshot         # Connect via CDP port
-agent-browser connect 9222                # Alternative: connect command
-agent-browser console                     # View console messages
-agent-browser console --clear             # Clear console
-agent-browser errors                      # View page errors
-agent-browser errors --clear              # Clear errors
-agent-browser highlight @e1               # Highlight element
-agent-browser trace start                 # Start recording trace
-agent-browser trace stop trace.zip        # Stop and save trace
-agent-browser record start ./debug.webm   # Record video from current page
-agent-browser record stop                 # Save recording
+agent-browser --session site1 open https://site-a.com
+agent-browser --session site2 open https://site-b.com
+
+agent-browser --session site1 snapshot -i
+agent-browser --session site2 snapshot -i
+
+agent-browser session list
 ```

-## Deep-dive documentation
+### Connect to Existing Chrome

-For detailed patterns and best practices, see:
+```bash
+# Auto-discover running Chrome with remote debugging enabled
+agent-browser --auto-connect open https://example.com
+agent-browser --auto-connect snapshot

-| Reference | Description |
+# Or with explicit CDP port
+agent-browser --cdp 9222 snapshot
+```
+
+### Color Scheme (Dark Mode)
+
+```bash
+# Persistent dark mode via flag (applies to all pages and new tabs)
+agent-browser --color-scheme dark open https://example.com
+
+# Or via environment variable
+AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
+
+# Or set during session (persists for subsequent commands)
+agent-browser set media dark
+```
+
+### Visual Browser (Debugging)
+
+```bash
+agent-browser --headed open https://example.com
+agent-browser highlight @e1          # Highlight element
+agent-browser record start demo.webm # Record session
+agent-browser profiler start         # Start Chrome DevTools profiling
+agent-browser profiler stop trace.json # Stop and save profile (path optional)
+```
+
+Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode.
+
+### Local Files (PDFs, HTML)
+
+```bash
+# Open local files with file:// URLs
+agent-browser --allow-file-access open file:///path/to/document.pdf
+agent-browser --allow-file-access open file:///path/to/page.html
+agent-browser screenshot output.png
+```
+
+### iOS Simulator (Mobile Safari)
+
+```bash
+# List available iOS simulators
+agent-browser device list
+
+# Launch Safari on a specific device
+agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
+
+# Same workflow as desktop - snapshot, interact, re-snapshot
+agent-browser -p ios snapshot -i
+agent-browser -p ios tap @e1          # Tap (alias for click)
+agent-browser -p ios fill @e2 "text"
+agent-browser -p ios swipe up         # Mobile-specific gesture
+
+# Take screenshot
+agent-browser -p ios screenshot mobile.png
+
+# Close session (shuts down simulator)
+agent-browser -p ios close
+```
+
+**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)
+
+**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
+
+## Security
+
+All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output.
+
+### Content Boundaries (Recommended for AI Agents)
+
+Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content:
+
+```bash
+export AGENT_BROWSER_CONTENT_BOUNDARIES=1
+agent-browser snapshot
+# Output:
+# --- AGENT_BROWSER_PAGE_CONTENT nonce=<hex> origin=https://example.com ---
+# [accessibility tree]
+# --- END_AGENT_BROWSER_PAGE_CONTENT nonce=<hex> ---
+```
+
+### Domain Allowlist
+
+Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on:
+
+```bash
+export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
+agent-browser open https://example.com        # OK
+agent-browser open https://malicious.com       # Blocked
+```
+
+### Action Policy
+
+Use a policy file to gate destructive actions:
+
+```bash
+export AGENT_BROWSER_ACTION_POLICY=./policy.json
+```
+
+Example `policy.json`:
+```json
+{"default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"]}
+```
+
+Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies.
+
+### Output Limits
+
+Prevent context flooding from large pages:
+
+```bash
+export AGENT_BROWSER_MAX_OUTPUT=50000
+```
+
+## Diffing (Verifying Changes)
+
+Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session.
+
+```bash
+# Typical workflow: snapshot -> action -> diff
+agent-browser snapshot -i          # Take baseline snapshot
+agent-browser click @e2            # Perform action
+agent-browser diff snapshot        # See what changed (auto-compares to last snapshot)
+```
+
+For visual regression testing or monitoring:
+
+```bash
+# Save a baseline screenshot, then compare later
+agent-browser screenshot baseline.png
+# ... time passes or changes are made ...
+agent-browser diff screenshot --baseline baseline.png
+
+# Compare staging vs production
+agent-browser diff url https://staging.example.com https://prod.example.com --screenshot
+```
+
+`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage.
+
+## Timeouts and Slow Pages
+
+The default Playwright timeout is 25 seconds for local browsers. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout:
+
+```bash
+# Wait for network activity to settle (best for slow pages)
+agent-browser wait --load networkidle
+
+# Wait for a specific element to appear
+agent-browser wait "#content"
+agent-browser wait @e1
+
+# Wait for a specific URL pattern (useful after redirects)
+agent-browser wait --url "**/dashboard"
+
+# Wait for a JavaScript condition
+agent-browser wait --fn "document.readyState === 'complete'"
+
+# Wait a fixed duration (milliseconds) as a last resort
+agent-browser wait 5000
+```
+
+When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait <selector>` or `wait @ref`.
+
+## Session Management and Cleanup
+
+When running multiple agents or automations concurrently, always use named sessions to avoid conflicts:
+
+```bash
+# Each agent gets its own isolated session
+agent-browser --session agent1 open site-a.com
+agent-browser --session agent2 open site-b.com
+
+# Check active sessions
+agent-browser session list
+```
+
+Always close your browser session when done to avoid leaked processes:
+
+```bash
+agent-browser close                    # Close default session
+agent-browser --session agent1 close   # Close specific session
+```
+
+If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work.
+
+## Ref Lifecycle (Important)
+
+Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after:
+
+- Clicking links or buttons that navigate
+- Form submissions
+- Dynamic content loading (dropdowns, modals)
+
+```bash
+agent-browser click @e5              # Navigates to new page
+agent-browser snapshot -i            # MUST re-snapshot
+agent-browser click @e1              # Use new refs
+```
+
+## Annotated Screenshots (Vision Mode)
+
+Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot.
+
+```bash
+agent-browser screenshot --annotate
+# Output includes the image path and a legend:
+#   [1] @e1 button "Submit"
+#   [2] @e2 link "Home"
+#   [3] @e3 textbox "Email"
+agent-browser click @e2              # Click using ref from annotated screenshot
+```
+
+Use annotated screenshots when:
+- The page has unlabeled icon buttons or visual-only elements
+- You need to verify visual layout or styling
+- Canvas or chart elements are present (invisible to text snapshots)
+- You need spatial reasoning about element positions
+
+## Semantic Locators (Alternative to Refs)
+
+When refs are unavailable or unreliable, use semantic locators:
+
+```bash
+agent-browser find text "Sign In" click
+agent-browser find label "Email" fill "user@test.com"
+agent-browser find role button click --name "Submit"
+agent-browser find placeholder "Search" type "query"
+agent-browser find testid "submit-btn" click
+```
+
+## JavaScript Evaluation (eval)
+
+Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues.
+
+```bash
+# Simple expressions work with regular quoting
+agent-browser eval 'document.title'
+agent-browser eval 'document.querySelectorAll("img").length'
+
+# Complex JS: use --stdin with heredoc (RECOMMENDED)
+agent-browser eval --stdin <<'EVALEOF'
+JSON.stringify(
+  Array.from(document.querySelectorAll("img"))
+    .filter(i => !i.alt)
+    .map(i => ({ src: i.src.split("/").pop(), width: i.width }))
+)
+EVALEOF
+
+# Alternative: base64 encoding (avoids all shell escaping issues)
+agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
+```
+
+**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely.
+
+**Rules of thumb:**
+- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine
+- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
+- Programmatic/generated scripts -> use `eval -b` with base64
+
+## Configuration File
+
+Create `agent-browser.json` in the project root for persistent settings:
+
+```json
+{
+  "headed": true,
+  "proxy": "http://localhost:8080",
+  "profile": "./browser-data"
+}
+```
+
+Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config <path>` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced.
+
+## Deep-Dive Documentation
+
+| Reference | When to Use |
 |-----------|-------------|
+| [references/commands.md](references/commands.md) | Full command reference with all options |
 | [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting |
 | [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping |
 | [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse |
 | [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation |
+| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis |
 | [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies |

-## Ready-to-use templates
+## Experimental: Native Mode

-Executable workflow scripts for common patterns:
+agent-browser has an experimental native Rust daemon that communicates with Chrome directly via CDP, bypassing Node.js and Playwright entirely. It is opt-in and not recommended for production use yet.
+
+```bash
+# Enable via flag
+agent-browser --native open example.com
+
+# Enable via environment variable (avoids passing --native every time)
+export AGENT_BROWSER_NATIVE=1
+agent-browser open example.com
+```
+
+The native daemon supports Chromium and Safari (via WebDriver). Firefox and WebKit are not yet supported. All core commands (navigate, snapshot, click, fill, screenshot, cookies, storage, tabs, eval, etc.) work identically in native mode. Use `agent-browser close` before switching between native and default mode within the same session.
+
+## Browser Engine Selection
+
+Use `--engine` to choose a local browser engine. The default is `chrome`.
+
+```bash
+# Use Lightpanda (fast headless browser, requires separate install)
+agent-browser --engine lightpanda open example.com
+
+# Via environment variable
+export AGENT_BROWSER_ENGINE=lightpanda
+agent-browser open example.com
+
+# With custom binary path
+agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com
+```
+
+Supported engines:
+- `chrome` (default) -- Chrome/Chromium via CDP
+- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome)
+
+Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
+
+## Ready-to-Use Templates

 | Template | Description |
 |----------|-------------|
@ -341,16 +532,8 @@ Executable workflow scripts for common patterns:
 | [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state |
 | [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots |

-Usage:
 ```bash
 ./templates/form-automation.sh https://example.com/form
 ./templates/authenticated-session.sh https://app.example.com/login
 ./templates/capture-workflow.sh https://example.com ./output
 ```
-
-## HTTPS Certificate Errors
-
-For sites with self-signed or invalid certificates:
-```bash
-agent-browser open https://localhost:8443 --ignore-https-errors
-```
--- a/.agents/skills/agent-browser/references/authentication.md
+++ b/.agents/skills/agent-browser/references/authentication.md
@ -1,6 +1,20 @@
 # Authentication Patterns

-Patterns for handling login flows, session persistence, and authenticated browsing.
+Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
+
+**Related**: [session-management.md](session-management.md) for state persistence details, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Login Flow](#basic-login-flow)
+- [Saving Authentication State](#saving-authentication-state)
+- [Restoring Authentication](#restoring-authentication)
+- [OAuth / SSO Flows](#oauth--sso-flows)
+- [Two-Factor Authentication](#two-factor-authentication)
+- [HTTP Basic Auth](#http-basic-auth)
+- [Cookie-Based Auth](#cookie-based-auth)
+- [Token Refresh Handling](#token-refresh-handling)
+- [Security Best Practices](#security-best-practices)

 ## Basic Login Flow

--- a/.agents/skills/agent-browser/references/proxy-support.md
+++ b/.agents/skills/agent-browser/references/proxy-support.md
@ -1,13 +1,29 @@
 # Proxy Support

-Configure proxy servers for browser automation, useful for geo-testing, rate limiting avoidance, and corporate environments.
+Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments.
+
+**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Proxy Configuration](#basic-proxy-configuration)
+- [Authenticated Proxy](#authenticated-proxy)
+- [SOCKS Proxy](#socks-proxy)
+- [Proxy Bypass](#proxy-bypass)
+- [Common Use Cases](#common-use-cases)
+- [Verifying Proxy Connection](#verifying-proxy-connection)
+- [Troubleshooting](#troubleshooting)
+- [Best Practices](#best-practices)

 ## Basic Proxy Configuration

-Set proxy via environment variable before starting:
+Use the `--proxy` flag or set proxy via environment variable:

 ```bash
-# HTTP proxy
+# Via CLI flag
+agent-browser --proxy "http://proxy.example.com:8080" open https://example.com
+
+# Via environment variable
 export HTTP_PROXY="http://proxy.example.com:8080"
 agent-browser open https://example.com

@ -45,10 +61,13 @@ agent-browser open https://example.com

 ## Proxy Bypass

-Skip proxy for specific domains:
+Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`:

 ```bash
-# Bypass proxy for local addresses
+# Via CLI flag
+agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com
+
+# Via environment variable
 export NO_PROXY="localhost,127.0.0.1,.internal.company.com"
 agent-browser open https://internal.company.com  # Direct connection
 agent-browser open https://external.com          # Via proxy
--- a/.agents/skills/agent-browser/references/session-management.md
+++ b/.agents/skills/agent-browser/references/session-management.md
@ -1,6 +1,18 @@
 # Session Management

-Run multiple isolated browser sessions concurrently with state persistence.
+Multiple isolated browser sessions with state persistence and concurrent browsing.
+
+**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Named Sessions](#named-sessions)
+- [Session Isolation Properties](#session-isolation-properties)
+- [Session State Persistence](#session-state-persistence)
+- [Common Patterns](#common-patterns)
+- [Default Session](#default-session)
+- [Session Cleanup](#session-cleanup)
+- [Best Practices](#best-practices)

 ## Named Sessions

--- a/.agents/skills/agent-browser/references/snapshot-refs.md
+++ b/.agents/skills/agent-browser/references/snapshot-refs.md
@ -1,21 +1,29 @@
-# Snapshot + Refs Workflow
+# Snapshot and Refs

-The core innovation of agent-browser: compact element references that reduce context usage dramatically for AI agents.
+Compact element references that reduce context usage dramatically for AI agents.

-## How It Works
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.

-### The Problem
-Traditional browser automation sends full DOM to AI agents:
+## Contents
+
+- [How Refs Work](#how-refs-work)
+- [Snapshot Command](#the-snapshot-command)
+- [Using Refs](#using-refs)
+- [Ref Lifecycle](#ref-lifecycle)
+- [Best Practices](#best-practices)
+- [Ref Notation Details](#ref-notation-details)
+- [Troubleshooting](#troubleshooting)
+
+## How Refs Work
+
+Traditional approach:
 ```
-Full DOM/HTML sent → AI parses → Generates CSS selector → Executes action
-~3000-5000 tokens per interaction
+Full DOM/HTML → AI parses → CSS selector → Action (~3000-5000 tokens)
 ```

-### The Solution
-agent-browser uses compact snapshots with refs:
+agent-browser approach:
 ```
-Compact snapshot → @refs assigned → Direct ref interaction
-~200-400 tokens per interaction
+Compact snapshot → @refs assigned → Direct interaction (~200-400 tokens)
 ```

 ## The Snapshot Command
@ -166,8 +174,8 @@ agent-browser snapshot -i
 ### Element Not Visible in Snapshot

 ```bash
-# Scroll to reveal element
-agent-browser scroll --bottom
+# Scroll down to reveal element
+agent-browser scroll down 1000
 agent-browser snapshot -i

 # Or wait for dynamic content
--- a/.agents/skills/agent-browser/references/video-recording.md
+++ b/.agents/skills/agent-browser/references/video-recording.md
@ -1,6 +1,17 @@
 # Video Recording

-Capture browser automation sessions as video for debugging, documentation, or verification.
+Capture browser automation as video for debugging, documentation, or verification.
+
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+
+## Contents
+
+- [Basic Recording](#basic-recording)
+- [Recording Commands](#recording-commands)
+- [Use Cases](#use-cases)
+- [Best Practices](#best-practices)
+- [Output Format](#output-format)
+- [Limitations](#limitations)

 ## Basic Recording

--- a/.agents/skills/agent-browser/templates/authenticated-session.sh
+++ b/.agents/skills/agent-browser/templates/authenticated-session.sh
@ -1,67 +1,81 @@
 #!/bin/bash
 # Template: Authenticated Session Workflow
-# Login once, save state, reuse for subsequent runs
+# Purpose: Login once, save state, reuse for subsequent runs
+# Usage: ./authenticated-session.sh <login-url> [state-file]
 #
-# Usage:
-#   ./authenticated-session.sh <login-url> [state-file]
+# RECOMMENDED: Use the auth vault instead of this template:
+#   echo "<pass>" | agent-browser auth save myapp --url <login-url> --username <user> --password-stdin
+#   agent-browser auth login myapp
+# The auth vault stores credentials securely and the LLM never sees passwords.
 #
-# Setup:
-#   1. Run once to see your form structure
-#   2. Note the @refs for your fields
-#   3. Uncomment LOGIN FLOW section and update refs
+# Environment variables:
+#   APP_USERNAME - Login username/email
+#   APP_PASSWORD - Login password
+#
+# Two modes:
+#   1. Discovery mode (default): Shows form structure so you can identify refs
+#   2. Login mode: Performs actual login after you update the refs
+#
+# Setup steps:
+#   1. Run once to see form structure (discovery mode)
+#   2. Update refs in LOGIN FLOW section below
+#   3. Set APP_USERNAME and APP_PASSWORD
+#   4. Delete the DISCOVERY section

 set -euo pipefail

 LOGIN_URL="${1:?Usage: $0 <login-url> [state-file]}"
 STATE_FILE="${2:-./auth-state.json}"

-echo "Authentication workflow for: $LOGIN_URL"
+echo "Authentication workflow: $LOGIN_URL"

-# ══════════════════════════════════════════════════════════════
-# SAVED STATE: Skip login if we have valid saved state
-# ══════════════════════════════════════════════════════════════
+# ================================================================
+# SAVED STATE: Skip login if valid saved state exists
+# ================================================================
 if [[ -f "$STATE_FILE" ]]; then
-    echo "Loading saved authentication state..."
-    agent-browser state load "$STATE_FILE"
-    agent-browser open "$LOGIN_URL"
-    agent-browser wait --load networkidle
+    echo "Loading saved state from $STATE_FILE..."
+    if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then
+        agent-browser wait --load networkidle

-    CURRENT_URL=$(agent-browser get url)
-    if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
-        echo "Session restored successfully!"
-        agent-browser snapshot -i
-        exit 0
+        CURRENT_URL=$(agent-browser get url)
+        if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
+            echo "Session restored successfully"
+            agent-browser snapshot -i
+            exit 0
+        fi
+        echo "Session expired, performing fresh login..."
+        agent-browser close 2>/dev/null || true
+    else
+        echo "Failed to load state, re-authenticating..."
    fi
-    echo "Session expired, performing fresh login..."
    rm -f "$STATE_FILE"
 fi

-# ══════════════════════════════════════════════════════════════
-# DISCOVERY MODE: Show form structure (remove after setup)
-# ══════════════════════════════════════════════════════════════
+# ================================================================
+# DISCOVERY MODE: Shows form structure (delete after setup)
+# ================================================================
 echo "Opening login page..."
 agent-browser open "$LOGIN_URL"
 agent-browser wait --load networkidle

 echo ""
-echo "┌─────────────────────────────────────────────────────────┐"
-echo "│ LOGIN FORM STRUCTURE                                    │"
-echo "├─────────────────────────────────────────────────────────┤"
+echo "Login form structure:"
+echo "---"
 agent-browser snapshot -i
-echo "└─────────────────────────────────────────────────────────┘"
+echo "---"
 echo ""
 echo "Next steps:"
-echo "  1. Note refs: @e? = username, @e? = password, @e? = submit"
-echo "  2. Uncomment LOGIN FLOW section below"
-echo "  3. Replace @e1, @e2, @e3 with your refs"
+echo "  1. Note the refs: username=@e?, password=@e?, submit=@e?"
+echo "  2. Update the LOGIN FLOW section below with your refs"
+echo "  3. Set: export APP_USERNAME='...' APP_PASSWORD='...'"
 echo "  4. Delete this DISCOVERY MODE section"
 echo ""
 agent-browser close
 exit 0

-# ══════════════════════════════════════════════════════════════
+# ================================================================
 # LOGIN FLOW: Uncomment and customize after discovery
-# ══════════════════════════════════════════════════════════════
+# ================================================================
 # : "${APP_USERNAME:?Set APP_USERNAME environment variable}"
 # : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}"
 #
@ -78,14 +92,14 @@ exit 0
 # # Verify login succeeded
 # FINAL_URL=$(agent-browser get url)
 # if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then
-#     echo "ERROR: Login failed - still on login page"
+#     echo "Login failed - still on login page"
 #     agent-browser screenshot /tmp/login-failed.png
 #     agent-browser close
 #     exit 1
 # fi
 #
 # # Save state for future runs
-# echo "Saving authentication state to: $STATE_FILE"
+# echo "Saving state to $STATE_FILE"
 # agent-browser state save "$STATE_FILE"
-# echo "Login successful!"
+# echo "Login successful"
 # agent-browser snapshot -i
--- a/.agents/skills/agent-browser/templates/capture-workflow.sh
+++ b/.agents/skills/agent-browser/templates/capture-workflow.sh
@ -1,68 +1,69 @@
 #!/bin/bash
 # Template: Content Capture Workflow
-# Extract content from web pages with optional authentication
+# Purpose: Extract content from web pages (text, screenshots, PDF)
+# Usage: ./capture-workflow.sh <url> [output-dir]
+#
+# Outputs:
+#   - page-full.png: Full page screenshot
+#   - page-structure.txt: Page element structure with refs
+#   - page-text.txt: All text content
+#   - page.pdf: PDF version
+#
+# Optional: Load auth state for protected pages

 set -euo pipefail

 TARGET_URL="${1:?Usage: $0 <url> [output-dir]}"
 OUTPUT_DIR="${2:-.}"

-echo "Capturing content from: $TARGET_URL"
+echo "Capturing: $TARGET_URL"
 mkdir -p "$OUTPUT_DIR"

-# Optional: Load authentication state if needed
+# Optional: Load authentication state
 # if [[ -f "./auth-state.json" ]]; then
+#     echo "Loading authentication state..."
 #     agent-browser state load "./auth-state.json"
 # fi

-# Navigate to target page
+# Navigate to target
 agent-browser open "$TARGET_URL"
 agent-browser wait --load networkidle

-# Get page metadata
-echo "Page title: $(agent-browser get title)"
-echo "Page URL: $(agent-browser get url)"
+# Get metadata
+TITLE=$(agent-browser get title)
+URL=$(agent-browser get url)
+echo "Title: $TITLE"
+echo "URL: $URL"

 # Capture full page screenshot
 agent-browser screenshot --full "$OUTPUT_DIR/page-full.png"
-echo "Screenshot saved: $OUTPUT_DIR/page-full.png"
+echo "Saved: $OUTPUT_DIR/page-full.png"

-# Get page structure
+# Get page structure with refs
 agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt"
-echo "Structure saved: $OUTPUT_DIR/page-structure.txt"
+echo "Saved: $OUTPUT_DIR/page-structure.txt"

-# Extract main content
-# Adjust selector based on target site structure
-# agent-browser get text @e1 > "$OUTPUT_DIR/main-content.txt"
-
-# Extract specific elements (uncomment as needed)
-# agent-browser get text "article" > "$OUTPUT_DIR/article.txt"
-# agent-browser get text "main" > "$OUTPUT_DIR/main.txt"
-# agent-browser get text ".content" > "$OUTPUT_DIR/content.txt"
-
-# Get full page text
+# Extract all text content
 agent-browser get text body > "$OUTPUT_DIR/page-text.txt"
-echo "Text content saved: $OUTPUT_DIR/page-text.txt"
+echo "Saved: $OUTPUT_DIR/page-text.txt"

-# Optional: Save as PDF
+# Save as PDF
 agent-browser pdf "$OUTPUT_DIR/page.pdf"
-echo "PDF saved: $OUTPUT_DIR/page.pdf"
+echo "Saved: $OUTPUT_DIR/page.pdf"

-# Optional: Capture with scrolling for infinite scroll pages
-# scroll_and_capture() {
-#     local count=0
-#     while [[ $count -lt 5 ]]; do
-#         agent-browser scroll down 1000
-#         agent-browser wait 1000
-#         ((count++))
-#     done
-#     agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"
-# }
-# scroll_and_capture
+# Optional: Extract specific elements using refs from structure
+# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt"
+
+# Optional: Handle infinite scroll pages
+# for i in {1..5}; do
+#     agent-browser scroll down 1000
+#     agent-browser wait 1000
+# done
+# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"

 # Cleanup
 agent-browser close

 echo ""
-echo "Capture complete! Files saved to: $OUTPUT_DIR"
+echo "Capture complete:"
 ls -la "$OUTPUT_DIR"
--- a/.agents/skills/agent-browser/templates/form-automation.sh
+++ b/.agents/skills/agent-browser/templates/form-automation.sh
@ -1,64 +1,62 @@
 #!/bin/bash
 # Template: Form Automation Workflow
-# Fills and submits web forms with validation
+# Purpose: Fill and submit web forms with validation
+# Usage: ./form-automation.sh <form-url>
+#
+# This template demonstrates the snapshot-interact-verify pattern:
+# 1. Navigate to form
+# 2. Snapshot to get element refs
+# 3. Fill fields using refs
+# 4. Submit and verify result
+#
+# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output

 set -euo pipefail

 FORM_URL="${1:?Usage: $0 <form-url>}"

-echo "Automating form at: $FORM_URL"
+echo "Form automation: $FORM_URL"

-# Navigate to form page
+# Step 1: Navigate to form
 agent-browser open "$FORM_URL"
 agent-browser wait --load networkidle

-# Get interactive snapshot to identify form fields
-echo "Analyzing form structure..."
+# Step 2: Snapshot to discover form elements
+echo ""
+echo "Form structure:"
 agent-browser snapshot -i

-# Example: Fill common form fields
-# Uncomment and modify refs based on snapshot output
+# Step 3: Fill form fields (customize these refs based on snapshot output)
+#
+# Common field types:
+#   agent-browser fill @e1 "John Doe"           # Text input
+#   agent-browser fill @e2 "user@example.com"   # Email input
+#   agent-browser fill @e3 "SecureP@ss123"      # Password input
+#   agent-browser select @e4 "Option Value"     # Dropdown
+#   agent-browser check @e5                     # Checkbox
+#   agent-browser click @e6                     # Radio button
+#   agent-browser fill @e7 "Multi-line text"   # Textarea
+#   agent-browser upload @e8 /path/to/file.pdf # File upload
+#
+# Uncomment and modify:
+# agent-browser fill @e1 "Test User"
+# agent-browser fill @e2 "test@example.com"
+# agent-browser click @e3  # Submit button

-# Text inputs
-# agent-browser fill @e1 "John Doe"           # Name field
-# agent-browser fill @e2 "user@example.com"   # Email field
-# agent-browser fill @e3 "+1-555-123-4567"    # Phone field
-
-# Password fields
-# agent-browser fill @e4 "SecureP@ssw0rd!"
-
-# Dropdowns
-# agent-browser select @e5 "Option Value"
-
-# Checkboxes
-# agent-browser check @e6                      # Check
-# agent-browser uncheck @e7                    # Uncheck
-
-# Radio buttons
-# agent-browser click @e8                      # Select radio option
-
-# Text areas
-# agent-browser fill @e9 "Multi-line text content here"
-
-# File uploads
-# agent-browser upload @e10 /path/to/file.pdf
-
-# Submit form
-# agent-browser click @e11                     # Submit button
-
-# Wait for response
+# Step 4: Wait for submission
 # agent-browser wait --load networkidle
-# agent-browser wait --url "**/success"        # Or wait for redirect
+# agent-browser wait --url "**/success"  # Or wait for redirect

-# Verify submission
-echo "Form submission result:"
+# Step 5: Verify result
+echo ""
+echo "Result:"
 agent-browser get url
 agent-browser snapshot -i

-# Take screenshot of result
+# Optional: Capture evidence
 agent-browser screenshot /tmp/form-result.png
+echo "Screenshot saved: /tmp/form-result.png"

 # Cleanup
 agent-browser close
-
-echo "Form automation complete"
+echo "Done"
--- a/.claude/commands/post-release-testing.md
+++ b/.claude/commands/post-release-testing.md
@ -0,0 +1,56 @@
+# Post-Release Testing Agent
+
+You are a post-release testing agent. Your job is to verify that a sandbox-agent release works correctly.
+
+## Environment Setup
+
+First, source the environment file:
+
+```bash
+source ~/misc/env.txt
+```
+
+## Tests to Run
+
+Run these tests in order, reporting results as you go:
+
+### 1. Docker Example Test
+
+```bash
+RUN_DOCKER_EXAMPLES=1 pnpm --filter @sandbox-agent/example-docker test
+```
+
+This test:
+- Creates an Alpine container
+- Installs sandbox-agent via curl from releases.rivet.dev
+- Verifies the `/v1/health` endpoint responds correctly
+
+### 2. E2B Example Test
+
+```bash
+pnpm --filter @sandbox-agent/example-e2b test
+```
+
+This test:
+- Creates an E2B sandbox with internet access
+- Installs sandbox-agent via curl
+- Verifies the `/v1/health` endpoint responds correctly
+
+### 3. Install Script Test
+
+Manually verify the install script works in a fresh environment:
+
+```bash
+docker run --rm alpine:latest sh -c "
+  apk add --no-cache curl ca-certificates libstdc++ libgcc bash &&
+  curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh &&
+  sandbox-agent --version
+"
+```
+
+## Instructions
+
+1. Run each test sequentially
+2. Report the outcome of each test (pass/fail)
+3. If a test fails, capture and report the error output
+4. Provide a summary at the end with overall pass/fail status
--- a/.claude/commands/release.md
+++ b/.claude/commands/release.md
@ -0,0 +1,165 @@
+# Release Agent
+
+You are a release agent for the Gigacode project (sandbox-agent). Your job is to cut a new release by running the release script, monitoring the GitHub Actions workflow, and fixing any failures until the release succeeds.
+
+## Step 1: Gather Release Information
+
+Ask the user what type of release they want to cut:
+
+- **patch** - Bug fixes (e.g., 0.1.8 -> 0.1.9)
+- **minor** - New features (e.g., 0.1.8 -> 0.2.0)
+- **major** - Breaking changes (e.g., 0.1.8 -> 1.0.0)
+- **rc** - Release candidate (e.g., 0.2.0-rc.1)
+
+For **rc** releases, also ask:
+1. What base version the RC is for (e.g., 0.2.0). If the user doesn't specify, determine it by bumping the minor version from the current version.
+2. What RC number (e.g., 1, 2, 3). If the user doesn't specify, check existing git tags to auto-determine the next RC number:
+
+```bash
+git tag -l "v<base_version>-rc.*" | sort -V
+```
+
+If no prior RC tags exist for that base version, use `rc.1`. Otherwise, increment the highest existing RC number.
+
+The final RC version string is `<base_version>-rc.<number>` (e.g., `0.2.0-rc.1`).
+
+## Step 2: Confirm Release Details
+
+Before proceeding, display the release details to the user and ask for explicit confirmation:
+
+- Current version (read from `Cargo.toml` workspace.package.version)
+- New version
+- Current branch
+- Whether it will be tagged as "latest" (RC releases are never tagged as latest)
+
+Do NOT proceed without user confirmation.
+
+## Step 3: Run the Release Script (Setup Local)
+
+The release script handles version bumping, local checks, committing, pushing, and triggering the workflow.
+
+For **major**, **minor**, or **patch** releases:
+
+```bash
+echo "yes" | ./scripts/release/main.ts --<type> --phase setup-local
+```
+
+For **rc** releases (using explicit version):
+
+```bash
+echo "yes" | ./scripts/release/main.ts --version <version> --phase setup-local
+```
+
+Where `<type>` is `major`, `minor`, or `patch`, and `<version>` is the full RC version string like `0.2.0-rc.1`.
+
+The `--phase setup-local` runs these steps in order:
+1. Confirms release details (interactive prompt - piping "yes" handles this)
+2. Updates version in all files (Cargo.toml, package.json files)
+3. Runs local checks (cargo check, cargo fmt, pnpm typecheck)
+4. Git commits with message `chore(release): update version to X.Y.Z`
+5. Git pushes
+6. Triggers the GitHub Actions workflow
+
+If local checks fail at step 3, fix the issues in the codebase, then re-run using `--only-steps` to avoid re-running already-completed steps:
+
+```bash
+echo "yes" | ./scripts/release/main.ts --version <version> --only-steps run-local-checks,git-commit,git-push,trigger-workflow
+```
+
+## Step 4: Monitor the GitHub Actions Workflow
+
+After the workflow is triggered, wait 5 seconds for it to register, then begin polling.
+
+### Find the workflow run
+
+```bash
+gh run list --workflow=release.yaml --limit=1 --json databaseId,status,conclusion,createdAt,url
+```
+
+Verify the run was created recently (within the last 2 minutes) to confirm you are monitoring the correct run. Save the `databaseId` as the run ID.
+
+### Poll for completion
+
+Poll every 15 seconds using:
+
+```bash
+gh run view <run-id> --json status,conclusion
+```
+
+Report progress to the user periodically (every ~60 seconds or when status changes). The status values are:
+- `queued` / `in_progress` / `waiting` - Still running, keep polling
+- `completed` - Done, check `conclusion`
+
+When `status` is `completed`, check `conclusion`:
+- `success` - Release succeeded! Proceed to Step 6.
+- `failure` - Proceed to Step 5.
+- `cancelled` - Inform the user and stop.
+
+## Step 5: Handle Workflow Failures
+
+If the workflow fails:
+
+### 5a. Get failure logs
+
+```bash
+gh run view <run-id> --log-failed
+```
+
+### 5b. Analyze the error
+
+Read the failure logs carefully. Common failure categories:
+- **Build failures** (cargo build, TypeScript compilation) - Fix the code
+- **Formatting issues** (cargo fmt) - Run `cargo fmt` and commit
+- **Test failures** - Fix the failing tests
+- **Publishing failures** (crates.io, npm) - These may be transient; check if retry will help
+- **Docker build failures** - Check Dockerfile or build script issues
+- **Infrastructure/transient failures** (network timeouts, rate limits) - Just re-trigger without code changes
+
+### 5c. Fix and re-push
+
+If a code fix is needed:
+1. Make the fix in the codebase
+2. Amend the release commit (since the release version commit is the most recent):
+
+```bash
+git add -A
+git commit --amend --no-edit
+git push --force-with-lease
+```
+
+IMPORTANT: Use `--force-with-lease` (not `--force`) for safety. Amend the commit rather than creating a new one so the release stays as a single version-bump commit.
+
+3. Re-trigger the workflow:
+
+```bash
+gh workflow run .github/workflows/release.yaml \
+  -f version=<version> \
+  -f latest=<true|false> \
+  --ref <branch>
+```
+
+Where `<branch>` is the current branch (usually `main`). Set `latest` to `false` for RC releases, `true` for stable releases that are newer than the current latest tag.
+
+4. Return to Step 4 to monitor the new run.
+
+If no code fix is needed (transient failure), skip straight to re-triggering the workflow (step 3 above).
+
+### 5d. Retry limit
+
+If the workflow has failed **5 times**, stop and report all errors to the user. Ask whether they want to continue retrying or abort the release. Do not retry infinitely.
+
+## Step 6: Report Success
+
+When the workflow completes successfully:
+1. Print the GitHub Actions run URL
+2. Print the new version number
+3. Suggest running post-release testing: "Run `/project:post-release-testing` to verify the release works correctly."
+
+## Important Notes
+
+- The product name is "Gigacode" (capital G, lowercase c). The CLI binary is `gigacode` (lowercase).
+- Do not include co-authors in any commit messages.
+- Use conventional commits style (e.g., `chore(release): update version to X.Y.Z`).
+- Keep commit messages to a single line.
+- The release script requires `tsx` to run (it's a TypeScript file with a shebang).
+- Always work on the current branch. Releases are typically cut from `main`.
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,34 @@
+# Build outputs
+target/
+dist/
+build/
+
+# Dependencies
+**/node_modules/
+
+# Cache
+.cache/
+.turbo/
+**/.turbo/
+*.tsbuildinfo
+.pnpm-store/
+coverage/
+
+# Environment
+.env
+.env.*
+.foundry/
+
+# IDE
+.idea/
+.vscode/
+
+# OS
+.DS_Store
+
+# Git
+.git/
+
+# Tests
+**/test/
+**/tests/
--- a/.env.development.example
+++ b/.env.development.example
@ -0,0 +1,34 @@
+# Foundry local development environment.
+# Copy ~/misc/the-foundry.env to .env in the repo root to populate secrets.
+# .env is gitignored — never commit it. The source of truth is ~/misc/the-foundry.env.
+#
+# Docker Compose (just foundry-dev) and the justfile (set dotenv-load := true)
+# both read .env automatically.
+
+APP_URL=http://localhost:4173
+BETTER_AUTH_URL=http://localhost:4173
+BETTER_AUTH_SECRET=sandbox-agent-foundry-development-only-change-me
+GITHUB_REDIRECT_URI=http://localhost:4173/v1/auth/callback/github
+
+# Fill these in when enabling live GitHub OAuth.
+GITHUB_CLIENT_ID=
+GITHUB_CLIENT_SECRET=
+
+# Fill these in when enabling GitHub App-backed org installation and repo import.
+GITHUB_APP_ID=
+GITHUB_APP_CLIENT_ID=
+GITHUB_APP_CLIENT_SECRET=
+# Store PEM material as a quoted single-line value with \n escapes.
+GITHUB_APP_PRIVATE_KEY=
+# Webhook secret for verifying GitHub webhook payloads.
+# Use smee.io for local development: https://smee.io/new
+GITHUB_WEBHOOK_SECRET=
+# Required for local GitHub webhook forwarding in compose.dev.
+SMEE_URL=
+SMEE_TARGET=http://backend:7741/v1/webhooks/github
+
+# Fill these in when enabling live Stripe billing.
+STRIPE_SECRET_KEY=
+STRIPE_PUBLISHABLE_KEY=
+STRIPE_WEBHOOK_SECRET=
+STRIPE_PRICE_TEAM=
--- a/.github/media/agent-diagram.gif
+++ b/.github/media/agent-diagram.gif
--- a/.github/media/banner.png
+++ b/.github/media/banner.png
--- a/.github/media/gigacode-header.jpeg
+++ b/.github/media/gigacode-header.jpeg
--- a/.github/media/inspector.png
+++ b/.github/media/inspector.png
--- a/.github/workflows/ci.yaml
+++ b/.github/workflows/ci.yaml
@ -11,6 +11,8 @@ jobs:
    runs-on: ubuntu-24.04
    steps:
      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: rustfmt, clippy
@ -21,5 +23,43 @@ jobs:
          node-version: 20
          cache: pnpm
      - run: pnpm install
+      - name: Run formatter hooks
+        shell: bash
+        run: |
+          if [ "${{ github.event_name }}" = "pull_request" ]; then
+            git fetch origin "${{ github.base_ref }}" --depth=1
+            diff_range="origin/${{ github.base_ref }}...HEAD"
+          elif [ "${{ github.event_name }}" = "push" ] && [ "${{ github.event.before }}" != "0000000000000000000000000000000000000000" ]; then
+            diff_range="${{ github.event.before }}...${{ github.sha }}"
+          else
+            diff_range="HEAD^...HEAD"
+          fi
+
+          mapfile -t changed_files < <(
+            git diff --name-only --diff-filter=ACMR "$diff_range" \
+              | grep -E '\.(cjs|cts|js|jsx|json|jsonc|mjs|mts|rs|ts|tsx)$' \
+              || true
+          )
+
+          if [ ${#changed_files[@]} -eq 0 ]; then
+            echo "No formatter-managed files changed."
+            exit 0
+          fi
+
+          args=()
+          for file in "${changed_files[@]}"; do
+            args+=(--file "$file")
+          done
+
+          pnpm exec lefthook run pre-commit --no-stage-fixed --fail-on-changes "${args[@]}"
+      - run: npm install -g tsx
      - name: Run checks
-        run: ./scripts/release/main.ts --version 0.0.0 --check
+        run: ./scripts/release/main.ts --version 0.0.0 --only-steps run-ci-checks
+      - name: Run ACP v1 server tests
+        run: |
+          cargo test -p sandbox-agent-agent-management
+          cargo test -p sandbox-agent --test v1_api
+          cargo test -p sandbox-agent --test v1_agent_process_matrix
+          cargo test -p sandbox-agent --lib
+      - name: Run SDK tests
+        run: pnpm --dir sdks/typescript test
--- a/.github/workflows/claude-code-review.yml
+++ b/.github/workflows/claude-code-review.yml
@ -0,0 +1,52 @@
+name: Claude Code Review
+
+on:
+  pull_request:
+    types: [opened, synchronize, ready_for_review, reopened]
+    # Optional: Only run on specific file changes
+    # paths:
+    #   - "src/**/*.ts"
+    #   - "src/**/*.tsx"
+    #   - "src/**/*.js"
+    #   - "src/**/*.jsx"
+
+jobs:
+  claude-review:
+    # Optional: Filter by PR author
+    # if: |
+    #   github.event.pull_request.user.login == 'external-contributor' ||
+    #   github.event.pull_request.user.login == 'new-developer' ||
+    #   github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'
+
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      pull-requests: read
+      issues: read
+      id-token: write
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "20"
+
+      - name: Install sandbox-agent skill
+        run: npx skills add rivet-dev/skills -s sandbox-agent --yes
+
+      - name: Run Claude Code Review
+        id: claude-review
+        uses: anthropics/claude-code-action@v1
+        with:
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
+          plugin_marketplaces: 'https://github.com/anthropics/claude-code.git'
+          plugins: 'code-review@claude-code-plugins'
+          prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}'
+          # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
+          # or https://code.claude.com/docs/en/cli-reference for available options
+
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@ -0,0 +1,47 @@
+name: Claude Code
+
+on:
+  issue_comment:
+    types: [created]
+  pull_request_review_comment:
+    types: [created]
+  issues:
+    types: [opened, assigned]
+  pull_request_review:
+    types: [submitted]
+
+jobs:
+  claude:
+    if: |
+      (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
+      (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
+      (github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
+      (github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      pull-requests: read
+      issues: read
+      id-token: write
+      actions: read
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "20"
+
+      - name: Install sandbox-agent skill
+        run: npx skills add rivet-dev/skills -s sandbox-agent --yes
+
+      - name: Run Claude Code
+        id: claude
+        uses: anthropics/claude-code-action@v1
+        with:
+          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
+          additional_permissions: |
+            actions: read
--- a/.github/workflows/release.yaml
+++ b/.github/workflows/release.yaml
@ -25,8 +25,6 @@ defaults:
 env:
  # Disable incremental compilation for faster from-scratch builds
  CARGO_INCREMENTAL: 0
-  # Skip inspector frontend for CI (not needed for type checking)
-  SANDBOX_AGENT_SKIP_INSPECTOR: 1
  # Skip OpenAPI generation in CI (use pre-committed docs/openapi.json)
  SKIP_OPENAPI_GEN: 1

@ -91,6 +89,7 @@ jobs:
    needs: [setup]
    if: ${{ !inputs.reuse_engine_version }}
    strategy:
+      fail-fast: false
      matrix:
        include:
          - platform: linux
@ -98,6 +97,11 @@ jobs:
            target: x86_64-unknown-linux-musl
            binary_ext: ""
            arch: x86_64
+          - platform: linux
+            runner: depot-ubuntu-24.04-arm-8
+            target: aarch64-unknown-linux-musl
+            binary_ext: ""
+            arch: aarch64
          - platform: windows
            runner: depot-ubuntu-24.04-8
            target: x86_64-pc-windows-gnu
@ -127,8 +131,8 @@ jobs:
          # Use Docker BuildKit
          export DOCKER_BUILDKIT=1

-          # Build the binary using our Dockerfile
-          docker/release/build.sh ${{ matrix.target }}
+          # Build the binary using our Dockerfile with version
+          docker/release/build.sh ${{ matrix.target }} ${{ github.event.inputs.version }}

          # Make sure dist directory exists and binary is there
          ls -la dist/
@ -143,12 +147,13 @@ jobs:
          sudo apt-get install -y unzip curl

          # Install AWS CLI
-          curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
-          unzip awscliv2.zip
+          curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscli.zip"
+          unzip awscli.zip
          sudo ./aws/install --update

          COMMIT_SHA_SHORT="${GITHUB_SHA::7}"
          BINARY_PATH="dist/sandbox-agent-${{ matrix.target }}${{ matrix.binary_ext }}"
+          GIGACODE_PATH="dist/gigacode-${{ matrix.target }}${{ matrix.binary_ext }}"

          # Must specify --checksum-algorithm for compatibility with R2
          aws s3 cp \
@ -158,19 +163,37 @@ jobs:
            --endpoint-url https://2a94c6a0ced8d35ea63cddc86c2681e7.r2.cloudflarestorage.com \
            --checksum-algorithm CRC32

+          aws s3 cp \
+            "${GIGACODE_PATH}" \
+            "s3://rivet-releases/sandbox-agent/${COMMIT_SHA_SHORT}/binaries/gigacode-${{ matrix.target }}${{ matrix.binary_ext }}" \
+            --region auto \
+            --endpoint-url https://2a94c6a0ced8d35ea63cddc86c2681e7.r2.cloudflarestorage.com \
+            --checksum-algorithm CRC32
+
  docker:
    name: "Build & Push Docker Images"
    needs: [setup]
    if: ${{ !inputs.reuse_engine_version }}
    strategy:
+      fail-fast: false
      matrix:
        include:
          - platform: linux/arm64
            runner: depot-ubuntu-24.04-arm-8
-            arch_suffix: -arm64
+            tag_suffix: -arm64
+            dockerfile: docker/runtime/Dockerfile
          - platform: linux/amd64
            runner: depot-ubuntu-24.04-8
-            arch_suffix: -amd64
+            tag_suffix: -amd64
+            dockerfile: docker/runtime/Dockerfile
+          - platform: linux/arm64
+            runner: depot-ubuntu-24.04-arm-8
+            tag_suffix: -full-arm64
+            dockerfile: docker/runtime/Dockerfile.full
+          - platform: linux/amd64
+            runner: depot-ubuntu-24.04-8
+            tag_suffix: -full-amd64
+            dockerfile: docker/runtime/Dockerfile.full
    runs-on: ${{ matrix.runner }}
    steps:
      - uses: actions/checkout@v4
@ -192,8 +215,8 @@ jobs:
        with:
          context: .
          push: true
-          tags: rivetdev/sandbox-agent:${{ steps.vars.outputs.sha_short }}${{ matrix.arch_suffix }}
-          file: docker/runtime/Dockerfile
+          tags: rivetdev/sandbox-agent:${{ steps.vars.outputs.sha_short }}${{ matrix.tag_suffix }}
+          file: ${{ matrix.dockerfile }}
          platforms: ${{ matrix.platform }}
          build-args: |
            TARGETARCH=${{ contains(matrix.platform, 'arm64') && 'arm64' || 'amd64' }}
--- a/.github/workflows/skill-generator.yml
+++ b/.github/workflows/skill-generator.yml
@ -0,0 +1,56 @@
+name: sync-sandbox-agent-skill
+
+on:
+  push:
+    branches: [main]
+  workflow_dispatch:
+
+jobs:
+  generate:
+    runs-on: ubuntu-24.04
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+
+      - name: Generate skill artifacts
+        run: node scripts/skill-generator/generate.js
+
+      - name: Sync to skills repo
+        env:
+          GH_TOKEN: ${{ secrets.RIVET_GITHUB_PAT }}
+        run: |
+          if [ -z "$GH_TOKEN" ]; then
+            echo "::error::RIVET_GITHUB_PAT secret is not set"
+            exit 1
+          fi
+
+          # Validate token before proceeding
+          if ! gh auth status 2>/dev/null; then
+            echo "::error::RIVET_GITHUB_PAT is invalid or expired. Rotate the token at https://github.com/settings/tokens"
+            exit 1
+          fi
+
+          git config --global user.name "github-actions[bot]"
+          git config --global user.email "github-actions[bot]@users.noreply.github.com"
+
+          # Clone public repo, configure auth via gh credential helper
+          gh auth setup-git
+          git clone https://github.com/rivet-dev/skills.git /tmp/rivet-skills
+
+          mkdir -p /tmp/rivet-skills/skills/sandbox-agent
+          rm -rf /tmp/rivet-skills/skills/sandbox-agent/*
+          cp -R scripts/skill-generator/dist/* /tmp/rivet-skills/skills/sandbox-agent/
+
+          cd /tmp/rivet-skills
+          git add skills/sandbox-agent
+
+          if git diff --cached --quiet; then
+            echo "No skill changes to publish"
+            exit 0
+          fi
+
+          git commit -m "chore: update sandbox-agent skill"
+          git push
--- a/.gitignore
+++ b/.gitignore
@ -15,6 +15,9 @@ yarn.lock
 .astro/
 *.tsbuildinfo
 .turbo/
+**/.turbo/
+.pnpm-store/
+coverage/

 # Environment
 .env
@ -39,3 +42,21 @@ npm-debug.log*
 # Rust
 Cargo.lock
 **/*.rs.bk
+
+# Agent runtime directories
+.agents/
+.claude/
+.opencode/
+
+# Example temp files
+.tmp-upload/
+*.db
+.foundry/
+
+# CLI binaries (downloaded during npm publish)
+sdks/cli/platforms/*/bin/
+
+# Foundry desktop app build artifacts
+foundry/packages/desktop/frontend-dist/
+foundry/packages/desktop/src-tauri/sidecars/
+.context/
--- a/.mcp.json
+++ b/.mcp.json
@ -0,0 +1,8 @@
+{
+  "mcpServers": {
+    "everything": {
+      "args": ["@modelcontextprotocol/server-everything"],
+      "command": "npx"
+    }
+  }
+}
--- a/.npmrc
+++ b/.npmrc
@ -0,0 +1 @@
+auto-install-peers=false
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1,68 +1,80 @@
 # Instructions

-## SDK Modes
+## Naming and Ownership

-There are two ways to work with the SDKs:
+- This repository/product is **Sandbox Agent**.
+- **Gigacode** is a separate user-facing UI/client, not the server product name.
+- Gigacode integrates with Sandbox Agent via the OpenCode-compatible surface (`/opencode/*`) when that compatibility layer is enabled.
+- Canonical extension namespace/domain string is `sandboxagent.dev` (no hyphen).
+- Canonical custom ACP extension method prefix is `_sandboxagent/...` (no hyphen).

- **Embedded**: Spawns the `sandbox-agent` server as a subprocess on a unique port and communicates with it locally. Useful for local development or when running the SDK and agent in the same environment.
- **Server**: Connects to a remotely running `sandbox-agent` server. The server is typically running inside a sandbox (e.g., Docker, E2B, Daytona, Vercel Sandboxes) and the SDK connects to it over HTTP.
+## Docs Terminology

-## Agent Schemas
+- Never mention "ACP" in user-facing docs (`docs/**/*.mdx`) except in docs that are specifically about ACP itself (e.g. `docs/acp-http-client.mdx`).
+- Never expose underlying protocol method names (e.g. `session/request_permission`, `session/create`, `_sandboxagent/session/detach`) in non-ACP docs. Describe the behavior in user-facing terms instead.
+- Do not describe the underlying protocol implementation in docs. Only document the SDK surface (methods, types, options). ACP protocol details belong exclusively in ACP-specific pages.
+- Do not use em dashes (`—`) in docs. Use commas, periods, or parentheses instead.

-Agent schemas (Claude Code, Codex, OpenCode, Amp) are available for reference in `resources/agent-schemas/artifacts/json-schema/`.
+### Docs Source Of Truth (HTTP/CLI)

-Extraction methods:
- **Claude**: Uses `claude --output-format json --json-schema` CLI command
- **Codex**: Uses `codex app-server generate-json-schema` CLI command
- **OpenCode**: Fetches from GitHub OpenAPI spec
- **Amp**: Scrapes from `https://ampcode.com/manual/appendix?preview#message-schema`
+- For HTTP/CLI docs/examples, source of truth is:
+  - `server/packages/sandbox-agent/src/router.rs`
+  - `server/packages/sandbox-agent/src/cli.rs`
+- Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy session REST APIs).

-All extractors have fallback schemas for when CLI/URL is unavailable.
+## Change Tracking

-Research on how different agents operate (CLI flags, streaming formats, HITL patterns, etc.) is in `research/agents/`. When adding or making changes to agent docs, follow the same structure as existing files.
+- If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push.
+- Keep CLI subcommands and HTTP endpoints in sync.
+- Update `docs/cli.mdx` when CLI behavior changes.
+- Regenerate `docs/openapi.json` when HTTP contracts change.
+- Keep `docs/inspector.mdx` and `docs/sdks/typescript.mdx` aligned with implementation.
+- Append blockers/decisions to `research/acp/friction.md` during ACP work.
+- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
+- Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.

-Universal schema guidance:
- The universal schema should cover the full feature set of all agents.
- Conversions must be best-effort overlap without being lossy; preserve raw payloads when needed.
- **The mock agent acts as the reference implementation** for correct event behavior. Real agents should use synthetic events to match the mock agent's event patterns (e.g., emitting both daemon synthetic and agent native `session.started` events, proper `item.started` → `item.delta` → `item.completed` sequences).
+## Docker Test Image

-## Spec Tracking
+- Docker-backed Rust and TypeScript tests build `docker/test-agent/Dockerfile` directly in-process and cache the image tag only in memory (`OnceLock` in Rust, module-level variable in TypeScript).
+- Do not add cross-process image-build scripts unless there is a concrete need for them.

- Keep CLI subcommands in sync with every HTTP endpoint.
- Update `CLAUDE.md` to keep CLI endpoints in sync with HTTP API changes.
- When changing the HTTP API, update the TypeScript SDK and CLI together.
- Do not make breaking changes to API endpoints.
- When changing API routes, ensure the HTTP/SSE test suite has full coverage of every route.
- When agent schema changes, ensure API tests cover the new schema and event shapes end-to-end.
- When the universal schema changes, update mock-agent events to cover the new fields or event types.
- Update `docs/conversion.md` whenever agent-native schema terms, synthetic events, identifier mappings, or conversion logic change.
- Never use synthetic data or mocked responses in tests.
- Never manually write agent types; always use generated types in `resources/agent-schemas/`. If types are broken, fix the generated types.
- The universal schema must provide consistent behavior across providers; avoid requiring frontend/client logic to special-case agents.
- The UI must reflect every field in AgentCapabilities; keep it in sync with the README feature matrix and `agent_capabilities_for`.
- When parsing agent data, if something is unexpected or does not match the schema, bail out and surface the error rather than trying to continue with partial parsing.
- When defining the universal schema, choose the option most compatible with native agent APIs, and add synthetics to fill gaps for other agents.
- Use `docs/glossary.md` as the source of truth for universal schema terminology and keep it updated alongside schema changes.
- On parse failures, emit an `agent.unparsed` event (source=daemon, synthetic=true) and treat it as a test failure. Preserve raw payloads when `include_raw=true`.
- Track subagent support in `docs/conversion.md`. For now, normalize subagent activity into normal message/tool flow, but revisit explicit subagent modeling later.
+## Common Software Sync

-### CLI ⇄ HTTP endpoint map (keep in sync)
+- These three files must stay in sync:
+  - `docs/common-software.mdx` (user-facing documentation)
+  - `docker/test-common-software/Dockerfile` (packages installed in the test image)
+  - `server/packages/sandbox-agent/tests/common_software.rs` (test assertions)
+- When adding or removing software from `docs/common-software.mdx`, also add/remove the corresponding `apt-get install` line in the Dockerfile and add/remove the test in `common_software.rs`.
+- Run `cargo test -p sandbox-agent --test common_software` to verify.

- `sandbox-agent api agents list` ↔ `GET /v1/agents`
- `sandbox-agent api agents install` ↔ `POST /v1/agents/{agent}/install`
- `sandbox-agent api agents modes` ↔ `GET /v1/agents/{agent}/modes`
- `sandbox-agent api sessions list` ↔ `GET /v1/sessions`
- `sandbox-agent api sessions create` ↔ `POST /v1/sessions/{sessionId}`
- `sandbox-agent api sessions send-message` ↔ `POST /v1/sessions/{sessionId}/messages`
- `sandbox-agent api sessions send-message-stream` ↔ `POST /v1/sessions/{sessionId}/messages/stream`
- `sandbox-agent api sessions events` / `get-messages` ↔ `GET /v1/sessions/{sessionId}/events`
- `sandbox-agent api sessions events-sse` ↔ `GET /v1/sessions/{sessionId}/events/sse`
- `sandbox-agent api sessions reply-question` ↔ `POST /v1/sessions/{sessionId}/questions/{questionId}/reply`
- `sandbox-agent api sessions reject-question` ↔ `POST /v1/sessions/{sessionId}/questions/{questionId}/reject`
- `sandbox-agent api sessions reply-permission` ↔ `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply`
+## Install Version References

-## Git Commits
-
- Do not include any co-authors in commit messages (no `Co-Authored-By` lines)
- Use conventional commits style (e.g., `feat:`, `fix:`, `docs:`, `chore:`, `refactor:`)
- Keep commit messages to a single line
+- Channel policy:
+  - Sandbox Agent install/version references use a pinned minor channel `0.N.x` (for curl URLs and `sandbox-agent` / `@sandbox-agent/cli` npm/bun installs).
+  - Gigacode install/version references use `latest` (for `@sandbox-agent/gigacode` install/run commands and `gigacode-install.*` release promotion).
+  - Release promotion policy: `latest` releases must still update `latest`; when a release is `latest`, Sandbox Agent must also be promoted to the matching minor channel `0.N.x`.
+- Keep every install-version reference below in sync whenever versions/channels change:
+  - `README.md`
+  - `docs/acp-http-client.mdx`
+  - `docs/cli.mdx`
+  - `docs/quickstart.mdx`
+  - `docs/sdk-overview.mdx`
+  - `docs/react-components.mdx`
+  - `docs/session-persistence.mdx`
+  - `docs/deploy/local.mdx`
+  - `docs/deploy/cloudflare.mdx`
+  - `docs/deploy/vercel.mdx`
+  - `docs/deploy/daytona.mdx`
+  - `docs/deploy/e2b.mdx`
+  - `docs/deploy/docker.mdx`
+  - `frontend/packages/website/src/components/GetStarted.tsx`
+  - `.claude/commands/post-release-testing.md`
+  - `examples/cloudflare/Dockerfile`
+  - `examples/daytona/src/index.ts`
+  - `examples/shared/src/docker.ts`
+  - `examples/docker/src/index.ts`
+  - `examples/e2b/src/index.ts`
+  - `examples/vercel/src/index.ts`
+  - `scripts/release/main.ts`
+  - `scripts/release/promote-artifacts.ts`
+  - `scripts/release/sdk.ts`
--- a/Cargo.toml
+++ b/Cargo.toml
@ -1,23 +1,25 @@
 [workspace]
 resolver = "2"
-members = ["server/packages/*"]
+members = ["server/packages/*", "gigacode"]
+exclude = ["factory/packages/desktop/src-tauri", "foundry/packages/desktop/src-tauri"]

 [workspace.package]
-version = "0.1.0"
+version = "0.4.2"
 edition = "2021"
 authors = [ "Rivet Gaming, LLC <developer@rivet.gg>" ]
 license = "Apache-2.0"
 repository = "https://github.com/rivet-dev/sandbox-agent"
-description = "Universal API for automatic coding agents in sandboxes. Supprots Claude Code, Codex, OpenCode, and Amp."
+description = "Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, and Amp."

 [workspace.dependencies]
 # Internal crates
-sandbox-agent = { version = "0.1.0", path = "server/packages/sandbox-agent" }
-sandbox-agent-error = { version = "0.1.0", path = "server/packages/error" }
-sandbox-agent-agent-management = { version = "0.1.0", path = "server/packages/agent-management" }
-sandbox-agent-agent-credentials = { version = "0.1.0", path = "server/packages/agent-credentials" }
-sandbox-agent-universal-agent-schema = { version = "0.1.0", path = "server/packages/universal-agent-schema" }
-sandbox-agent-extracted-agent-schemas = { version = "0.1.0", path = "server/packages/extracted-agent-schemas" }
+sandbox-agent = { version = "0.4.2", path = "server/packages/sandbox-agent" }
+sandbox-agent-error = { version = "0.4.2", path = "server/packages/error" }
+sandbox-agent-agent-management = { version = "0.4.2", path = "server/packages/agent-management" }
+sandbox-agent-agent-credentials = { version = "0.4.2", path = "server/packages/agent-credentials" }
+sandbox-agent-opencode-adapter = { version = "0.4.2", path = "server/packages/opencode-adapter" }
+sandbox-agent-opencode-server-manager = { version = "0.4.2", path = "server/packages/opencode-server-manager" }
+acp-http-adapter = { version = "0.4.2", path = "server/packages/acp-http-adapter" }

 # Serialization
 serde = { version = "1.0", features = ["derive"] }
@ -31,7 +33,7 @@ schemars = "0.8"
 utoipa = { version = "4.2", features = ["axum_extras"] }

 # Web framework
-axum = "0.7"
+axum = { version = "0.7", features = ["ws"] }
 tower = { version = "0.5", features = ["util"] }
 tower-http = { version = "0.5", features = ["cors", "trace"] }

@ -68,6 +70,8 @@ zip = { version = "0.6", default-features = false, features = ["deflate"] }
 url = "2.5"
 regress = "0.10"
 include_dir = "0.7"
+base64 = "0.22"
+toml_edit = "0.22"

 # Code generation (build deps)
 typify = "0.4"
--- a/README.md
+++ b/README.md
@ -1,81 +1,92 @@
-# Sandbox Agent SDK
+<p align="center">
+  <img src=".github/media/banner.png" alt="Sandbox Agent SDK" />
+</p>

-Universal API for automatic coding agents in sandboxes. Supports Claude Code, Codex, OpenCode, and Amp.
+<h3 align="center">Run Coding Agents in Sandboxes. Control Them Over HTTP.</h3>

-Docs: https://rivet.dev/docs/
+<p align="center">
+  A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Cursor, Amp, or Pi — streaming events, handling permissions, managing sessions.
+</p>

- **Any coding agent**: Universal API to interact with all agents with full feature coverage
- **Server or SDK mode**: Run as an HTTP server or with the TypeScript SDK
- **Universal session schema**: Universal schema to store agent transcripts
- **Supports your sandbox provider**: Daytona, E2B, Vercel Sandboxes, and more
- **Lightweight, portable Rust binary**: Install anywhere with 1 curl command
- **Automatic agent installation**: Agents are installed on-demand when first used
- **OpenAPI spec**: https://rivet.dev/docs/api
+<p align="center">
+  <a href="https://sandboxagent.dev/docs">Documentation</a> — <a href="https://sandboxagent.dev/docs/api-reference">API Reference</a> — <a href="https://rivet.dev/discord">Discord</a>
+</p>

-Roadmap:
+<p align="center">
+  <em><strong>Experimental:</strong> <a href="./gigacode/">Gigacode</a> — use OpenCode's TUI with any coding agent.</em>
+</p>

- [ ] Python SDK
- [ ] Automatic MCP & skill & hook configuration
- [ ] Todo lists
+## Why Sandbox Agent?

-## Agent Compatibility
+Running coding agents remotely is hard. Existing SDKs assume local execution, SSH breaks TTY handling and streaming, and every agent has a different API. Building from scratch means reimplementing everything for each coding agent.

-| Feature | [Claude Code*](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview) | [Codex](https://github.com/openai/codex) | [OpenCode](https://github.com/opencode-ai/opencode) | [Amp](https://ampcode.com) |
-|---------|:-----------:|:-----:|:--------:|:---:|
-| Stability | Stable | Stable | Experimental | Experimental |
-| Text Messages | ✓ | ✓ | ✓ | ✓ |
-| Tool Calls | —* | ✓ | ✓ | ✓ |
-| Tool Results | —* | ✓ | ✓ | ✓ |
-| Questions (HITL) | —* | | ✓ | |
-| Permissions (HITL) | —* | | ✓ | |
-| Images | | ✓ | ✓ | |
-| File Attachments | | ✓ | ✓ | |
-| Session Lifecycle | | ✓ | ✓ | |
-| Error Events | | ✓ | ✓ | ✓ |
-| Reasoning/Thinking | | ✓ | | |
-| Command Execution | | ✓ | | |
-| File Changes | | ✓ | | |
-| MCP Tools | | ✓ | | |
-| Streaming Deltas | | ✓ | ✓ | |
+Sandbox Agent solves three problems:

-* Claude headless CLI does not natively support tool calls/results or HITL questions/permissions yet; these are WIP.
+1. **Coding agents need sandboxes** — You can't let AI execute arbitrary code on your production servers. Coding agents need isolated environments, but existing SDKs assume local execution. Sandbox Agent is a server that runs inside the sandbox and exposes HTTP/SSE.

-Want support for another agent? [Open an issue](https://github.com/anthropics/sandbox-agent/issues/new) to request it.
+2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Cursor, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
+
+3. **Sessions are ephemeral** — Agent transcripts live in the sandbox. When the process ends, you lose everything. Sandbox Agent streams events in a universal schema to your storage. Persist to Postgres, ClickHouse, or [Rivet](https://rivet.dev). Replay later, audit everything.
+
+## Features
+
+- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Cursor, Amp, and Pi with full feature coverage
+- **Universal Session Schema**: Standardized schema that normalizes all agent event formats for storage and replay
+- **Runs Inside Any Sandbox**: Lightweight static Rust binary. One curl command to install inside E2B, Daytona, Vercel Sandboxes, or Docker
+- **Server or SDK Mode**: Run as an HTTP server or embed with the TypeScript SDK
+- **OpenAPI Spec**: [Well documented](https://sandboxagent.dev/docs/api-reference) and easy to integrate from any language
+- **OpenCode SDK & UI Support** *(Experimental)*: [Connect OpenCode CLI, SDK, or web UI](https://sandboxagent.dev/docs/opencode-compatibility) to control agents through familiar OpenCode tooling

 ## Architecture

 ![Agent Architecture Diagram](./.github/media/agent-diagram.gif)

-The Sandbox Agent acts as a universal adapter between your client application and various coding agents (Claude Code, Codex, OpenCode, Amp). Each agent has its own adapter (e.g., `claude_adapter.rs`) that handles the translation between the universal API and the agent-specific interface.
+The Sandbox Agent acts as a universal adapter between your client application and various coding agents. Each agent has its own adapter that handles the translation between the universal API and the agent-specific interface.

 - **Embedded Mode**: Runs agents locally as subprocesses
 - **Server Mode**: Runs as HTTP server from any sandbox provider

-[Documentation](https://rivet.dev/docs/architecture)
+[Architecture documentation](https://sandboxagent.dev/docs)

 ## Components

- Server: Rust daemon (`sandbox-agent server`) exposing the HTTP + SSE API.
- SDK: TypeScript client with embedded and server modes.
- Inspector: `https://inspect.sandboxagent.dev` for browsing sessions and events.
- CLI: `sandbox-agent` (same binary, plus npm wrapper) mirrors the HTTP endpoints.
+| Component | Description |
+|-----------|-------------|
+| **Server** | Rust daemon (`sandbox-agent server`) exposing the HTTP + SSE API |
+| **SDK** | TypeScript client with embedded and server modes |
+| **Inspector** | Built-in UI at inspecting sessions and events |
+| **CLI** | `sandbox-agent` (same binary, plus npm wrapper) mirrors the HTTP endpoints |

-## Quickstart
+## Get Started
+
+Choose the installation method that works best for your use case.

 ### Skill

 Install skill with:

-```
-npx skills add https://sandboxagent.dev/docs
+```bash
+npx skills add rivet-dev/skills -s sandbox-agent
 ```

-### SDK
+```bash
+bunx skills add rivet-dev/skills -s sandbox-agent
+```
+
+### TypeScript SDK
+
+Import the SDK directly into your Node or browser application. Full type safety and streaming support.

 **Install**

 ```bash
-npm install sandbox-agent
+npm install sandbox-agent@0.4.x
+```
+
+```bash
+bun add sandbox-agent@0.4.x
+# Optional: allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
+bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
 ```

 **Setup**
@ -107,7 +118,6 @@ const agents = await client.listAgents();
 await client.createSession("demo", {
  agent: "codex",
  agentMode: "default",
-  permissionMode: "plan",
 });

 await client.postMessage("demo", { message: "Hello from the SDK." });
@ -117,15 +127,15 @@ for await (const event of client.streamEvents("demo", { offset: 0 })) {
 }
 ```

-Full guide: https://rivet.dev/docs/sdks/typescript
+[SDK documentation](https://sandboxagent.dev/docs/sdks/typescript) — [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions)

-### Server
+### HTTP Server

-Install the binary (fastest installation, no Node.js required):
+Run as an HTTP server and connect from any language. Deploy to E2B, Daytona, Vercel, or your own infrastructure.

 ```bash
 # Install it
-curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
+curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
 # Run it
 sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
 ```
@ -133,10 +143,7 @@ sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
 Optional: preinstall agent binaries (no server required; they will be installed lazily on first use if you skip this):

 ```bash
-sandbox-agent install-agent claude
-sandbox-agent install-agent codex
-sandbox-agent install-agent opencode
-sandbox-agent install-agent amp
+sandbox-agent install-agent --all
 ```

 To disable auth locally:
@ -145,15 +152,20 @@ To disable auth locally:
 sandbox-agent server --no-token --host 127.0.0.1 --port 2468
 ```

-Docs: https://rivet.dev/docs/quickstart
-Integration guides: https://rivet.dev/docs/deployments
+[Quickstart](https://sandboxagent.dev/docs/quickstart) — [Deployment guides](https://sandboxagent.dev/docs/deploy)

 ### CLI

 Install the CLI wrapper (optional but convenient):

 ```bash
-npm install -g @sandbox-agent/cli
+npm install -g @sandbox-agent/cli@0.4.x
+```
+
+```bash
+# Allow Bun to run postinstall scripts for native binaries.
+bun add -g @sandbox-agent/cli@0.4.x
+bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
 ```

 Create a session and send a message:
@ -164,69 +176,111 @@ sandbox-agent api sessions send-message my-session --message "Hello" --endpoint
 sandbox-agent api sessions send-message-stream my-session --message "Hello" --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
 ```

-Docs: https://rivet.dev/docs/cli
+You can also use npx like:
+
+```bash
+npx @sandbox-agent/cli@0.4.x --help
+```
+
+```bash
+bunx @sandbox-agent/cli@0.4.x --help
+```
+
+[CLI documentation](https://sandboxagent.dev/docs/cli)
+
+### Inspector
+
+Debug sessions and events with the built-in Inspector UI (e.g., `http://localhost:2468/ui/`).
+
+![Sandbox Agent Inspector](./.github/media/inspector.png)
+
+[Inspector documentation](https://sandboxagent.dev/docs/inspector)
+
+### OpenAPI Specification
+
+[Explore API](https://sandboxagent.dev/docs/api-reference) — [View Specification](https://github.com/rivet-dev/sandbox-agent/blob/main/docs/openapi.json)

 ### Tip: Extract credentials

+Often you need to use your personal API tokens to test agents on sandboxes:
+
 ```bash
 sandbox-agent credentials extract-env --export
 ```

-This prints environment variables for your locally installed agents.
-Docs: https://rivet.dev/docs/quickstart
+This prints environment variables for your OpenAI/Anthropic/etc API keys to test with Sandbox Agent SDK.

-## Project Goals
+## FAQ

-This project aims to solve 3 problems with agents:
+<details>
+<summary><strong>Does this replace the Vercel AI SDK?</strong></summary>

- **Universal Agent API**: Claude Code, Codex, Amp, and OpenCode all have put a lot of work in to the agent scaffold. Each have respective pros and cons and need to be easy to be swapped between.
- **Agent Transcript**: Maintaining agent transcripts is difficult since the agent manages its own sessions. This provides a simpler way to read and retrieve agent transcripts in your system.
- **Agents In Sandboxes**: There are many complications with running agents inside of sandbox providers. This lets you run a simple curl command to spawn an HTTP server for using any agent from within the sandbox.
+No, they're complementary. AI SDK is for building chat interfaces and calling LLMs. This SDK is for controlling autonomous coding agents that write code and run commands. Use AI SDK for your UI, use this when you need an agent to actually code.
+</details>

-Features out of scope:
+<details>
+<summary><strong>Which coding agents are supported?</strong></summary>
+
+Claude Code, Codex, OpenCode, Cursor, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
+</details>
+
+<details>
+<summary><strong>How is session data persisted?</strong></summary>
+
+This SDK does not handle persisting session data. Events stream in a universal JSON schema that you can persist anywhere. See [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions) for patterns using Postgres or [Rivet Actors](https://rivet.dev).
+</details>
+
+<details>
+<summary><strong>Can I run this locally or does it require a sandbox provider?</strong></summary>
+
+Both. Run locally for development, deploy to E2B, Daytona, or Vercel Sandboxes for production.
+</details>
+
+<details>
+<summary><strong>Does it support [platform]?</strong></summary>
+
+The server is a single Rust binary that runs anywhere with a curl install. If your platform can run Linux binaries (Docker, VMs, etc.), it works. See the deployment guides for E2B, Daytona, and Vercel Sandboxes.
+</details>
+
+<details>
+<summary><strong>Can I use this with my personal API keys?</strong></summary>
+
+Yes. Use `sandbox-agent credentials extract-env` to extract API keys from your local agent configs (Claude Code, Codex, OpenCode, Amp, Pi) and pass them to the sandbox environment.
+</details>
+
+<details>
+<summary><strong>Why Rust and not [language]?</strong></summary>
+
+Rust gives us a single static binary, fast startup, and predictable memory usage. That makes it easy to run inside sandboxes or in CI without shipping a large runtime, such as Node.js.
+</details>
+
+<details>
+<summary><strong>Why can't I just run coding agents locally?</strong></summary>
+
+You can for development. But in production, you need isolation. Coding agents execute arbitrary code — that can't happen on your servers. Sandboxes provide the isolation; this SDK provides the HTTP API to control coding agents remotely.
+</details>
+
+<details>
+<summary><strong>How is this different from the agent's official SDK?</strong></summary>
+
+Official SDKs assume local execution. They spawn processes and expect interactive terminals. This SDK runs a server inside a sandbox that you connect to over HTTP — designed for remote control from the start.
+</details>
+
+<details>
+<summary><strong>Why not just SSH into the sandbox?</strong></summary>
+
+Coding agents expect interactive terminals with proper TTY handling. SSH with piped commands breaks tool confirmations, streaming output, and human-in-the-loop flows. The SDK handles all of this over a clean HTTP API.
+</details>
+
+## Out of Scope

 - **Storage of sessions on disk**: Sessions are already stored by the respective coding agents on disk. It's assumed that the consumer is streaming data from this machine to an external storage, such as Postgres, ClickHouse, or Rivet.
 - **Direct LLM wrappers**: Use the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction) if you want to implement your own agent from scratch.
 - **Git Repo Management**: Just use git commands or the features provided by your sandbox provider of choice.
- **Sandbox Provider API**: Sandbox providers have many nuanced differences in their API, it does not make sense for us to try to provide a custom layer. Instead, we opt to provide guides that let you integrate this project with sandbox providers.
+- **Sandbox Provider API**: Sandbox providers have many nuanced differences in their API, it does not make sense for us to try to provide a custom layer. Instead, we opt to provide guides that let you integrate this repository with sandbox providers.

-## FAQ
+## Roadmap

-**Why not use PTY?**
-
-PTY-based approaches require parsing terminal escape sequences and dealing with interactive prompts.
-
-The agents we support all have machine-readable output modes (JSONL, HTTP APIs) that provide structured events, making integration more reliable.
-
-**Why not use features that already exist on sandbox provider APIs?**
-
-Sandbox providers focus on infrastructure (containers, VMs, networking).
-
-This project focuses specifically on coding agent orchestration: session management, HITL (human-in-the-loop) flows, and universal event schemas. These concerns are complementary.
-
-**Does it support [platform]?**
-The server is a single Rust binary that runs anywhere with a curl install. If your platform can run Linux binaries (Docker, VMs, etc.), it works. See the deployment guides for E2B, Daytona, Vercel Sandboxes, and Docker.
-
-**Can I use this with my personal API keys?**
-Yes. Use `sandbox-agent credentials extract-env` to extract API keys from your local agent configs (Claude Code, Codex, OpenCode, Amp) and pass them to the sandbox environment.
-
-**Why Rust?**
-Rust gives us a single static binary, fast startup, and predictable memory usage. That makes it
-easy to run inside sandboxes or in CI without shipping a large runtime.
-
-**Why not use stdio/JSON-RPC?**
-
- has benefit of not having to listen on a port
- more difficult to interact with, harder to analyze, doesn't support inspector for debugging
- may add at some point
- Codex does this and Claude has a JSON stream, but HTTP/SSE gives us a consistent API surface and inspector UI.
-
-**Why not AI SDK?**
-
- AI SDK does not provide harness for bieng a fully fledged coding agent
- Fronteir coding agent harnesses have a lot of work put in to complex things like swarms, compaction, etc
-
-**Why not OpenCode server?**
-
- The harnesses do a lot of heavy lifting, but different agents have very different APIs and behavior.
- A universal API lets you swap agents without rewriting your orchestration code.
+- [ ] Python SDK
+- [ ] Automatic MCP & skill & hook configuration
+- [ ] Todo lists
--- a/biome.json
+++ b/biome.json
@ -0,0 +1,7 @@
+{
+  "$schema": "./node_modules/@biomejs/biome/configuration_schema.json",
+  "formatter": {
+    "indentStyle": "space",
+    "lineWidth": 160
+  }
+}
--- a/docker/inspector-dev/Dockerfile
+++ b/docker/inspector-dev/Dockerfile
@ -0,0 +1,7 @@
+FROM node:22-bookworm-slim
+
+RUN npm install -g pnpm@10.28.2
+
+WORKDIR /app
+
+CMD ["bash", "-lc", "pnpm install --filter @sandbox-agent/inspector... && cd frontend/packages/inspector && exec pnpm vite --host 0.0.0.0 --port 5173"]
--- a/docker/release/build.sh
+++ b/docker/release/build.sh
@ -2,6 +2,14 @@
 set -euo pipefail

 TARGET=${1:-x86_64-unknown-linux-musl}
+VERSION=${2:-}
+
+# Build arguments for Docker
+BUILD_ARGS=""
+if [ -n "$VERSION" ]; then
+  BUILD_ARGS="--build-arg SANDBOX_AGENT_VERSION=$VERSION"
+  echo "Building with version: $VERSION"
+fi

 case $TARGET in
  x86_64-unknown-linux-musl)
@ -9,24 +17,35 @@ case $TARGET in
    DOCKERFILE="linux-x86_64.Dockerfile"
    TARGET_STAGE="x86_64-builder"
    BINARY="sandbox-agent-$TARGET"
+    GIGACODE="gigacode-$TARGET"
+    ;;
+  aarch64-unknown-linux-musl)
+    echo "Building for Linux aarch64 musl"
+    DOCKERFILE="linux-aarch64.Dockerfile"
+    TARGET_STAGE="aarch64-builder"
+    BINARY="sandbox-agent-$TARGET"
+    GIGACODE="gigacode-$TARGET"
    ;;
  x86_64-pc-windows-gnu)
    echo "Building for Windows x86_64"
    DOCKERFILE="windows.Dockerfile"
    TARGET_STAGE=""
    BINARY="sandbox-agent-$TARGET.exe"
+    GIGACODE="gigacode-$TARGET.exe"
    ;;
  x86_64-apple-darwin)
    echo "Building for macOS x86_64"
    DOCKERFILE="macos-x86_64.Dockerfile"
    TARGET_STAGE="x86_64-builder"
    BINARY="sandbox-agent-$TARGET"
+    GIGACODE="gigacode-$TARGET"
    ;;
  aarch64-apple-darwin)
    echo "Building for macOS aarch64"
    DOCKERFILE="macos-aarch64.Dockerfile"
    TARGET_STAGE="aarch64-builder"
    BINARY="sandbox-agent-$TARGET"
+    GIGACODE="gigacode-$TARGET"
    ;;
  *)
    echo "Unsupported target: $TARGET"
@ -36,19 +55,22 @@ case $TARGET in

 DOCKER_BUILDKIT=1
 if [ -n "$TARGET_STAGE" ]; then
-  docker build --target "$TARGET_STAGE" -f "docker/release/$DOCKERFILE" -t "sandbox-agent-builder-$TARGET" .
+  docker build --target "$TARGET_STAGE" $BUILD_ARGS -f "docker/release/$DOCKERFILE" -t "sandbox-agent-builder-$TARGET" .
 else
-  docker build -f "docker/release/$DOCKERFILE" -t "sandbox-agent-builder-$TARGET" .
+  docker build $BUILD_ARGS -f "docker/release/$DOCKERFILE" -t "sandbox-agent-builder-$TARGET" .
 fi

 CONTAINER_ID=$(docker create "sandbox-agent-builder-$TARGET")
 mkdir -p dist

 docker cp "$CONTAINER_ID:/artifacts/$BINARY" "dist/"
+docker cp "$CONTAINER_ID:/artifacts/$GIGACODE" "dist/"
 docker rm "$CONTAINER_ID"

 if [[ "$BINARY" != *.exe ]]; then
  chmod +x "dist/$BINARY"
+  chmod +x "dist/$GIGACODE"
 fi

 echo "Binary saved to: dist/$BINARY"
+echo "Binary saved to: dist/$GIGACODE"
--- a/docker/release/linux-aarch64.Dockerfile
+++ b/docker/release/linux-aarch64.Dockerfile
@ -0,0 +1,81 @@
+# syntax=docker/dockerfile:1.10.0
+
+# Build inspector frontend
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
+# Use Alpine with native musl for ARM64 builds (runs natively on ARM64 runner)
+FROM rust:1.88-alpine AS aarch64-builder
+
+# Accept version as build arg
+ARG SANDBOX_AGENT_VERSION
+ENV SANDBOX_AGENT_VERSION=${SANDBOX_AGENT_VERSION}
+
+# Install dependencies
+RUN apk add --no-cache \
+    musl-dev \
+    clang \
+    llvm-dev \
+    openssl-dev \
+    openssl-libs-static \
+    pkgconfig \
+    git \
+    curl \
+    build-base
+
+# Add musl target
+RUN rustup target add aarch64-unknown-linux-musl
+
+# Set environment variables for native musl build
+ENV CARGO_INCREMENTAL=0 \
+    CARGO_NET_GIT_FETCH_WITH_CLI=true \
+    RUSTFLAGS="-C target-feature=+crt-static"
+
+WORKDIR /build
+
+# Copy the source code
+COPY . .
+
+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
+# Build for Linux with musl (static binary) - aarch64
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent -p gigacode --release --target aarch64-unknown-linux-musl && \
+    mkdir -p /artifacts && \
+    cp target/aarch64-unknown-linux-musl/release/sandbox-agent /artifacts/sandbox-agent-aarch64-unknown-linux-musl && \
+    cp target/aarch64-unknown-linux-musl/release/gigacode /artifacts/gigacode-aarch64-unknown-linux-musl
+
+# Default command to show help
+CMD ["ls", "-la", "/artifacts"]
--- a/docker/release/linux-x86_64.Dockerfile
+++ b/docker/release/linux-x86_64.Dockerfile
@ -1,4 +1,38 @@
 # syntax=docker/dockerfile:1.10.0
+
+# Build inspector frontend
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
 FROM rust:1.88.0 AS base

 # Install dependencies
@ -41,6 +75,10 @@ WORKDIR /build
 # Build for x86_64
 FROM base AS x86_64-builder

+# Accept version as build arg
+ARG SANDBOX_AGENT_VERSION
+ENV SANDBOX_AGENT_VERSION=${SANDBOX_AGENT_VERSION}
+
 # Set up OpenSSL for x86_64 musl target
 ENV SSL_VER=1.1.1w
 RUN wget https://www.openssl.org/source/openssl-$SSL_VER.tar.gz \
@ -61,14 +99,17 @@ ENV OPENSSL_DIR=/musl \
 # Copy the source code
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 # Build for Linux with musl (static binary) - x86_64
-# SANDBOX_AGENT_SKIP_INSPECTOR=1 skips embedding the inspector frontend
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target x86_64-unknown-linux-musl && \
+    cargo build -p sandbox-agent -p gigacode --release --target x86_64-unknown-linux-musl && \
    mkdir -p /artifacts && \
-    cp target/x86_64-unknown-linux-musl/release/sandbox-agent /artifacts/sandbox-agent-x86_64-unknown-linux-musl
+    cp target/x86_64-unknown-linux-musl/release/sandbox-agent /artifacts/sandbox-agent-x86_64-unknown-linux-musl && \
+    cp target/x86_64-unknown-linux-musl/release/gigacode /artifacts/gigacode-x86_64-unknown-linux-musl

 # Default command to show help
 CMD ["ls", "-la", "/artifacts"]
--- a/docker/release/macos-aarch64.Dockerfile
+++ b/docker/release/macos-aarch64.Dockerfile
@ -1,4 +1,38 @@
 # syntax=docker/dockerfile:1.10.0
+
+# Build inspector frontend
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
 FROM rust:1.88.0 AS base

 # Install dependencies
@ -45,6 +79,10 @@ WORKDIR /build
 # Build for ARM64 macOS
 FROM base AS aarch64-builder

+# Accept version as build arg
+ARG SANDBOX_AGENT_VERSION
+ENV SANDBOX_AGENT_VERSION=${SANDBOX_AGENT_VERSION}
+
 # Install macOS ARM64 target
 RUN rustup target add aarch64-apple-darwin

@ -59,14 +97,17 @@ ar = "aarch64-apple-darwin20.4-ar"\n\
 # Copy the source code
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 # Build for ARM64 macOS
-# SANDBOX_AGENT_SKIP_INSPECTOR=1 skips embedding the inspector frontend
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target aarch64-apple-darwin && \
+    cargo build -p sandbox-agent -p gigacode --release --target aarch64-apple-darwin && \
    mkdir -p /artifacts && \
-    cp target/aarch64-apple-darwin/release/sandbox-agent /artifacts/sandbox-agent-aarch64-apple-darwin
+    cp target/aarch64-apple-darwin/release/sandbox-agent /artifacts/sandbox-agent-aarch64-apple-darwin && \
+    cp target/aarch64-apple-darwin/release/gigacode /artifacts/gigacode-aarch64-apple-darwin

 # Default command to show help
 CMD ["ls", "-la", "/artifacts"]
--- a/docker/release/macos-x86_64.Dockerfile
+++ b/docker/release/macos-x86_64.Dockerfile
@ -1,4 +1,38 @@
 # syntax=docker/dockerfile:1.10.0
+
+# Build inspector frontend
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
 FROM rust:1.88.0 AS base

 # Install dependencies
@ -45,6 +79,10 @@ WORKDIR /build
 # Build for x86_64 macOS
 FROM base AS x86_64-builder

+# Accept version as build arg
+ARG SANDBOX_AGENT_VERSION
+ENV SANDBOX_AGENT_VERSION=${SANDBOX_AGENT_VERSION}
+
 # Install macOS x86_64 target
 RUN rustup target add x86_64-apple-darwin

@ -59,14 +97,17 @@ ar = "x86_64-apple-darwin20.4-ar"\n\
 # Copy the source code
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 # Build for x86_64 macOS
-# SANDBOX_AGENT_SKIP_INSPECTOR=1 skips embedding the inspector frontend
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target x86_64-apple-darwin && \
+    cargo build -p sandbox-agent -p gigacode --release --target x86_64-apple-darwin && \
    mkdir -p /artifacts && \
-    cp target/x86_64-apple-darwin/release/sandbox-agent /artifacts/sandbox-agent-x86_64-apple-darwin
+    cp target/x86_64-apple-darwin/release/sandbox-agent /artifacts/sandbox-agent-x86_64-apple-darwin && \
+    cp target/x86_64-apple-darwin/release/gigacode /artifacts/gigacode-x86_64-apple-darwin

 # Default command to show help
 CMD ["ls", "-la", "/artifacts"]
--- a/docker/release/windows.Dockerfile
+++ b/docker/release/windows.Dockerfile
@ -1,6 +1,44 @@
 # syntax=docker/dockerfile:1.10.0
+
+# Build inspector frontend
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
 FROM rust:1.88.0

+# Accept version as build arg
+ARG SANDBOX_AGENT_VERSION
+ENV SANDBOX_AGENT_VERSION=${SANDBOX_AGENT_VERSION}
+
 # Install dependencies
 RUN apt-get update && apt-get install -y \
    llvm-14-dev \
@ -45,14 +83,17 @@ WORKDIR /build
 # Copy the source code
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 # Build for Windows
-# SANDBOX_AGENT_SKIP_INSPECTOR=1 skips embedding the inspector frontend
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target x86_64-pc-windows-gnu && \
+    cargo build -p sandbox-agent -p gigacode --release --target x86_64-pc-windows-gnu && \
    mkdir -p /artifacts && \
-    cp target/x86_64-pc-windows-gnu/release/sandbox-agent.exe /artifacts/sandbox-agent-x86_64-pc-windows-gnu.exe
+    cp target/x86_64-pc-windows-gnu/release/sandbox-agent.exe /artifacts/sandbox-agent-x86_64-pc-windows-gnu.exe && \
+    cp target/x86_64-pc-windows-gnu/release/gigacode.exe /artifacts/gigacode-x86_64-pc-windows-gnu.exe

 # Default command to show help
 CMD ["ls", "-la", "/artifacts"]
--- a/docker/runtime/Dockerfile
+++ b/docker/runtime/Dockerfile
@ -1,5 +1,40 @@
 # syntax=docker/dockerfile:1.10.0

+# ============================================================================
+# Build inspector frontend
+# ============================================================================
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+# Copy package files for workspaces
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+# Install dependencies
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+# Copy SDK source (with pre-generated types from docs/openapi.json)
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+# Build cli-shared, acp-http-client, SDK, then persist-indexeddb and react (depends on SDK)
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+# Copy inspector source and build
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
 # ============================================================================
 # AMD64 Builder - Uses cross-tools musl toolchain
 # ============================================================================
@ -59,10 +94,13 @@ ENV OPENSSL_DIR=/musl \
 WORKDIR /build
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target x86_64-unknown-linux-musl && \
+    cargo build -p sandbox-agent --release --target x86_64-unknown-linux-musl && \
    cp target/x86_64-unknown-linux-musl/release/sandbox-agent /sandbox-agent

 # ============================================================================
@ -90,10 +128,13 @@ ENV CARGO_INCREMENTAL=0 \
 WORKDIR /build
 COPY . .

+# Copy pre-built inspector frontend
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
 RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/usr/local/cargo/git \
    --mount=type=cache,target=/build/target \
-    SANDBOX_AGENT_SKIP_INSPECTOR=1 cargo build -p sandbox-agent --release --target aarch64-unknown-linux-musl && \
+    cargo build -p sandbox-agent --release --target aarch64-unknown-linux-musl && \
    cp target/aarch64-unknown-linux-musl/release/sandbox-agent /sandbox-agent

 # ============================================================================
@ -108,7 +149,8 @@ FROM debian:bookworm-slim
 RUN apt-get update && apt-get install -y \
    ca-certificates \
    curl \
-    git && \
+    git \
+    ffmpeg && \
    rm -rf /var/lib/apt/lists/*

 # Copy the binary from builder
@ -123,4 +165,4 @@ WORKDIR /home/sandbox
 EXPOSE 2468

 ENTRYPOINT ["sandbox-agent"]
-CMD ["--host", "0.0.0.0", "--port", "2468"]
+CMD ["server", "--host", "0.0.0.0", "--port", "2468"]
--- a/docker/runtime/Dockerfile.full
+++ b/docker/runtime/Dockerfile.full
@ -0,0 +1,159 @@
+# syntax=docker/dockerfile:1.10.0
+
+# ============================================================================
+# Build inspector frontend
+# ============================================================================
+FROM node:22-alpine AS inspector-build
+WORKDIR /app
+RUN npm install -g pnpm
+
+COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
+COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
+COPY sdks/cli-shared/package.json ./sdks/cli-shared/
+COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
+COPY sdks/react/package.json ./sdks/react/
+COPY sdks/typescript/package.json ./sdks/typescript/
+
+RUN pnpm install --filter @sandbox-agent/inspector...
+
+COPY docs/openapi.json ./docs/
+COPY sdks/cli-shared ./sdks/cli-shared
+COPY sdks/acp-http-client ./sdks/acp-http-client
+COPY sdks/react ./sdks/react
+COPY sdks/typescript ./sdks/typescript
+
+RUN cd sdks/cli-shared && pnpm exec tsup
+RUN cd sdks/acp-http-client && pnpm exec tsup
+RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
+RUN cd sdks/react && pnpm exec tsup
+
+COPY frontend/packages/inspector ./frontend/packages/inspector
+RUN cd frontend/packages/inspector && pnpm exec vite build
+
+# ============================================================================
+# AMD64 Builder - Uses cross-tools musl toolchain
+# ============================================================================
+FROM --platform=linux/amd64 rust:1.88.0 AS builder-amd64
+
+ENV DEBIAN_FRONTEND=noninteractive
+
+RUN apt-get update && apt-get install -y \
+    musl-tools \
+    musl-dev \
+    llvm-14-dev \
+    libclang-14-dev \
+    clang-14 \
+    libssl-dev \
+    pkg-config \
+    ca-certificates \
+    g++ \
+    g++-multilib \
+    git \
+    curl \
+    wget && \
+    rm -rf /var/lib/apt/lists/*
+
+RUN wget -q https://github.com/cross-tools/musl-cross/releases/latest/download/x86_64-unknown-linux-musl.tar.xz && \
+    tar -xf x86_64-unknown-linux-musl.tar.xz -C /opt/ && \
+    rm x86_64-unknown-linux-musl.tar.xz && \
+    rustup target add x86_64-unknown-linux-musl
+
+ENV PATH="/opt/x86_64-unknown-linux-musl/bin:$PATH" \
+    LIBCLANG_PATH=/usr/lib/llvm-14/lib \
+    CLANG_PATH=/usr/bin/clang-14 \
+    CC_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-gcc \
+    CXX_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-g++ \
+    AR_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-ar \
+    CARGO_TARGET_X86_64_UNKNOWN_LINUX_MUSL_LINKER=x86_64-unknown-linux-musl-gcc \
+    CARGO_INCREMENTAL=0 \
+    CARGO_NET_GIT_FETCH_WITH_CLI=true
+
+ENV SSL_VER=1.1.1w
+RUN wget https://www.openssl.org/source/openssl-$SSL_VER.tar.gz && \
+    tar -xzf openssl-$SSL_VER.tar.gz && \
+    cd openssl-$SSL_VER && \
+    ./Configure no-shared no-async --prefix=/musl --openssldir=/musl/ssl linux-x86_64 && \
+    make -j$(nproc) && \
+    make install_sw && \
+    cd .. && \
+    rm -rf openssl-$SSL_VER*
+
+ENV OPENSSL_DIR=/musl \
+    OPENSSL_INCLUDE_DIR=/musl/include \
+    OPENSSL_LIB_DIR=/musl/lib \
+    PKG_CONFIG_ALLOW_CROSS=1 \
+    RUSTFLAGS="-C target-feature=+crt-static -C link-arg=-static-libgcc"
+
+WORKDIR /build
+COPY . .
+
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent --release --target x86_64-unknown-linux-musl && \
+    cp target/x86_64-unknown-linux-musl/release/sandbox-agent /sandbox-agent
+
+# ============================================================================
+# ARM64 Builder - Uses Alpine with native musl
+# ============================================================================
+FROM --platform=linux/arm64 rust:1.88-alpine AS builder-arm64
+
+RUN apk add --no-cache \
+    musl-dev \
+    clang \
+    llvm-dev \
+    openssl-dev \
+    openssl-libs-static \
+    pkgconfig \
+    git \
+    curl \
+    build-base
+
+RUN rustup target add aarch64-unknown-linux-musl
+
+ENV CARGO_INCREMENTAL=0 \
+    CARGO_NET_GIT_FETCH_WITH_CLI=true \
+    RUSTFLAGS="-C target-feature=+crt-static"
+
+WORKDIR /build
+COPY . .
+
+COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
+
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent --release --target aarch64-unknown-linux-musl && \
+    cp target/aarch64-unknown-linux-musl/release/sandbox-agent /sandbox-agent
+
+# ============================================================================
+# Select the appropriate builder based on target architecture
+# ============================================================================
+ARG TARGETARCH
+FROM builder-${TARGETARCH} AS builder
+
+# Runtime stage - full image with all supported agents preinstalled
+FROM node:22-bookworm-slim
+
+RUN apt-get update && apt-get install -y \
+    bash \
+    ca-certificates \
+    curl \
+    git && \
+    rm -rf /var/lib/apt/lists/*
+
+COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
+RUN chmod +x /usr/local/bin/sandbox-agent
+
+RUN useradd -m -s /bin/bash sandbox
+USER sandbox
+WORKDIR /home/sandbox
+
+RUN sandbox-agent install-agent --all
+
+EXPOSE 2468
+
+ENTRYPOINT ["sandbox-agent"]
+CMD ["server", "--host", "0.0.0.0", "--port", "2468"]
--- a/docker/test-agent/Dockerfile
+++ b/docker/test-agent/Dockerfile
@ -0,0 +1,61 @@
+FROM rust:1.88.0-bookworm AS builder
+WORKDIR /build
+
+COPY Cargo.toml Cargo.lock ./
+COPY server/ ./server/
+COPY gigacode/ ./gigacode/
+COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/
+COPY scripts/agent-configs/ ./scripts/agent-configs/
+COPY scripts/audit-acp-deps/ ./scripts/audit-acp-deps/
+
+ENV SANDBOX_AGENT_SKIP_INSPECTOR=1
+
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent --release && \
+    cp target/release/sandbox-agent /sandbox-agent
+
+# Extract neko binary from the official image for WebRTC desktop streaming.
+# Using neko v3 base image from GHCR which provides multi-arch support (amd64, arm64).
+# Pinned by digest to prevent breaking changes from upstream.
+# Reference client: https://github.com/demodesk/neko-client/blob/37f93eae6bd55b333c94bd009d7f2b079075a026/src/component/internal/webrtc.ts
+FROM ghcr.io/m1k1o/neko/base@sha256:0c384afa56268aaa2d5570211d284763d0840dcdd1a7d9a24be3081d94d3dfce AS neko-base
+
+FROM node:22-bookworm-slim
+RUN apt-get update -qq && \
+    apt-get install -y -qq --no-install-recommends \
+      ca-certificates \
+      bash \
+      libstdc++6 \
+      xvfb \
+      openbox \
+      xdotool \
+      imagemagick \
+      ffmpeg \
+      gstreamer1.0-tools \
+      gstreamer1.0-plugins-base \
+      gstreamer1.0-plugins-good \
+      gstreamer1.0-plugins-bad \
+      gstreamer1.0-plugins-ugly \
+      gstreamer1.0-nice \
+      gstreamer1.0-x \
+      gstreamer1.0-pulseaudio \
+      libxcvt0 \
+      x11-xserver-utils \
+      dbus-x11 \
+      xauth \
+      fonts-dejavu-core \
+      xterm \
+      > /dev/null 2>&1 && \
+    rm -rf /var/lib/apt/lists/*
+
+COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
+COPY --from=neko-base /usr/bin/neko /usr/local/bin/neko
+
+EXPOSE 3000
+# Expose UDP port range for WebRTC media transport
+EXPOSE 59050-59070/udp
+
+ENTRYPOINT ["/usr/local/bin/sandbox-agent"]
+CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"]
--- a/docker/test-common-software/Dockerfile
+++ b/docker/test-common-software/Dockerfile
@ -0,0 +1,37 @@
+# Extends the base test-agent image with common software pre-installed.
+# Used by the common_software integration test to verify that all documented
+# software in docs/common-software.mdx works correctly inside the sandbox.
+#
+# KEEP IN SYNC with docs/common-software.mdx
+
+ARG BASE_IMAGE=sandbox-agent-test:dev
+FROM ${BASE_IMAGE}
+
+USER root
+
+RUN apt-get update -qq && \
+    apt-get install -y -qq --no-install-recommends \
+      # Browsers
+      chromium \
+      firefox-esr \
+      # Languages
+      python3 python3-pip python3-venv \
+      default-jdk \
+      ruby-full \
+      # Databases
+      sqlite3 \
+      redis-server \
+      # Build tools
+      build-essential cmake pkg-config \
+      # CLI tools
+      git jq tmux \
+      # Media and graphics
+      imagemagick \
+      poppler-utils \
+      # Desktop apps
+      gimp \
+      > /dev/null 2>&1 && \
+    rm -rf /var/lib/apt/lists/*
+
+ENTRYPOINT ["/usr/local/bin/sandbox-agent"]
+CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"]
--- a/docs/agent-compatibility.mdx
+++ b/docs/agent-compatibility.mdx
@ -1,28 +0,0 @@
---
-title: "Agent Compatibility"
-description: "Supported agents, install methods, and streaming formats."
---
-
-## Compatibility matrix
-
-| Agent | Provider | Binary | Install method | Session ID | Streaming format |
-|-------|----------|--------|----------------|------------|------------------|
-| Claude Code | Anthropic | `claude` | curl raw binary from GCS | `session_id` | JSONL via stdout |
-| Codex | OpenAI | `codex` | curl tarball from GitHub releases | `thread_id` | JSON-RPC over stdio |
-| OpenCode | Multi-provider | `opencode` | curl tarball from GitHub releases | `session_id` | SSE or JSONL |
-| Amp | Sourcegraph | `amp` | curl raw binary from GCS | `session_id` | JSONL via stdout |
-| Mock | Built-in | — | bundled | `mock-*` | daemon-generated |
-
-## Agent modes
-
- **OpenCode**: discovered via the server API.
- **Claude Code / Codex / Amp**: hardcoded modes (typically `build`, `plan`, or `custom`).
-
-## Capability notes
-
- **Questions / permissions**: OpenCode natively supports these workflows. Claude plan approval is normalized into a question event (tests do not currently exercise Claude question/permission flows).
- **Streaming**: all agents stream events; OpenCode uses SSE, Codex uses JSON-RPC over stdio, others use JSONL. Codex is currently normalized to thread/turn starts plus user/assistant completed items (deltas and tool/reasoning items are not emitted yet).
- **User messages**: Claude CLI output does not include explicit user-message events in our snapshots, so only assistant messages are surfaced for Claude today.
- **Files and images**: normalized via `UniversalMessagePart` with `File` and `Image` parts.
-
-See [Universal API](/universal-api) for feature coverage details.
--- a/docs/agent-sessions.mdx
+++ b/docs/agent-sessions.mdx
@ -0,0 +1,268 @@
+---
+title: "Agent Sessions"
+description: "Create sessions, prompt agents, and inspect event history."
+sidebarTitle: "Sessions"
+icon: "comments"
+---
+
+Sessions are the unit of interaction with an agent. Create one session per task, send prompts, and consume event history.
+
+For SDK-based flows, sessions can be restored after runtime/session loss when persistence is enabled.
+See [Session Restoration](/session-restoration).
+
+## Create a session
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+const session = await sdk.createSession({
+  agent: "codex",
+  cwd: "/",
+});
+
+console.log(session.id, session.agentSessionId);
+```
+
+## Send a prompt
+
+```ts
+const response = await session.prompt([
+  { type: "text", text: "Summarize the repository structure." },
+]);
+
+console.log(response.stopReason);
+```
+
+## Subscribe to live events
+
+```ts
+const unsubscribe = session.onEvent((event) => {
+  console.log(event.eventIndex, event.sender, event.payload);
+});
+
+await session.prompt([
+  { type: "text", text: "Explain the main entrypoints." },
+]);
+
+unsubscribe();
+```
+
+### Event types
+
+Each event's `payload` contains a session update. The `sessionUpdate` field identifies the type.
+
+<AccordionGroup>
+<Accordion title="agent_message_chunk">
+Streamed text or content from the agent's response.
+
+```json
+{
+  "sessionUpdate": "agent_message_chunk",
+  "content": { "type": "text", "text": "Here's how the repository is structured..." }
+}
+```
+</Accordion>
+
+<Accordion title="agent_thought_chunk">
+Internal reasoning from the agent (chain-of-thought / extended thinking).
+
+```json
+{
+  "sessionUpdate": "agent_thought_chunk",
+  "content": { "type": "text", "text": "I should start by looking at the project structure..." }
+}
+```
+</Accordion>
+
+<Accordion title="user_message_chunk">
+Echo of the user's prompt being processed.
+
+```json
+{
+  "sessionUpdate": "user_message_chunk",
+  "content": { "type": "text", "text": "Summarize the repository structure." }
+}
+```
+</Accordion>
+
+<Accordion title="tool_call">
+The agent invoked a tool (file edit, terminal command, etc.).
+
+```json
+{
+  "sessionUpdate": "tool_call",
+  "toolCallId": "tc_abc123",
+  "title": "Read file",
+  "status": "in_progress",
+  "rawInput": { "path": "/src/index.ts" }
+}
+```
+</Accordion>
+
+<Accordion title="tool_call_update">
+Progress or result update for an in-progress tool call.
+
+```json
+{
+  "sessionUpdate": "tool_call_update",
+  "toolCallId": "tc_abc123",
+  "status": "completed",
+  "content": [{ "type": "text", "text": "import express from 'express';\n..." }]
+}
+```
+</Accordion>
+
+<Accordion title="plan">
+The agent's execution plan for the current task.
+
+```json
+{
+  "sessionUpdate": "plan",
+  "entries": [
+    { "content": "Read the project structure", "status": "completed" },
+    { "content": "Identify main entrypoints", "status": "in_progress" },
+    { "content": "Write summary", "status": "pending" }
+  ]
+}
+```
+</Accordion>
+
+<Accordion title="usage_update">
+Token usage metrics for the current turn.
+
+```json
+{
+  "sessionUpdate": "usage_update"
+}
+```
+</Accordion>
+
+<Accordion title="session_info_update">
+Session metadata changed (e.g. agent-generated title).
+
+```json
+{
+  "sessionUpdate": "session_info_update",
+  "title": "Repository structure analysis"
+}
+```
+</Accordion>
+</AccordionGroup>
+
+## Fetch persisted event history
+
+```ts
+const page = await sdk.getEvents({
+  sessionId: session.id,
+  limit: 50,
+});
+
+for (const event of page.items) {
+  console.log(event.id, event.createdAt, event.sender);
+}
+```
+
+## List and load sessions
+
+```ts
+const sessions = await sdk.listSessions({ limit: 20 });
+
+for (const item of sessions.items) {
+  console.log(item.id, item.agent, item.createdAt);
+}
+
+if (sessions.items.length > 0) {
+  const loaded = await sdk.resumeSession(sessions.items[0]!.id);
+  await loaded.prompt([{ type: "text", text: "Continue." }]);
+}
+```
+
+## Configure model, mode, and thought level
+
+Set the model, mode, or thought level on a session at creation time or after:
+
+```ts
+// At creation time
+const session = await sdk.createSession({
+  agent: "codex",
+  model: "gpt-5.3-codex",
+  mode: "auto",
+  thoughtLevel: "high",
+});
+```
+
+```ts
+// After creation
+await session.setModel("gpt-5.2-codex");
+await session.setMode("full-access");
+await session.setThoughtLevel("medium");
+```
+
+Query available modes:
+
+```ts
+const modes = await session.getModes();
+console.log(modes?.currentModeId, modes?.availableModes);
+```
+
+### Advanced config options
+
+For config options beyond model, mode, and thought level, use `getConfigOptions` to discover what the agent supports and `setConfigOption` to set any option by ID:
+
+```ts
+const options = await session.getConfigOptions();
+for (const opt of options) {
+  console.log(opt.id, opt.category, opt.type);
+}
+```
+
+```ts
+await session.setConfigOption("some-agent-option", "value");
+```
+
+## Handle permission requests
+
+For agents that request tool-use permissions, register a permission listener and reply with `once`, `always`, or `reject`:
+
+```ts
+const session = await sdk.createSession({
+  agent: "claude",
+  mode: "default",
+});
+
+session.onPermissionRequest((request) => {
+  console.log(request.toolCall.title, request.availableReplies);
+  void session.respondPermission(request.id, "once");
+});
+
+await session.prompt([
+  { type: "text", text: "Create ./permission-example.txt with the text hello." },
+]);
+```
+
+
+### Auto-approving permissions
+
+To auto-approve all permission requests, respond with `"once"` or `"always"` in your listener:
+
+```ts
+session.onPermissionRequest((request) => {
+  void session.respondPermission(request.id, "always");
+});
+```
+
+See `examples/permissions/src/index.ts` for a complete permissions example that works with Claude and Codex.
+
+<Info>
+Some agents like Claude allow configuring permission behavior through modes (e.g. `bypassPermissions`, `acceptEdits`). We recommend leaving the mode as `default` and handling permission decisions explicitly in `onPermissionRequest` instead.
+</Info>
+
+## Destroy a session
+
+```ts
+await sdk.destroySession(session.id);
+```
--- a/docs/agents/amp.mdx
+++ b/docs/agents/amp.mdx
@ -0,0 +1,20 @@
+---
+title: "Amp"
+description: "Use Amp as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "amp",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | `amp-default` |
+| **Modes** | `default`, `bypass` |
+| **Thought levels** | Unsupported |
--- a/docs/agents/claude.mdx
+++ b/docs/agents/claude.mdx
@ -0,0 +1,49 @@
+---
+title: "Claude"
+description: "Use Claude Code as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "claude",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | `default`, `sonnet`, `opus`, `haiku` |
+| **Modes** | `default`, `acceptEdits`, `plan`, `dontAsk`, `bypassPermissions` |
+| **Thought levels** | Unsupported |
+
+## Configuring effort level
+
+Claude does not support changing effort level after a session starts. Configure it in the filesystem before creating the session.
+
+```ts
+import { mkdir, writeFile } from "node:fs/promises";
+import path from "node:path";
+
+const cwd = "/path/to/workspace";
+await mkdir(path.join(cwd, ".claude"), { recursive: true });
+await writeFile(
+  path.join(cwd, ".claude", "settings.json"),
+  JSON.stringify({ effortLevel: "high" }, null, 2),
+);
+
+const session = await client.createSession({
+  agent: "claude",
+  cwd,
+});
+```
+
+<Accordion title="Supported settings file locations (highest precedence last)">
+
+1. `~/.claude/settings.json`
+2. `<session cwd>/.claude/settings.json`
+3. `<session cwd>/.claude/settings.local.json`
+
+</Accordion>
--- a/docs/agents/codex.mdx
+++ b/docs/agents/codex.mdx
@ -0,0 +1,20 @@
+---
+title: "Codex"
+description: "Use OpenAI Codex as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "codex",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | `gpt-5.3-codex` (default), `gpt-5.3-codex-spark`, `gpt-5.2-codex`, `gpt-5.1-codex-max`, `gpt-5.2`, `gpt-5.1-codex-mini` |
+| **Modes** | `read-only` (default), `auto`, `full-access` |
+| **Thought levels** | `low`, `medium`, `high` (default), `xhigh` |
--- a/docs/agents/cursor.mdx
+++ b/docs/agents/cursor.mdx
@ -0,0 +1,34 @@
+---
+title: "Cursor"
+description: "Use Cursor as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "cursor",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | See below |
+| **Modes** | Unsupported |
+| **Thought levels** | Unsupported |
+
+<Accordion title="All models">
+
+| Group | Models |
+|-------|--------|
+| **Auto** | `auto` |
+| **Composer** | `composer-1.5`, `composer-1` |
+| **GPT-5.3 Codex** | `gpt-5.3-codex`, `gpt-5.3-codex-low`, `gpt-5.3-codex-high`, `gpt-5.3-codex-xhigh`, `gpt-5.3-codex-fast`, `gpt-5.3-codex-low-fast`, `gpt-5.3-codex-high-fast`, `gpt-5.3-codex-xhigh-fast` |
+| **GPT-5.2** | `gpt-5.2`, `gpt-5.2-high`, `gpt-5.2-codex`, `gpt-5.2-codex-low`, `gpt-5.2-codex-high`, `gpt-5.2-codex-xhigh`, `gpt-5.2-codex-fast`, `gpt-5.2-codex-low-fast`, `gpt-5.2-codex-high-fast`, `gpt-5.2-codex-xhigh-fast` |
+| **GPT-5.1** | `gpt-5.1-high`, `gpt-5.1-codex-max`, `gpt-5.1-codex-max-high` |
+| **Claude** | `opus-4.6-thinking` (default), `opus-4.6`, `opus-4.5`, `opus-4.5-thinking`, `sonnet-4.5`, `sonnet-4.5-thinking` |
+| **Other** | `gemini-3-pro`, `gemini-3-flash`, `grok` |
+
+</Accordion>
--- a/docs/agents/opencode.mdx
+++ b/docs/agents/opencode.mdx
@ -0,0 +1,31 @@
+---
+title: "OpenCode"
+description: "Use OpenCode as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "opencode",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | See below |
+| **Modes** | `build` (default), `plan` |
+| **Thought levels** | Unsupported |
+
+<Accordion title="All models">
+
+| Provider | Models |
+|----------|--------|
+| **Anthropic** | `anthropic/claude-3-5-haiku-20241022`, `anthropic/claude-3-5-haiku-latest`, `anthropic/claude-3-5-sonnet-20240620`, `anthropic/claude-3-5-sonnet-20241022`, `anthropic/claude-3-7-sonnet-20250219`, `anthropic/claude-3-7-sonnet-latest`, `anthropic/claude-3-haiku-20240307`, `anthropic/claude-3-opus-20240229`, `anthropic/claude-3-sonnet-20240229`, `anthropic/claude-haiku-4-5`, `anthropic/claude-haiku-4-5-20251001`, `anthropic/claude-opus-4-0`, `anthropic/claude-opus-4-1`, `anthropic/claude-opus-4-1-20250805`, `anthropic/claude-opus-4-20250514`, `anthropic/claude-opus-4-5`, `anthropic/claude-opus-4-5-20251101`, `anthropic/claude-opus-4-6`, `anthropic/claude-sonnet-4-0`, `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-sonnet-4-5`, `anthropic/claude-sonnet-4-5-20250929` |
+| **OpenAI** | `openai/gpt-5.1-codex`, `openai/gpt-5.1-codex-max`, `openai/gpt-5.1-codex-mini`, `openai/gpt-5.2`, `openai/gpt-5.2-codex`, `openai/gpt-5.3-codex` |
+| **Cerebras** | `cerebras/gpt-oss-120b`, `cerebras/qwen-3-235b-a22b-instruct-2507`, `cerebras/zai-glm-4.7` |
+| **OpenCode Zen** | `opencode/big-pickle`, `opencode/claude-3-5-haiku`, `opencode/claude-haiku-4-5`, `opencode/claude-opus-4-1`, `opencode/claude-opus-4-5`, `opencode/claude-opus-4-6`, `opencode/claude-sonnet-4`, `opencode/claude-sonnet-4-5`, `opencode/gemini-3-flash`, `opencode/gemini-3-pro` (default), `opencode/glm-4.6`, `opencode/glm-4.7`, `opencode/gpt-5`, `opencode/gpt-5-codex`, `opencode/gpt-5-nano`, `opencode/gpt-5.1`, `opencode/gpt-5.1-codex`, `opencode/gpt-5.1-codex-max`, `opencode/gpt-5.1-codex-mini`, `opencode/gpt-5.2`, `opencode/gpt-5.2-codex`, `opencode/kimi-k2`, `opencode/kimi-k2-thinking`, `opencode/kimi-k2.5`, `opencode/kimi-k2.5-free`, `opencode/minimax-m2.1`, `opencode/minimax-m2.1-free`, `opencode/trinity-large-preview-free` |
+
+</Accordion>
--- a/docs/agents/pi.mdx
+++ b/docs/agents/pi.mdx
@ -0,0 +1,20 @@
+---
+title: "Pi"
+description: "Use Pi as a sandbox agent."
+---
+
+## Usage
+
+```typescript
+const session = await client.createSession({
+  agent: "pi",
+});
+```
+
+## Capabilities
+
+| Category | Values |
+|----------|--------|
+| **Models** | `default` |
+| **Modes** | Unsupported |
+| **Thought levels** | Unsupported |
--- a/docs/ai/llms-txt.mdx
+++ b/docs/ai/llms-txt.mdx
@ -8,8 +8,8 @@ Mintlify publishes `llms.txt` and `llms-full.txt` for this documentation site.
 Access them at:

 ```
-https://rivet.dev/docs/llms.txt
-https://rivet.dev/docs/llms-full.txt
+https://sandboxagent.dev/docs/llms.txt
+https://sandboxagent.dev/docs/llms-full.txt
 ```

 If you run a reverse proxy in front of the docs, forward `/llms.txt` and `/llms-full.txt` to Mintlify.
--- a/docs/ai/skill.mdx
+++ b/docs/ai/skill.mdx
@ -8,14 +8,23 @@ Mintlify hosts a `skill.md` file for this documentation site.
 Access it at:

 ```
-https://rivet.dev/docs/skill.md
+https://sandboxagent.dev/docs/skill.md
 ```

 To add it to an agent using the Skills CLI:

-```
-npx skills add rivet.dev/docs/skill.md
-```
+<Tabs>
+  <Tab title="npx">
+    ```bash
+    npx skills add rivet-dev/skills -s sandbox-agent
+    ```
+  </Tab>
+  <Tab title="bunx">
+    ```bash
+    bunx skills add rivet-dev/skills -s sandbox-agent
+    ```
+  </Tab>
+</Tabs>

 If you run a reverse proxy in front of the docs, make sure `/skill.md` and `/.well-known/skills/*`
 are forwarded to Mintlify.
--- a/docs/architecture.mdx
+++ b/docs/architecture.mdx
@ -1,350 +1,63 @@
 ---
 title: "Architecture"
-description: "How the daemon, schemas, and agents fit together."
+description: "How the Sandbox Agent server, SDK, and agent processes fit together."
 ---

-Sandbox Agent SDK is built around a single daemon that runs inside the sandbox and exposes a universal HTTP API. Clients use the API (or the TypeScript SDK / CLI) to create sessions, send messages, and stream events.
+Sandbox Agent is a lightweight HTTP server that runs **inside** a sandbox. It:
+
+- **Agent management**: Installs, spawns, and stops coding agent processes
+- **Sessions**: Routes prompts to agents and streams events back in real time
+- **Sandbox APIs**: Filesystem, process, and terminal access for the sandbox environment

 ## Components

- **Daemon**: Rust HTTP server that manages agent processes and streaming.
- **Universal schema**: Shared input/output types for messages and events.
- **SDKs & CLI**: Convenience wrappers around the HTTP API.
+```mermaid
+flowchart LR
+    CLIENT["Your App"]

-## Agent Schema Pipeline
+    subgraph SANDBOX["Sandbox"]
+        direction TB
+        SERVER["Sandbox Agent Server"]
+        AGENT["Agent Process<br/>(Claude, Codex, etc.)"]
+        SERVER --> AGENT
+    end

-The schema pipeline extracts type definitions from AI coding agents and converts them to a universal format.
-
-### Schema Extraction
-
-TypeScript extractors in `resources/agent-schemas/src/` pull schemas from each agent:
-
-| Agent | Source | Extractor |
-|-------|--------|-----------|
-| Claude | `claude --output-format json --json-schema` | `claude.ts` |
-| Codex | `codex app-server generate-json-schema` | `codex.ts` |
-| OpenCode | GitHub OpenAPI spec | `opencode.ts` |
-| Amp | Scrapes ampcode.com docs | `amp.ts` |
-
-All extractors include fallback schemas for when CLIs or URLs are unavailable.
-
-**Output:** JSON schemas written to `resources/agent-schemas/artifacts/json-schema/`
-
-### Rust Type Generation
-
-The `server/packages/extracted-agent-schemas/` package generates Rust types at build time:
-
- `build.rs` reads JSON schemas and uses the `typify` crate to generate Rust structs
- Generated code is written to `$OUT_DIR/{agent}.rs`
- Types are exposed via `include!()` macros in `src/lib.rs`
-
-```
-resources/agent-schemas/artifacts/json-schema/*.json
-        ↓ (build.rs + typify)
-$OUT_DIR/{claude,codex,opencode,amp}.rs
-        ↓ (include!)
-extracted_agent_schemas::{claude,codex,opencode,amp}::*
+    CLIENT -->|"SDK (HTTP)"| SERVER
 ```

-### Universal Schema
+- **Your app**: Uses the `sandbox-agent` TypeScript SDK to talk to the server over HTTP.
+- **Sandbox**: An isolated runtime (local process, Docker, E2B, Daytona, Vercel, Cloudflare).
+- **Sandbox Agent server**: A single binary inside the sandbox that manages agent lifecycles, routes prompts, streams events, and exposes filesystem/process/terminal APIs.
+- **Agent process**: A coding agent (Claude Code, Codex, etc.) spawned by the server. Each session maps to one agent process.

-The `server/packages/universal-agent-schema/` package defines agent-agnostic types:
+## What `SandboxAgent.start()` does

-**Core types** (`src/lib.rs`):
- `UniversalEvent` - Wrapper with id, timestamp, session_id, agent, data
- `UniversalEventData` - Enum: Message, Started, Error, QuestionAsked, PermissionAsked, Unknown
- `UniversalMessage` - Parsed (role, parts, metadata) or Unparsed (raw JSON)
- `UniversalMessagePart` - Text, ToolCall, ToolResult, FunctionCall, FunctionResult, File, Image, Error, Unknown
+1. **Provision**: The provider creates a sandbox (starts a container, creates a VM, etc.)
+2. **Install**: The Sandbox Agent binary is installed inside the sandbox
+3. **Boot**: The server starts listening on an HTTP port
+4. **Health check**: The SDK waits for `/v1/health` to respond
+5. **Ready**: The SDK returns a connected client

-**Converters** (`src/agents/{claude,codex,opencode,amp}.rs`):
- Each agent has a converter module that transforms native events to universal format
- Conversions are best-effort; unparseable data preserved in `Unparsed` or `Unknown` variants
+For the `local` provider, provisioning is a no-op and the server runs as a local subprocess.

-## Session Management
+### Server recovery

-Sessions track agent conversations with in-memory state.
+If the server process stops, the SDK automatically calls the provider's `ensureServer()` after 3 consecutive health-check failures. Most built-in providers implement this. Custom providers can add `ensureServer(sandboxId)` to their `SandboxProvider` object.

-### Session Model
+## Server HTTP API

- **Session ID**: Client-provided primary session identifier.
- **Agent session ID**: Underlying ID from the agent (thread/session). This is surfaced in events but is not the primary key.
+See the [HTTP API reference](/api-reference) for the full list of server endpoints.

-### Storage
+## Agent installation

-Sessions are stored in an in-memory `HashMap<String, SessionState>` inside `SessionManager`:
-
-```rust
-struct SessionManager {
-    sessions: Mutex<HashMap<String, SessionState>>,
-    // ...
-}
-```
-
-There is no disk persistence. Sessions are ephemeral and lost on server restart.
-
-### SessionState
-
-Each session tracks:
-
-| Field | Purpose |
-|-------|---------|
-| `session_id` | Client-provided identifier |
-| `agent` | Agent type (Claude, Codex, OpenCode, Amp) |
-| `agent_mode` | Operating mode (build, plan, custom) |
-| `permission_mode` | Permission handling (default, plan, bypass) |
-| `model` | Optional model override |
-| `events: Vec<UniversalEvent>` | Full event history |
-| `pending_questions` | Question IDs awaiting reply |
-| `pending_permissions` | Permission IDs awaiting reply |
-| `broadcaster` | Tokio broadcast channel for SSE streaming |
-| `ended` | Whether agent process has terminated |
-
-### Lifecycle
-
-```
-POST /v1/sessions/{sessionId}     Create session, auto-install agent
-        ↓
-POST /v1/sessions/{id}/messages   Spawn agent subprocess, stream output
-POST /v1/sessions/{id}/messages/stream   Post and stream a single turn
-        ↓
-GET /v1/sessions/{id}/events      Poll for new events (offset-based)
-GET /v1/sessions/{id}/events/sse  Subscribe to SSE stream
-        ↓
-POST .../questions/{id}/reply     Answer agent question
-POST .../permissions/{id}/reply   Grant/deny permission request
-        ↓
-(agent process terminates)        Session marked as ended
-```
-
-### Event Streaming
-
- Events are stored in memory per session and assigned a monotonically increasing `id`.
- `/events` returns a slice of events by offset/limit.
- `/events/sse` streams new events from the same offset semantics.
-
-When a message is sent:
-
-1. `send_message()` spawns the agent CLI as a subprocess
-2. `consume_spawn()` reads stdout/stderr line by line
-3. Each JSON line is parsed and converted via `parse_agent_line()`
-4. Events are recorded via `record_event()` which:
-   - Assigns incrementing event ID
-   - Appends to `events` vector
-   - Broadcasts to SSE subscribers
-
-## Agent Execution
-
-Each agent has a different execution model and communication pattern. There are two main architectural patterns:
-
-### Architecture Patterns
-
-**Subprocess Model (Claude, Amp):**
- New process spawned per message/turn
- Process terminates after turn completes
- Multi-turn via CLI resume flags (`--resume`, `--continue`)
- Simple but has process spawn overhead
-
-**Client/Server Model (OpenCode, Codex):**
- Single long-running server process
- Multiple sessions/threads multiplexed via RPC
- Multi-turn via server-side thread persistence
- More efficient for repeated interactions
-
-### Overview
-
-| Agent | Architecture | Binary Source | Multi-Turn Method |
-|-------|--------------|---------------|-------------------|
-| Claude Code | Subprocess (per-turn) | GCS (Anthropic) | `--resume` flag |
-| Codex | **Shared Server (JSON-RPC)** | GitHub releases | **Thread persistence** |
-| OpenCode | HTTP Server (SSE) | GitHub releases | Server-side sessions |
-| Amp | Subprocess (per-turn) | GCS (Amp) | `--continue` flag |
-
-### Claude Code
-
-Spawned as a subprocess with JSONL streaming:
+Agents are installed lazily on first use. To avoid the cold-start delay, pre-install them:

 ```bash
-claude --print --output-format stream-json --verbose \
-  [--model MODEL] [--resume SESSION_ID] \
-  [--permission-mode plan | --dangerously-skip-permissions] \
-  PROMPT
+sandbox-agent install-agent --all
 ```

- Streams JSON events to stdout, one per line
- Supports session resumption via `--resume`
- Permission modes: `--permission-mode plan` for approval workflow, `--dangerously-skip-permissions` for bypass
+The `rivetdev/sandbox-agent:0.4.2-full` Docker image ships with all agents pre-installed.

-### Codex
+## Production-ready agent orchestration

-Uses a **shared app-server process** that handles multiple sessions via JSON-RPC over stdio:
-
-```bash
-codex app-server
-```
-
-**Daemon flow:**
-1. First Codex session triggers `codex app-server` spawn
-2. Performs `initialize` / `initialized` handshake
-3. Each session creation sends `thread/start` → receives `thread_id`
-4. Messages sent via `turn/start` with `thread_id`
-5. Notifications routed back to session by `thread_id`
-
-**Key characteristics:**
- Single process handles all Codex sessions
- JSON-RPC over stdio (JSONL format)
- Thread IDs map to daemon session IDs
- Approval requests arrive as server-to-client JSON-RPC requests
- Process lifetime matches daemon lifetime (not per-turn)
-
-### OpenCode
-
-Unique architecture - runs as a **persistent HTTP server** rather than per-message subprocess:
-
-```bash
-opencode serve --port {4200-4300}
-```
-
-Then communicates via HTTP endpoints:
-
-| Endpoint | Purpose |
-|----------|---------|
-| `POST /session` | Create new session |
-| `POST /session/{id}/prompt` | Send message |
-| `GET /event/subscribe` | SSE event stream |
-| `POST /question/reply` | Answer HITL question |
-| `POST /permission/reply` | Grant/deny permission |
-
-The server is started once and reused across sessions. Events are received via Server-Sent Events (SSE) subscription.
-
-### Amp
-
-Spawned as a subprocess with dynamic flag detection:
-
-```bash
-amp [--execute|--print] [--output-format stream-json] \
-  [--model MODEL] [--continue SESSION_ID] \
-  [--dangerously-skip-permissions] PROMPT
-```
-
- **Dynamic flag detection**: Probes `--help` output to determine which flags the installed version supports
- **Fallback strategy**: If execution fails, retries with progressively simpler flag combinations
- Streams JSON events to stdout
- Supports session continuation via `--continue`
-
-### Communication Patterns
-
-**Per-turn subprocess agents (Claude, Amp):**
-1. Agent CLI spawned with appropriate flags
-2. Stdout/stderr read line-by-line
-3. Each line parsed as JSON
-4. Events converted via `parse_agent_line()` → agent-specific converter
-5. Universal events recorded and broadcast to SSE subscribers
-6. Process terminated on turn completion
-
-**Shared stdio server agent (Codex):**
-1. Single `codex app-server` process started on first session
-2. `initialize`/`initialized` handshake performed once
-3. New sessions send `thread/start`, receive `thread_id`
-4. Messages sent via `turn/start` with `thread_id`
-5. Notifications read from stdout, routed by `thread_id`
-6. Process persists across sessions and turns
-
-**HTTP server agent (OpenCode):**
-1. Server started on available port (if not running)
-2. Session created via HTTP POST
-3. Prompts sent via HTTP POST
-4. Events received via SSE subscription
-5. HITL responses forwarded via HTTP POST
-
-### Credential Handling
-
-All agents receive API keys via environment variables:
-
-| Agent | Environment Variables |
-|-------|----------------------|
-| Claude | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` |
-| Codex | `OPENAI_API_KEY`, `CODEX_API_KEY` |
-| OpenCode | `OPENAI_API_KEY` |
-| Amp | `ANTHROPIC_API_KEY` |
-
-## Human-in-the-Loop
-
-Questions and permission prompts are normalized into the universal schema:
-
- Question events surface as `questionAsked` with selectable options.
- Permission events surface as `permissionAsked` with `reply: once | always | reject`.
- Claude plan approval is normalized into a question event (approve/reject).
-
-## SDK Modes
-
-The TypeScript SDK supports two connection modes.
-
-### Embedded Mode
-
-Defined in `sdks/typescript/src/spawn.ts`:
-
-1. **Binary resolution**: Checks `SANDBOX_AGENT_BIN` env, then platform-specific npm package, then `PATH`
-2. **Port selection**: Uses provided port or finds a free one via `net.createServer()`
-3. **Token generation**: Uses provided token or generates random 24-byte hex string
-4. **Spawn**: Launches `sandbox-agent server --host <host> --port <port> --token <token>`
-5. **Health wait**: Polls `GET /v1/health` until server is ready (up to 15s timeout)
-6. **Cleanup**: On dispose, sends SIGTERM then SIGKILL if needed; also registers process exit handlers
-
-```typescript
-const handle = await spawnSandboxAgent({ log: "inherit" });
-// handle.baseUrl = "http://127.0.0.1:<port>"
-// handle.token = "<generated>"
-// handle.dispose() to cleanup
-```
-
-### Server Mode
-
-Defined in `sdks/typescript/src/client.ts`:
-
- Direct HTTP client to a remote `sandbox-agent` server
- Uses provided `baseUrl` and optional `token`
- No subprocess management
-
-```typescript
-const client = await SandboxAgent.connect({
-  baseUrl: "http://remote-server:8080",
-  token: "secret",
-});
-```
-
-### Auto-Detection
-
-`SandboxAgent` provides two factory methods:
-
-```typescript
-// Connect to existing server
-const client = await SandboxAgent.connect({
-  baseUrl: "http://remote:8080",
-});
-
-// Start embedded subprocess
-const client = await SandboxAgent.start();
-
-// With options
-const client = await SandboxAgent.start({
-  spawn: { port: 9000 },
-});
-```
-
-The `spawn` option can be:
- `true` / `false` - Enable/disable embedded mode
- `SandboxAgentSpawnOptions` - Fine-grained control over host, port, token, binary path, timeout, logging
-
-## Authentication
-
-The daemon uses a **global token** configured at startup. All HTTP and CLI operations reuse the same token and are validated against the `Authorization` header (`Bearer` or `Token`).
-
-## Key Files
-
-| Component | Path |
-|-----------|------|
-| Agent spawn/install | `server/packages/agent-management/src/agents.rs` |
-| Session routing | `server/packages/sandbox-agent/src/router.rs` |
-| Event converters | `server/packages/universal-agent-schema/src/agents/*.rs` |
-| Schema extractors | `resources/agent-schemas/src/*.ts` |
-| TypeScript SDK | `sdks/typescript/src/` |
+For production deployments, see [Orchestration Architecture](/orchestration-architecture) for recommended topology, backend requirements, and session persistence patterns.
--- a/docs/attachments.mdx
+++ b/docs/attachments.mdx
@ -0,0 +1,61 @@
+---
+title: "Attachments"
+description: "Upload files into the sandbox and reference them in prompts."
+sidebarTitle: "Attachments"
+icon: "paperclip"
+---
+
+Use the filesystem API to upload files, then include file references in prompt content.
+
+<Steps>
+  <Step title="Upload a file">
+    <CodeGroup>
+    ```ts TypeScript
+    import { SandboxAgent } from "sandbox-agent";
+    import fs from "node:fs";
+
+    const sdk = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+    });
+
+    const buffer = await fs.promises.readFile("./data.csv");
+
+    const upload = await sdk.writeFsFile(
+      { path: "./uploads/data.csv" },
+      buffer,
+    );
+
+    console.log(upload.path);
+    ```
+
+    ```bash cURL
+    curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv" \
+      --data-binary @./data.csv
+    ```
+    </CodeGroup>
+
+    The upload response returns the absolute path.
+  </Step>
+
+  <Step title="Reference the file in a prompt">
+    ```ts TypeScript
+    const session = await sdk.createSession({ agent: "mock" });
+
+    await session.prompt([
+      { type: "text", text: "Please analyze the attached CSV." },
+      {
+        type: "resource_link",
+        name: "data.csv",
+        uri: "file:///home/sandbox/uploads/data.csv",
+        mimeType: "text/csv",
+      },
+    ]);
+    ```
+  </Step>
+</Steps>
+
+## Notes
+
+- Use absolute file URIs in `resource_link` blocks.
+- If `mimeType` is omitted, the agent/runtime may infer a default.
+- Support for non-text resources depends on each agent's prompt capabilities.
--- a/docs/building-chat-ui.mdx
+++ b/docs/building-chat-ui.mdx
@ -1,167 +0,0 @@
---
-title: "Building a Chat UI"
-description: "Design a client that renders universal session events consistently across providers."
---
-
-This guide explains how to build a chat UI that works across all agents using the universal event
-stream.
-
-## High-level flow
-
-1. List agents and read their capabilities.
-2. Create a session for the selected agent.
-3. Send user messages.
-4. Subscribe to events (polling or SSE).
-5. Render items and deltas into a stable message timeline.
-
-## Use agent capabilities
-
-Capabilities tell you which features are supported for the selected agent:
-
- `tool_calls` and `tool_results` indicate tool execution events.
- `questions` and `permissions` indicate HITL flows.
- `plan_mode` indicates that the agent supports plan-only execution.
- `reasoning` and `status` indicate that the agent can emit reasoning/status content parts.
- `item_started` indicates that the agent emits `item.started` on its own; when false the daemon will emit a synthetic `item.started` immediately after sending a user message.
-
-Use these to enable or disable UI affordances (tool panels, approval buttons, etc.).
-
-## Event model
-
-Every event includes:
-
- `event_id`, `sequence`, and `time` for ordering.
- `session_id` for the universal session.
- `native_session_id` for provider-specific debugging.
- `type` with one of:
-  - `session.started`, `session.ended`
-  - `item.started`, `item.delta`, `item.completed`
-  - `permission.requested`, `permission.resolved`
-  - `question.requested`, `question.resolved`
-  - `error`, `agent.unparsed`
- `data` which holds the payload for the event type.
- `synthetic` and `source` to show daemon-generated events.
- `raw` (optional) when `include_raw=true`.
-
-## Rendering items
-
-Items are emitted in three phases:
-
- `item.started`: first snapshot of a message or tool item.
- `item.delta`: incremental updates (token streaming or synthetic deltas).
- `item.completed`: final snapshot.
-
-Recommended render flow:
-
-```ts
-type ItemState = {
-  item: UniversalItem;
-  deltas: string[];
-};
-
-const items = new Map<string, ItemState>();
-const order: string[] = [];
-
-function applyEvent(event: UniversalEvent) {
-  if (event.type === "item.started") {
-    const item = event.data.item;
-    items.set(item.item_id, { item, deltas: [] });
-    order.push(item.item_id);
-  }
-
-  if (event.type === "item.delta") {
-    const { item_id, delta } = event.data;
-    const state = items.get(item_id);
-    if (state) {
-      state.deltas.push(delta);
-    }
-  }
-
-  if (event.type === "item.completed") {
-    const item = event.data.item;
-    const state = items.get(item.item_id);
-    if (state) {
-      state.item = item;
-    }
-  }
-}
-```
-
-When rendering, combine the item content with accumulated deltas. If you receive a delta before a
-started event (should not happen), treat it as an error.
-
-## Content parts
-
-Each `UniversalItem` has `content` parts. Your UI can branch on `part.type`:
-
- `text` for normal chat text.
- `tool_call` and `tool_result` for tool execution.
- `file_ref` for file read/write/patch previews.
- `reasoning` if you display public reasoning text.
- `status` for progress updates.
- `image` for image outputs.
-
-Treat `item.kind` as the primary layout decision (message vs tool call vs system), and use content
-parts for the detailed rendering.
-
-## Questions and permissions
-
-Question and permission events are out-of-band from item flow. Render them as modal or inline UI
-blocks that must be resolved via:
-
- `POST /v1/sessions/{session_id}/questions/{question_id}/reply`
- `POST /v1/sessions/{session_id}/questions/{question_id}/reject`
- `POST /v1/sessions/{session_id}/permissions/{permission_id}/reply`
-
-If an agent does not advertise these capabilities, keep those UI controls hidden.
-
-## Error and unparsed events
-
- `error` events are structured failures from the daemon or agent.
- `agent.unparsed` indicates the provider emitted something the converter could not parse.
-
-Treat `agent.unparsed` as a hard failure in development so you can fix converters quickly.
-
-## Event ordering
-
-Prefer `sequence` for ordering. It is monotonic for a given session. The `time` field is for
-timestamps, not ordering.
-
-## Handling session end
-
-`session.ended` includes the reason and who terminated it. Disable input after a terminal event.
-
-## Optional raw payloads
-
-If you need provider-level debugging, pass `include_raw=true` when streaming or polling events
-(including one-turn streams) to receive the `raw` payload for each event.
-
-## SSE vs polling vs turn streaming
-
- SSE gives low-latency updates and simplifies streaming UIs.
- Polling is simpler to debug and works in any environment.
- Turn streaming (`POST /v1/sessions/{session_id}/messages/stream`) is a one-shot stream tied to a
-  single prompt. The stream closes automatically once the turn completes.
-
-Both yield the same event payloads.
-
-## Mock agent for UI testing
-
-Use the built-in `mock` agent to exercise UI behaviors without external credentials:
-
-```bash
-curl -X POST http://127.0.0.1:2468/v1/sessions/demo-session \
-  -H "content-type: application/json" \
-  -d '{"agent":"mock"}'
-```
-
-The mock agent sends a prompt telling you what commands it accepts. Send messages like `demo`,
-`markdown`, or `permission` to emit specific event sequences. Any other text is echoed back as an
-assistant message so you can test rendering, streaming, and approval flows on demand.
-
-## Reference implementation
-
-The [Inspector chat UI](https://github.com/rivet-dev/sandbox-agent/blob/main/frontend/packages/inspector/src/App.tsx)
-is a complete reference implementation showing how to build a chat interface using the universal event
-stream. It demonstrates session management, event rendering, item lifecycle handling, and HITL approval
-flows.
--- a/docs/cli.mdx
+++ b/docs/cli.mdx
@ -1,140 +1,298 @@
 ---
-title: "CLI"
-description: "CLI reference and server flags."
+title: "CLI Reference"
+description: "CLI reference for sandbox-agent."
+sidebarTitle: "CLI"
 ---

-The `sandbox-agent api` subcommand mirrors the HTTP API so you can script everything without writing client code.
+Global flags (available on all commands):

-## Server flags
+- `-t, --token <TOKEN>`: require/use bearer auth
+- `-n, --no-token`: disable auth
+
+## server
+
+Run the HTTP server.

 ```bash
-sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
+sandbox-agent server [OPTIONS]
 ```

- `--token`: global token for all requests.
- `--no-token`: disable auth (local dev only).
- `--host`, `--port`: bind address.
- `--cors-allow-origin`, `--cors-allow-method`, `--cors-allow-header`, `--cors-allow-credentials`: configure CORS.
- `--no-telemetry`: disable anonymous telemetry.
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host to bind |
+| `-p, --port <PORT>` | `2468` | Port to bind |
+| `-O, --cors-allow-origin <ORIGIN>` | - | Allowed CORS origin (repeatable) |
+| `-M, --cors-allow-method <METHOD>` | all | Allowed CORS method (repeatable) |
+| `-A, --cors-allow-header <HEADER>` | all | Allowed CORS header (repeatable) |
+| `-C, --cors-allow-credentials` | false | Enable CORS credentials |
+| `--no-telemetry` | false | Disable anonymous telemetry |

-## Install agent (no server required)
+```bash
+sandbox-agent server --port 3000
+```

-<details>
-<summary><strong>install-agent</strong></summary>
+Notes:
+
+- Server logs are redirected to files by default.
+- Set `SANDBOX_AGENT_LOG_STDOUT=1` to force stdout/stderr logging.
+- Use `SANDBOX_AGENT_LOG_DIR` to override log directory.
+
+## install
+
+Install first-party runtime dependencies.
+
+### install desktop
+
+Install the Linux desktop runtime packages required by `/v1/desktop/*`.
+
+```bash
+sandbox-agent install desktop [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--yes` | Skip the confirmation prompt |
+| `--print-only` | Print the package-manager command without executing it |
+| `--package-manager <apt\|dnf\|apk>` | Override package-manager detection |
+| `--no-fonts` | Skip the default DejaVu font package |
+
+```bash
+sandbox-agent install desktop --yes
+sandbox-agent install desktop --print-only
+```
+
+Notes:
+
+- Supported on Linux only.
+- The command detects `apt`, `dnf`, or `apk`.
+- If the host is not already running as root, the command requires `sudo`.
+
+## install-agent
+
+Install or reinstall a single agent, or every supported agent with `--all`.
+
+```bash
+sandbox-agent install-agent [<AGENT>] [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--all` | Install every supported agent |
+| `-r, --reinstall` | Force reinstall |
+| `--agent-version <VERSION>` | Override agent package version (conflicts with `--all`) |
+| `--agent-process-version <VERSION>` | Override agent process version (conflicts with `--all`) |
+
+Examples:

 ```bash
 sandbox-agent install-agent claude --reinstall
+sandbox-agent install-agent --all
 ```
-</details>

-## API agent commands
+### Custom Pi implementation path

-<details>
-<summary><strong>api agents list</strong></summary>
+If you use a forked/custom `pi` binary with `pi-acp`, you can override what executable gets launched.
+
+#### Option 1: explicit command override (recommended)
+
+Set `PI_ACP_PI_COMMAND` in the environment where `sandbox-agent` runs:

 ```bash
-sandbox-agent api agents list --endpoint http://127.0.0.1:2468
+PI_ACP_PI_COMMAND=/absolute/path/to/your/pi-fork sandbox-agent server
 ```
-</details>

-<details>
-<summary><strong>api agents install</strong></summary>
+This is forwarded to `pi-acp`, which uses it instead of looking up `pi` on `PATH`.
+
+#### Option 2: PATH override
+
+Put your custom `pi` first on `PATH` before starting `sandbox-agent`:

 ```bash
-sandbox-agent api agents install claude --reinstall --endpoint http://127.0.0.1:2468
+export PATH="/path/to/custom-pi-dir:$PATH"
+sandbox-agent server
 ```
-</details>

-<details>
-<summary><strong>api agents modes</strong></summary>
+#### Option 3: symlink override
+
+Point `pi` to your custom binary via symlink in a directory that is early on `PATH`:

 ```bash
-sandbox-agent api agents modes claude --endpoint http://127.0.0.1:2468
+ln -sf /absolute/path/to/your/pi-fork /usr/local/bin/pi
 ```
-</details>

-## API session commands
+Then start `sandbox-agent` normally.

-<details>
-<summary><strong>api sessions list</strong></summary>
+## opencode (experimental)
+
+Start/reuse daemon and run `opencode attach` against `/opencode`.

 ```bash
-sandbox-agent api sessions list --endpoint http://127.0.0.1:2468
+sandbox-agent opencode [OPTIONS]
 ```
-</details>

-<details>
-<summary><strong>api sessions create</strong></summary>
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Daemon host |
+| `-p, --port <PORT>` | `2468` | Daemon port |
+| `--session-title <TITLE>` | - | Reserved option (currently no-op) |
+| `--yolo` | false | OpenCode attach mode flag |

 ```bash
-sandbox-agent api sessions create my-session \
-  --agent claude \
-  --agent-mode build \
-  --permission-mode default \
-  --endpoint http://127.0.0.1:2468
+sandbox-agent opencode
 ```
-</details>

-<details>
-<summary><strong>api sessions send-message</strong></summary>
+## daemon
+
+Manage the background daemon.
+
+### daemon start

 ```bash
-sandbox-agent api sessions send-message my-session \
-  --message "Summarize the repository" \
-  --endpoint http://127.0.0.1:2468
+sandbox-agent daemon start [OPTIONS]
 ```
-</details>

-<details>
-<summary><strong>api sessions send-message-stream</strong></summary>
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+| `--upgrade` | false | Use ensure-running + upgrade behavior |

 ```bash
-sandbox-agent api sessions send-message-stream my-session \
-  --message "Summarize the repository" \
-  --endpoint http://127.0.0.1:2468
+sandbox-agent daemon start
+sandbox-agent daemon start --upgrade
 ```
-</details>

-<details>
-<summary><strong>api sessions events</strong></summary>
+### daemon stop

 ```bash
-sandbox-agent api sessions events my-session --offset 0 --limit 50 --endpoint http://127.0.0.1:2468
+sandbox-agent daemon stop [OPTIONS]
 ```
-</details>

-<details>
-<summary><strong>api sessions events-sse</strong></summary>
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+
+### daemon status

 ```bash
-sandbox-agent api sessions events-sse my-session --offset 0 --endpoint http://127.0.0.1:2468
+sandbox-agent daemon status [OPTIONS]
 ```
-</details>

-<details>
-<summary><strong>api sessions reply-question</strong></summary>
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+
+## credentials
+
+### credentials extract

 ```bash
-sandbox-agent api sessions reply-question my-session QUESTION_ID \
-  --answers "yes" \
-  --endpoint http://127.0.0.1:2468
+sandbox-agent credentials extract [OPTIONS]
 ```
-</details>

-<details>
-<summary><strong>api sessions reject-question</strong></summary>
+| Option | Description |
+|--------|-------------|
+| `-a, --agent <AGENT>` | Filter by `claude`, `codex`, `opencode`, or `amp` |
+| `-p, --provider <PROVIDER>` | Filter by provider |
+| `-d, --home-dir <DIR>` | Override home dir |
+| `--no-oauth` | Skip OAuth sources |
+| `-r, --reveal` | Show full credential values |

 ```bash
-sandbox-agent api sessions reject-question my-session QUESTION_ID --endpoint http://127.0.0.1:2468
+sandbox-agent credentials extract --agent claude --reveal
 ```
-</details>

-<details>
-<summary><strong>api sessions reply-permission</strong></summary>
+### credentials extract-env

 ```bash
-sandbox-agent api sessions reply-permission my-session PERMISSION_ID \
-  --reply once \
-  --endpoint http://127.0.0.1:2468
+sandbox-agent credentials extract-env [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `-e, --export` | Prefix output with `export` |
+| `-d, --home-dir <DIR>` | Override home dir |
+| `--no-oauth` | Skip OAuth sources |
+
+```bash
+eval "$(sandbox-agent credentials extract-env --export)"
+```
+
+## api
+
+API subcommands for scripting.
+
+Shared option:
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-e, --endpoint <URL>` | `http://127.0.0.1:2468` | Target server |
+
+### api agents
+
+```bash
+sandbox-agent api agents list [--endpoint <URL>]
+sandbox-agent api agents report [--endpoint <URL>]
+sandbox-agent api agents install <AGENT> [--reinstall] [--endpoint <URL>]
+```
+
+#### api agents list
+
+List all agents and their install status.
+
+```bash
+sandbox-agent api agents list
+```
+
+#### api agents report
+
+Emit a JSON report of available models, modes, and thought levels for every agent, grouped by category.
+
+```bash
+sandbox-agent api agents report --endpoint http://127.0.0.1:2468 | jq .
+```
+
+Example output:
+
+```json
+{
+  "generatedAtMs": 1740000000000,
+  "endpoint": "http://127.0.0.1:2468",
+  "agents": [
+    {
+      "id": "claude",
+      "installed": true,
+      "models": {
+        "currentValue": "default",
+        "values": [
+          { "value": "default", "name": "Default" },
+          { "value": "sonnet", "name": "Sonnet" },
+          { "value": "opus", "name": "Opus" },
+          { "value": "haiku", "name": "Haiku" }
+        ]
+      },
+      "modes": {
+        "currentValue": "default",
+        "values": [
+          { "value": "default", "name": "Default" },
+          { "value": "acceptEdits", "name": "Accept Edits" },
+          { "value": "plan", "name": "Plan" },
+          { "value": "dontAsk", "name": "Don't Ask" },
+          { "value": "bypassPermissions", "name": "Bypass Permissions" }
+        ]
+      },
+      "thoughtLevels": { "values": [] }
+    }
+  ]
+}
+```
+
+See individual agent pages (e.g. [Claude](/agents/claude), [Codex](/agents/codex)) for supported models, modes, and thought levels.
+
+#### api agents install
+
+```bash
+sandbox-agent api agents install codex --reinstall
 ```
-</details>
--- a/docs/common-software.mdx
+++ b/docs/common-software.mdx
@ -0,0 +1,560 @@
+---
+title: "Common Software"
+description: "Install browsers, languages, databases, and other tools inside the sandbox."
+sidebarTitle: "Common Software"
+icon: "box-open"
+---
+
+The sandbox runs a Debian/Ubuntu base image. You can install software with `apt-get` via the [Process API](/processes) or by customizing your Docker image. This page covers commonly needed packages and how to install them.
+
+## Browsers
+
+### Chromium
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "chromium", "chromium-sandbox"],
+});
+
+// Launch headless
+await sdk.runProcess({
+  command: "chromium",
+  args: ["--headless", "--no-sandbox", "--disable-gpu", "https://example.com"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","chromium","chromium-sandbox"]}'
+```
+</CodeGroup>
+
+<Note>
+Use `--no-sandbox` when running Chromium inside a container. The container itself provides isolation.
+</Note>
+
+### Firefox
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "firefox-esr"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","firefox-esr"]}'
+```
+</CodeGroup>
+
+### Playwright browsers
+
+Playwright bundles its own browser binaries. Install the Playwright CLI and let it download browsers for you.
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "npx",
+  args: ["playwright", "install", "--with-deps", "chromium"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"npx","args":["playwright","install","--with-deps","chromium"]}'
+```
+</CodeGroup>
+
+---
+
+## Languages and runtimes
+
+### Node.js
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "nodejs", "npm"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","nodejs","npm"]}'
+```
+</CodeGroup>
+
+For a specific version, use [nvm](https://github.com/nvm-sh/nvm):
+
+```ts TypeScript
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash && . ~/.nvm/nvm.sh && nvm install 22"],
+});
+```
+
+### Python
+
+Python 3 is typically pre-installed. To add pip and common packages:
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "python3", "python3-pip", "python3-venv"],
+});
+
+await sdk.runProcess({
+  command: "pip3",
+  args: ["install", "numpy", "pandas", "matplotlib"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","python3","python3-pip","python3-venv"]}'
+
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"pip3","args":["install","numpy","pandas","matplotlib"]}'
+```
+</CodeGroup>
+
+### Go
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"],
+});
+
+// Add to PATH for subsequent commands
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "export PATH=$PATH:/usr/local/go/bin && go version"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"bash","args":["-c","curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"]}'
+```
+</CodeGroup>
+
+### Rust
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"bash","args":["-c","curl --proto =https --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"]}'
+```
+</CodeGroup>
+
+### Java (OpenJDK)
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "default-jdk"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","default-jdk"]}'
+```
+</CodeGroup>
+
+### Ruby
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "ruby-full"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","ruby-full"]}'
+```
+</CodeGroup>
+
+---
+
+## Databases
+
+### PostgreSQL
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "postgresql", "postgresql-client"],
+});
+
+// Start the service
+const proc = await sdk.createProcess({
+  command: "bash",
+  args: ["-c", "su - postgres -c 'pg_ctlcluster 15 main start'"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","postgresql","postgresql-client"]}'
+```
+</CodeGroup>
+
+### SQLite
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "sqlite3"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","sqlite3"]}'
+```
+</CodeGroup>
+
+### Redis
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "redis-server"],
+});
+
+const proc = await sdk.createProcess({
+  command: "redis-server",
+  args: ["--daemonize", "no"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","redis-server"]}'
+
+curl -X POST "http://127.0.0.1:2468/v1/processes" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"redis-server","args":["--daemonize","no"]}'
+```
+</CodeGroup>
+
+### MySQL / MariaDB
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "mariadb-server", "mariadb-client"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","mariadb-server","mariadb-client"]}'
+```
+</CodeGroup>
+
+---
+
+## Build tools
+
+### Essential build toolchain
+
+Most compiled software needs the standard build toolchain:
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "build-essential", "cmake", "pkg-config"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","build-essential","cmake","pkg-config"]}'
+```
+</CodeGroup>
+
+This installs `gcc`, `g++`, `make`, `cmake`, and related tools.
+
+---
+
+## Desktop applications
+
+These require the [Computer Use](/computer-use) desktop to be started first.
+
+### LibreOffice
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "libreoffice"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","libreoffice"]}'
+```
+</CodeGroup>
+
+### GIMP
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "gimp"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","gimp"]}'
+```
+</CodeGroup>
+
+### VLC
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "vlc"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","vlc"]}'
+```
+</CodeGroup>
+
+### VS Code (code-server)
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "curl -fsSL https://code-server.dev/install.sh | sh"],
+});
+
+const proc = await sdk.createProcess({
+  command: "code-server",
+  args: ["--bind-addr", "0.0.0.0:8080", "--auth", "none"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"bash","args":["-c","curl -fsSL https://code-server.dev/install.sh | sh"]}'
+
+curl -X POST "http://127.0.0.1:2468/v1/processes" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"code-server","args":["--bind-addr","0.0.0.0:8080","--auth","none"]}'
+```
+</CodeGroup>
+
+---
+
+## CLI tools
+
+### Git
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "git"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","git"]}'
+```
+</CodeGroup>
+
+### Docker
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "bash",
+  args: ["-c", "curl -fsSL https://get.docker.com | sh"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"bash","args":["-c","curl -fsSL https://get.docker.com | sh"]}'
+```
+</CodeGroup>
+
+### jq
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "jq"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","jq"]}'
+```
+</CodeGroup>
+
+### tmux
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "tmux"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","tmux"]}'
+```
+</CodeGroup>
+
+---
+
+## Media and graphics
+
+### FFmpeg
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "ffmpeg"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","ffmpeg"]}'
+```
+</CodeGroup>
+
+### ImageMagick
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "imagemagick"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","imagemagick"]}'
+```
+</CodeGroup>
+
+### Poppler (PDF utilities)
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "poppler-utils"],
+});
+
+// Convert PDF to images
+await sdk.runProcess({
+  command: "pdftoppm",
+  args: ["-png", "document.pdf", "output"],
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","poppler-utils"]}'
+```
+</CodeGroup>
+
+---
+
+## Pre-installing in a Docker image
+
+For production use, install software in your Dockerfile instead of at runtime. This avoids repeated downloads and makes startup faster.
+
+```dockerfile
+FROM ubuntu:22.04
+
+RUN apt-get update && apt-get install -y \
+    chromium \
+    firefox-esr \
+    nodejs npm \
+    python3 python3-pip \
+    git curl wget \
+    build-essential \
+    sqlite3 \
+    ffmpeg \
+    imagemagick \
+    jq \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN pip3 install numpy pandas matplotlib
+```
+
+See [Docker deployment](/deploy/docker) for how to use custom images with Sandbox Agent.
--- a/docs/computer-use.mdx
+++ b/docs/computer-use.mdx
@ -0,0 +1,859 @@
+---
+title: "Computer Use"
+description: "Control a virtual desktop inside the sandbox with mouse, keyboard, screenshots, recordings, and live streaming."
+sidebarTitle: "Computer Use"
+icon: "desktop"
+---
+
+Sandbox Agent provides a managed virtual desktop (Xvfb + openbox) that you can control programmatically. This is useful for browser automation, GUI testing, and AI computer-use workflows.
+
+## Start and stop
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+const status = await sdk.startDesktop({
+  width: 1920,
+  height: 1080,
+  dpi: 96,
+});
+
+console.log(status.state); // "active"
+console.log(status.display); // ":99"
+
+// When done
+await sdk.stopDesktop();
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \
+  -H "Content-Type: application/json" \
+  -d '{"width":1920,"height":1080,"dpi":96}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/stop"
+```
+</CodeGroup>
+
+All fields in the start request are optional. Defaults are 1440x900 at 96 DPI.
+
+### Start request options
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `width` | number | 1440 | Desktop width in pixels |
+| `height` | number | 900 | Desktop height in pixels |
+| `dpi` | number | 96 | Display DPI |
+| `displayNum` | number | 99 | Starting X display number. The runtime probes from this number upward to find an available display. |
+| `stateDir` | string | (auto) | Desktop state directory for home, logs, recordings |
+| `streamVideoCodec` | string | `"vp8"` | WebRTC video codec (`vp8`, `vp9`, `h264`) |
+| `streamAudioCodec` | string | `"opus"` | WebRTC audio codec (`opus`, `g722`) |
+| `streamFrameRate` | number | 30 | Streaming frame rate (1-60) |
+| `webrtcPortRange` | string | `"59050-59070"` | UDP port range for WebRTC media |
+| `recordingFps` | number | 30 | Default recording FPS when not specified in `startDesktopRecording` (1-60) |
+
+The streaming and recording options configure defaults for the desktop session. They take effect when streaming or recording is started later.
+
+<CodeGroup>
+```ts TypeScript
+const status = await sdk.startDesktop({
+  width: 1920,
+  height: 1080,
+  streamVideoCodec: "h264",
+  streamFrameRate: 60,
+  webrtcPortRange: "59100-59120",
+  recordingFps: 15,
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "width": 1920,
+    "height": 1080,
+    "streamVideoCodec": "h264",
+    "streamFrameRate": 60,
+    "webrtcPortRange": "59100-59120",
+    "recordingFps": 15
+  }'
+```
+</CodeGroup>
+
+## Status
+
+<CodeGroup>
+```ts TypeScript
+const status = await sdk.getDesktopStatus();
+console.log(status.state); // "inactive" | "active" | "failed" | ...
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/status"
+```
+</CodeGroup>
+
+## Screenshots
+
+Capture the full desktop or a specific region. Optionally include the cursor position.
+
+<CodeGroup>
+```ts TypeScript
+// Full screenshot (PNG by default)
+const png = await sdk.takeDesktopScreenshot();
+
+// JPEG at 70% quality, half scale
+const jpeg = await sdk.takeDesktopScreenshot({
+  format: "jpeg",
+  quality: 70,
+  scale: 0.5,
+});
+
+// Include cursor overlay
+const withCursor = await sdk.takeDesktopScreenshot({
+  showCursor: true,
+});
+
+// Region screenshot
+const region = await sdk.takeDesktopRegionScreenshot({
+  x: 100,
+  y: 100,
+  width: 400,
+  height: 300,
+});
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/screenshot" --output screenshot.png
+
+curl "http://127.0.0.1:2468/v1/desktop/screenshot?format=jpeg&quality=70&scale=0.5" \
+  --output screenshot.jpg
+
+# Include cursor overlay
+curl "http://127.0.0.1:2468/v1/desktop/screenshot?show_cursor=true" \
+  --output with_cursor.png
+
+curl "http://127.0.0.1:2468/v1/desktop/screenshot/region?x=100&y=100&width=400&height=300" \
+  --output region.png
+```
+</CodeGroup>
+
+### Screenshot options
+
+| Param | Type | Default | Description |
+|-------|------|---------|-------------|
+| `format` | string | `"png"` | Output format: `png`, `jpeg`, or `webp` |
+| `quality` | number | 85 | Compression quality (1-100, JPEG/WebP only) |
+| `scale` | number | 1.0 | Scale factor (0.1-1.0) |
+| `showCursor` | boolean | `false` | Composite a crosshair at the cursor position |
+
+When `showCursor` is enabled, the cursor position is captured at the moment of the screenshot and a red crosshair is drawn at that location. This is useful for AI agents that need to see where the cursor is in the screenshot.
+
+## Mouse
+
+<CodeGroup>
+```ts TypeScript
+// Get current position
+const pos = await sdk.getDesktopMousePosition();
+console.log(pos.x, pos.y);
+
+// Move
+await sdk.moveDesktopMouse({ x: 500, y: 300 });
+
+// Click (left by default)
+await sdk.clickDesktop({ x: 500, y: 300 });
+
+// Right click
+await sdk.clickDesktop({ x: 500, y: 300, button: "right" });
+
+// Double click
+await sdk.clickDesktop({ x: 500, y: 300, clickCount: 2 });
+
+// Drag
+await sdk.dragDesktopMouse({
+  startX: 100, startY: 100,
+  endX: 400, endY: 400,
+});
+
+// Scroll
+await sdk.scrollDesktop({ x: 500, y: 300, deltaY: -3 });
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/mouse/position"
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/click" \
+  -H "Content-Type: application/json" \
+  -d '{"x":500,"y":300}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/drag" \
+  -H "Content-Type: application/json" \
+  -d '{"startX":100,"startY":100,"endX":400,"endY":400}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/scroll" \
+  -H "Content-Type: application/json" \
+  -d '{"x":500,"y":300,"deltaY":-3}'
+```
+</CodeGroup>
+
+## Keyboard
+
+<CodeGroup>
+```ts TypeScript
+// Type text
+await sdk.typeDesktopText({ text: "Hello, world!" });
+
+// Press a key with modifiers
+await sdk.pressDesktopKey({
+  key: "c",
+  modifiers: { ctrl: true },
+});
+
+// Low-level key down/up
+await sdk.keyDownDesktop({ key: "Shift_L" });
+await sdk.keyUpDesktop({ key: "Shift_L" });
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/type" \
+  -H "Content-Type: application/json" \
+  -d '{"text":"Hello, world!"}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/press" \
+  -H "Content-Type: application/json" \
+  -d '{"key":"c","modifiers":{"ctrl":true}}'
+```
+</CodeGroup>
+
+## Clipboard
+
+Read and write the X11 clipboard programmatically.
+
+<CodeGroup>
+```ts TypeScript
+// Read clipboard
+const clipboard = await sdk.getDesktopClipboard();
+console.log(clipboard.text);
+
+// Read primary selection (mouse-selected text)
+const primary = await sdk.getDesktopClipboard({ selection: "primary" });
+
+// Write to clipboard
+await sdk.setDesktopClipboard({ text: "Pasted via API" });
+
+// Write to both clipboard and primary selection
+await sdk.setDesktopClipboard({
+  text: "Synced text",
+  selection: "both",
+});
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/clipboard"
+
+curl "http://127.0.0.1:2468/v1/desktop/clipboard?selection=primary"
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \
+  -H "Content-Type: application/json" \
+  -d '{"text":"Pasted via API"}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \
+  -H "Content-Type: application/json" \
+  -d '{"text":"Synced text","selection":"both"}'
+```
+</CodeGroup>
+
+The `selection` parameter controls which X11 selection to read or write:
+
+| Value | Description |
+|-------|-------------|
+| `clipboard` (default) | The standard clipboard (Ctrl+C / Ctrl+V) |
+| `primary` | The primary selection (text selected with the mouse) |
+| `both` | Write to both clipboard and primary selection (write only) |
+
+## Display and windows
+
+<CodeGroup>
+```ts TypeScript
+const display = await sdk.getDesktopDisplayInfo();
+console.log(display.resolution); // { width: 1920, height: 1080, dpi: 96 }
+
+const { windows } = await sdk.listDesktopWindows();
+for (const win of windows) {
+  console.log(win.title, win.x, win.y, win.width, win.height);
+}
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/display/info"
+
+curl "http://127.0.0.1:2468/v1/desktop/windows"
+```
+</CodeGroup>
+
+The windows endpoint filters out noise automatically: window manager internals (Openbox), windows with empty titles, and tiny helper windows (under 120x80) are excluded. The currently active/focused window is always included regardless of filters.
+
+### Focused window
+
+Get the currently focused window without listing all windows.
+
+<CodeGroup>
+```ts TypeScript
+const focused = await sdk.getDesktopFocusedWindow();
+console.log(focused.title, focused.id);
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/windows/focused"
+```
+</CodeGroup>
+
+Returns 404 if no window currently has focus.
+
+### Window management
+
+Focus, move, and resize windows by their X11 window ID.
+
+<CodeGroup>
+```ts TypeScript
+const { windows } = await sdk.listDesktopWindows();
+const win = windows[0];
+
+// Bring window to foreground
+await sdk.focusDesktopWindow(win.id);
+
+// Move window
+await sdk.moveDesktopWindow(win.id, { x: 100, y: 50 });
+
+// Resize window
+await sdk.resizeDesktopWindow(win.id, { width: 1280, height: 720 });
+```
+
+```bash cURL
+# Focus a window
+curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/focus"
+
+# Move a window
+curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/move" \
+  -H "Content-Type: application/json" \
+  -d '{"x":100,"y":50}'
+
+# Resize a window
+curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/resize" \
+  -H "Content-Type: application/json" \
+  -d '{"width":1280,"height":720}'
+```
+</CodeGroup>
+
+All three endpoints return the updated window info so you can verify the operation took effect. The window manager may adjust the requested position or size.
+
+## App launching
+
+Launch applications or open files/URLs on the desktop without needing to shell out.
+
+<CodeGroup>
+```ts TypeScript
+// Launch an app by name
+const result = await sdk.launchDesktopApp({
+  app: "firefox",
+  args: ["--private"],
+});
+console.log(result.processId); // "proc_7"
+
+// Launch and wait for the window to appear
+const withWindow = await sdk.launchDesktopApp({
+  app: "xterm",
+  wait: true,
+});
+console.log(withWindow.windowId); // "12345" or null if timed out
+
+// Open a URL with the default handler
+const opened = await sdk.openDesktopTarget({
+  target: "https://example.com",
+});
+console.log(opened.processId);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \
+  -H "Content-Type: application/json" \
+  -d '{"app":"firefox","args":["--private"]}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \
+  -H "Content-Type: application/json" \
+  -d '{"app":"xterm","wait":true}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/open" \
+  -H "Content-Type: application/json" \
+  -d '{"target":"https://example.com"}'
+```
+</CodeGroup>
+
+The returned `processId` can be used with the [Process API](/processes) to read logs (`GET /v1/processes/{id}/logs`) or stop the application (`POST /v1/processes/{id}/stop`).
+
+When `wait` is `true`, the API polls for up to 5 seconds for a window to appear. If the window appears, its ID is returned in `windowId`. If it times out, `windowId` is `null` but the process is still running.
+
+<Tip>
+**Launch/Open vs the Process API:** Both `launch` and `open` are convenience wrappers around the [Process API](/processes). They create managed processes (with `owner: "desktop"`) that you can inspect, log, and stop through the same Process endpoints. The difference is that `launch` validates the binary exists in PATH first and can optionally wait for a window to appear, while `open` delegates to the system default handler (`xdg-open`). Use the Process API directly when you need full control over command, environment, working directory, or restart policies.
+</Tip>
+
+## Recording
+
+Record the desktop to MP4.
+
+<CodeGroup>
+```ts TypeScript
+const recording = await sdk.startDesktopRecording({ fps: 30 });
+console.log(recording.id);
+
+// ... do things ...
+
+const stopped = await sdk.stopDesktopRecording();
+
+// List all recordings
+const { recordings } = await sdk.listDesktopRecordings();
+
+// Download
+const mp4 = await sdk.downloadDesktopRecording(recording.id);
+
+// Clean up
+await sdk.deleteDesktopRecording(recording.id);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/start" \
+  -H "Content-Type: application/json" \
+  -d '{"fps":30}'
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/stop"
+
+curl "http://127.0.0.1:2468/v1/desktop/recordings"
+
+curl "http://127.0.0.1:2468/v1/desktop/recordings/rec_1/download" --output recording.mp4
+
+curl -X DELETE "http://127.0.0.1:2468/v1/desktop/recordings/rec_1"
+```
+</CodeGroup>
+
+## Desktop processes
+
+The desktop runtime manages several background processes (Xvfb, openbox, neko, ffmpeg). These are all registered with the general [Process API](/processes) under the `desktop` owner, so you can inspect logs, check status, and troubleshoot using the same tools you use for any other managed process.
+
+<CodeGroup>
+```ts TypeScript
+// List all processes, including desktop-owned ones
+const { processes } = await sdk.listProcesses();
+
+const desktopProcs = processes.filter((p) => p.owner === "desktop");
+for (const p of desktopProcs) {
+  console.log(p.id, p.command, p.status);
+}
+
+// Read logs from a specific desktop process
+const logs = await sdk.getProcessLogs(desktopProcs[0].id, { tail: 50 });
+for (const entry of logs.entries) {
+  console.log(entry.stream, atob(entry.data));
+}
+```
+
+```bash cURL
+# List all processes (desktop processes have owner: "desktop")
+curl "http://127.0.0.1:2468/v1/processes"
+
+# Get logs from a specific desktop process
+curl "http://127.0.0.1:2468/v1/processes/proc_1/logs?tail=50"
+```
+</CodeGroup>
+
+The desktop status endpoint also includes a summary of running processes:
+
+<CodeGroup>
+```ts TypeScript
+const status = await sdk.getDesktopStatus();
+for (const proc of status.processes) {
+  console.log(proc.name, proc.pid, proc.running);
+}
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/desktop/status"
+# Response includes: processes: [{ name: "Xvfb", pid: 123, running: true }, ...]
+```
+</CodeGroup>
+
+| Process | Role | Restart policy |
+|---------|------|---------------|
+| Xvfb | Virtual X11 framebuffer | Auto-restart while desktop is active |
+| openbox | Window manager | Auto-restart while desktop is active |
+| neko | WebRTC streaming server (started by `startDesktopStream`) | No auto-restart |
+| ffmpeg | Screen recorder (started by `startDesktopRecording`) | No auto-restart |
+
+## Live streaming
+
+Start a WebRTC stream for real-time desktop viewing in a browser.
+
+<CodeGroup>
+```ts TypeScript
+await sdk.startDesktopStream();
+
+// Check stream status
+const status = await sdk.getDesktopStreamStatus();
+console.log(status.active); // true
+console.log(status.processId); // "proc_5"
+
+// Connect via the React DesktopViewer component or
+// use the WebSocket signaling endpoint directly
+// at ws://127.0.0.1:2468/v1/desktop/stream/signaling
+
+await sdk.stopDesktopStream();
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/start"
+
+# Check stream status
+curl "http://127.0.0.1:2468/v1/desktop/stream/status"
+
+# Connect to ws://127.0.0.1:2468/v1/desktop/stream/signaling for WebRTC signaling
+
+curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/stop"
+```
+</CodeGroup>
+
+For a drop-in React component, see [React Components](/react-components).
+
+## API reference
+
+### Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/v1/desktop/start` | Start the desktop runtime |
+| `POST` | `/v1/desktop/stop` | Stop the desktop runtime |
+| `GET` | `/v1/desktop/status` | Get desktop runtime status |
+| `GET` | `/v1/desktop/screenshot` | Capture full desktop screenshot |
+| `GET` | `/v1/desktop/screenshot/region` | Capture a region screenshot |
+| `GET` | `/v1/desktop/mouse/position` | Get current mouse position |
+| `POST` | `/v1/desktop/mouse/move` | Move the mouse |
+| `POST` | `/v1/desktop/mouse/click` | Click the mouse |
+| `POST` | `/v1/desktop/mouse/down` | Press mouse button down |
+| `POST` | `/v1/desktop/mouse/up` | Release mouse button |
+| `POST` | `/v1/desktop/mouse/drag` | Drag from one point to another |
+| `POST` | `/v1/desktop/mouse/scroll` | Scroll at a position |
+| `POST` | `/v1/desktop/keyboard/type` | Type text |
+| `POST` | `/v1/desktop/keyboard/press` | Press a key with optional modifiers |
+| `POST` | `/v1/desktop/keyboard/down` | Press a key down (hold) |
+| `POST` | `/v1/desktop/keyboard/up` | Release a key |
+| `GET` | `/v1/desktop/display/info` | Get display info |
+| `GET` | `/v1/desktop/windows` | List visible windows |
+| `GET` | `/v1/desktop/windows/focused` | Get focused window info |
+| `POST` | `/v1/desktop/windows/{id}/focus` | Focus a window |
+| `POST` | `/v1/desktop/windows/{id}/move` | Move a window |
+| `POST` | `/v1/desktop/windows/{id}/resize` | Resize a window |
+| `GET` | `/v1/desktop/clipboard` | Read clipboard contents |
+| `POST` | `/v1/desktop/clipboard` | Write to clipboard |
+| `POST` | `/v1/desktop/launch` | Launch an application |
+| `POST` | `/v1/desktop/open` | Open a file or URL |
+| `POST` | `/v1/desktop/recording/start` | Start recording |
+| `POST` | `/v1/desktop/recording/stop` | Stop recording |
+| `GET` | `/v1/desktop/recordings` | List recordings |
+| `GET` | `/v1/desktop/recordings/{id}` | Get recording metadata |
+| `GET` | `/v1/desktop/recordings/{id}/download` | Download recording |
+| `DELETE` | `/v1/desktop/recordings/{id}` | Delete recording |
+| `POST` | `/v1/desktop/stream/start` | Start WebRTC streaming |
+| `POST` | `/v1/desktop/stream/stop` | Stop WebRTC streaming |
+| `GET` | `/v1/desktop/stream/status` | Get stream status |
+| `GET` | `/v1/desktop/stream/signaling` | WebSocket for WebRTC signaling |
+
+### TypeScript SDK methods
+
+| Method | Returns | Description |
+|--------|---------|-------------|
+| `startDesktop(request?)` | `DesktopStatusResponse` | Start the desktop |
+| `stopDesktop()` | `DesktopStatusResponse` | Stop the desktop |
+| `getDesktopStatus()` | `DesktopStatusResponse` | Get desktop status |
+| `takeDesktopScreenshot(query?)` | `Uint8Array` | Capture screenshot |
+| `takeDesktopRegionScreenshot(query)` | `Uint8Array` | Capture region screenshot |
+| `getDesktopMousePosition()` | `DesktopMousePositionResponse` | Get mouse position |
+| `moveDesktopMouse(request)` | `DesktopMousePositionResponse` | Move mouse |
+| `clickDesktop(request)` | `DesktopMousePositionResponse` | Click mouse |
+| `mouseDownDesktop(request)` | `DesktopMousePositionResponse` | Mouse button down |
+| `mouseUpDesktop(request)` | `DesktopMousePositionResponse` | Mouse button up |
+| `dragDesktopMouse(request)` | `DesktopMousePositionResponse` | Drag mouse |
+| `scrollDesktop(request)` | `DesktopMousePositionResponse` | Scroll |
+| `typeDesktopText(request)` | `DesktopActionResponse` | Type text |
+| `pressDesktopKey(request)` | `DesktopActionResponse` | Press key |
+| `keyDownDesktop(request)` | `DesktopActionResponse` | Key down |
+| `keyUpDesktop(request)` | `DesktopActionResponse` | Key up |
+| `getDesktopDisplayInfo()` | `DesktopDisplayInfoResponse` | Get display info |
+| `listDesktopWindows()` | `DesktopWindowListResponse` | List windows |
+| `getDesktopFocusedWindow()` | `DesktopWindowInfo` | Get focused window |
+| `focusDesktopWindow(id)` | `DesktopWindowInfo` | Focus a window |
+| `moveDesktopWindow(id, request)` | `DesktopWindowInfo` | Move a window |
+| `resizeDesktopWindow(id, request)` | `DesktopWindowInfo` | Resize a window |
+| `getDesktopClipboard(query?)` | `DesktopClipboardResponse` | Read clipboard |
+| `setDesktopClipboard(request)` | `DesktopActionResponse` | Write clipboard |
+| `launchDesktopApp(request)` | `DesktopLaunchResponse` | Launch an app |
+| `openDesktopTarget(request)` | `DesktopOpenResponse` | Open file/URL |
+| `startDesktopRecording(request?)` | `DesktopRecordingInfo` | Start recording |
+| `stopDesktopRecording()` | `DesktopRecordingInfo` | Stop recording |
+| `listDesktopRecordings()` | `DesktopRecordingListResponse` | List recordings |
+| `getDesktopRecording(id)` | `DesktopRecordingInfo` | Get recording |
+| `downloadDesktopRecording(id)` | `Uint8Array` | Download recording |
+| `deleteDesktopRecording(id)` | `void` | Delete recording |
+| `startDesktopStream()` | `DesktopStreamStatusResponse` | Start streaming |
+| `stopDesktopStream()` | `DesktopStreamStatusResponse` | Stop streaming |
+| `getDesktopStreamStatus()` | `DesktopStreamStatusResponse` | Stream status |
+
+## Customizing the desktop environment
+
+The desktop runs inside the sandbox filesystem, so you can customize it using the [File System](/file-system) API before or after starting the desktop. The desktop HOME directory is located at `~/.local/state/sandbox-agent/desktop/home` (or `$XDG_STATE_HOME/sandbox-agent/desktop/home` if `XDG_STATE_HOME` is set).
+
+All configuration files below are written to paths relative to this HOME directory.
+
+### Window manager (openbox)
+
+The desktop uses [openbox](http://openbox.org/) as its window manager. You can customize its behavior, theme, and keyboard shortcuts by writing an `rc.xml` config file.
+
+<CodeGroup>
+```ts TypeScript
+const openboxConfig = `<?xml version="1.0" encoding="UTF-8"?>
+<openbox_config xmlns="http://openbox.org/3.4/rc">
+  <theme>
+    <name>Clearlooks</name>
+    <titleLayout>NLIMC</titleLayout>
+    <font place="ActiveWindow"><name>DejaVu Sans</name><size>10</size></font>
+  </theme>
+  <desktops><number>1</number></desktops>
+  <keyboard>
+    <keybind key="A-F4"><action name="Close"/></keybind>
+    <keybind key="A-Tab"><action name="NextWindow"/></keybind>
+  </keyboard>
+</openbox_config>`;
+
+await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" },
+  openboxConfig,
+);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox"
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @rc.xml
+```
+</CodeGroup>
+
+### Autostart programs
+
+Openbox runs scripts in `~/.config/openbox/autostart` on startup. Use this to launch applications, set the background, or configure the environment.
+
+<CodeGroup>
+```ts TypeScript
+const autostart = `#!/bin/sh
+# Set a solid background color
+xsetroot -solid "#1e1e2e" &
+
+# Launch a terminal
+xterm -geometry 120x40+50+50 &
+
+# Launch a browser
+firefox --no-remote &
+`;
+
+await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" },
+  autostart,
+);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox"
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @autostart.sh
+```
+</CodeGroup>
+
+<Note>
+The autostart script runs when openbox starts, which happens during `startDesktop()`. Write the autostart file before calling `startDesktop()` for it to take effect.
+</Note>
+
+### Background
+
+There is no wallpaper set by default (the background is the X root window default). You can set it using `xsetroot` in the autostart script (as shown above), or use `feh` if you need an image:
+
+<CodeGroup>
+```ts TypeScript
+// Upload a wallpaper image
+import fs from "node:fs";
+
+const wallpaper = await fs.promises.readFile("./wallpaper.png");
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/wallpaper.png" },
+  wallpaper,
+);
+
+// Set the autostart to apply it
+const autostart = `#!/bin/sh
+feh --bg-fill ~/wallpaper.png &
+`;
+
+await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" },
+  autostart,
+);
+```
+
+```bash cURL
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/wallpaper.png" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @wallpaper.png
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @autostart.sh
+```
+</CodeGroup>
+
+<Note>
+`feh` is not installed by default. Install it via the [Process API](/processes) before starting the desktop: `await sdk.runProcess({ command: "apt-get", args: ["install", "-y", "feh"] })`.
+</Note>
+
+### Fonts
+
+Only `fonts-dejavu-core` is installed by default. To add more fonts, install them with your system package manager or copy font files into the sandbox:
+
+<CodeGroup>
+```ts TypeScript
+// Install a font package
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "fonts-noto", "fonts-liberation"],
+});
+
+// Or copy a custom font file
+import fs from "node:fs";
+
+const font = await fs.promises.readFile("./CustomFont.ttf");
+await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts" });
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" },
+  font,
+);
+
+// Rebuild the font cache
+await sdk.runProcess({ command: "fc-cache", args: ["-fv"] });
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","fonts-noto","fonts-liberation"]}'
+
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts"
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @CustomFont.ttf
+
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"fc-cache","args":["-fv"]}'
+```
+</CodeGroup>
+
+### Cursor theme
+
+<CodeGroup>
+```ts TypeScript
+await sdk.runProcess({
+  command: "apt-get",
+  args: ["install", "-y", "dmz-cursor-theme"],
+});
+
+const xresources = `Xcursor.theme: DMZ-White\nXcursor.size: 24\n`;
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.Xresources" },
+  xresources,
+);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"apt-get","args":["install","-y","dmz-cursor-theme"]}'
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.Xresources" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary 'Xcursor.theme: DMZ-White\nXcursor.size: 24'
+```
+</CodeGroup>
+
+<Note>
+Run `xrdb -merge ~/.Xresources` (via the autostart or process API) after writing the file for changes to take effect.
+</Note>
+
+### Shell and terminal
+
+No terminal emulator or shell is launched by default. Add one to the openbox autostart:
+
+```sh
+# In ~/.config/openbox/autostart
+xterm -geometry 120x40+50+50 &
+```
+
+To use a different shell, set the `SHELL` environment variable in your Dockerfile or install your preferred shell and configure the terminal to use it.
+
+### GTK theme
+
+Applications using GTK will pick up settings from `~/.config/gtk-3.0/settings.ini`:
+
+<CodeGroup>
+```ts TypeScript
+const gtkSettings = `[Settings]
+gtk-theme-name=Adwaita
+gtk-icon-theme-name=Adwaita
+gtk-font-name=DejaVu Sans 10
+gtk-cursor-theme-name=DMZ-White
+gtk-cursor-theme-size=24
+`;
+
+await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0" });
+await sdk.writeFsFile(
+  { path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" },
+  gtkSettings,
+);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0"
+
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" \
+  -H "Content-Type: application/octet-stream" \
+  --data-binary @settings.ini
+```
+</CodeGroup>
+
+### Summary of configuration paths
+
+All paths are relative to the desktop HOME directory (`~/.local/state/sandbox-agent/desktop/home`).
+
+| What | Path | Notes |
+|------|------|-------|
+| Openbox config | `.config/openbox/rc.xml` | Window manager theme, keybindings, behavior |
+| Autostart | `.config/openbox/autostart` | Shell script run on desktop start |
+| Custom fonts | `.local/share/fonts/` | TTF/OTF files, run `fc-cache -fv` after |
+| Cursor theme | `.Xresources` | Requires `xrdb -merge` to apply |
+| GTK 3 settings | `.config/gtk-3.0/settings.ini` | Theme, icons, fonts for GTK apps |
+| Wallpaper | Any path, referenced from autostart | Requires `feh` or similar tool |
--- a/docs/conversion.mdx
+++ b/docs/conversion.mdx
@ -1,80 +0,0 @@
-# Universal ↔ Agent Term Mapping
-
-Source of truth: generated agent schemas in `resources/agent-schemas/artifacts/json-schema/`.
-
-Identifiers
-
-+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
-| Universal term       | Claude                 | Codex (app-server)                       | OpenCode                    | Amp                    |
-+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
-| session_id           | n/a (daemon-only)      | n/a (daemon-only)                        | n/a (daemon-only)           | n/a (daemon-only)      |
-| native_session_id    | none                   | threadId                                 | sessionID                   | none                   |
-| item_id              | synthetic              | ThreadItem.id                            | Message.id                  | StreamJSONMessage.id   |
-| native_item_id       | none                   | ThreadItem.id                            | Message.id                  | StreamJSONMessage.id   |
-+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
-
-Notes:
- When a provider does not supply IDs (Claude), we synthesize item_id values and keep native_item_id null.
- native_session_id is the only provider session identifier. It is intentionally used for thread/session/run ids.
- native_item_id preserves the agent-native item/message id when present.
- source indicates who emitted the event: agent (native) or daemon (synthetic).
- raw is always present on events. When clients do not opt-in to raw payloads, raw is null.
- opt-in via `include_raw=true` on events endpoints (HTTP + SSE).
- If parsing fails, emit agent.unparsed (source=daemon, synthetic=true). Tests must assert zero unparsed events.
-
-Events / Message Flow
-
-+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
-| Universal term         | Claude                       | Codex (app-server)                         | OpenCode                                | Amp                              |
-+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
-| session.started        | none                         | method=thread/started                      | type=session.created                    | none                             |
-| session.ended          | SDKMessage.type=result       | no explicit session end (turn/completed)   | no explicit session end (session.deleted)| type=done                        |
-| message (user)         | SDKMessage.type=user         | item/completed (ThreadItem.type=userMessage)| message.updated (Message.role=user)    | type=message                     |
-| message (assistant)    | SDKMessage.type=assistant    | item/completed (ThreadItem.type=agentMessage)| message.updated (Message.role=assistant)| type=message                  |
-| message.delta          | synthetic                   | method=item/agentMessage/delta             | type=message.part.updated (delta)       | synthetic                        |
-| tool call              | synthetic from tool usage    | method=item/mcpToolCall/progress           | message.part.updated (part.type=tool)   | type=tool_call                   |
-| tool result            | synthetic from tool usage    | item/completed (tool result ThreadItem variants) | message.part.updated (part.type=tool, state=completed) | type=tool_result     |
-| permission.requested   | none                         | none                                       | type=permission.asked                   | none                             |
-| permission.resolved    | none                         | none                                       | type=permission.replied                 | none                             |
-| question.requested     | ExitPlanMode tool (synthetic)| experimental request_user_input (payload)  | type=question.asked                     | none                             |
-| question.resolved      | ExitPlanMode reply (synthetic)| experimental request_user_input (payload) | type=question.replied / question.rejected | none                          |
-| error                  | SDKResultMessage.error       | method=error                               | type=session.error (or message error)   | type=error                        |
-+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
-
-Synthetics
-
-+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-| Synthetic element            | When it appears        | Stored as               | Notes                                                        |
-+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-| session.started              | When agent emits no explicit start | session.started event | Mark source=daemon                                            |
-| session.ended                | When agent emits no explicit end   | session.ended event   | Mark source=daemon; reason may be inferred                    |
-| item_id (Claude)             | Claude provides no item IDs        | item_id               | Maintain provider_item_id map when possible                   |
-| user message (Claude)        | Claude emits only assistant output | item.completed        | Mark source=daemon; preserve raw input in event metadata       |
-| question events (Claude)     | Plan mode ExitPlanMode tool usage  | question.requested/resolved | Synthetic mapping from tool call/result                       |
-| native_session_id (Codex)    | Codex uses threadId                | native_session_id     | Intentionally merged threadId into native_session_id          |
-+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-| message.delta (Claude/Amp)   | No native deltas               | item.delta             | Synthetic delta with full message content; source=daemon       |
-+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-| message.delta (OpenCode)     | part delta before message       | item.delta             | If part arrives first, create item.started stub then delta     |
-+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
-
-Delta handling
-
- Codex emits agent message and other deltas (e.g., item/agentMessage/delta).
- OpenCode emits part deltas via message.part.updated with a delta string.
- Claude and Amp do not emit deltas in their schemas.
-
-Policy:
- Always emit item.delta across all providers.
- For providers without native deltas, emit a single synthetic delta containing the full content prior to item.completed.
- For providers with native deltas, forward as-is; also emit item.completed when final content is known.
-
-Message normalization notes
-
- user vs assistant: normalized via role in the universal item; provider role fields or item types determine role.
- file artifacts: always represented as content parts (type=file_ref) inside message/tool_result items, not a separate item kind.
- reasoning: represented as content parts (type=reasoning) inside message items, with visibility when available.
- subagents: OpenCode subtask parts and Claude Task tool usage are currently normalized into standard message/tool flow (no dedicated subagent fields).
- OpenCode unrolling: message.updated creates/updates the parent message item; tool-related parts emit separate tool item events (item.started/ item.completed) with parent_id pointing to the message item.
- If a message.part.updated arrives before message.updated, we create a stub item.started (source=daemon) so deltas have a parent.
- Tool calls/results are always emitted as separate tool items to keep behavior consistent across agents.
--- a/docs/cors.mdx
+++ b/docs/cors.mdx
@ -0,0 +1,53 @@
+---
+title: "CORS Configuration"
+description: "Configure CORS for browser-based applications."
+sidebarTitle: "CORS"
+---
+
+When calling the Sandbox Agent server from a browser, CORS (Cross-Origin Resource Sharing) controls which origins can make requests.
+
+## Default Behavior
+
+By default, no CORS origins are allowed. You must explicitly specify origins for browser-based applications:
+
+```bash
+sandbox-agent server \
+  --cors-allow-origin "http://localhost:5173"
+```
+
+<Note>
+The built-in Inspector UI at `/ui/` is served from the same origin as the server, so it does not require CORS configuration.
+</Note>
+
+## Options
+
+| Flag | Description |
+|------|-------------|
+| `--cors-allow-origin` | Origins to allow |
+| `--cors-allow-method` | HTTP methods to allow (defaults to all if not specified) |
+| `--cors-allow-header` | Headers to allow (defaults to all if not specified) |
+| `--cors-allow-credentials` | Allow credentials (cookies, authorization headers) |
+
+## Multiple Origins
+
+Specify the flag multiple times to allow multiple origins:
+
+```bash
+sandbox-agent server \
+  --cors-allow-origin "http://localhost:5173" \
+  --cors-allow-origin "http://localhost:3000"
+```
+
+## Restricting Methods and Headers
+
+By default, all methods and headers are allowed. To restrict them:
+
+```bash
+sandbox-agent server \
+  --cors-allow-origin "https://your-app.com" \
+  --cors-allow-method "GET" \
+  --cors-allow-method "POST" \
+  --cors-allow-header "Authorization" \
+  --cors-allow-header "Content-Type" \
+  --cors-allow-credentials
+```
--- a/docs/custom-tools.mdx
+++ b/docs/custom-tools.mdx
@ -0,0 +1,159 @@
+---
+title: "Custom Tools"
+description: "Give agents custom tools inside the sandbox using MCP servers or skills."
+sidebarTitle: "Custom Tools"
+icon: "wrench"
+---
+
+There are two common patterns for sandbox-local custom tooling:
+
+| | MCP Server | Skill |
+|---|---|---|
+| **How it works** | Agent connects to an MCP server (`mcpServers`) | Agent follows `SKILL.md` instructions and runs scripts |
+| **Best for** | Typed tool calls and structured protocols | Lightweight task-specific guidance |
+| **Requires** | MCP server process (stdio/http/sse) | Script + `SKILL.md` |
+
+## Option A: MCP server (stdio)
+
+<Steps>
+  <Step title="Write and bundle your MCP server">
+
+```ts src/mcp-server.ts
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { z } from "zod";
+
+const server = new McpServer({ name: "rand", version: "1.0.0" });
+
+server.tool(
+  "random_number",
+  "Generate a random integer between min and max",
+  {
+    min: z.number(),
+    max: z.number(),
+  },
+  async ({ min, max }) => ({
+    content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
+  }),
+);
+
+await server.connect(new StdioServerTransport());
+```
+
+```bash
+npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/mcp-server.cjs
+```
+  </Step>
+
+  <Step title="Upload it into the sandbox">
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+import fs from "node:fs";
+
+const sdk = await SandboxAgent.connect({ baseUrl: "http://127.0.0.1:2468" });
+const content = await fs.promises.readFile("./dist/mcp-server.cjs");
+
+await sdk.writeFsFile({ path: "/opt/mcp/custom-tools/mcp-server.cjs" }, content);
+```
+
+```bash
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
+  --data-binary @./dist/mcp-server.cjs
+```
+  </Step>
+
+  <Step title="Register MCP config and create a session">
+
+```ts
+await sdk.setMcpConfig(
+  {
+    directory: "/workspace",
+    mcpName: "customTools",
+  },
+  {
+    type: "local",
+    command: "node",
+    args: ["/opt/mcp/custom-tools/mcp-server.cjs"],
+  },
+);
+
+const session = await sdk.createSession({
+  agent: "claude",
+  cwd: "/workspace",
+});
+
+await session.prompt([
+  { type: "text", text: "Use the random_number tool with min=1 and max=10." },
+]);
+```
+  </Step>
+</Steps>
+
+## Option B: Skills
+
+<Steps>
+  <Step title="Write script + skill file">
+
+```ts src/random-number.ts
+const min = Number(process.argv[2]);
+const max = Number(process.argv[3]);
+
+if (Number.isNaN(min) || Number.isNaN(max)) {
+  console.error("Usage: random-number <min> <max>");
+  process.exit(1);
+}
+
+console.log(Math.floor(Math.random() * (max - min + 1)) + min);
+```
+
+````md SKILL.md
+---
+name: random-number
+description: Generate a random integer between min and max.
+---
+
+Run:
+
+```bash
+node /opt/skills/random-number/random-number.cjs <min> <max>
+```
+````
+
+```bash
+npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/random-number.cjs
+```
+  </Step>
+
+  <Step title="Upload files">
+
+```ts
+import fs from "node:fs";
+
+const script = await fs.promises.readFile("./dist/random-number.cjs");
+await sdk.writeFsFile({ path: "/opt/skills/random-number/random-number.cjs" }, script);
+
+const skill = await fs.promises.readFile("./SKILL.md");
+await sdk.writeFsFile({ path: "/opt/skills/random-number/SKILL.md" }, skill);
+```
+  </Step>
+
+  <Step title="Use in a session">
+
+```ts
+const session = await sdk.createSession({
+  agent: "claude",
+  cwd: "/workspace",
+});
+
+await session.prompt([
+  { type: "text", text: "Use the random-number skill to pick a number from 1 to 100." },
+]);
+```
+  </Step>
+</Steps>
+
+## Notes
+
+- The sandbox runtime must include Node.js (or your chosen runtime).
+- For persistent skill-source wiring by directory, see [Skills](/skills-config).
--- a/docs/daemon.mdx
+++ b/docs/daemon.mdx
@ -0,0 +1,69 @@
+---
+title: "Daemon"
+description: "Background daemon lifecycle and management."
+---
+
+The sandbox-agent daemon is a background server process. Commands like `sandbox-agent opencode` and `gigacode` can ensure it is running.
+
+## How it works
+
+1. A daemon-aware command checks for a healthy daemon at host/port.
+2. If missing, it starts one in the background and records PID/version files.
+3. Subsequent checks can compare build/version and restart when required.
+
+## Auto-upgrade behavior
+
+- `sandbox-agent opencode` and `gigacode` use ensure-running behavior with upgrade checks.
+- `sandbox-agent daemon start` uses direct start by default.
+- `sandbox-agent daemon start --upgrade` uses ensure-running behavior (including version check/restart).
+
+## Managing the daemon
+
+### Start
+
+```bash
+sandbox-agent daemon start [OPTIONS]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+| `--upgrade` | false | Use ensure-running + upgrade behavior |
+
+```bash
+sandbox-agent daemon start
+sandbox-agent daemon start --upgrade
+```
+
+### Stop
+
+```bash
+sandbox-agent daemon stop [OPTIONS]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+
+### Status
+
+```bash
+sandbox-agent daemon status [OPTIONS]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-H, --host <HOST>` | `127.0.0.1` | Host |
+| `-p, --port <PORT>` | `2468` | Port |
+
+## Files
+
+Daemon state is stored under the sandbox-agent data directory (for example `~/.local/share/sandbox-agent/daemon/`):
+
+| File | Purpose |
+|------|---------|
+| `daemon-{host}-{port}.pid` | PID of running daemon |
+| `daemon-{host}-{port}.version` | Build/version marker |
+| `daemon-{host}-{port}.log` | Daemon stdout/stderr log |
--- a/docs/deploy/boxlite.mdx
+++ b/docs/deploy/boxlite.mdx
@ -0,0 +1,67 @@
+---
+title: "BoxLite"
+description: "Run Sandbox Agent inside a BoxLite micro-VM."
+---
+
+BoxLite is a local-first micro-VM sandbox — no cloud account needed.
+See [BoxLite docs](https://docs.boxlite.ai) for platform requirements (KVM on Linux, Apple Silicon on macOS).
+
+## Prerequisites
+
+- `@boxlite-ai/boxlite` installed (requires KVM or Apple Hypervisor)
+- Docker (to build the base image)
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+## Base image
+
+Build a Docker image with Sandbox Agent pre-installed, then export it as an OCI layout
+that BoxLite can load directly (BoxLite has its own image store separate from Docker):
+
+```dockerfile
+FROM node:22-bookworm-slim
+RUN apt-get update && apt-get install -y curl ca-certificates && rm -rf /var/lib/apt/lists/*
+RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
+RUN sandbox-agent install-agent claude
+RUN sandbox-agent install-agent codex
+```
+
+```bash
+docker build -t sandbox-agent-boxlite .
+mkdir -p oci-image
+docker save sandbox-agent-boxlite | tar -xf - -C oci-image
+```
+
+## TypeScript example
+
+```typescript
+import { SimpleBox } from "@boxlite-ai/boxlite";
+import { SandboxAgent } from "sandbox-agent";
+
+const env: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+const box = new SimpleBox({
+  rootfsPath: "./oci-image",
+  env,
+  ports: [{ hostPort: 3000, guestPort: 3000 }],
+  diskSizeGb: 4,
+});
+
+await box.exec("sh", "-c",
+  "nohup sandbox-agent server --no-token --host 0.0.0.0 --port 3000 >/tmp/sandbox-agent.log 2>&1 &"
+);
+
+const baseUrl = "http://localhost:3000";
+const sdk = await SandboxAgent.connect({ baseUrl });
+
+const session = await sdk.createSession({ agent: "claude" });
+const off = session.onEvent((event) => {
+  console.log(event.sender, event.payload);
+});
+
+await session.prompt([{ type: "text", text: "Summarize this repository" }]);
+off();
+
+await box.stop();
+```
--- a/docs/deploy/cloudflare.mdx
+++ b/docs/deploy/cloudflare.mdx
@ -0,0 +1,188 @@
+---
+title: "Cloudflare"
+description: "Deploy Sandbox Agent inside a Cloudflare Sandbox."
+---
+
+## Prerequisites
+
+- Cloudflare account with Workers paid plan
+- Docker for local `wrangler dev`
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+<Note>
+Cloudflare Sandbox SDK is beta. See [Sandbox SDK docs](https://developers.cloudflare.com/sandbox/).
+</Note>
+
+## Quick start
+
+```bash
+npm create cloudflare@latest -- my-sandbox --template=cloudflare/sandbox-sdk/examples/minimal
+cd my-sandbox
+```
+
+## Dockerfile
+
+```dockerfile
+FROM cloudflare/sandbox:0.7.0
+
+RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
+RUN sandbox-agent install-agent claude && sandbox-agent install-agent codex
+
+EXPOSE 8000
+```
+
+## TypeScript example (with provider)
+
+For standalone scripts, use the `cloudflare` provider:
+
+```bash
+npm install sandbox-agent@0.4.x @cloudflare/sandbox
+```
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { cloudflare } from "sandbox-agent/cloudflare";
+
+const sdk = await SandboxAgent.start({
+  sandbox: cloudflare(),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "codex" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `cloudflare` provider uses `containerFetch` under the hood, automatically stripping `AbortSignal` to avoid dropped streaming updates.
+
+## TypeScript example (Durable Objects)
+
+For Workers with Durable Objects, use `SandboxAgent.connect(...)` with a custom `fetch` backed by `sandbox.containerFetch(...)`:
+
+```typescript
+import { getSandbox, type Sandbox } from "@cloudflare/sandbox";
+import { Hono } from "hono";
+import { SandboxAgent } from "sandbox-agent";
+
+export { Sandbox } from "@cloudflare/sandbox";
+
+type Bindings = {
+  Sandbox: DurableObjectNamespace<Sandbox>;
+  ASSETS: Fetcher;
+  ANTHROPIC_API_KEY?: string;
+  OPENAI_API_KEY?: string;
+};
+
+const app = new Hono<{ Bindings: Bindings }>();
+const PORT = 8000;
+
+async function isServerRunning(sandbox: Sandbox): Promise<boolean> {
+  try {
+    const result = await sandbox.exec(`curl -sf http://localhost:${PORT}/v1/health`);
+    return result.success;
+  } catch {
+    return false;
+  }
+}
+
+async function getReadySandbox(name: string, env: Bindings): Promise<Sandbox> {
+  const sandbox = getSandbox(env.Sandbox, name);
+  if (!(await isServerRunning(sandbox))) {
+    const envVars: Record<string, string> = {};
+    if (env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = env.ANTHROPIC_API_KEY;
+    if (env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = env.OPENAI_API_KEY;
+    await sandbox.setEnvVars(envVars);
+    await sandbox.startProcess(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`);
+  }
+  return sandbox;
+}
+
+app.post("/sandbox/:name/prompt", async (c) => {
+  const sandbox = await getReadySandbox(c.req.param("name"), c.env);
+
+  const sdk = await SandboxAgent.connect({
+    fetch: (input, init) =>
+      sandbox.containerFetch(
+        input as Request | string | URL,
+        {
+          ...(init ?? {}),
+          // Avoid passing AbortSignal through containerFetch; it can drop streamed session updates.
+          signal: undefined,
+        },
+        PORT,
+      ),
+  });
+
+  const session = await sdk.createSession({ agent: "codex" });
+  const response = await session.prompt([{ type: "text", text: "Summarize this repository" }]);
+  await sdk.destroySession(session.id);
+  await sdk.dispose();
+
+  return c.json(response);
+});
+
+app.all("/sandbox/:name/proxy/*", async (c) => {
+  const sandbox = await getReadySandbox(c.req.param("name"), c.env);
+  const wildcard = c.req.param("*");
+  const path = wildcard ? `/${wildcard}` : "/";
+  const query = new URL(c.req.raw.url).search;
+
+  return sandbox.containerFetch(new Request(`http://localhost${path}${query}`, c.req.raw), PORT);
+});
+
+app.all("*", (c) => c.env.ASSETS.fetch(c.req.raw));
+
+export default app;
+```
+
+This keeps all Sandbox Agent calls inside the Cloudflare sandbox routing path and does not require a `baseUrl`.
+
+## Troubleshooting streaming updates
+
+If you only receive:
+- the outbound prompt request
+- the final `{ stopReason: "end_turn" }` response
+
+then the streamed update channel dropped. In Cloudflare sandbox paths, this is typically caused by forwarding `AbortSignal` from SDK fetch init into `containerFetch(...)`.
+
+Fix:
+
+```ts
+const sdk = await SandboxAgent.connect({
+  fetch: (input, init) =>
+    sandbox.containerFetch(
+      input as Request | string | URL,
+      {
+        ...(init ?? {}),
+        // Avoid passing AbortSignal through containerFetch; it can drop streamed session updates.
+        signal: undefined,
+      },
+      PORT,
+    ),
+});
+```
+
+This keeps prompt completion behavior the same, but restores streamed text/tool updates.
+
+## Local development
+
+```bash
+npm run dev
+```
+
+Test health:
+
+```bash
+curl http://localhost:8787/sandbox/demo/proxy/v1/health
+```
+
+## Production deployment
+
+```bash
+wrangler deploy
+```
--- a/docs/deploy/computesdk.mdx
+++ b/docs/deploy/computesdk.mdx
@ -0,0 +1,81 @@
+---
+title: "ComputeSDK"
+description: "Deploy Sandbox Agent using ComputeSDK's provider-agnostic sandbox API."
+---
+
+[ComputeSDK](https://computesdk.com) provides a unified interface for managing sandboxes across multiple providers. Write once, deploy anywhere by changing environment variables.
+
+## Prerequisites
+
+- `COMPUTESDK_API_KEY` from [console.computesdk.com](https://console.computesdk.com)
+- Provider API key (one of: `E2B_API_KEY`, `DAYTONA_API_KEY`, `VERCEL_TOKEN`, `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`, `BLAXEL_API_KEY`, `CSB_API_KEY`)
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+## TypeScript example
+
+```bash
+npm install sandbox-agent@0.4.x computesdk
+```
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { computesdk } from "sandbox-agent/computesdk";
+
+const envs: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+const sdk = await SandboxAgent.start({
+  sandbox: computesdk({
+    create: {
+      envs,
+      image: process.env.COMPUTESDK_IMAGE,
+      templateId: process.env.COMPUTESDK_TEMPLATE_ID,
+    },
+  }),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "claude" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `computesdk` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically. ComputeSDK routes to your configured provider behind the scenes.
+The `create` option now forwards the full ComputeSDK sandbox-create payload, including provider-specific fields such as `image` and `templateId` when the selected provider supports them.
+
+Before calling `SandboxAgent.start()`, configure ComputeSDK with your provider:
+
+```typescript
+import { compute } from "computesdk";
+
+compute.setConfig({
+  provider: "e2b", // or auto-detect via detectProvider()
+  computesdkApiKey: process.env.COMPUTESDK_API_KEY,
+});
+```
+
+## Supported providers
+
+ComputeSDK auto-detects your provider from environment variables:
+
+| Provider | Environment Variables |
+|----------|----------------------|
+| E2B | `E2B_API_KEY` |
+| Daytona | `DAYTONA_API_KEY` |
+| Vercel | `VERCEL_TOKEN` or `VERCEL_OIDC_TOKEN` |
+| Modal | `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET` |
+| Blaxel | `BLAXEL_API_KEY` |
+| CodeSandbox | `CSB_API_KEY` |
+
+## Notes
+
+- **Provider resolution**: Set `COMPUTESDK_PROVIDER` to force a specific provider, or let ComputeSDK auto-detect from API keys.
+- `sandbox.runCommand(..., { background: true })` keeps the server running while your app continues.
+- `sandbox.getUrl({ port })` returns a public URL for the sandbox port.
+- Always destroy the sandbox when done to avoid leaking resources.
--- a/docs/deploy/daytona.mdx
+++ b/docs/deploy/daytona.mdx
@ -1,21 +1,70 @@
 ---
 title: "Daytona"
-description: "Run the daemon in a Daytona workspace." 
+description: "Run Sandbox Agent in a Daytona workspace."
 ---

-## Steps
+<Warning>
+Daytona Tier 3+ is required for access to common model provider endpoints.
+See [Daytona network limits](https://www.daytona.io/docs/en/network-limits/).
+</Warning>

-1. Create a Daytona workspace with Rust and curl available.
-2. Install or build the sandbox-agent binary.
-3. Start the daemon and expose port `2468` (or your preferred port).
+## Prerequisites
+
+- `DAYTONA_API_KEY`
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+## TypeScript example

 ```bash
-export SANDBOX_TOKEN="..."
-
-cargo run -p sandbox-agent -- server \
-  --token "$SANDBOX_TOKEN" \
-  --host 0.0.0.0 \
-  --port 2468
+npm install sandbox-agent@0.4.x @daytonaio/sdk
 ```

-4. Use your Daytona port forwarding to reach the daemon from your client.
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { daytona } from "sandbox-agent/daytona";
+
+const envVars: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+const sdk = await SandboxAgent.start({
+  sandbox: daytona({
+    create: { envVars },
+  }),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "claude" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `daytona` provider uses the `rivetdev/sandbox-agent:0.4.2-full` image by default and starts the server automatically.
+
+## Using snapshots for faster startup
+
+```typescript
+import { Daytona, Image } from "@daytonaio/sdk";
+
+const daytona = new Daytona();
+const SNAPSHOT = "sandbox-agent-ready";
+
+const hasSnapshot = await daytona.snapshot.get(SNAPSHOT).then(() => true, () => false);
+
+if (!hasSnapshot) {
+  await daytona.snapshot.create({
+    name: SNAPSHOT,
+    image: Image.base("ubuntu:22.04").runCommands(
+      "apt-get update && apt-get install -y curl ca-certificates",
+      "curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh",
+      "sandbox-agent install-agent claude",
+      "sandbox-agent install-agent codex",
+    ),
+  });
+}
+```
--- a/docs/deploy/docker.mdx
+++ b/docs/deploy/docker.mdx
@ -1,27 +1,108 @@
 ---
-title: "Docker (dev)"
-description: "Build and run the daemon in a Docker container."
+title: "Docker"
+description: "Build and run Sandbox Agent in a Docker container."
 ---

-## Build the binary
+<Warning>
+Docker is not recommended for production isolation of untrusted workloads. Use dedicated sandbox providers (E2B, Daytona, etc.) for stronger isolation.
+</Warning>

-Use the release Dockerfile to build a static binary:
+## Quick start
+
+Run the published full image with all supported agents pre-installed:
+
+```bash
+docker run --rm -p 3000:3000 \
+  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
+  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
+  rivetdev/sandbox-agent:0.4.2-full \
+  server --no-token --host 0.0.0.0 --port 3000
+```
+
+The `0.4.2-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image.
+
+If you also want the desktop API inside the container, install desktop dependencies before starting the server:
+
+```bash
+docker run --rm -p 3000:3000 \
+  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
+  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
+  node:22-bookworm-slim sh -c "\
+    apt-get update && \
+    DEBIAN_FRONTEND=noninteractive apt-get install -y curl ca-certificates bash libstdc++6 && \
+    rm -rf /var/lib/apt/lists/* && \
+    curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh && \
+    sandbox-agent install desktop --yes && \
+    sandbox-agent server --no-token --host 0.0.0.0 --port 3000"
+```
+
+In a Dockerfile:
+
+```dockerfile
+RUN sandbox-agent install desktop --yes
+```
+
+## TypeScript with dockerode
+
+```typescript
+import Docker from "dockerode";
+import { SandboxAgent } from "sandbox-agent";
+
+const docker = new Docker();
+const PORT = 3000;
+
+const container = await docker.createContainer({
+  Image: "rivetdev/sandbox-agent:0.4.2-full",
+  Cmd: ["server", "--no-token", "--host", "0.0.0.0", "--port", `${PORT}`],
+  Env: [
+    `ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`,
+    `OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`,
+    `CODEX_API_KEY=${process.env.CODEX_API_KEY}`,
+  ].filter(Boolean),
+  ExposedPorts: { [`${PORT}/tcp`]: {} },
+  HostConfig: {
+    AutoRemove: true,
+    PortBindings: { [`${PORT}/tcp`]: [{ HostPort: `${PORT}` }] },
+  },
+});
+
+await container.start();
+
+const baseUrl = `http://127.0.0.1:${PORT}`;
+const sdk = await SandboxAgent.connect({ baseUrl });
+
+const session = await sdk.createSession({ agent: "codex" });
+await session.prompt([{ type: "text", text: "Summarize this repository." }]);
+```
+
+## Building a custom image with everything preinstalled
+
+If you need to extend your own base image, install Sandbox Agent and preinstall every supported agent in one step:
+
+```dockerfile
+FROM node:22-bookworm-slim
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    bash ca-certificates curl git && \
+    rm -rf /var/lib/apt/lists/*
+
+RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh && \
+    sandbox-agent install-agent --all
+
+RUN useradd -m -s /bin/bash sandbox
+USER sandbox
+WORKDIR /home/sandbox
+
+EXPOSE 2468
+ENTRYPOINT ["sandbox-agent"]
+CMD ["server", "--host", "0.0.0.0", "--port", "2468"]
+```
+
+## Building from source

 ```bash
 docker build -f docker/release/linux-x86_64.Dockerfile -t sandbox-agent-build .
-
 docker run --rm -v "$PWD/artifacts:/artifacts" sandbox-agent-build
 ```

-The binary will be written to `./artifacts/sandbox-agent-x86_64-unknown-linux-musl`.
-
-## Run the daemon
-
-```bash
-docker run --rm -p 2468:2468 \
-  -v "$PWD/artifacts:/artifacts" \
-  debian:bookworm-slim \
-  /artifacts/sandbox-agent-x86_64-unknown-linux-musl server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
-```
-
-You can now access the API at `http://localhost:2468`.
+Binary output: `./artifacts/sandbox-agent-x86_64-unknown-linux-musl`.
--- a/docs/deploy/e2b.mdx
+++ b/docs/deploy/e2b.mdx
@ -1,25 +1,52 @@
 ---
 title: "E2B"
-description: "Deploy the daemon inside an E2B sandbox."
+description: "Deploy Sandbox Agent inside an E2B sandbox."
 ---

-## Steps
+## Prerequisites

-1. Start an E2B sandbox with network access.
-2. Install the agent binaries you need (Claude, Codex, OpenCode, Amp).
-3. Run the daemon and expose its port.
+- `E2B_API_KEY`
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`

-Example startup script:
+## TypeScript example

 ```bash
-export SANDBOX_TOKEN="..."
-
-# Install sandbox-agent binary (or build from source)
-# TODO: replace with release download once published
-cargo run -p sandbox-agent -- server \
-  --token "$SANDBOX_TOKEN" \
-  --host 0.0.0.0 \
-  --port 2468
+npm install sandbox-agent@0.4.x @e2b/code-interpreter
 ```

-4. Configure your client to connect to the sandbox endpoint.
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { e2b } from "sandbox-agent/e2b";
+
+const envs: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+const template = process.env.E2B_TEMPLATE;
+
+const sdk = await SandboxAgent.start({
+  sandbox: e2b({
+    template,
+    create: { envs },
+  }),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "claude" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `e2b` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically. Sandboxes pause by default instead of being deleted, and reconnecting with the same `sandboxId` resumes them automatically.
+
+Pass `template` when you want to start from a custom E2B template alias or template ID. E2B base-image selection happens when you build the template, then `sandbox-agent/e2b` uses that template at sandbox creation time.
+
+## Faster cold starts
+
+For faster startup, create a custom E2B template with Sandbox Agent and target agents pre-installed.
+Build System 2.0 also lets you choose the template's base image in code.
+See [E2B Custom Templates](https://e2b.dev/docs/sandbox-template) and [E2B Base Images](https://e2b.dev/docs/template/base-image).
--- a/docs/deploy/index.mdx
+++ b/docs/deploy/index.mdx
@ -1,4 +0,0 @@
---
-sidebarTitle: Overview
---
-
--- a/docs/deploy/local.mdx
+++ b/docs/deploy/local.mdx
@ -0,0 +1,70 @@
+---
+title: "Local"
+description: "Run Sandbox Agent locally for development."
+---
+
+For local development, run Sandbox Agent directly on your machine.
+
+## With the CLI
+
+```bash
+# Install
+curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
+
+# Run
+sandbox-agent server --no-token --host 127.0.0.1 --port 2468
+```
+
+Or with npm/Bun:
+
+<Tabs>
+  <Tab title="npx">
+    ```bash
+    npx @sandbox-agent/cli@0.4.x server --no-token --host 127.0.0.1 --port 2468
+    ```
+  </Tab>
+  <Tab title="bunx">
+    ```bash
+    bunx @sandbox-agent/cli@0.4.x server --no-token --host 127.0.0.1 --port 2468
+    ```
+  </Tab>
+</Tabs>
+
+## With the TypeScript SDK
+
+The SDK can spawn and manage the server as a subprocess using the `local` provider:
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { local } from "sandbox-agent/local";
+
+const sdk = await SandboxAgent.start({
+  sandbox: local(),
+});
+
+const session = await sdk.createSession({
+  agent: "claude",
+});
+
+await session.prompt([
+  { type: "text", text: "Summarize this repository." },
+]);
+
+await sdk.destroySandbox();
+```
+
+This starts the server on an available local port and connects automatically.
+
+Pass options to customize the local provider:
+
+```typescript
+const sdk = await SandboxAgent.start({
+  sandbox: local({
+    port: 3000,
+    log: "inherit",
+    env: {
+      ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
+    },
+  }),
+});
+```
--- a/docs/deploy/modal.mdx
+++ b/docs/deploy/modal.mdx
@ -0,0 +1,55 @@
+---
+title: "Modal"
+description: "Deploy Sandbox Agent inside a Modal sandbox."
+---
+
+## Prerequisites
+
+- `MODAL_TOKEN_ID` and `MODAL_TOKEN_SECRET` from [modal.com/settings](https://modal.com/settings)
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+## TypeScript example
+
+```bash
+npm install sandbox-agent@0.4.x modal
+```
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { modal } from "sandbox-agent/modal";
+
+const secrets: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) secrets.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) secrets.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+const baseImage = process.env.MODAL_BASE_IMAGE ?? "node:22-slim";
+
+const sdk = await SandboxAgent.start({
+  sandbox: modal({
+    image: baseImage,
+    create: { secrets },
+  }),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "claude" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `modal` provider handles app creation, image building, sandbox provisioning, agent installation, server startup, and tunnel networking automatically.
+Set `image` to change the base Docker image before Sandbox Agent and its agent binaries are layered on top. You can also pass a prebuilt Modal `Image` object.
+
+## Faster cold starts
+
+Modal caches image layers, so the Dockerfile commands that install `curl` and `sandbox-agent` only run on the first build. Subsequent sandbox creates reuse the cached image.
+
+## Notes
+
+- Modal sandboxes use [gVisor](https://gvisor.dev/) for strong isolation.
+- Ports are exposed via encrypted tunnels (`encryptedPorts`). The provider uses `sb.tunnels()` to get the public HTTPS URL.
+- Environment variables (API keys) are passed as Modal [Secrets](https://modal.com/docs/guide/secrets) for security.
--- a/docs/deploy/vercel-sandboxes.mdx
+++ b/docs/deploy/vercel-sandboxes.mdx
@ -1,21 +0,0 @@
---
-title: "Vercel Sandboxes"
-description: "Run the daemon inside Vercel Sandboxes." 
---
-
-## Steps
-
-1. Provision a Vercel Sandbox with network access and storage.
-2. Install the agent binaries you need.
-3. Run the daemon and expose the port.
-
-```bash
-export SANDBOX_TOKEN="..."
-
-cargo run -p sandbox-agent -- server \
-  --token "$SANDBOX_TOKEN" \
-  --host 0.0.0.0 \
-  --port 2468
-```
-
-4. Configure your client to use the sandbox URL.
--- a/docs/deploy/vercel.mdx
+++ b/docs/deploy/vercel.mdx
@ -0,0 +1,50 @@
+---
+title: "Vercel"
+description: "Deploy Sandbox Agent inside a Vercel Sandbox."
+---
+
+## Prerequisites
+
+- `VERCEL_OIDC_TOKEN` or `VERCEL_ACCESS_TOKEN`
+- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
+
+## TypeScript example
+
+```bash
+npm install sandbox-agent@0.4.x @vercel/sandbox
+```
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+import { vercel } from "sandbox-agent/vercel";
+
+const env: Record<string, string> = {};
+if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+const sdk = await SandboxAgent.start({
+  sandbox: vercel({
+    create: {
+      runtime: "node24",
+      env,
+    },
+  }),
+});
+
+try {
+  const session = await sdk.createSession({ agent: "claude" });
+  const response = await session.prompt([
+    { type: "text", text: "Summarize this repository" },
+  ]);
+  console.log(response.stopReason);
+} finally {
+  await sdk.destroySandbox();
+}
+```
+
+The `vercel` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically.
+
+## Authentication
+
+Vercel Sandboxes support OIDC token auth (recommended) and access-token auth.
+See [Vercel Sandbox docs](https://vercel.com/docs/functions/sandbox).
--- a/docs/docs.json
+++ b/docs/docs.json
@ -1,72 +1,130 @@
 {
-	"$schema": "https://mintlify.com/docs.json",
-	"theme": "willow",
-	"name": "Sandbox Agent SDK",
-	"appearance": {
-		"default": "dark",
-		"strict": true
-	},
-	"colors": {
-		"primary": "#ff4f00",
-		"light": "#ff4f00",
-		"dark": "#ff4f00"
-	},
-	"favicon": "/favicon.svg",
-	"logo": {
-		"light": "/logo/light.svg",
-		"dark": "/logo/dark.svg"
-	},
-	"navbar": {
-		"links": [
-			{
-				"label": "Discord",
-				"icon": "discord",
-				"href": "https://rivet.dev/discord"
-			},
-			{
-				"label": "GitHub",
-				"icon": "github",
-				"href": "https://github.com/rivet-dev/sandbox-agent"
-			}
-		]
-	},
-	"navigation": {
-		"pages": [
-			{
-				"group": "Getting started",
-				"pages": [
-					"index",
-					"quickstart",
-					"architecture",
-					"agent-compatibility",
-					"universal-api",
-					"frontend",
-					"building-chat-ui",
-					"manage-session-state"
-				]
-			},
-			{
-				"group": "SDKs",
-				"pages": ["sdks/typescript"]
-			},
-			{
-				"group": "AI",
-				"pages": ["ai/skill", "ai/llms-txt"]
-			},
-			{
-				"group": "Reference",
-				"pages": ["cli", "telemetry", "http-api"]
-			},
-			{
-				"group": "Deploy",
-				"pages": [
-					"deploy/index",
-					"deploy/docker",
-					"deploy/e2b",
-					"deploy/daytona",
-					"deploy/vercel-sandboxes"
-				]
-			}
-		]
-	}
+  "$schema": "https://mintlify.com/docs.json",
+  "theme": "mint",
+  "name": "Sandbox Agent SDK",
+  "appearance": {
+    "default": "dark",
+    "strict": true
+  },
+  "colors": {
+    "primary": "#ff4f00",
+    "light": "#ff6a2a",
+    "dark": "#cc3f00"
+  },
+  "favicon": "/favicon.svg",
+  "logo": {
+    "light": "/logo/light.svg",
+    "dark": "/logo/dark.svg"
+  },
+  "integrations": {
+    "posthog": {
+      "apiKey": "phc_6kfTNEAVw7rn1LA51cO3D69FefbKupSWFaM7OUgEpEo",
+      "apiHost": "https://ph.rivet.gg",
+      "sessionRecording": true
+    }
+  },
+  "navbar": {
+    "links": [
+      {
+        "label": "Discord",
+        "icon": "discord",
+        "href": "https://discord.gg/auCecybynK"
+      },
+      {
+        "label": "GitHub",
+        "type": "github",
+        "href": "https://github.com/rivet-dev/sandbox-agent"
+      }
+    ]
+  },
+  "navigation": {
+    "tabs": [
+      {
+        "tab": "Documentation",
+        "pages": [
+          {
+            "group": "Getting started",
+            "pages": [
+              "quickstart",
+              "sdk-overview",
+              "llm-credentials",
+              "react-components",
+              {
+                "group": "Deploy",
+                "icon": "server",
+                "pages": [
+                  "deploy/local",
+                  "deploy/e2b",
+                  "deploy/daytona",
+                  "deploy/vercel",
+                  "deploy/cloudflare",
+                  "deploy/docker",
+                  "deploy/modal",
+                  "deploy/boxlite",
+                  "deploy/computesdk"
+                ]
+              }
+            ]
+          },
+          {
+            "group": "Agent",
+            "pages": [
+              "agent-sessions",
+              {
+                "group": "Agents",
+                "icon": "robot",
+                "pages": ["agents/claude", "agents/codex", "agents/opencode", "agents/cursor", "agents/amp", "agents/pi"]
+              },
+              "attachments",
+              "skills-config",
+              "mcp-config",
+              "custom-tools"
+            ]
+          },
+          {
+            "group": "System",
+            "pages": ["file-system", "processes", "computer-use", "common-software"]
+          },
+          {
+            "group": "Reference",
+            "pages": [
+              "troubleshooting",
+              "architecture",
+              "cli",
+              "inspector",
+              "opencode-compatibility",
+              {
+                "group": "More",
+                "pages": [
+                  "daemon",
+                  "cors",
+                  "session-restoration",
+                  "telemetry",
+                  {
+                    "group": "AI",
+                    "pages": ["ai/skill", "ai/llms-txt"]
+                  }
+                ]
+              }
+            ]
+          }
+        ]
+      },
+      {
+        "tab": "HTTP API",
+        "pages": [
+          {
+            "group": "HTTP Reference",
+            "openapi": "openapi.json"
+          }
+        ]
+      }
+    ]
+  },
+  "__removed": [
+	  {
+		"group": "Orchestration",
+		"pages": ["orchestration-architecture", "session-persistence", "observability", "multiplayer", "security"]
+	  }
+  ]
 }
--- a/docs/favicon.svg
+++ b/docs/favicon.svg
@ -1,19 +1 @@
-<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
-<path d="M9.06145 23.1079C5.26816 22.3769 -3.39077 20.6274 1.4173 5.06384C9.6344 6.09939 16.9728 14.0644 9.06145 23.1079Z" fill="url(#paint0_linear_17557_2021)"/>
-<path d="M8.91928 23.0939C5.27642 21.2223 0.78371 4.20891 17.0071 0C20.7569 7.19341 19.6212 16.5452 8.91928 23.0939Z" fill="url(#paint1_linear_17557_2021)"/>
-<path d="M8.91388 23.0788C8.73534 19.8817 10.1585 9.08525 23.5699 13.1107C23.1812 20.1229 18.984 26.4182 8.91388 23.0788Z" fill="url(#paint2_linear_17557_2021)"/>
-<defs>
-<linearGradient id="paint0_linear_17557_2021" x1="3.77557" y1="5.91571" x2="5.23185" y2="21.5589" gradientUnits="userSpaceOnUse">
-<stop stop-color="#18E299"/>
-<stop offset="1" stop-color="#15803D"/>
-</linearGradient>
-<linearGradient id="paint1_linear_17557_2021" x1="12.1711" y1="-0.718425" x2="10.1897" y2="22.9832" gradientUnits="userSpaceOnUse">
-<stop stop-color="#16A34A"/>
-<stop offset="1" stop-color="#4ADE80"/>
-</linearGradient>
-<linearGradient id="paint2_linear_17557_2021" x1="23.1327" y1="15.353" x2="9.33841" y2="18.5196" gradientUnits="userSpaceOnUse">
-<stop stop-color="#4ADE80"/>
-<stop offset="1" stop-color="#0D9373"/>
-</linearGradient>
-</defs>
-</svg>
+<svg width="128" height="128" fill="none" xmlns="http://www.w3.org/2000/svg"><rect x="1" y="1" width="126" height="126" rx="44" fill="#0F0F0F"/><rect x="18.25" y="18.25" width="91.5" height="91.5" rx="25.75" stroke="#F0F0F0" stroke-width="8.5"/><path fill-rule="evenodd" clip-rule="evenodd" d="M57.694 43.098c0-.622-.505-1.126-1.127-1.126h-8.444a5.114 5.114 0 0 0-5.112 5.111v33.824a5.114 5.114 0 0 0 5.112 5.112h8.444c.622 0 1.127-.505 1.127-1.127V43.098Zm24.424 27.869c-1.238-2.222-4.047-4.026-6.27-4.026H62.923c-.684 0-.93.555-.549 1.239l7.703 13.822c1.239 2.223 4.048 4.026 6.27 4.026h12.927c.683 0 .93-.555.548-1.239l-7.703-13.822Zm.538-18.718c0-5.672-4.605-10.277-10.277-10.277H63.31a1.21 1.21 0 0 0-1.209 1.209v18.137c0 .667.542 1.209 1.21 1.209h9.068c5.672 0 10.277-4.605 10.277-10.278Z" fill="#F0F0F0"/></svg>
--- a/docs/file-system.mdx
+++ b/docs/file-system.mdx
@ -0,0 +1,154 @@
+---
+title: "File System"
+description: "Read, write, and manage files inside the sandbox."
+sidebarTitle: "File System"
+icon: "folder"
+---
+
+The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload tar archives in batch.
+
+## Path resolution
+
+- Absolute paths are used as-is.
+- Relative paths resolve from the server process working directory.
+- Requests that attempt to escape allowed roots are rejected by the server.
+
+## List entries
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+const entries = await sdk.listFsEntries({
+  path: "./workspace",
+});
+
+console.log(entries);
+```
+
+```bash cURL
+curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace"
+```
+</CodeGroup>
+
+## Read and write files
+
+`PUT /v1/fs/file` writes raw bytes. `GET /v1/fs/file` returns raw bytes.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+await sdk.writeFsFile({ path: "./notes.txt" }, "hello");
+
+const bytes = await sdk.readFsFile({ path: "./notes.txt" });
+const text = new TextDecoder().decode(bytes);
+
+console.log(text);
+```
+
+```bash cURL
+curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt" \
+  --data-binary "hello"
+
+curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt" \
+  --output ./notes.txt
+```
+</CodeGroup>
+
+## Create directories
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+await sdk.mkdirFs({ path: "./data" });
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data"
+```
+</CodeGroup>
+
+## Move, delete, and stat
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+await sdk.moveFs({
+  from: "./notes.txt",
+  to: "./notes-old.txt",
+  overwrite: true,
+});
+
+const stat = await sdk.statFs({ path: "./notes-old.txt" });
+await sdk.deleteFsEntry({ path: "./notes-old.txt" });
+
+console.log(stat);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/fs/move" \
+  -H "Content-Type: application/json" \
+  -d '{"from":"./notes.txt","to":"./notes-old.txt","overwrite":true}'
+
+curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt"
+
+curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt"
+```
+</CodeGroup>
+
+## Batch upload (tar)
+
+Batch upload accepts `application/x-tar` and extracts into the destination directory.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+import fs from "node:fs";
+import path from "node:path";
+import tar from "tar";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+const archivePath = path.join(process.cwd(), "skills.tar");
+await tar.c({
+  cwd: "./skills",
+  file: archivePath,
+}, ["."]);
+
+const tarBuffer = await fs.promises.readFile(archivePath);
+const result = await sdk.uploadFsBatch(tarBuffer, {
+  path: "./skills",
+});
+
+console.log(result);
+```
+
+```bash cURL
+tar -cf skills.tar -C ./skills .
+
+curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills" \
+  -H "Content-Type: application/x-tar" \
+  --data-binary @skills.tar
+```
+</CodeGroup>
--- a/docs/frontend.mdx
+++ b/docs/frontend.mdx
@ -1,22 +0,0 @@
---
-title: "Frontend Demo"
-description: "Run the Vite + React UI for testing the server."
---
-
-The demo frontend lives at `frontend/packages/inspector`.
-
-## Run locally
-
-```bash
-pnpm install
-pnpm --filter @sandbox-agent/inspector dev
-```
-
-The UI expects:
-
- Endpoint (e.g. `http://127.0.0.1:2468`)
- Optional token
-
-When running the server, the inspector is also served automatically at `http://127.0.0.1:2468/ui`.
-
-If you see CORS errors, enable CORS on the server with `sandbox-agent server --cors-allow-origin` and related flags.
--- a/docs/glossary.md
+++ b/docs/glossary.md
@ -1,62 +0,0 @@
-# Glossary (Universal Schema)
-
-This glossary defines the universal schema terms used across the daemon, SDK, and tests.
-
-Session terms
- session_id: daemon-generated identifier for a universal session.
- native_session_id: provider-native thread/session/run identifier (thread_id merged here).
- session.started: event emitted at session start (native or synthetic).
- session.ended: event emitted at session end (native or synthetic); includes reason and terminated_by.
- terminated_by: who ended the session: agent or daemon.
- reason: why the session ended: completed, error, or terminated.
-
-Event terms
- UniversalEvent: envelope that wraps all events; includes source, type, data, raw.
- event_id: unique identifier for the event.
- sequence: monotonic event sequence number within a session.
- time: RFC3339 timestamp for the event.
- source: event origin: agent (native) or daemon (synthetic).
- raw: original provider payload for native events; optional for synthetic events.
-
-Item terms
- item_id: daemon-generated identifier for a universal item.
- native_item_id: provider-native item/message identifier when available; null otherwise.
- parent_id: item_id of the parent item (e.g., tool call/result parented to a message).
- kind: item category: message, tool_call, tool_result, system, status, unknown.
- role: actor role for message items: user, assistant, system, tool (or null).
- status: item lifecycle status: in_progress, completed, failed (or null).
-
-Item event terms
- item.started: item creation event (may be synthetic).
- item.delta: streaming delta event (native where supported; synthetic otherwise).
- item.completed: final item event with complete content.
-
-Content terms
- content: ordered list of parts that make up an item payload.
- content part: a typed element inside content (text, json, tool_call, tool_result, file_ref, image, status, reasoning).
- text: plain text content part.
- json: structured JSON content part.
- tool_call: tool invocation content part (name, arguments, call_id).
- tool_result: tool result content part (call_id, output).
- file_ref: file reference content part (path, action, diff).
- image: image content part (path, mime).
- status: status content part (label, detail).
- reasoning: reasoning content part (text, visibility).
- visibility: reasoning visibility: public or private.
-
-HITL terms
- permission.requested / permission.resolved: human-in-the-loop permission flow events.
- permission_id: identifier for the permission request.
- question.requested / question.resolved: human-in-the-loop question flow events.
- question_id: identifier for the question request.
- options: question answer options.
- response: selected answer for a question.
-
-Synthetic terms
- synthetic event: a daemon-emitted event used to fill gaps in provider-native schemas.
- source=daemon: marks synthetic events.
- synthetic delta: a single full-content delta emitted for providers without native deltas.
-
-Provider terms
- agent: the native provider (claude, codex, opencode, amp).
- native payload: the provider’s original event/message object stored in raw.
--- a/docs/http-api.mdx
+++ b/docs/http-api.mdx
@ -1,169 +0,0 @@
---
-title: "HTTP API"
-description: "Endpoint reference for the sandbox agent daemon."
---
-
-All endpoints are under `/v1`. Authentication uses the daemon-level token via `Authorization: Bearer <token>`.
-
-## Health
-
-<details>
-<summary><strong>GET /v1/health</strong> - Connectivity check</summary>
-
-Response:
-
-```json
-{ "status": "ok" }
-```
-</details>
-
-## Sessions
-
-<details>
-<summary><strong>POST /v1/sessions/{sessionId}</strong> - Create session</summary>
-
-Request:
-
-```json
-{
-  "agent": "claude",
-  "agentMode": "build",
-  "permissionMode": "default",
-  "model": "claude-3-5-sonnet",
-  "variant": "high",
-  "agentVersion": "latest"
-}
-```
-
-Response:
-
-```json
-{
-  "healthy": true,
-  "agentSessionId": "..."
-}
-```
-</details>
-
-<details>
-<summary><strong>POST /v1/sessions/{sessionId}/messages</strong> - Send message</summary>
-
-Request:
-
-```json
-{
-  "message": "Describe the repository."
-}
-```
-</details>
-
-<details>
-<summary><strong>GET /v1/sessions/{sessionId}/events</strong> - Fetch events</summary>
-
-Query params:
-
- `offset`: last-seen event id (exclusive)
- `limit`: max number of events
-
-Response:
-
-```json
-{
-  "events": [
-    {
-      "id": 1,
-      "timestamp": "2026-01-25T10:00:00Z",
-      "sessionId": "my-session",
-      "agent": "claude",
-      "agentSessionId": "...",
-      "data": { "message": { "role": "assistant", "parts": [{ "type": "text", "text": "..." }] } }
-    }
-  ],
-  "hasMore": false
-}
-```
-</details>
-
-<details>
-<summary><strong>GET /v1/sessions/{sessionId}/events/sse</strong> - Stream events (SSE)</summary>
-
-Query params:
-
- `offset`: last-seen event id (exclusive)
-
-SSE payloads are `UniversalEvent` JSON.
-</details>
-
-<details>
-<summary><strong>POST /v1/sessions/{sessionId}/questions/{questionId}/reply</strong></summary>
-
-Request:
-
-```json
-{ "answers": [["Option A"], ["Option B", "Option C"]] }
-```
-</details>
-
-<details>
-<summary><strong>POST /v1/sessions/{sessionId}/questions/{questionId}/reject</strong></summary>
-
-Request:
-
-```json
-{}
-```
-</details>
-
-<details>
-<summary><strong>POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply</strong></summary>
-
-Request:
-
-```json
-{ "reply": "once" }
-```
-</details>
-
-## Agents
-
-<details>
-<summary><strong>GET /v1/agents</strong> - List agents</summary>
-
-Response:
-
-```json
-{
-  "agents": [
-    { "id": "claude", "installed": true, "version": "...", "path": "/usr/local/bin/claude" }
-  ]
-}
-```
-</details>
-
-<details>
-<summary><strong>POST /v1/agents/{agentId}/install</strong> - Install agent</summary>
-
-Request:
-
-```json
-{ "reinstall": false }
-```
-</details>
-
-<details>
-<summary><strong>GET /v1/agents/{agentId}/modes</strong> - List modes</summary>
-
-Response:
-
-```json
-{
-  "modes": [
-    { "id": "build", "name": "Build", "description": "Default coding mode" }
-  ]
-}
-```
-</details>
-
-## Error handling
-
-All errors use RFC 7807 Problem Details and stable `type` strings (e.g. `urn:sandbox-agent:error:session_not_found`).
--- a/docs/images/inspector.png
+++ b/docs/images/inspector.png
--- a/docs/index.mdx
+++ b/docs/index.mdx
@ -1,68 +0,0 @@
---
-title: "Overview"
-description: "Universal API for running Claude Code, Codex, OpenCode, and Amp inside sandboxes."
---
-
-Sandbox Agent SDK is a universal API and daemon for running coding agents inside sandboxes. It standardizes agent sessions, events, and human-in-the-loop workflows across Claude Code, Codex, OpenCode, and Amp.
-
-## At a glance
-
- Universal HTTP API and TypeScript SDK
- Runs inside sandboxes with a lightweight Rust daemon
- Streams events in a shared UniversalEvent schema
- Supports questions and permission workflows
- Designed for multi-provider sandbox environments
-
-## Quickstart
-
-Run the daemon locally:
-
-```bash
-sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
-```
-
-Send a message:
-
-```bash
-curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"agent":"claude"}'
-
-curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"message":"Explain the repo structure."}'
-```
-
-See the full quickstart in [Quickstart](/quickstart).
-
-## What this project solves
-
- **Universal Coding Agent API**: standardize tool calls, messages, and events across agents.
- **Agents in sandboxes**: run a single HTTP daemon inside any sandbox provider.
- **Agent transcripts**: stream or persist a universal event log in your own storage.
-
-## Project scope
-
-**In scope**
-
- Agent session orchestration inside a sandbox
- Streaming events in a universal schema
- Human-in-the-loop questions and permissions
- TypeScript SDK and CLI wrappers
-
-**Out of scope**
-
- Persistent storage of sessions on disk
- Building custom LLM agents (use Vercel AI SDK for that)
- Sandbox provider APIs (use provider SDKs or custom glue)
- Git repo management
-
-## Next steps
-
- Read the [Architecture](/architecture) overview
- Review [Agent compatibility](/agent-compatibility)
- See the [HTTP API](/http-api) and [CLI](/cli)
- Run the [Frontend demo](/frontend)
- Use the [TypeScript SDK](/typescript-sdk)
--- a/docs/inspector.mdx
+++ b/docs/inspector.mdx
@ -0,0 +1,66 @@
+---
+title: "Inspector"
+description: "Debug and inspect agent sessions with the Inspector UI."
+---
+
+The Inspector is a web UI for inspecting Sandbox Agent sessions. Use it to view events, inspect payloads, and troubleshoot behavior.
+
+<Frame>
+  <img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
+</Frame>
+
+## Open the Inspector
+
+The Inspector is served at `/ui/` on your Sandbox Agent server.
+For example, if your server runs at `http://localhost:2468`, open `http://localhost:2468/ui/`.
+
+You can also generate a pre-filled Inspector URL from the SDK:
+
+```typescript
+import { buildInspectorUrl } from "sandbox-agent";
+
+const url = buildInspectorUrl({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+console.log(url);
+// http://127.0.0.1:2468/ui/
+```
+
+## Features
+
+- Session list
+- Event stream view
+- Event JSON inspector
+- Prompt testing
+- Request/response debugging
+- Interactive permission prompts (approve, always-allow, or reject tool-use requests)
+- Desktop panel for status, remediation, start/stop, and screenshot refresh
+- Process management (create, stop, kill, delete, view logs)
+- Interactive PTY terminal for tty processes
+- One-shot command execution
+
+## When to use
+
+- Development: validate session behavior quickly
+- Debugging: inspect raw event payloads
+- Integration work: compare UI behavior with SDK/API calls
+
+## Process terminal
+
+The Inspector includes an embedded Ghostty-based terminal for interactive tty
+processes. The UI uses the SDK's high-level `connectProcessTerminal(...)`
+wrapper via the shared `@sandbox-agent/react` `ProcessTerminal` component.
+
+## Desktop panel
+
+The `Desktop` panel shows the current desktop runtime state, missing dependencies,
+the suggested install command, last error details, process/log paths, and the
+latest captured screenshot.
+
+Use it to:
+
+- Check whether desktop dependencies are installed
+- Start or stop the managed desktop runtime
+- Refresh desktop status
+- Capture a fresh screenshot on demand
--- a/docs/llm-credentials.mdx
+++ b/docs/llm-credentials.mdx
@ -0,0 +1,250 @@
+---
+title: "LLM Credentials"
+description: "Strategies for providing LLM provider credentials to agents."
+icon: "key"
+---
+
+Sandbox Agent needs LLM provider credentials (Anthropic, OpenAI, etc.) to run agent sessions.
+
+## Configuration
+
+Pass credentials via `spawn.env` when starting a sandbox. Each call to `SandboxAgent.start()` can use different credentials:
+
+```typescript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.start({
+  spawn: {
+    env: {
+      ANTHROPIC_API_KEY: "sk-ant-...",
+      OPENAI_API_KEY: "sk-...",
+    },
+  },
+});
+```
+
+Each agent requires credentials from a specific provider. Sandbox Agent checks environment variables (including those passed via `spawn.env`) and host config files:
+
+| Agent | Provider | Environment variables | Config files |
+|-------|----------|----------------------|--------------|
+| Claude Code | Anthropic | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | `~/.claude.json`, `~/.claude/.credentials.json` |
+| Amp | Anthropic | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | `~/.amp/config.json` |
+| Codex | OpenAI | `OPENAI_API_KEY`, `CODEX_API_KEY` | `~/.codex/auth.json` |
+| OpenCode | Anthropic or OpenAI | `ANTHROPIC_API_KEY`, `OPENAI_API_KEY` | `~/.local/share/opencode/auth.json` |
+| Mock | None | - | - |
+
+## Credential strategies
+
+LLM credentials are passed into the sandbox as environment variables. The agent and everything inside the sandbox has access to the token, so it's important to choose the right strategy for how you provision and scope these credentials.
+
+| Strategy | Who pays | Cost attribution | Best for |
+|----------|----------|-----------------|----------|
+| **Per-tenant gateway** (recommended) | Your organization, billed back per tenant | Per-tenant keys with budgets | Multi-tenant SaaS, usage-based billing |
+| **Bring your own key** | Each user (usage-based) | Per-user by default | Dev environments, internal tools |
+| **Shared API key** | Your organization | None (single bill) | Single-tenant apps, internal platforms |
+| **Personal subscription** | Each user (existing subscription) | Per-user by default | Local dev, internal tools where users have Claude or Codex subscriptions |
+
+### Per-tenant gateway (recommended)
+
+Route LLM traffic through a gateway that mints per-tenant API keys, each with its own spend tracking and budget limits.
+
+```mermaid
+graph LR
+    B[Your Backend] -->|tenant key| S[Sandbox]
+    S -->|LLM requests| G[Gateway]
+    G -->|scoped key| P[LLM Provider]
+```
+
+Your backend issues a scoped key per tenant, then passes it to the sandbox. This is the typical pattern when using sandbox providers (E2B, Daytona, Docker).
+
+```typescript expandable
+import { SandboxAgent } from "sandbox-agent";
+
+async function createTenantSandbox(tenantId: string) {
+  // Issue a scoped key for this tenant via OpenRouter
+  const res = await fetch("https://openrouter.ai/api/v1/keys", {
+    method: "POST",
+    headers: {
+      Authorization: `Bearer ${process.env.OPENROUTER_PROVISIONING_KEY}`,
+      "Content-Type": "application/json",
+    },
+    body: JSON.stringify({
+      name: `tenant-${tenantId}`,
+      limit: 50,
+      limitResetType: "monthly",
+    }),
+  });
+  const { key } = await res.json();
+
+  // Start a sandbox with the tenant's scoped key
+  const sdk = await SandboxAgent.start({
+    spawn: {
+      env: {
+        OPENAI_API_KEY: key, // OpenRouter uses OpenAI-compatible endpoints
+      },
+    },
+  });
+
+  const session = await sdk.createSession({
+    agent: "claude",
+    sessionInit: { cwd: "/workspace" },
+  });
+
+  return { sdk, session };
+}
+```
+
+#### Security
+
+Recommended for multi-tenant applications. Each tenant gets a scoped key with its own budget, so exfiltration only exposes that tenant's allowance.
+
+#### Use cases
+
+- **Multi-tenant SaaS**: per-tenant spend tracking and budget limits
+- **Production apps**: exposed to end users who need isolated credentials
+- **Usage-based billing**: each tenant pays for their own consumption
+
+#### Choosing a gateway
+
+<AccordionGroup>
+
+<Accordion title="OpenRouter provisioned keys" icon="cloud">
+
+Managed service, zero infrastructure. [OpenRouter](https://openrouter.ai/docs/features/provisioning-api-keys) provides per-tenant API keys with spend tracking and budget limits via their Provisioning API. Pass the tenant key to Sandbox Agent as `OPENAI_API_KEY` (OpenRouter uses OpenAI-compatible endpoints).
+
+```bash
+# Create a key for a tenant with a $50/month budget
+curl https://openrouter.ai/api/v1/keys \
+  -H "Authorization: Bearer $PROVISIONING_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "tenant-acme",
+    "limit": 50,
+    "limitResetType": "monthly"
+  }'
+```
+
+Easiest to set up but not open-source. See [OpenRouter pricing](https://openrouter.ai/docs/framework/pricing) for details.
+
+</Accordion>
+
+<Accordion title="LiteLLM proxy" icon="server">
+
+Self-hosted, open-source (MIT). [LiteLLM](https://github.com/BerriAI/litellm) is an OpenAI-compatible proxy with hierarchical budgets (org, team, user, key), virtual keys, and spend tracking. Requires Python + PostgreSQL.
+
+```bash
+# Create a team (tenant) with a $500 budget
+curl http://litellm:4000/team/new \
+  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "team_alias": "tenant-acme",
+    "max_budget": 500
+  }'
+
+# Generate a key for that team
+curl http://litellm:4000/key/generate \
+  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "team_id": "team-abc123",
+    "max_budget": 100
+  }'
+```
+
+Full control with no vendor lock-in. Organization-level features require an enterprise license.
+
+</Accordion>
+
+<Accordion title="Portkey gateway" icon="code-branch">
+
+Self-hosted, open-source (Apache 2.0). [Portkey](https://github.com/Portkey-AI/gateway) is a lightweight OpenAI-compatible gateway supporting 200+ providers. Single binary, no database required. Create virtual keys with per-tenant budget limits and pass them to Sandbox Agent.
+
+Lightest operational footprint of the self-hosted options. Observability and analytics require the managed platform or your own tooling.
+
+</Accordion>
+
+</AccordionGroup>
+
+To bill tenants for LLM usage, use [Stripe token billing](https://docs.stripe.com/billing/token-billing) (integrates natively with OpenRouter) or query your gateway's spend API and feed usage into your billing system.
+
+### Bring your own key
+
+Each user provides their own API key. Users are billed directly by the LLM provider with no additional infrastructure needed.
+
+Pass the user's key via `spawn.env`:
+
+```typescript
+const sdk = await SandboxAgent.start({
+  spawn: {
+    env: {
+      ANTHROPIC_API_KEY: userProvidedKey,
+    },
+  },
+});
+```
+
+#### Security
+
+API keys are typically long-lived. The key is visible to the agent and anything running inside the sandbox, so exfiltration is possible. This is usually acceptable for developer-facing tools where the user owns the key.
+
+#### Use cases
+
+- **Developer tools**: each user manages their own API key
+- **Internal platforms**: users already have LLM provider accounts
+- **Per-user billing**: no extra infrastructure needed
+
+### Shared credentials
+
+A single organization-wide API key is used for all sessions. All token usage appears on one bill with no per-user or per-tenant cost attribution.
+
+```typescript
+const sdk = await SandboxAgent.start({
+  spawn: {
+    env: {
+      ANTHROPIC_API_KEY: process.env.ORG_ANTHROPIC_KEY!,
+      OPENAI_API_KEY: process.env.ORG_OPENAI_KEY!,
+    },
+  },
+});
+```
+
+If you need to track or limit spend per tenant, use a per-tenant gateway instead.
+
+#### Security
+
+Not recommended for anything other than internal tooling. A single exfiltrated key exposes your organization's entire LLM budget. If you need org-paid credentials for external users, use a per-tenant gateway with scoped keys instead.
+
+#### Use cases
+
+- **Single-tenant apps**: small number of users, one bill
+- **Prototyping**: cost attribution not needed yet
+- **Simplicity over security**: acceptable when exfiltration risk is low
+
+### Personal subscription
+
+If the user is signed into Claude Code or Codex on the host machine, Sandbox Agent automatically picks up their OAuth tokens. No configuration is needed.
+
+#### Remote sandboxes
+
+Extract credentials locally and pass them to a remote sandbox via `spawn.env`:
+
+```bash
+$ sandbox-agent credentials extract-env
+ANTHROPIC_API_KEY=sk-ant-...
+CLAUDE_API_KEY=sk-ant-...
+OPENAI_API_KEY=sk-...
+CODEX_API_KEY=sk-...
+```
+
+Use `-e` to prefix with `export` for shell sourcing.
+
+#### Security
+
+Personal subscriptions use OAuth tokens with a limited lifespan. These are the same credentials used when running an agent normally on the host. If a token is exfiltrated from the sandbox, the exposure window is short.
+
+#### Use cases
+
+- **Local development**: users are already signed into Claude Code or Codex
+- **Internal tools**: every user has their own subscription
+- **Prototyping**: no key management needed
--- a/docs/logo/dark.svg
+++ b/docs/logo/dark.svg
--- a/docs/logo/light.svg
+++ b/docs/logo/light.svg
--- a/docs/manage-sessions.mdx
+++ b/docs/manage-sessions.mdx
@ -1,21 +1,263 @@
 ---
-title: "Manage Session State"
-description: "TODO"
+title: "Manage Sessions"
+description: "Persist and replay agent transcripts across connections."
+icon: "database"
 ---

-TODO
+Sandbox Agent stores sessions in memory only. When the server restarts or the sandbox is destroyed, all session data is lost. It's your responsibility to persist events to your own database.

 ## Recommended approach

- Store the offset of the last message you have seen (the last event id).
- Update your server to stream events from the Events API using that offset.
- Write the resulting messages and events to your own database.
+1. Store events to your database as they arrive
+2. On reconnect, get the last event's `sequence` and pass it as `offset`
+3. The API returns events where `sequence > offset`

-This lets you resume from a known offset after a disconnect and prevents duplicate writes.
+This prevents duplicate writes and lets you recover from disconnects.

-## Recommended: Rivet Actors
+## Receiving Events

-If you want a managed way to keep long-running streams alive, consider [Rivet Actors](https://rivet.dev).
-They handle continuous event streaming plus fast reads and writes of data for agents, with built-in
-realtime support and observability. You can use them to stream `/events/sse` per session and persist
-each event to your database as it arrives.
+Two ways to receive events: streaming (recommended) or polling.
+
+### Streaming
+
+Use streaming for real-time events with automatic reconnection support.
+
+```typescript
+import { SandboxAgentClient } from "sandbox-agent";
+
+const client = new SandboxAgentClient({
+  baseUrl: "http://127.0.0.1:2468",
+  agent: "mock",
+});
+
+// Get offset from last stored event (0 returns all events)
+const lastEvent = await db.getLastEvent("my-session");
+const offset = lastEvent?.sequence ?? 0;
+
+// Stream from where you left off
+for await (const event of client.streamEvents("my-session", { offset })) {
+  await db.insertEvent("my-session", event);
+}
+```
+
+### Polling
+
+If you can't use streaming, poll the events endpoint:
+
+```typescript
+const lastEvent = await db.getLastEvent("my-session");
+let offset = lastEvent?.sequence ?? 0;
+
+while (true) {
+  const { events } = await client.getEvents("my-session", {
+    offset,
+    limit: 100
+  });
+
+  for (const event of events) {
+    await db.insertEvent("my-session", event);
+    offset = event.sequence;
+  }
+
+  await sleep(1000);
+}
+```
+
+## Database options
+
+Choose where to persist events based on your requirements. For most use cases, we recommend Rivet Actors.
+
+| | Durable | Real-time | Multiplayer | Scaling | Throughput | Complexity |
+|---------|:-------:|:---------:|:-----------:|---------|------------|------------|
+| Rivet Actors | ✓ | ✓ | ✓ | Auto-sharded, one actor per session | Millions of concurrent sessions | Zero infrastructure |
+| PostgreSQL | ✓ |  |  | Manual sharding | Connection pool limited | Connection pools, migrations |
+| Redis |  | ✓ |  | Redis Cluster | High, in-memory | Memory management, Sentinel for failover |
+
+### Rivet Actors
+
+For production workloads, [Rivet Actors](https://rivet.gg) provide a managed solution for:
+
+- **Persistent state**: Events survive crashes and restarts
+- **Real-time streaming**: Built-in WebSocket support for clients
+- **Horizontal scaling**: Run thousands of concurrent sessions
+- **Observability**: Built-in logging and metrics
+
+#### Actor
+
+```typescript
+import { actor } from "rivetkit";
+import { Daytona } from "@daytonaio/sdk";
+import { SandboxAgent, SandboxAgentClient, AgentEvent } from "sandbox-agent";
+
+interface CodingSessionState {
+  sandboxId: string;
+  baseUrl: string;
+  sessionId: string;
+  events: AgentEvent[];
+}
+
+interface CodingSessionVars {
+  client: SandboxAgentClient;
+}
+
+const daytona = new Daytona();
+
+const codingSession = actor({
+  createState: async (): Promise<CodingSessionState> => {
+    const sandbox = await daytona.create({
+      snapshot: "sandbox-agent-ready",
+      envVars: {
+        ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
+        OPENAI_API_KEY: process.env.OPENAI_API_KEY,
+      },
+      autoStopInterval: 0,
+    });
+
+    await sandbox.process.executeCommand(
+      "nohup sandbox-agent server --no-token --host 0.0.0.0 --port 3000 &"
+    );
+
+    const baseUrl = (await sandbox.getSignedPreviewUrl(3000)).url;
+    const sessionId = crypto.randomUUID();
+
+    return {
+      sandboxId: sandbox.id,
+      baseUrl,
+      sessionId,
+      events: [],
+    };
+  },
+
+  createVars: async (c): Promise<CodingSessionVars> => {
+    const client = new SandboxAgentClient({
+  baseUrl: c.state.baseUrl,
+  agent: "mock",
+});
+    await client.createSession(c.state.sessionId, { agent: "claude" });
+    return { client };
+  },
+
+  onDestroy: async (c) => {
+    const sandbox = await daytona.get(c.state.sandboxId);
+    await sandbox.delete();
+  },
+
+  run: async (c) => {
+    for await (const event of c.vars.client.streamEvents(c.state.sessionId)) {
+      c.state.events.push(event);
+      c.broadcast("agentEvent", event);
+    }
+  },
+
+  actions: {
+    postMessage: async (c, message: string) => {
+      await c.vars.client.postMessage(c.state.sessionId, message);
+    },
+
+    getTranscript: (c) => c.state.events,
+  },
+});
+```
+
+#### Client
+
+<CodeGroup>
+
+```typescript TypeScript
+import { createClient } from "rivetkit/client";
+
+const client = createClient();
+const session = client.codingSession.getOrCreate(["my-session"]);
+
+const conn = session.connect();
+conn.on("agentEvent", (event) => {
+  console.log(event.type, event.data);
+});
+
+await conn.postMessage("Create a new React component for user profiles");
+
+const transcript = await conn.getTranscript();
+```
+
+```typescript React
+import { createRivetKit } from "@rivetkit/react";
+
+const { useActor } = createRivetKit();
+
+function CodingSession() {
+  const [messages, setMessages] = useState<AgentEvent[]>([]);
+  const session = useActor({ name: "codingSession", key: ["my-session"] });
+
+  session.useEvent("agentEvent", (event) => {
+    setMessages((prev) => [...prev, event]);
+  });
+
+  const sendPrompt = async (prompt: string) => {
+    await session.connection?.postMessage(prompt);
+  };
+
+  return (
+    <div>
+      {messages.map((msg, i) => (
+        <div key={i}>{JSON.stringify(msg)}</div>
+      ))}
+      <button onClick={() => sendPrompt("Build a login page")}>
+        Send Prompt
+      </button>
+    </div>
+  );
+}
+```
+
+</CodeGroup>
+
+### PostgreSQL
+
+```sql
+CREATE TABLE agent_events (
+  event_id TEXT PRIMARY KEY,
+  session_id TEXT NOT NULL,
+  native_session_id TEXT,
+  sequence INTEGER NOT NULL,
+  time TIMESTAMPTZ NOT NULL,
+  type TEXT NOT NULL,
+  source TEXT NOT NULL,
+  synthetic BOOLEAN NOT NULL DEFAULT FALSE,
+  data JSONB NOT NULL,
+  UNIQUE(session_id, sequence)
+);
+
+CREATE INDEX idx_events_session ON agent_events(session_id, sequence);
+```
+
+### Redis
+
+```typescript
+// Append event to list
+await redis.rpush(`session:${sessionId}`, JSON.stringify(event));
+
+// Get events from offset
+const events = await redis.lrange(`session:${sessionId}`, offset, -1);
+```
+
+## Handling disconnects
+
+The event stream may disconnect due to network issues. Handle reconnection gracefully:
+
+```typescript
+async function streamWithRetry(sessionId: string) {
+  while (true) {
+    try {
+      const lastEvent = await db.getLastEvent(sessionId);
+      const offset = lastEvent?.sequence ?? 0;
+
+      for await (const event of client.streamEvents(sessionId, { offset })) {
+        await db.insertEvent(sessionId, event);
+      }
+    } catch (error) {
+      console.error("Stream disconnected, reconnecting...", error);
+      await sleep(1000);
+    }
+  }
+}
+```
--- a/docs/mcp-config.mdx
+++ b/docs/mcp-config.mdx
@ -0,0 +1,82 @@
+---
+title: "MCP"
+description: "Configure MCP servers for agent sessions."
+sidebarTitle: "MCP"
+icon: "plug"
+---
+
+MCP (Model Context Protocol) servers extend agents with tools and external context.
+
+## Configuring MCP servers
+
+The HTTP config endpoints let you store/retrieve MCP server configs by directory + name.
+
+```ts
+// Create MCP config
+await sdk.setMcpConfig(
+  {
+    directory: "/workspace",
+    mcpName: "github",
+  },
+  {
+    type: "remote",
+    url: "https://example.com/mcp",
+  },
+);
+
+// Create a session using the configured MCP servers
+const session = await sdk.createSession({
+  agent: "claude",
+  cwd: "/workspace",
+});
+
+await session.prompt([
+  { type: "text", text: "Use available MCP servers to help with this task." },
+]);
+
+// List MCP configs
+const config = await sdk.getMcpConfig({
+  directory: "/workspace",
+  mcpName: "github",
+});
+
+console.log(config.type);
+
+// Delete MCP config
+await sdk.deleteMcpConfig({
+  directory: "/workspace",
+  mcpName: "github",
+});
+```
+
+## Config fields
+
+### Local server
+
+| Field | Description |
+|---|---|
+| `type` | `local` |
+| `command` | executable path |
+| `args` | array of CLI args |
+| `env` | environment variable map |
+| `cwd` | working directory |
+| `enabled` | enable/disable server |
+| `timeoutMs` | timeout override |
+
+### Remote server
+
+| Field | Description |
+|---|---|
+| `type` | `remote` |
+| `url` | MCP server URL |
+| `transport` | `http` or `sse` |
+| `headers` | static headers map |
+| `bearerTokenEnvVar` | env var name to inject in auth header |
+| `envHeaders` | header name to env var map |
+| `oauth` | optional OAuth config object |
+| `enabled` | enable/disable server |
+| `timeoutMs` | timeout override |
+
+## Custom MCP servers
+
+To bundle and upload your own MCP server into the sandbox, see [Custom Tools](/custom-tools).
--- a/docs/multiplayer.mdx
+++ b/docs/multiplayer.mdx
@ -0,0 +1,147 @@
+---
+title: "Multiplayer"
+description: "Use Rivet Actors to coordinate shared sessions."
+icon: "users"
+---
+
+For multiplayer orchestration, use [Rivet Actors](https://rivet.dev/docs/actors).
+
+Recommended model:
+
+- One actor per collaborative workspace/thread.
+- The actor owns Sandbox Agent session lifecycle and persistence.
+- Clients connect to the actor and receive realtime broadcasts.
+
+Use [actor keys](https://rivet.dev/docs/actors/keys) to map each workspace to one actor, [events](https://rivet.dev/docs/actors/events) for realtime updates, and [lifecycle hooks](https://rivet.dev/docs/actors/lifecycle) for cleanup.
+
+## Example
+
+<CodeGroup>
+
+```ts Actor (server)
+import { actor, setup } from "rivetkit";
+import { SandboxAgent, type SessionPersistDriver, type SessionRecord, type SessionEvent, type ListPageRequest, type ListPage, type ListEventsRequest } from "sandbox-agent";
+
+interface RivetPersistData { sessions: Record<string, SessionRecord>; events: Record<string, SessionEvent[]>; }
+type RivetPersistState = { _sandboxAgentPersist: RivetPersistData };
+
+class RivetSessionPersistDriver implements SessionPersistDriver {
+  private readonly stateKey: string;
+  private readonly ctx: { state: Record<string, unknown> };
+  constructor(ctx: { state: Record<string, unknown> }, options: { stateKey?: string } = {}) {
+    this.ctx = ctx;
+    this.stateKey = options.stateKey ?? "_sandboxAgentPersist";
+    if (!this.ctx.state[this.stateKey]) {
+      this.ctx.state[this.stateKey] = { sessions: {}, events: {} };
+    }
+  }
+  private get data(): RivetPersistData { return this.ctx.state[this.stateKey] as RivetPersistData; }
+  async getSession(id: string) { const s = this.data.sessions[id]; return s ? { ...s } : undefined; }
+  async listSessions(request: ListPageRequest = {}): Promise<ListPage<SessionRecord>> {
+    const sorted = Object.values(this.data.sessions).sort((a, b) => a.createdAt - b.createdAt || a.id.localeCompare(b.id));
+    const offset = Number(request.cursor ?? 0);
+    const limit = request.limit ?? 100;
+    const slice = sorted.slice(offset, offset + limit);
+    return { items: slice, nextCursor: offset + slice.length < sorted.length ? String(offset + slice.length) : undefined };
+  }
+  async updateSession(session: SessionRecord) { this.data.sessions[session.id] = { ...session }; if (!this.data.events[session.id]) this.data.events[session.id] = []; }
+  async listEvents(request: ListEventsRequest): Promise<ListPage<SessionEvent>> {
+    const all = [...(this.data.events[request.sessionId] ?? [])].sort((a, b) => a.eventIndex - b.eventIndex || a.id.localeCompare(b.id));
+    const offset = Number(request.cursor ?? 0);
+    const limit = request.limit ?? 100;
+    const slice = all.slice(offset, offset + limit);
+    return { items: slice, nextCursor: offset + slice.length < all.length ? String(offset + slice.length) : undefined };
+  }
+  async insertEvent(sessionId: string, event: SessionEvent) { const events = this.data.events[sessionId] ?? []; events.push({ ...event, payload: JSON.parse(JSON.stringify(event.payload)) }); this.data.events[sessionId] = events; }
+}
+
+type WorkspaceState = RivetPersistState & {
+  sandboxId: string;
+  baseUrl: string;
+};
+
+export const workspace = actor({
+  createState: async () => {
+    return {
+      sandboxId: "sbx_123",
+      baseUrl: "http://127.0.0.1:2468",
+    } satisfies Partial<WorkspaceState>;
+  },
+
+  createVars: async (c) => {
+    const persist = new RivetSessionPersistDriver(c);
+    const sdk = await SandboxAgent.connect({
+      baseUrl: c.state.baseUrl,
+      persist,
+    });
+
+    const session = await sdk.resumeOrCreateSession({ id: "default", agent: "codex" });
+
+    const unsubscribe = session.onEvent((event) => {
+      c.broadcast("session.event", event);
+    });
+
+    return { sdk, session, unsubscribe };
+  },
+
+  actions: {
+    getSessionInfo: (c) => ({
+      workspaceId: c.key[0],
+      sandboxId: c.state.sandboxId,
+    }),
+
+    prompt: async (c, input: { userId: string; text: string }) => {
+      c.broadcast("chat.user", {
+        userId: input.userId,
+        text: input.text,
+        createdAt: Date.now(),
+      });
+
+      await c.vars.session.prompt([{ type: "text", text: input.text }]);
+    },
+  },
+
+  onSleep: async (c) => {
+    c.vars.unsubscribe?.();
+    await c.vars.sdk.dispose();
+  },
+});
+
+export const registry = setup({
+  use: { workspace },
+});
+```
+
+```ts Client (browser)
+import { createClient } from "rivetkit/client";
+import type { registry } from "./actors";
+
+const client = createClient<typeof registry>({
+  endpoint: process.env.NEXT_PUBLIC_RIVET_ENDPOINT!,
+});
+
+const workspaceId = "workspace-42";
+const room = client.workspace.getOrCreate([workspaceId]);
+const conn = room.connect();
+
+conn.on("chat.user", (event) => {
+  console.log("user message", event);
+});
+
+conn.on("session.event", (event) => {
+  console.log("sandbox event", event);
+});
+
+await conn.prompt({
+  userId: "user-123",
+  text: "Propose a refactor plan for auth middleware.",
+});
+```
+
+</CodeGroup>
+
+## Notes
+
+- Keep sandbox calls actor-only. Browser clients should not call Sandbox Agent directly.
+- Copy the Rivet persist driver from the example above into your project so session history persists in actor state.
+- For client connection patterns, see [Rivet JavaScript client](https://rivet.dev/docs/clients/javascript).
--- a/docs/observability.mdx
+++ b/docs/observability.mdx
@ -0,0 +1,64 @@
+---
+title: "Observability"
+description: "Track session activity with OpenTelemetry."
+icon: "chart-line"
+---
+
+Use OpenTelemetry to instrument session traffic, then ship telemetry to your collector/backend.
+
+## Common collectors and backends
+
+- [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
+- [Jaeger](https://www.jaegertracing.io/)
+- [Grafana Tempo](https://grafana.com/oss/tempo/)
+- [Honeycomb](https://www.honeycomb.io/)
+- [Datadog APM](https://docs.datadoghq.com/tracing/)
+
+## Example: trace a prompt round-trip
+
+Wrap `session.prompt()` in a span to measure the full round-trip, then log individual events as span events.
+
+Assumes your OTEL provider/exporter is already configured.
+
+```ts
+import { trace } from "@opentelemetry/api";
+import { SandboxAgent } from "sandbox-agent";
+
+const tracer = trace.getTracer("my-app/sandbox-agent");
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: process.env.SANDBOX_URL!,
+});
+
+const session = await sdk.createSession({ agent: "mock" });
+
+// Log each event as an OTEL span event on the active span
+const unsubscribe = session.onEvent((event) => {
+  const activeSpan = trace.getActiveSpan();
+  if (!activeSpan) return;
+
+  activeSpan.addEvent("session.event", {
+    "sandbox.sender": event.sender,
+    "sandbox.event_index": event.eventIndex,
+  });
+});
+
+// The span covers the full prompt round-trip
+await tracer.startActiveSpan("sandbox_agent.prompt", async (span) => {
+  span.setAttribute("sandbox.session_id", session.id);
+
+  try {
+    const result = await session.prompt([
+      { type: "text", text: "Summarize this repository." },
+    ]);
+    span.setAttribute("sandbox.stop_reason", result.stopReason);
+  } catch (error) {
+    span.recordException(error as Error);
+    throw error;
+  } finally {
+    span.end();
+  }
+});
+
+unsubscribe();
+```
--- a/docs/openapi.json
+++ b/docs/openapi.json
--- a/docs/opencode-compatibility.mdx
+++ b/docs/opencode-compatibility.mdx
@ -0,0 +1,125 @@
+---
+title: "OpenCode Compatibility"
+description: "Connect OpenCode clients, SDKs, and web UI to Sandbox Agent."
+---
+
+<Warning>
+  **Experimental**: OpenCode SDK/UI compatibility may change.
+</Warning>
+
+Sandbox Agent exposes an OpenCode-compatible API at `/opencode`.
+
+## Why use OpenCode clients with Sandbox Agent?
+
+- OpenCode CLI (`opencode attach`)
+- OpenCode web UI
+- OpenCode TypeScript SDK (`@opencode-ai/sdk`)
+
+## Quick start
+
+### OpenCode CLI / TUI
+
+```bash
+sandbox-agent opencode --port 2468 --no-token
+```
+
+Or start server + attach manually:
+
+```bash
+sandbox-agent server --no-token --host 127.0.0.1 --port 2468
+opencode attach http://localhost:2468/opencode
+```
+
+With authentication enabled:
+
+```bash
+sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
+opencode attach http://localhost:2468/opencode --password "$SANDBOX_TOKEN"
+```
+
+### OpenCode web UI
+
+<Steps>
+  <Step title="Start Sandbox Agent with CORS">
+    ```bash
+    sandbox-agent server --no-token --host 127.0.0.1 --port 2468 --cors-allow-origin http://127.0.0.1:5173
+    ```
+  </Step>
+  <Step title="Run OpenCode web app">
+    ```bash
+    git clone https://github.com/anomalyco/opencode
+    cd opencode/packages/app
+    export VITE_OPENCODE_SERVER_HOST=127.0.0.1
+    export VITE_OPENCODE_SERVER_PORT=2468
+    bun install
+    bun run dev -- --host 127.0.0.1 --port 5173
+    ```
+  </Step>
+  <Step title="Open UI">
+    Visit `http://127.0.0.1:5173/`.
+  </Step>
+</Steps>
+
+### OpenCode SDK
+
+```typescript
+import { createOpencodeClient } from "@opencode-ai/sdk";
+
+const client = createOpencodeClient({
+  baseUrl: "http://localhost:2468/opencode",
+});
+
+const session = await client.session.create();
+
+await client.session.promptAsync({
+  path: { id: session.data.id },
+  body: {
+    parts: [{ type: "text", text: "Hello, write a hello world script" }],
+  },
+});
+
+const events = await client.event.subscribe({});
+for await (const event of events.stream) {
+  console.log(event);
+}
+```
+
+## Notes
+
+- API base path: `/opencode`
+- If server auth is enabled, pass bearer auth (or `--password` in OpenCode CLI)
+- For browser UIs, configure CORS with `--cors-allow-origin`
+- Provider selector currently exposes compatible providers (`mock`, `amp`, `claude`, `codex`)
+- Provider/model metadata for compatibility endpoints is normalized and may differ from native OpenCode grouping
+- Optional proxy: set `OPENCODE_COMPAT_PROXY_URL` to forward selected endpoints to native OpenCode
+
+## Endpoint coverage
+
+<Accordion title="Endpoint Status Table">
+
+| Endpoint | Status | Notes |
+|---|---|---|
+| `GET /event` | ✓ | Session/message updates (SSE) |
+| `GET /global/event` | ✓ | GlobalEvent-wrapped stream |
+| `GET /session` | ✓ | Session list |
+| `POST /session` | ✓ | Create session |
+| `GET /session/{id}` | ✓ | Session details |
+| `POST /session/{id}/message` | ✓ | Send message |
+| `GET /session/{id}/message` | ✓ | Session messages |
+| `GET /permission` | ✓ | Pending permissions |
+| `POST /permission/{id}/reply` | ✓ | Permission reply |
+| `GET /question` | ✓ | Pending questions |
+| `POST /question/{id}/reply` | ✓ | Question reply |
+| `GET /provider` | ✓ | Provider metadata |
+| `GET /command` | ↔ | Proxied when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub |
+| `GET /config` | ↔ | Proxied when set; otherwise stub |
+| `PATCH /config` | ↔ | Proxied when set; otherwise local compatibility behavior |
+| `GET /global/config` | ↔ | Proxied when set; otherwise stub |
+| `PATCH /global/config` | ↔ | Proxied when set; otherwise local compatibility behavior |
+| `/tui/*` | ↔ | Proxied when set; otherwise local compatibility behavior |
+| `GET /agent` | − | Agent list |
+| *other endpoints* | − | Empty/stub responses |
+
+✓ Functional   ↔ Proxied optional   − Stubbed
+
+</Accordion>
--- a/docs/orchestration-architecture.mdx
+++ b/docs/orchestration-architecture.mdx
@ -0,0 +1,43 @@
+---
+title: "Orchestration Architecture"
+description: "Production topology, backend requirements, and session persistence."
+icon: "sitemap"
+---
+
+This page covers production topology and backend requirements. Read [Architecture](/architecture) first for an overview of how the server, SDK, and agent processes fit together.
+
+## Suggested Topology
+
+Run the SDK on your backend, then call it from your frontend.
+
+This extra hop is recommended because it keeps auth/token logic on the backend and makes persistence simpler.
+
+```mermaid placement="top-right"
+  flowchart LR
+    BROWSER["Browser"]
+    subgraph BACKEND["Your backend"]
+      direction TB
+      SDK["Sandbox Agent SDK"]
+    end
+    subgraph SANDBOX_SIMPLE["Sandbox"]
+      SERVER_SIMPLE["Sandbox Agent server"]
+    end
+
+    BROWSER --> BACKEND
+    BACKEND --> SDK --> SERVER_SIMPLE
+```
+
+### Backend requirements
+
+Your backend layer needs to handle:
+
+- **Long-running connections**: prompts can take minutes.
+- **Session affinity**: follow-up messages must reach the same session.
+- **State between requests**: session metadata and event history must persist across requests.
+- **Graceful recovery**: sessions should resume after backend restarts.
+
+We recommend [Rivet](https://rivet.dev) over serverless because actors natively support the long-lived connections, session routing, and state persistence that agent workloads require.
+
+## Session persistence
+
+For storage driver options and replay behavior, see [Persisting Sessions](/session-persistence).
--- a/docs/processes.mdx
+++ b/docs/processes.mdx
@ -0,0 +1,258 @@
+---
+title: "Processes"
+description: "Run commands and manage long-lived processes inside the sandbox."
+sidebarTitle: "Processes"
+icon: "terminal"
+---
+
+The process API supports:
+
+- **One-shot execution** — run a command to completion and capture stdout, stderr, and exit code
+- **Managed processes** — spawn, list, stop, kill, and delete long-lived processes
+- **Log streaming** — fetch buffered logs or follow live output
+- **Terminals** — full PTY support with bidirectional WebSocket I/O
+- **Configurable limits** — control concurrency, timeouts, and buffer sizes per runtime
+
+## Run a command
+
+Execute a command to completion and get its output.
+
+<CodeGroup>
+```ts TypeScript
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+const result = await sdk.runProcess({
+  command: "ls",
+  args: ["-la", "/workspace"],
+});
+
+console.log(result.exitCode); // 0
+console.log(result.stdout);
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"ls","args":["-la","/workspace"]}'
+```
+</CodeGroup>
+
+You can set a timeout and cap output size:
+
+<CodeGroup>
+```ts TypeScript
+const result = await sdk.runProcess({
+  command: "make",
+  args: ["build"],
+  timeoutMs: 60000,
+  maxOutputBytes: 1048576,
+});
+
+if (result.timedOut) {
+  console.log("Build timed out");
+}
+if (result.stdoutTruncated) {
+  console.log("Output was truncated");
+}
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"make","args":["build"],"timeoutMs":60000,"maxOutputBytes":1048576}'
+```
+</CodeGroup>
+
+## Managed processes
+
+Create a long-lived process that you can interact with, monitor, and stop later.
+
+### Create
+
+<CodeGroup>
+```ts TypeScript
+const proc = await sdk.createProcess({
+  command: "node",
+  args: ["server.js"],
+  cwd: "/workspace",
+});
+
+console.log(proc.id, proc.pid); // proc_1, 12345
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes" \
+  -H "Content-Type: application/json" \
+  -d '{"command":"node","args":["server.js"],"cwd":"/workspace"}'
+```
+</CodeGroup>
+
+### List and get
+
+<CodeGroup>
+```ts TypeScript
+const { processes } = await sdk.listProcesses();
+
+for (const p of processes) {
+  console.log(p.id, p.command, p.status);
+}
+
+const proc = await sdk.getProcess("proc_1");
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/processes"
+
+curl "http://127.0.0.1:2468/v1/processes/proc_1"
+```
+</CodeGroup>
+
+### Stop, kill, and delete
+
+<CodeGroup>
+```ts TypeScript
+// SIGTERM with optional wait
+await sdk.stopProcess("proc_1", { waitMs: 5000 });
+
+// SIGKILL
+await sdk.killProcess("proc_1", { waitMs: 1000 });
+
+// Remove exited process record
+await sdk.deleteProcess("proc_1");
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/stop?waitMs=5000"
+
+curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/kill?waitMs=1000"
+
+curl -X DELETE "http://127.0.0.1:2468/v1/processes/proc_1"
+```
+</CodeGroup>
+
+## Logs
+
+### Fetch buffered logs
+
+<CodeGroup>
+```ts TypeScript
+const logs = await sdk.getProcessLogs("proc_1", {
+  tail: 50,
+  stream: "combined",
+});
+
+for (const entry of logs.entries) {
+  console.log(entry.stream, atob(entry.data));
+}
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/processes/proc_1/logs?tail=50&stream=combined"
+```
+</CodeGroup>
+
+### Follow logs
+
+Stream log entries in real time. The subscription replays buffered entries first, then streams new output as it arrives.
+
+```ts TypeScript
+const sub = await sdk.followProcessLogs("proc_1", (entry) => {
+  console.log(entry.stream, atob(entry.data));
+});
+
+// Later, stop following
+sub.close();
+await sub.closed;
+```
+
+## Terminals
+
+Create a process with `tty: true` to allocate a pseudo-terminal, then connect via WebSocket for full bidirectional I/O.
+
+```ts TypeScript
+const proc = await sdk.createProcess({
+  command: "bash",
+  tty: true,
+});
+```
+
+### Write input
+
+<CodeGroup>
+```ts TypeScript
+await sdk.sendProcessInput("proc_1", {
+  data: "echo hello\n",
+  encoding: "utf8",
+});
+```
+
+```bash cURL
+curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/input" \
+  -H "Content-Type: application/json" \
+  -d '{"data":"echo hello\n","encoding":"utf8"}'
+```
+</CodeGroup>
+
+### Connect to a terminal
+
+Use `ProcessTerminalSession` unless you need direct frame access.
+
+```ts TypeScript
+const terminal = sdk.connectProcessTerminal("proc_1");
+
+terminal.onReady(() => {
+  terminal.resize({ cols: 120, rows: 40 });
+  terminal.sendInput("ls\n");
+});
+
+terminal.onData((bytes) => {
+  process.stdout.write(new TextDecoder().decode(bytes));
+});
+
+terminal.onExit((status) => {
+  console.log("exit:", status.exitCode);
+});
+
+terminal.onError((error) => {
+  console.error(error instanceof Error ? error.message : error.message);
+});
+
+terminal.onClose(() => {
+  console.log("terminal closed");
+});
+```
+
+Since the browser WebSocket API cannot send custom headers, the endpoint accepts an `access_token` query parameter for authentication. The SDK handles this automatically.
+
+### Browser terminal emulators
+
+The terminal session works with any browser terminal emulator like ghostty-web or xterm.js. For a drop-in React terminal, see [React Components](/react-components).
+
+## Configuration
+
+Adjust runtime limits like max concurrent processes, timeouts, and buffer sizes.
+
+<CodeGroup>
+```ts TypeScript
+const config = await sdk.getProcessConfig();
+console.log(config);
+
+await sdk.setProcessConfig({
+  ...config,
+  maxConcurrentProcesses: 32,
+  defaultRunTimeoutMs: 60000,
+});
+```
+
+```bash cURL
+curl "http://127.0.0.1:2468/v1/processes/config"
+
+curl -X POST "http://127.0.0.1:2468/v1/processes/config" \
+  -H "Content-Type: application/json" \
+  -d '{"maxConcurrentProcesses":32,"defaultRunTimeoutMs":60000,"maxRunTimeoutMs":300000,"maxOutputBytes":1048576,"maxLogBytesPerProcess":10485760,"maxInputBytesPerRequest":65536}'
+```
+</CodeGroup>
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@ -1,103 +1,306 @@
 ---
 title: "Quickstart"
 description: "Start the server and send your first message."
+icon: "rocket"
 ---

-## 1. Run the server
+<Steps>
+  <Step title="Install skill (optional)">
+    <Tabs>
+      <Tab title="npx">
+        ```bash
+        npx skills add rivet-dev/skills -s sandbox-agent
+        ```
+      </Tab>
+      <Tab title="bunx">
+        ```bash
+        bunx skills add rivet-dev/skills -s sandbox-agent
+        ```
+      </Tab>
+    </Tabs>
+  </Step>

-Use the installed binary, or `cargo run` in development.
+  <Step title="Set environment variables">
+    Each coding agent requires API keys to connect to their respective LLM providers.

-```bash
-sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
-```
+    <Tabs>
+      <Tab title="Local shell">
+        ```bash
+        export ANTHROPIC_API_KEY="sk-ant-..."
+        export OPENAI_API_KEY="sk-..."
+        ```
+      </Tab>

-If you want to run without auth (local dev only):
+      <Tab title="E2B">
+        ```typescript
+        import { Sandbox } from "@e2b/code-interpreter";

-```bash
-sandbox-agent server --no-token --host 127.0.0.1 --port 2468
-```
+        const envs: Record<string, string> = {};
+        if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+        if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;

-If you're running from source instead of the installed CLI:
+        const sandbox = await Sandbox.create({ envs });
+        ```
+      </Tab>

-```bash
-cargo run -p sandbox-agent -- server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
-```
+      <Tab title="Daytona">
+        ```typescript
+        import { Daytona } from "@daytonaio/sdk";

-### CORS (frontend usage)
+        const envVars: Record<string, string> = {};
+        if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+        if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY;

-If you are calling the server from a browser, enable CORS explicitly:
+        const daytona = new Daytona();
+        const sandbox = await daytona.create({
+          snapshot: "sandbox-agent-ready",
+          envVars,
+        });
+        ```
+      </Tab>

-```bash
-sandbox-agent server \
-  --token "$SANDBOX_TOKEN" \
-  --cors-allow-origin "http://localhost:5173" \
-  --cors-allow-method "GET" \
-  --cors-allow-method "POST" \
-  --cors-allow-header "Authorization" \
-  --cors-allow-header "Content-Type" \
-  --cors-allow-credentials
-```
+      <Tab title="Docker">
+        ```bash
+        docker run -p 2468:2468 \
+          -e ANTHROPIC_API_KEY="sk-ant-..." \
+          -e OPENAI_API_KEY="sk-..." \
+          rivetdev/sandbox-agent:0.4.2-full \
+          server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+    </Tabs>

-## 2. Install agents (optional)
+    <AccordionGroup>
+      <Accordion title="Extracting API keys from current machine">
+        Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from local Claude Code or Codex config files.
+      </Accordion>
+      <Accordion title="Testing without API keys">
+        Use the `mock` agent for SDK and integration testing without provider credentials.
+      </Accordion>
+      <Accordion title="Multi-tenant and per-user billing">
+        For per-tenant token tracking, budget enforcement, or usage-based billing, see [LLM Credentials](/llm-credentials) for gateway options like OpenRouter, LiteLLM, and Portkey.
+      </Accordion>
+    </AccordionGroup>
+  </Step>

-Agents install lazily on first use. To preinstall everything up front:
+  <Step title="Run the server">
+    <Tabs>
+      <Tab title="curl">
+        Install and run the binary directly.

-```bash
-sandbox-agent install-agent claude
-sandbox-agent install-agent codex
-sandbox-agent install-agent opencode
-sandbox-agent install-agent amp
-```
+        ```bash
+        curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>

-## 3. Create a session
+      <Tab title="npx">
+        Run without installing globally.

-```bash
-curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"agent":"claude","agentMode":"build","permissionMode":"default"}'
-```
+        ```bash
+        npx @sandbox-agent/cli@0.4.x server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>

-## 4. Send a message
+      <Tab title="bunx">
+        Run without installing globally.

-```bash
-curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"message":"Summarize the repository and suggest next steps."}'
-```
+        ```bash
+        bunx @sandbox-agent/cli@0.4.x server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>

-## 5. Read events
+      <Tab title="npm i -g">
+        Install globally, then run.

-```bash
-curl "http://127.0.0.1:2468/v1/sessions/my-session/events?offset=0&limit=50" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN"
-```
+        ```bash
+        npm install -g @sandbox-agent/cli@0.4.x
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>

-For streaming output, use SSE:
+      <Tab title="bun add -g">
+        Install globally, then run.

-```bash
-curl "http://127.0.0.1:2468/v1/sessions/my-session/events/sse?offset=0" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN"
-```
+        ```bash
+        bun add -g @sandbox-agent/cli@0.4.x
+        # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
+        bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>

-For a single-turn stream (post a message and get one streamed response):
+      <Tab title="Node.js (local)">
+        For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.

-```bash
-curl -N -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages/stream" \
-  -H "Authorization: Bearer $SANDBOX_TOKEN" \
-  -H "content-type: application/json" \
-  -d '{"message":"Hello"}'
-```
+        ```bash
+        npm install sandbox-agent@0.4.x
+        ```

-## 6. CLI shortcuts
+        ```typescript
+        import { SandboxAgent } from "sandbox-agent";

-The `sandbox-agent api` subcommand mirrors the HTTP API:
+        const sdk = await SandboxAgent.start();
+        ```
+      </Tab>

-```bash
-sandbox-agent api sessions create my-session --agent claude --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
+      <Tab title="Bun (local)">
+        For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.

-sandbox-agent api sessions send-message my-session --message "Hello" --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
+        ```bash
+        bun add sandbox-agent@0.4.x
+        # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
+        bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
+        ```

-sandbox-agent api sessions send-message-stream my-session --message "Hello" --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
-```
+        ```typescript
+        import { SandboxAgent } from "sandbox-agent";
+
+        const sdk = await SandboxAgent.start();
+        ```
+      </Tab>
+
+      <Tab title="Build from source">
+        If you're running from source instead of the installed CLI.
+
+        ```bash
+        cargo run -p sandbox-agent -- server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+    </Tabs>
+
+    Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely.
+
+    <AccordionGroup>
+      <Accordion title="Configuring token">
+        Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure networking at the infrastructure layer.
+
+        If you expose the server publicly, use `--token "$SANDBOX_TOKEN"` to require authentication:
+
+        ```bash
+        sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
+        ```
+
+        Then pass the token when connecting:
+
+        <Tabs>
+          <Tab title="TypeScript">
+            ```typescript
+            import { SandboxAgent } from "sandbox-agent";
+
+            const sdk = await SandboxAgent.connect({
+              baseUrl: "http://your-server:2468",
+              token: process.env.SANDBOX_TOKEN,
+            });
+            ```
+          </Tab>
+
+          <Tab title="curl">
+            ```bash
+            curl "http://your-server:2468/v1/health" \
+              -H "Authorization: Bearer $SANDBOX_TOKEN"
+            ```
+          </Tab>
+
+          <Tab title="CLI">
+            ```bash
+            sandbox-agent --token "$SANDBOX_TOKEN" api agents list \
+              --endpoint http://your-server:2468
+            ```
+          </Tab>
+        </Tabs>
+      </Accordion>
+      <Accordion title="CORS">
+        If you're calling the server from a browser, see the [CORS configuration guide](/cors).
+      </Accordion>
+    </AccordionGroup>
+  </Step>
+
+  <Step title="Install agents (optional)">
+    To preinstall agents:
+
+    ```bash
+    sandbox-agent install-agent --all
+    ```
+
+    If agents are not installed up front, they are lazily installed when creating a session.
+  </Step>
+
+  <Step title="Install desktop dependencies (optional, Linux only)">
+    If you want to use `/v1/desktop/*`, install the desktop runtime packages first:
+
+    ```bash
+    sandbox-agent install desktop --yes
+    ```
+
+    Then use `GET /v1/desktop/status` or `sdk.getDesktopStatus()` to verify the runtime is ready before calling desktop screenshot or input APIs.
+  </Step>
+
+  <Step title="Create a session">
+    ```typescript
+    import { SandboxAgent } from "sandbox-agent";
+
+    const sdk = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+    });
+
+    const session = await sdk.createSession({
+      agent: "claude",
+      sessionInit: {
+        cwd: "/",
+        mcpServers: [],
+      },
+    });
+
+    console.log(session.id);
+    ```
+  </Step>
+
+  <Step title="Send a message">
+    ```typescript
+    const result = await session.prompt([
+      { type: "text", text: "Summarize the repository and suggest next steps." },
+    ]);
+
+    console.log(result.stopReason);
+    ```
+  </Step>
+
+  <Step title="Read events">
+    ```typescript
+    const off = session.onEvent((event) => {
+      console.log(event.sender, event.payload);
+    });
+
+    const page = await sdk.getEvents({
+      sessionId: session.id,
+      limit: 50,
+    });
+
+    console.log(page.items.length);
+    off();
+    ```
+  </Step>
+
+  <Step title="Test with Inspector">
+    Open the Inspector UI at `/ui/` on your server (for example, `http://localhost:2468/ui/`) to inspect sessions and events in a GUI.
+
+    <Frame>
+      <img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
+    </Frame>
+  </Step>
+</Steps>
+
+## Next steps
+
+<CardGroup cols={3}>
+  <Card title="Session Persistence" icon="database" href="/session-persistence">
+    Configure in-memory, Rivet Actor state, IndexedDB, SQLite, and Postgres persistence.
+  </Card>
+  <Card title="Deploy to a Sandbox" icon="box" href="/deploy/local">
+    Deploy your agent to E2B, Daytona, Docker, Vercel, or Cloudflare.
+  </Card>
+  <Card title="SDK Overview" icon="compass" href="/sdk-overview">
+    Use the latest TypeScript SDK API.
+  </Card>
+</CardGroup>
--- a/docs/react-components.mdx
+++ b/docs/react-components.mdx
@ -0,0 +1,245 @@
+---
+title: "React Components"
+description: "Drop-in React components for Sandbox Agent frontends."
+icon: "react"
+---
+
+`@sandbox-agent/react` exposes small React components built on top of the `sandbox-agent` SDK.
+
+Current exports:
+
+- `AgentConversation` for a combined transcript + composer surface
+- `ProcessTerminal` for attaching to a running tty process
+- `AgentTranscript` for rendering session/message timelines without bundling any styles
+- `ChatComposer` for a reusable prompt input/send surface
+- `useTranscriptVirtualizer` for wiring large transcript lists to a scroll container
+
+## Install
+
+```bash
+npm install @sandbox-agent/react@0.4.x
+```
+
+## Full example
+
+This example connects to a running Sandbox Agent server, starts a tty shell, renders `ProcessTerminal`, and cleans up the process when the component unmounts.
+
+```tsx TerminalPane.tsx expandable highlight={5,32-36,71}
+"use client";
+
+import { useEffect, useState } from "react";
+import { SandboxAgent } from "sandbox-agent";
+import { ProcessTerminal } from "@sandbox-agent/react";
+
+export default function TerminalPane() {
+  const [client, setClient] = useState<SandboxAgent | null>(null);
+  const [processId, setProcessId] = useState<string | null>(null);
+  const [error, setError] = useState<string | null>(null);
+
+  useEffect(() => {
+    let cancelled = false;
+    let sdk: SandboxAgent | null = null;
+    let createdProcessId: string | null = null;
+
+    const cleanup = async () => {
+      if (!sdk || !createdProcessId) {
+        return;
+      }
+
+      await sdk.killProcess(createdProcessId, { waitMs: 1_000 }).catch(() => {});
+      await sdk.deleteProcess(createdProcessId).catch(() => {});
+    };
+
+    const start = async () => {
+      try {
+        sdk = await SandboxAgent.connect({
+          baseUrl: "http://127.0.0.1:2468",
+        });
+
+        const process = await sdk.createProcess({
+          command: "sh",
+          interactive: true,
+          tty: true,
+        });
+
+        if (cancelled) {
+          createdProcessId = process.id;
+          await cleanup();
+          await sdk.dispose();
+          return;
+        }
+
+        createdProcessId = process.id;
+        setClient(sdk);
+        setProcessId(process.id);
+      } catch (err) {
+        const message = err instanceof Error ? err.message : "Failed to start terminal.";
+        setError(message);
+      }
+    };
+
+    void start();
+
+    return () => {
+      cancelled = true;
+      void cleanup();
+      void sdk?.dispose();
+    };
+  }, []);
+
+  if (error) {
+    return <div>{error}</div>;
+  }
+
+  if (!client || !processId) {
+    return <div>Starting terminal...</div>;
+  }
+
+  return <ProcessTerminal client={client} processId={processId} height={480} />;
+}
+```
+
+## Component
+
+`ProcessTerminal` attaches to a running tty process.
+
+- `client`: a `SandboxAgent` client
+- `processId`: the process to attach to
+- `height`, `style`, `terminalStyle`: optional layout overrides
+- `onExit`, `onError`: optional lifecycle callbacks
+
+See [Processes](/processes) for the lower-level terminal APIs.
+
+## Headless transcript
+
+`AgentTranscript` is intentionally unstyled. It follows the common headless React pattern used by libraries like Radix, Headless UI, and React Aria: behavior lives in the component, while styling stays in your app through `className`, slot-level `classNames`, and `data-*` state attributes on the rendered DOM.
+
+```tsx TranscriptPane.tsx
+import {
+  AgentTranscript,
+  type AgentTranscriptClassNames,
+  type TranscriptEntry,
+} from "@sandbox-agent/react";
+
+const transcriptClasses: Partial<AgentTranscriptClassNames> = {
+  root: "transcript",
+  message: "transcript-message",
+  messageContent: "transcript-message-content",
+  toolGroupContainer: "transcript-tools",
+  toolGroupHeader: "transcript-tools-header",
+  toolItem: "transcript-tool-item",
+  toolItemHeader: "transcript-tool-item-header",
+  toolItemBody: "transcript-tool-item-body",
+  divider: "transcript-divider",
+  dividerText: "transcript-divider-text",
+  error: "transcript-error",
+};
+
+export function TranscriptPane({ entries }: { entries: TranscriptEntry[] }) {
+  return (
+    <AgentTranscript
+      entries={entries}
+      classNames={transcriptClasses}
+      renderMessageText={(entry) => <div>{entry.text}</div>}
+      renderInlinePendingIndicator={() => <span>...</span>}
+      renderToolGroupIcon={() => <span>Events</span>}
+      renderChevron={(expanded) => <span>{expanded ? "Hide" : "Show"}</span>}
+    />
+  );
+}
+```
+
+```css
+.transcript {
+  display: grid;
+  gap: 12px;
+}
+
+.transcript [data-slot="message"][data-variant="user"] .transcript-message-content {
+  background: #161616;
+  color: white;
+}
+
+.transcript [data-slot="message"][data-variant="assistant"] .transcript-message-content {
+  background: #f4f4f0;
+  color: #161616;
+}
+
+.transcript [data-slot="tool-item"][data-failed="true"] {
+  border-color: #d33;
+}
+
+.transcript [data-slot="tool-item-header"][data-expanded="true"] {
+  background: rgba(0, 0, 0, 0.06);
+}
+```
+
+`AgentTranscript` accepts `TranscriptEntry[]`, which matches the Inspector timeline shape:
+
+- `message` entries render user/assistant text
+- `tool` entries render expandable tool input/output sections
+- `reasoning` entries render expandable reasoning blocks
+- `meta` entries render status rows or expandable metadata details
+
+Useful props:
+
+- `className`: root class hook
+- `classNames`: slot-level class hooks for styling from outside the package
+- `scrollRef` + `virtualize`: opt into TanStack Virtual against an external scroll container
+- `renderMessageText`: custom text or markdown renderer
+- `renderToolItemIcon`, `renderToolGroupIcon`, `renderChevron`, `renderEventLinkContent`: presentation overrides
+- `renderInlinePendingIndicator`, `renderThinkingState`: loading/thinking UI overrides
+- `isDividerEntry`, `canOpenEvent`, `getToolGroupSummary`: behavior overrides for grouping and labels
+
+## Transcript virtualization hook
+
+`useTranscriptVirtualizer` exposes the same TanStack Virtual behavior used by `AgentTranscript` when `virtualize` is enabled.
+
+- Pass the grouped transcript rows you want to virtualize
+- Pass a `scrollRef` that points at the actual scrollable element
+- Use it when you need transcript-aware virtualization outside the stock `AgentTranscript` renderer
+
+## Composer and conversation
+
+`ChatComposer` is the headless message input. `AgentConversation` composes `AgentTranscript` and `ChatComposer` so apps can reuse the transcript/composer pairing without pulling in Inspector session chrome.
+
+```tsx ConversationPane.tsx
+import { AgentConversation, type TranscriptEntry } from "@sandbox-agent/react";
+
+export function ConversationPane({
+  entries,
+  message,
+  onMessageChange,
+  onSubmit,
+}: {
+  entries: TranscriptEntry[];
+  message: string;
+  onMessageChange: (value: string) => void;
+  onSubmit: () => void;
+}) {
+  return (
+    <AgentConversation
+      entries={entries}
+      emptyState={<div>Start the conversation.</div>}
+      transcriptProps={{
+        renderMessageText: (entry) => <div>{entry.text}</div>,
+      }}
+      composerProps={{
+        message,
+        onMessageChange,
+        onSubmit,
+        placeholder: "Send a message...",
+      }}
+    />
+  );
+}
+```
+
+Useful `ChatComposer` props:
+
+- `className` and `classNames` for external styling
+- `inputRef` to manage focus or autoresize from the consumer
+- `textareaProps` for lower-level textarea behavior
+- `allowEmptySubmit` when the submit action is valid without draft text, such as a stop button
+
+Use `transcriptProps` and `composerProps` when you want the shared composition but still need custom rendering or behavior. Use `transcriptClassNames` and `composerClassNames` when you want styling hooks for each subcomponent.
--- a/docs/sdk-overview.mdx
+++ b/docs/sdk-overview.mdx
@ -0,0 +1,276 @@
+---
+title: "SDK Overview"
+description: "Use the TypeScript SDK to manage Sandbox Agent sessions and APIs."
+icon: "compass"
+---
+
+The TypeScript SDK is centered on `sandbox-agent` and its `SandboxAgent` class.
+
+## Install
+
+<Tabs>
+  <Tab title="npm">
+    ```bash
+    npm install sandbox-agent@0.4.x
+    ```
+  </Tab>
+  <Tab title="bun">
+    ```bash
+    bun add sandbox-agent@0.4.x
+    # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
+    bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
+    ```
+  </Tab>
+</Tabs>
+
+## Optional React components
+
+```bash
+npm install @sandbox-agent/react@0.4.x
+```
+
+## Create a client
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+```
+
+`SandboxAgent.connect(...)` now waits for `/v1/health` by default before other SDK requests proceed. To disable that gate, pass `waitForHealth: false`. To keep the default gate but fail after a bounded wait, pass `waitForHealth: { timeoutMs: 120_000 }`. To cancel the startup wait early, pass `signal: abortController.signal`.
+
+With a custom fetch handler (for example, proxying requests inside Workers):
+
+```ts
+const sdk = await SandboxAgent.connect({
+  fetch: (input, init) => customFetch(input, init),
+});
+```
+
+With an abort signal for the startup health gate:
+
+```ts
+const controller = new AbortController();
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  signal: controller.signal,
+});
+
+controller.abort();
+```
+
+With persistence (see [Persisting Sessions](/session-persistence) for driver options):
+
+```ts
+import { SandboxAgent, InMemorySessionPersistDriver } from "sandbox-agent";
+
+const persist = new InMemorySessionPersistDriver();
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  persist,
+});
+```
+
+Local spawn with a sandbox provider:
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+import { local } from "sandbox-agent/local";
+
+const sdk = await SandboxAgent.start({
+  sandbox: local(),
+});
+
+// sdk.sandboxId — prefixed provider ID (e.g. "local/127.0.0.1:2468")
+
+await sdk.destroySandbox(); // provider-defined cleanup + disposes client
+```
+
+`SandboxAgent.start(...)` requires a `sandbox` provider. Built-in providers:
+
+| Import | Provider |
+|--------|----------|
+| `sandbox-agent/local` | Local subprocess |
+| `sandbox-agent/docker` | Docker container |
+| `sandbox-agent/e2b` | E2B sandbox |
+| `sandbox-agent/daytona` | Daytona workspace |
+| `sandbox-agent/vercel` | Vercel Sandbox |
+| `sandbox-agent/cloudflare` | Cloudflare Sandbox |
+
+Use `sdk.dispose()` to disconnect without changing sandbox state, `sdk.pauseSandbox()` for graceful suspension when supported, or `sdk.killSandbox()` for permanent deletion.
+
+## Session flow
+
+```ts
+const session = await sdk.createSession({
+  agent: "mock",
+  cwd: "/",
+});
+
+const prompt = await session.prompt([
+  { type: "text", text: "Summarize this repository." },
+]);
+
+console.log(prompt.stopReason);
+```
+
+Load and destroy:
+
+```ts
+const restored = await sdk.resumeSession(session.id);
+await restored.prompt([{ type: "text", text: "Continue from previous context." }]);
+
+await sdk.destroySession(restored.id);
+```
+
+## Session configuration
+
+Set model, mode, or thought level at creation or on an existing session:
+
+```ts
+const session = await sdk.createSession({
+  agent: "codex",
+  model: "gpt-5.3-codex",
+});
+
+await session.setModel("gpt-5.2-codex");
+await session.setMode("auto");
+
+const options = await session.getConfigOptions();
+const modes = await session.getModes();
+```
+
+Handle permission requests from agents that ask before executing tools:
+
+```ts
+const claude = await sdk.createSession({
+  agent: "claude",
+  mode: "default",
+});
+
+claude.onPermissionRequest((request) => {
+  void claude.respondPermission(request.id, "once");
+});
+```
+
+See [Agent Sessions](/agent-sessions) for full details on config options and error handling.
+
+## Events
+
+Subscribe to live events:
+
+```ts
+const unsubscribe = session.onEvent((event) => {
+  console.log(event.eventIndex, event.sender, event.payload);
+});
+
+await session.prompt([{ type: "text", text: "Give me a short summary." }]);
+unsubscribe();
+```
+
+Fetch persisted events:
+
+```ts
+const page = await sdk.getEvents({
+  sessionId: session.id,
+  limit: 100,
+});
+
+console.log(page.items.length);
+```
+
+## Control-plane and HTTP helpers
+
+```ts
+const health = await sdk.getHealth();
+const agents = await sdk.listAgents();
+await sdk.installAgent("codex", { reinstall: true });
+
+const entries = await sdk.listFsEntries({ path: "." });
+const writeResult = await sdk.writeFsFile({ path: "./hello.txt" }, "hello");
+
+console.log(health.status, agents.agents.length, entries.length, writeResult.path);
+```
+
+## Desktop API
+
+The SDK also wraps the desktop host/runtime HTTP API.
+
+Install desktop dependencies first on Linux hosts:
+
+```bash
+sandbox-agent install desktop --yes
+```
+
+Then query status, surface remediation if needed, and start the runtime:
+
+```ts
+const status = await sdk.getDesktopStatus();
+
+if (status.state === "install_required") {
+  console.log(status.installCommand);
+}
+
+const started = await sdk.startDesktop({
+  width: 1440,
+  height: 900,
+  dpi: 96,
+});
+
+const screenshot = await sdk.takeDesktopScreenshot();
+const displayInfo = await sdk.getDesktopDisplayInfo();
+
+await sdk.moveDesktopMouse({ x: 400, y: 300 });
+await sdk.clickDesktop({ x: 400, y: 300, button: "left", clickCount: 1 });
+await sdk.typeDesktopText({ text: "hello world", delayMs: 10 });
+await sdk.pressDesktopKey({ key: "ctrl+l" });
+
+await sdk.stopDesktop();
+```
+
+Screenshot helpers return `Uint8Array` PNG bytes. The SDK does not attempt to install OS packages remotely; callers should surface `missingDependencies` and `installCommand` from `getDesktopStatus()`.
+
+## Error handling
+
+```ts
+import { SandboxAgentError } from "sandbox-agent";
+
+try {
+  await sdk.listAgents();
+} catch (error) {
+  if (error instanceof SandboxAgentError) {
+    console.error(error.status, error.problem);
+  }
+}
+```
+
+## Inspector URL
+
+```ts
+import { buildInspectorUrl } from "sandbox-agent";
+
+const url = buildInspectorUrl({
+  baseUrl: "https://your-sandbox-agent.example.com",
+  headers: { "X-Custom-Header": "value" },
+});
+
+console.log(url);
+```
+
+Parameters:
+
+- `baseUrl` (required unless `fetch` is provided): Sandbox Agent server URL
+- `token` (optional): Bearer token for authenticated servers
+- `headers` (optional): Additional request headers
+- `fetch` (optional): Custom fetch implementation used by SDK HTTP and session calls
+- `skipHealthCheck` (optional): set `true` to skip the startup `/v1/health` wait
+- `waitForHealth` (optional, defaults to enabled): waits for `/v1/health` before HTTP helpers and session setup proceed; pass `false` to disable or `{ timeoutMs }` to bound the wait
+- `signal` (optional): aborts the startup `/v1/health` wait used by `connect()`
+
+## LLM credentials
+
+Sandbox Agent supports personal API keys, shared organization keys, and per-tenant gateway keys with budget enforcement. See [LLM Credentials](/llm-credentials) for setup details.
--- a/docs/sdks/typescript.mdx
+++ b/docs/sdks/typescript.mdx
@ -1,150 +0,0 @@
---
-title: "TypeScript SDK"
-description: "Use the generated client to manage sessions and stream events."
---
-
-The TypeScript SDK is generated from the OpenAPI spec that ships with the server. It provides a typed
-client for sessions, events, and agent operations.
-
-## Install
-
-```bash
-npm install sandbox-agent
-```
-
-## Create a client
-
-```ts
-import { SandboxAgent } from "sandbox-agent";
-
-const client = await SandboxAgent.connect({
-  baseUrl: "http://127.0.0.1:2468",
-  token: process.env.SANDBOX_TOKEN,
-});
-```
-
-## Autospawn (Node only)
-
-If you run locally, the SDK can launch the server for you.
-
-```ts
-import { SandboxAgent } from "sandbox-agent";
-
-const client = await SandboxAgent.start();
-
-await client.dispose();
-```
-
-Autospawn uses the local `sandbox-agent` binary. Install `@sandbox-agent/cli` (recommended) or set
-`SANDBOX_AGENT_BIN` to a custom path.
-
-## Sessions and messages
-
-```ts
-await client.createSession("demo-session", {
-  agent: "codex",
-  agentMode: "default",
-  permissionMode: "plan",
-});
-
-await client.postMessage("demo-session", { message: "Hello" });
-```
-
-List agents and pick a compatible one:
-
-```ts
-const agents = await client.listAgents();
-const codex = agents.agents.find((agent) => agent.id === "codex");
-console.log(codex?.capabilities);
-```
-
-## Poll events
-
-```ts
-const events = await client.getEvents("demo-session", {
-  offset: 0,
-  limit: 200,
-  includeRaw: false,
-});
-
-for (const event of events.events) {
-  console.log(event.type, event.data);
-}
-```
-
-## Stream events (SSE)
-
-```ts
-for await (const event of client.streamEvents("demo-session", {
-  offset: 0,
-  includeRaw: false,
-})) {
-  console.log(event.type, event.data);
-}
-```
-
-The SDK parses `text/event-stream` into `UniversalEvent` objects. If you want full control, use
-`getEventsSse()` and parse the stream yourself.
-
-## Stream a single turn
-
-```ts
-for await (const event of client.streamTurn("demo-session", { message: "Hello" })) {
-  console.log(event.type, event.data);
-}
-```
-
-This method posts the message and streams only the next turn. For manual control, call
-`postMessageStream()` and parse the SSE response yourself.
-
-## Optional raw payloads
-
-Set `includeRaw: true` on `getEvents`, `streamEvents`, or `streamTurn` to include the raw provider
-payload in `event.raw`. This is useful for debugging and conversion analysis.
-
-## Error handling
-
-All HTTP errors throw `SandboxAgentError`:
-
-```ts
-import { SandboxAgentError } from "sandbox-agent";
-
-try {
-  await client.postMessage("missing-session", { message: "Hi" });
-} catch (error) {
-  if (error instanceof SandboxAgentError) {
-    console.error(error.status, error.problem);
-  }
-}
-```
-
-## Inspector URL
-
-Build a URL to open the sandbox-agent Inspector UI with pre-filled connection settings:
-
-```ts
-import { buildInspectorUrl } from "sandbox-agent";
-
-const url = buildInspectorUrl({
-  baseUrl: "https://your-sandbox-agent.example.com",
-  token: "optional-bearer-token",
-  headers: { "X-Custom-Header": "value" },
-});
-console.log(url);
-// https://inspect.sandboxagent.dev?url=https%3A%2F%2Fyour-sandbox-agent.example.com&token=...&headers=...
-```
-
-Parameters:
- `baseUrl` (required): The sandbox-agent server URL
- `token` (optional): Bearer token for authentication
- `headers` (optional): Extra headers to pass to the server (JSON-encoded in the URL)
-
-## Types
-
-The SDK exports OpenAPI-derived types for events, items, and capabilities:
-
-```ts
-import type { UniversalEvent, UniversalItem, AgentCapabilities } from "sandbox-agent";
-```
-
-See `docs/universal-api.mdx` for the universal schema fields and semantics.
--- a/docs/security.mdx
+++ b/docs/security.mdx
@ -0,0 +1,191 @@
+---
+title: "Security"
+description: "Backend-first auth and access control patterns."
+icon: "shield"
+---
+
+As covered in [Orchestration Architecture](/orchestration-architecture), run the Sandbox Agent client on your backend, not in the browser.
+
+This keeps sandbox credentials private and gives you one place for authz, rate limiting, and audit logging.
+
+## Auth model
+
+Implement auth however it fits your stack (sessions, JWT, API keys, etc.), but enforce it before any sandbox-bound request.
+
+Minimum checks:
+
+- Authenticate the caller.
+- Authorize access to the target workspace/sandbox/session.
+- Apply request rate limits and request logging.
+
+## Examples
+
+### Rivet
+
+<CodeGroup>
+
+```ts Actor (server)
+import { UserError, actor } from "rivetkit";
+import { SandboxAgent } from "sandbox-agent";
+
+type ConnParams = {
+  accessToken: string;
+};
+
+type WorkspaceClaims = {
+  sub: string;
+  workspaceId: string;
+  role: "owner" | "member" | "viewer";
+};
+
+async function verifyWorkspaceToken(
+  token: string,
+  workspaceId: string,
+): Promise<WorkspaceClaims | null> {
+  // Validate JWT/session token here, then enforce workspace scope.
+  // Return null when invalid/expired/not a member.
+  if (!token) return null;
+  return { sub: "user_123", workspaceId, role: "member" };
+}
+
+export const workspace = actor({
+  state: {
+    events: [] as Array<{ userId: string; prompt: string; createdAt: number }>,
+  },
+
+  onBeforeConnect: async (c, params: ConnParams) => {
+    const claims = await verifyWorkspaceToken(params.accessToken, c.key[0]);
+    if (!claims) {
+      throw new UserError("Forbidden", { code: "forbidden" });
+    }
+  },
+
+  createConnState: async (c, params: ConnParams) => {
+    const claims = await verifyWorkspaceToken(params.accessToken, c.key[0]);
+    if (!claims) {
+      throw new UserError("Forbidden", { code: "forbidden" });
+    }
+
+    return {
+      userId: claims.sub,
+      role: claims.role,
+      workspaceId: claims.workspaceId,
+    };
+  },
+
+  actions: {
+    submitPrompt: async (c, prompt: string) => {
+      if (!c.conn) {
+        throw new UserError("Connection required", { code: "connection_required" });
+      }
+
+      if (c.conn.state.role === "viewer") {
+        throw new UserError("Insufficient permissions", { code: "forbidden" });
+      }
+
+      // Connect to Sandbox Agent from the actor (server-side only).
+      // Sandbox credentials never reach the client.
+      const sdk = await SandboxAgent.connect({
+        baseUrl: process.env.SANDBOX_URL!,
+        token: process.env.SANDBOX_TOKEN,
+      });
+
+      const session = await sdk.createSession({
+        agent: "claude",
+        cwd: "/workspace",
+      });
+
+      session.onEvent((event) => {
+        c.broadcast("session.event", {
+          userId: c.conn!.state.userId,
+          eventIndex: event.eventIndex,
+          sender: event.sender,
+          payload: event.payload,
+        });
+      });
+
+      const result = await session.prompt([
+        { type: "text", text: prompt },
+      ]);
+
+      c.state.events.push({
+        userId: c.conn.state.userId,
+        prompt,
+        createdAt: Date.now(),
+      });
+
+      return { stopReason: result.stopReason };
+    },
+  },
+});
+```
+
+```ts Client (browser)
+import { createClient } from "rivetkit/client";
+import type { registry } from "./actors";
+
+const client = createClient<typeof registry>({
+  endpoint: process.env.NEXT_PUBLIC_RIVET_ENDPOINT!,
+});
+
+const handle = client.workspace.getOrCreate(["ws_123"], {
+  params: { accessToken: userJwt },
+});
+
+const conn = handle.connect();
+
+conn.on("session.event", (event) => {
+  console.log(event.sender, event.payload);
+});
+
+const result = await conn.submitPrompt("Plan a refactor for auth middleware.");
+console.log(result.stopReason);
+```
+
+</CodeGroup>
+
+Use [onBeforeConnect](https://rivet.dev/docs/actors/authentication), [connection params](https://rivet.dev/docs/actors/connections), and [actor keys](https://rivet.dev/docs/actors/keys) together so each actor enforces auth per workspace.
+
+### Hono
+
+```ts
+import { Hono } from "hono";
+import { bearerAuth } from "hono/bearer-auth";
+
+const app = new Hono();
+
+app.use("/sandbox/*", bearerAuth({ token: process.env.APP_API_TOKEN! }));
+
+app.all("/sandbox/*", async (c) => {
+  const incoming = new URL(c.req.url);
+  const upstreamUrl = new URL(process.env.SANDBOX_URL!);
+  upstreamUrl.pathname = incoming.pathname.replace(/^\/sandbox/, "/v1");
+  upstreamUrl.search = incoming.search;
+
+  const headers = new Headers();
+  headers.set("authorization", `Bearer ${process.env.SANDBOX_TOKEN ?? ""}`);
+
+  const accept = c.req.header("accept");
+  if (accept) headers.set("accept", accept);
+
+  const contentType = c.req.header("content-type");
+  if (contentType) headers.set("content-type", contentType);
+
+  const body =
+    c.req.method === "POST" || c.req.method === "PUT" || c.req.method === "PATCH"
+      ? await c.req.text()
+      : undefined;
+
+  const upstream = await fetch(upstreamUrl, {
+    method: c.req.method,
+    headers,
+    body,
+  });
+
+  return new Response(upstream.body, {
+    status: upstream.status,
+    headers: upstream.headers,
+  });
+});
+```
+
--- a/docs/session-persistence.mdx
+++ b/docs/session-persistence.mdx
@ -0,0 +1,121 @@
+---
+title: "Persisting Sessions"
+description: "Choose and configure session persistence for the TypeScript SDK."
+icon: "database"
+---
+
+The TypeScript SDK uses a `SessionPersistDriver` to store session records and event history.
+If you do not provide one, the SDK uses in-memory storage.
+With persistence enabled, sessions can be restored after runtime/session loss. See [Session Restoration](/session-restoration).
+
+Each driver stores:
+
+- `SessionRecord` (`id`, `agent`, `agentSessionId`, `lastConnectionId`, `createdAt`, optional `destroyedAt`, optional `sandboxId`, optional `sessionInit`, optional `configOptions`, optional `modes`)
+- `SessionEvent` (`id`, `eventIndex`, `sessionId`, `connectionId`, `sender`, `payload`, `createdAt`)
+
+## Persistence drivers
+
+### Rivet
+
+Recommended for sandbox orchestration with actor state. See [Multiplayer](/multiplayer) for a full Rivet actor example with persistence in actor state.
+
+### IndexedDB (browser)
+
+Best for browser apps that should survive reloads. See the [Inspector source](https://github.com/rivet-dev/sandbox-agent/tree/main/frontend/packages/inspector/src/persist-indexeddb.ts) for a complete IndexedDB driver you can copy into your project.
+
+### In-memory (built-in)
+
+Best for local dev and ephemeral workloads. No extra dependencies required.
+
+```ts
+import { InMemorySessionPersistDriver, SandboxAgent } from "sandbox-agent";
+
+const persist = new InMemorySessionPersistDriver({
+  maxSessions: 1024,
+  maxEventsPerSession: 500,
+});
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  persist,
+});
+```
+
+### SQLite
+
+Best for local/server Node apps that need durable storage without a DB server.
+
+```bash
+npm install better-sqlite3
+```
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+import { SQLiteSessionPersistDriver } from "./persist.ts";
+
+const persist = new SQLiteSessionPersistDriver({
+  filename: "./sandbox-agent.db",
+});
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  persist,
+});
+```
+
+See the [full SQLite example](https://github.com/rivet-dev/sandbox-agent/tree/main/examples/persist-sqlite) for the complete driver implementation you can copy into your project.
+
+### Postgres
+
+Use when you already run Postgres and want shared relational storage.
+
+```bash
+npm install pg
+```
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+import { PostgresSessionPersistDriver } from "./persist.ts";
+
+const persist = new PostgresSessionPersistDriver({
+  connectionString: process.env.DATABASE_URL,
+  schema: "public",
+});
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+  persist,
+});
+```
+
+See the [full Postgres example](https://github.com/rivet-dev/sandbox-agent/tree/main/examples/persist-postgres) for the complete driver implementation you can copy into your project.
+
+### Custom driver
+
+Implement `SessionPersistDriver` for custom backends.
+
+```ts
+import type { SessionPersistDriver } from "sandbox-agent";
+
+class MyDriver implements SessionPersistDriver {
+  async getSession(id) { return undefined; }
+  async listSessions(request) { return { items: [] }; }
+  async updateSession(session) {}
+  async listEvents(request) { return { items: [] }; }
+  async insertEvent(sessionId, event) {}
+}
+```
+
+## Replay controls
+
+`SandboxAgent.connect(...)` supports:
+
+- `replayMaxEvents` (default `50`)
+- `replayMaxChars` (default `12000`)
+
+These cap replay size when restoring sessions.
+
+## Related docs
+
+- [SDK Overview](/sdk-overview)
+- [Session Restoration](/session-restoration)
--- a/docs/session-restoration.mdx
+++ b/docs/session-restoration.mdx
@ -0,0 +1,33 @@
+---
+title: "Session Restoration"
+description: "How the TypeScript SDK restores sessions after connection/runtime loss."
+---
+
+Sandbox Agent automatically restores stale sessions when live session state is no longer available.
+
+This is driven by the configured `SessionPersistDriver` (`inMemory`, IndexedDB, SQLite, Postgres, or custom).
+
+## How Auto-Restore Works
+
+When you call `session.prompt(...)` (or `resumeSession(...)`) and the saved session points to a stale connection, the SDK:
+
+1. Recreates a fresh session for the same local session id.
+2. Rebinds the local session to the new runtime session id.
+3. Replays recent persisted events into the next prompt as context.
+
+This happens automatically; you do not need to manually rebuild the session.
+
+## Replay Limits
+
+Replay payload size is capped by:
+
+- `replayMaxEvents` (default `50`)
+- `replayMaxChars` (default `12000`)
+
+These controls limit prompt growth during restore while preserving recent context.
+
+## Related Docs
+
+- [SDK Overview](/sdk-overview)
+- [Persisting Sessions](/session-persistence)
+- [Agent Sessions](/agent-sessions)
--- a/docs/skills-config.mdx
+++ b/docs/skills-config.mdx
@ -0,0 +1,79 @@
+---
+title: "Skills"
+description: "Configure skill sources for agent sessions."
+sidebarTitle: "Skills"
+icon: "sparkles"
+---
+
+Skills are local instruction bundles stored in `SKILL.md` files.
+
+## Configuring skills
+
+Use `setSkillsConfig` / `getSkillsConfig` / `deleteSkillsConfig` to manage skill source config by directory + skill name.
+
+```ts
+import { SandboxAgent } from "sandbox-agent";
+
+const sdk = await SandboxAgent.connect({
+  baseUrl: "http://127.0.0.1:2468",
+});
+
+// Add a skill
+await sdk.setSkillsConfig(
+  {
+    directory: "/workspace",
+    skillName: "default",
+  },
+  {
+    sources: [
+      { type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
+      { type: "local", source: "/workspace/my-custom-skill" },
+    ],
+  },
+);
+
+// Create a session using the configured skills
+const session = await sdk.createSession({
+  agent: "claude",
+  cwd: "/workspace",
+});
+
+await session.prompt([
+  { type: "text", text: "Use available skills to help with this task." },
+]);
+
+// List skills
+const config = await sdk.getSkillsConfig({
+  directory: "/workspace",
+  skillName: "default",
+});
+
+console.log(config.sources.length);
+
+// Delete skill
+await sdk.deleteSkillsConfig({
+  directory: "/workspace",
+  skillName: "default",
+});
+
+```
+
+## Skill sources
+
+Each `skills.sources` entry describes where to find skills.
+
+| Type | `source` value | Example |
+|------|---------------|---------|
+| `github` | `owner/repo` | `"rivet-dev/skills"` |
+| `local` | filesystem path | `"/workspace/my-skill"` |
+| `git` | git clone URL | `"https://git.example.com/skills.git"` |
+
+Optional fields:
+
+- `skills`: subset of skill directory names to include
+- `ref`: branch/tag/commit (for `github` and `git`)
+- `subpath`: subdirectory within repo to scan
+
+## Custom skills
+
+To write, upload, and configure your own skills inside the sandbox, see [Custom Tools](/custom-tools).
--- a/docs/telemetry.mdx
+++ b/docs/telemetry.mdx
@ -24,8 +24,3 @@ Disable it with:
 sandbox-agent server --no-telemetry
 ```

-Debug builds disable telemetry automatically. You can opt in with:
-
-```bash
-SANDBOX_AGENT_TELEMETRY_DEBUG=1 sandbox-agent server
-```
--- a/Show more
+++ b/Show more