Compare commits

...

159 commits

Author SHA1 Message Date
Nathan Flurry
bf484e7c96 docs: clean up orphaned docs and add session event types
Delete orphaned docs not in docs.json navigation (gigacode.mdx,
foundry-self-hosting.mdx, session-transcript-schema.mdx, pi-support-plan.md).
Remove outdated musl/glibc troubleshooting section. Add event types
documentation with example payloads to agent-sessions.mdx.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 19:11:19 -07:00
Nathan Flurry
d55b0dfb88 chore(release): update version to 0.4.2 2026-03-25 18:07:26 -07:00
ABC
251f731232
Merge pull request #284 from rivet-dev/03-25-fix_mock_pass_sandbox_agent_bin_to_mock_agent_launcher
fix(mock): pass SANDBOX_AGENT_BIN to mock agent launcher
2026-03-25 17:00:51 -04:00
abcxff
b45989a082 fix(mock): pass SANDBOX_AGENT_BIN to mock agent launcher 2026-03-25 16:54:40 -04:00
Nathan Flurry
78e84281e8 chore(release): update version to 0.4.1 2026-03-25 13:20:57 -07:00
Nathan Flurry
5da35e6dfa feat: sprites support 2026-03-25 12:23:14 -07:00
ABC
9cd9252725
Merge pull request #283 from rivet-dev/03-25-chore_providers_move_back_to_0.4.x_install_script
chore(providers): sync install script with latest 0.4.x
2026-03-25 14:24:55 -04:00
abcxff
858b9a4d2f chore(providers): move back to 0.4.x install script 2026-03-25 14:22:57 -04:00
Nathan Flurry
4fa28061e9
Merge pull request #279 from rivet-dev/NicholasKissel/docs-dark-theme
fix(docs): restore dark theme styling
2026-03-24 23:26:28 -07:00
ABCxFF
cb42971b56 chore(release): update version to 0.5.0-rc.2 2026-03-25 05:13:47 +00:00
ABC
e9fabbfe64
fix: surface agent stderr in RPC errors & add defaultCwd param (#278) 2026-03-25 00:49:35 -04:00
ABC
32dd5914ed
Merge pull request #269 from rivet-dev/e2b-base-image-support
feat(providers): add base image support and improve forward compatibility
2026-03-25 00:42:49 -04:00
ABC
fe8fbfc91c
Merge branch 'main' into e2b-base-image-support 2026-03-25 00:37:58 -04:00
Nicholas Kissel
32713ff453 fix(docs): keep dark mode strict appearance
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 21:33:10 -07:00
abcxff
833b57deb1 fix: surface agent stderr in RPC errors and default cwd for remote providers 2026-03-25 04:26:48 +00:00
Nicholas Kissel
927e77c7e2 fix(docs): restore dark theme styling with custom CSS
Re-enable theme.css with full custom styling (links, inputs, cards,
code blocks, alerts) and update docs.json color values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 21:25:06 -07:00
Nathan Flurry
f353e39fc6
Merge pull request #273 from Crunchyman-ralph/fix/update-install-script-to-0.4.x
fix: update install script URL from 0.3.x to 0.4.x
2026-03-19 12:45:00 -07:00
Ralph Khreish
3525dcc315
fix: update install script URL from 0.3.x to 0.4.x
The E2B and Vercel providers install sandbox-agent 0.3.x inside sandboxes
while the SDK client speaks 0.4.0 ACP protocol, causing AcpRpcError -32603.

Fixes #272
2026-03-19 17:45:16 +01:00
Nathan Flurry
7b23e519c2 fix(foundry): add Bun idleTimeout safety net and subscription retry with backoff
Bun.serve() defaults to a 10s idle timeout that can kill long-running
requests. Actor RPCs go through the gateway tunnel with a 1s SSE ping,
so this likely never fires, but set idleTimeout to 255 as a safety net.

Subscription topics (app, org, session, task) previously had no retry
mechanism. If the initial connection or a mid-session error occurred,
the subscription stayed in error state permanently. Add exponential
backoff retry (1s base, 30s max) that cleans up the old connection
before each attempt and stops when disposed or no listeners remain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 18:35:36 -07:00
Nathan Flurry
bea3b58199 fix(foundry): use $HOME instead of hardcoded /home/sandbox for sandbox repo paths
E2B sandboxes run as `user` (home: /home/user), not `sandbox`, so
`mkdir -p /home/sandbox` fails with "Permission denied". Replace all
hardcoded `/home/sandbox` paths with `$HOME` resolved at shell runtime
inside the sandbox, and dynamically resolve the repo CWD via the sandbox
actor so it works across providers (E2B, local Docker, Daytona).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:33:53 -07:00
Nathan Flurry
524f40ec02 feat(providers): simplify modal to use published base image
The `-full` base image already includes sandbox-agent and all agents
pre-installed. Remove redundant apt-get, install script, and
install-agent dockerfile commands from the Modal provider.

Also allow overriding the default image via SANDBOX_AGENT_IMAGE env var
across all providers for testing with different published versions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 16:54:20 -07:00
Nathan Flurry
4e76038a0d feat(providers): add base image support and improve forward compatibility
Add support for configuring base images across all compute providers:
- E2B: Accept optional `template` parameter to select custom templates
- Modal: Accept optional `image` parameter (string or Image object) for base images
- ComputeSDK: Expand `create` override to accept full CreateSandboxOptions payload (image, templateId, etc.)
- Daytona: Improve type safety for `image` option

Improve forward compatibility by making all `create` overrides accept full Partial SDK types, allowing any new provider fields to flow through without code changes. Fix Modal provider bug where `encryptedPorts` was hardcoded and would clobber user-provided values; now merges additional ports instead.

Update docs and examples to demonstrate base image configuration for E2B, Modal, and ComputeSDK. Add comprehensive provider lifecycle tests for Modal and ComputeSDK, including template and image passthrough verification.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-17 15:25:21 -07:00
Nathan Flurry
ffb9f1082b fix(foundry): fix runner version 2026-03-17 14:33:13 -07:00
Nathan Flurry
f25a92aca8 chore(release): update version to 0.5.0-rc.1 2026-03-17 02:44:41 -07:00
Nathan Flurry
3b8c74589d
Merge pull request #264 from rivet-dev/desktop-computer-use-neko
feat: desktop computer-use APIs with neko streaming
2026-03-17 02:36:50 -07:00
Nathan Flurry
dff7614b11 feat: desktop computer-use APIs with windows, launch/open, and neko streaming
Adds desktop computer-use endpoints (windows, screenshots, mouse/keyboard,
launch/open), enhances neko-based streaming integration, updates inspector
UI with desktop debug tab, and adds common software test infrastructure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 02:36:25 -07:00
Nathan Flurry
2d8508d6e2 feat: enhance desktop computer-use streaming with neko integration
Improve desktop streaming architecture, add inspector dev tooling,
React DesktopViewer updates, and computer-use documentation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 02:36:25 -07:00
Nathan Flurry
4252c705df chore: remove .context/ from git and add to .gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 02:36:25 -07:00
Nathan Flurry
33821d8660 feat: desktop computer-use APIs with neko-based streaming
Add desktop runtime management (Xvfb, openbox, dbus), screen capture,
mouse/keyboard input, and video streaming via neko binary extracted
from the m1k1o/neko container. Includes Docker test rig, TypeScript SDK
desktop support, and inspector Desktop tab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 02:36:17 -07:00
Nathan Flurry
3895e34bdb feat(foundry): add foundry base sandbox image with sudo, chromium, and dev tooling
Add a custom Docker image (foundry-base.Dockerfile) that builds sandbox-agent
from source and layers sudo, git, neovim, gh, node, bun, chromium, and
agent-browser. Includes publish script for timestamped + latest tags to
rivetdev/sandbox-agent on Docker Hub.

Update local sandbox provider default to use foundry-base-latest and wire
HF_LOCAL_SANDBOX_IMAGE env var through compose.dev.yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 02:09:12 -07:00
Nathan Flurry
eafe0f9fe4 fix(foundry): use IF NOT EXISTS in org migration to handle pre-existing auth tables
Some org actors had auth tables created outside the migration system
(by earlier queue-based auth code). Migration m0001 fails with
"table auth_session_index already exists" on those actors, preventing
them from starting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 22:45:34 -07:00
Nathan Flurry
6ebe13cddd fix(foundry): use cookie-based OAuth state to prevent proxy retry auth failures
Switch storeStateStrategy from "database" to "cookie" so OAuth state is
stored encrypted in a temporary cookie instead of a DB verification record.
This makes the callback idempotent — proxy retries can't fail because the
state travels with the request itself rather than being deleted after the
first successful callback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 22:37:54 -07:00
Nathan Flurry
8ddec6831b fix(foundry): deduplicate OAuth callbacks and cache actor handles to fix production auth
The production proxy chain (Cloudflare -> Fastly -> Railway) retries
OAuth callback requests when they take >10s. The first request succeeds
and deletes the verification record, so the retry fails with
"verification not found" -> ?error=please_restart_the_process.

- Add callback deduplication by OAuth state param in the auth handler.
  Duplicate requests wait for the original and return a cloned response.
- Cache appOrganization() and getUser() actor handles to eliminate
  redundant getOrCreate RPCs during callbacks (was 10+ per sign-in).
- Add diagnostic logging for auth callback timing and adapter operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 22:29:17 -07:00
Nathan Flurry
4ca77e4d83 Merge remote-tracking branch 'origin/main' into fix-foundry-auth-error 2026-03-16 21:26:25 -07:00
Nathan Flurry
e7b9ac6854 fix(foundry): move Better Auth operations from queues to actions to fix production auth timeout
The org actor's workflow queue is shared with GitHub sync, webhooks, task
mutations, and billing (20+ queue names processed sequentially). During
OAuth callback, auth operations would time out waiting behind long-running
queue handlers, causing Better Auth's parseState to redirect to
?error=please_restart_the_process.

Auth operations are simple SQLite reads/writes with no cross-actor side
effects, so they are safe to run as actions that execute immediately
without competing in the queue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 21:26:13 -07:00
Nathan Flurry
eab215c7cb feat(foundry): redirect to signin page on auth API errors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 19:34:16 -07:00
Nathan Flurry
84a80d59d7
Merge pull request #265 from rivet-dev/revert-actions-to-queues
feat(foundry): revert actions to queue/workflow pattern
2026-03-16 18:48:21 -07:00
Nathan Flurry
a171956298 feat(foundry): revert actions to queue/workflow pattern with direct sends
Revert actor communication from direct action calls to queue/workflow-based
patterns for better observability (workflow history in RivetKit inspector),
replay/recovery semantics, and idiomatic RivetKit usage.

- Add queue/workflow infrastructure to all actors: organization, task, user,
  github-data, sandbox, and audit-log
- Mutations route through named queues processed by workflow command loops
  with ctx.step() wrapping for c.state/c.db access and observability
- Remove command action wrappers (~460 lines) — callers use .send() directly
  to queue names with expectQueueResponse() for wait:true results
- Keep sendPrompt and runProcess as direct sandbox actions (long-running /
  large responses that would block the workflow loop or exceed 128KB limit)
- Fix workspace fire-and-forget calls (enqueueWorkspaceEnsureSession,
  enqueueWorkspaceRefresh) to self-send to task queue instead of calling
  directly outside workflow step context

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:46:53 -07:00
Nathan Flurry
4111aebfce
feat(foundry): task owner git auth + manual owner change UI (#263)
* Add task owner git auth proposal and sandbox architecture docs

- Add proposal for primary user per task with OAuth token injection
  for sandbox git operations (.context/proposal-task-owner-git-auth.md)
- Document sandbox architecture constraints in CLAUDE.md: single sandbox
  per task assumption, OAuth token security implications, git auto-auth
  requirement, and git error surfacing rules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add proposals for reverting to queues and rivetkit sandbox resilience

- proposal-revert-actions-to-queues.md: Detailed plan for reverting the
  actions-only pattern back to queues/workflows now that the RivetKit
  queue.iter() bug is fixed. Lists what to keep (lazy tasks, resolveTaskRepoId,
  sync override threading, E2B fixes, frontend fixes) vs what to revert
  (communication pattern only).

- proposal-rivetkit-sandbox-resilience.md: Rivetkit sandbox actor changes for
  handling destroyed/paused sandboxes, keep-alive, and the UNIQUE constraint
  crash fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(foundry): add manual task owner change via UI dropdown

Add an owner dropdown to the Overview tab that lets users reassign task
ownership to any organization member. The owner's GitHub credentials are
used for git operations in the sandbox.

Full-stack implementation:
- Backend: changeTaskOwnerManually action on task actor, routed through
  org actor's changeWorkspaceTaskOwner action, with primaryUser schema
  columns on both task and org index tables
- Client: changeOwner method on workspace client (mock + remote)
- Frontend: owner dropdown in right sidebar Overview tab showing org
  members, with avatar and role display
- Shared: TaskWorkspaceChangeOwnerInput type and primaryUser fields on
  workspace snapshot types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 17:05:11 -07:00
Nathan Flurry
167712ace7 chore(release): update version to 0.4.1-rc.1 2026-03-16 15:53:00 -07:00
Nathan Flurry
9ce71c03c8
Merge pull request #261 from rivet-dev/e2b-autopause-provider
feat: add E2B auto-pause provider lifecycle support
2026-03-16 15:39:45 -07:00
Nathan Flurry
f45a467484
chore(foundry): migrate to actions (#262)
* feat(foundry): checkpoint actor and workspace refactor

* docs(foundry): add agent handoff context

* wip(foundry): continue actor refactor

* wip(foundry): capture remaining local changes

* Complete Foundry refactor checklist

* Fix Foundry validation fallout

* wip

* wip: convert all actors from workflow to plain run handlers

Workaround for RivetKit bug where c.queue.iter() never yields messages
for actors created via getOrCreate from another actor's context. The
queue accepts messages (visible in inspector) but the iterator hangs.
Sleep/wake fixes it, but actors with active connections never sleep.

Converted organization, github-data, task, and user actors from
run: workflow(...) to plain run: async (c) => { for await ... }.

Also fixes:
- Missing auth tables in org migration (auth_verification etc)
- default_model NOT NULL constraint on org profile upsert
- Nested workflow step in github-data (HistoryDivergedError)
- Removed --force from frontend Dockerfile pnpm install

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Convert all actors from queues/workflows to direct actions, lazy task creation

Major refactor replacing all queue-based workflow communication with direct
RivetKit action calls across all actors. This works around a RivetKit bug
where c.queue.iter() deadlocks for actors created from another actor's context.

Key changes:
- All actors (organization, task, user, audit-log, github-data) converted
  from run: workflow(...) to actions-only (no run handler, no queues)
- PR sync creates virtual task entries in org local DB instead of spawning
  task actors — prevents OOM from 200+ actors created simultaneously
- Task actors created lazily on first user interaction via getOrCreate,
  self-initialize from org's getTaskIndexEntry data
- Removed requireRepoExists cross-actor call (caused 500s), replaced with
  local resolveTaskRepoId from org's taskIndex table
- Fixed getOrganizationContext to thread overrides through all sync phases
- Fixed sandbox repo path (/home/user/repo for E2B compatibility)
- Fixed buildSessionDetail to skip transcript fetch for pending sessions
- Added process crash protection (uncaughtException/unhandledRejection)
- Fixed React infinite render loop in mock-layout useEffect dependencies
- Added sandbox listProcesses error handling for expired E2B sandboxes
- Set E2B sandbox timeout to 1 hour (was 5 min default)
- Updated CLAUDE.md with lazy task creation rules, no-silent-catch policy,
  React hook dependency safety rules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix E2B sandbox timeout comment, frontend stability, and create-flow improvements

- Add TEMPORARY comment on E2B timeoutMs with pointer to rivetkit sandbox
  resilience proposal for when autoPause lands
- Fix React useEffect dependency stability in mock-layout and
  organization-dashboard to prevent infinite re-render loops
- Fix terminal-pane ref handling
- Improve create-flow service and tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 15:23:59 -07:00
Nathan Flurry
77c8f1e3f3 feat: add E2B auto-pause support with pause/kill/reconnect provider lifecycle
Add `pause()`, `kill()`, and `reconnect()` methods to the SandboxProvider interface so providers can support graceful suspension and permanent deletion as distinct operations. The E2B provider now uses `betaCreate` with `autoPause: true` by default, `betaPause()` for suspension, and surfaces `SandboxDestroyedError` on reconnect to a deleted sandbox. SDK exposes `pauseSandbox()` and `killSandbox()` alongside the existing `destroySandbox()`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:57:49 -07:00
Nathan Flurry
32f3c6c3bc chore(release): update version to 0.4.0 2026-03-16 00:48:05 -07:00
Nathan Flurry
7faed2f43a chore(release): update version to 0.4.0-rc.3 2026-03-15 23:26:42 -07:00
Nathan Flurry
f0ec8e497b fix: mock agent process launcher not written during install
agent_process_status() for mock always returned Some(...) even when the
launcher file did not exist. This caused install_agent_process() to
short-circuit with "already installed" and never write the launcher
script. Fix by checking that the launcher file exists before reporting
the mock agent as installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 23:17:52 -07:00
Nathan Flurry
56c80e6c9e chore(release): update version to 0.4.0-rc.2 2026-03-15 22:38:30 -07:00
Nathan Flurry
bf543d225d fix: mock agent process, React 18/19 types, release version refs
- Add hidden `mock-agent-process` CLI subcommand implementing a stdio
  JSON-RPC echo agent (ported from examples/mock-acp-agent)
- Update write_mock_agent_process_launcher() to exec the new subcommand
  instead of exiting with error
- Update sdks/react to support React 18 and 19 peer dependencies
- Update @types/react to v19 across workspace (pnpm override + inspector)
- Fix RefObject<T | null> compatibility for React 19 useRef() signatures
- Add version reference replacement logic to release update_version.ts
  covering all docs, examples, and code files listed in CLAUDE.md
- Add missing files to CLAUDE.md Install Version References list
  (architecture.mdx, boxlite, modal, computesdk docs and examples)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 22:36:48 -07:00
Nathan Flurry
2f9f25ae54 chore(release): update version to 0.4.0-rc.1 2026-03-15 20:53:54 -07:00
Nathan Flurry
cf7e2a92c6
SDK: Add ensureServer() for automatic server recovery (#260)
* SDK sandbox provisioning: built-in providers, docs restructure, and quickstart overhaul

- Add built-in sandbox providers (local, docker, e2b, daytona, vercel, cloudflare) to the TypeScript SDK so users import directly instead of passing client instances
- Restructure docs: rename architecture to orchestration-architecture, add new architecture page for server overview, improve getting started flow
- Rewrite quickstart to be TypeScript-first with provider CodeGroup and custom provider accordion
- Update all examples to use new provider APIs
- Update persist drivers and foundry for new SDK surface

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix SDK typecheck errors and update persist drivers for insertEvent signature

- Fix insertEvent call in client.ts to pass sessionId as first argument
- Update Daytona provider create options to use Partial type (image has default)
- Update StrictUniqueSessionPersistDriver in tests to match new insertEvent signature
- Sync persist packages, openapi spec, and docs with upstream changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Modal and ComputeSDK built-in providers, update examples and docs

- Add `sandbox-agent/modal` provider using Modal SDK with node:22-slim image
- Add `sandbox-agent/computesdk` provider using ComputeSDK's unified sandbox API
- Update Modal and ComputeSDK examples to use new SDK providers
- Update Modal and ComputeSDK deploy docs with provider-based examples
- Add Modal to quickstart CodeGroup and docs.json navigation
- Add provider test entries for Modal and ComputeSDK
- Remove old standalone example files (modal.ts, computesdk.ts)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix Modal provider: pre-install agents in image, fire-and-forget exec for server

- Pre-install agents in Dockerfile commands so they are cached across creates
- Use fire-and-forget exec (no wait) to keep server alive in Modal sandbox
- Add memoryMiB option (default 2GB) to avoid OOM during agent install

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Sync upstream changes: multiplayer docs, logos, openapi spec, foundry config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* SDK: Add ensureServer() for automatic server recovery

Add ensureServer() to SandboxProvider interface to handle cases where the
sandbox-agent server stops or goes to sleep. The SDK now calls this method
after 3 consecutive health-check failures, allowing providers to restart the
server if needed. Most built-in providers (E2B, Daytona, Vercel, Modal,
ComputeSDK) implement this. Docker and Cloudflare manage server lifecycle
differently, and Local uses managed child processes.

Also update docs for quickstart, architecture, multiplayer, and session
persistence; mark persist-* packages as deprecated; and add ensureServer
implementations to all applicable providers.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* wip

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 20:29:28 -07:00
Nathan Flurry
3426cbc6ec
chore: update ACP SDK to 0.16.1 and add e2e testing guidance (#259)
- Bump @agentclientprotocol/sdk from 0.14.1 to 0.16.1 in acp-http-client
- Update adapters.json to reflect new SDK version
- Migrate unstableListSessions to listSessions (stabilized in SDK 0.16.0)
- Add CLAUDE.md guidance: request token location before e2e agent testing

All 5 ACP adapters remain at their latest versions. E2E testing confirms
Claude, Codex, Pi, and Cursor agents work end-to-end with credentials.

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-15 19:46:26 -07:00
Nathan Flurry
d850a3b77a
fix: normalize Pi ACP bootstrap payloads (#227)
* fix: normalize pi ACP bootstrap payloads

* docs(cli): document custom pi binary override

* docs(quickstart): list all supported agent IDs

* docs(code): clarify Pi payload normalization rationale
2026-03-15 18:52:59 -07:00
waltertang27
e740d28e0a
Add modal sandbox support (#192)
* add modal sandbox example

* add test instructions

---------

Co-authored-by: Nathan Flurry <NathanFlurry@users.noreply.github.com>
2026-03-15 13:14:59 -07:00
Nathan Flurry
284fe66be4
wip (#258) 2026-03-15 12:37:42 -07:00
Nathan Flurry
57a07f6a0a
wip (#256) 2026-03-14 23:47:43 -07:00
Nathan Flurry
99abb9d42e
chore(foundry): workbench action responsiveness (#254)
* wip

* wip
2026-03-14 20:42:18 -07:00
Nathan Flurry
400f9a214e
Add transcript virtualization to Foundry UI (#255) 2026-03-14 17:55:05 -07:00
Nathan Flurry
5ea9ec5e2f
wip (#253) 2026-03-14 14:38:29 -07:00
Nathan Flurry
70d31f819c
chore(foundry): improve sandbox impl + status pill (#252)
* Improve Daytona sandbox provisioning and frontend UI

Refactor git clone script in Daytona provider to use cleaner shell logic for GitHub token authentication and branch checkout. Add support for private repository clones with token-based auth. Improve Daytona provider error handling and git configuration setup.

Frontend improvements include enhanced dev panel, workspace dashboard, sidebar navigation, and UI components for better task/session management. Update interest manager and backend client to support improved session state handling.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* Add header status pill showing task/session/sandbox state

Surface aggregate status (error, provisioning, running, ready, no sandbox)
as a colored pill in the transcript panel header. Integrates task runtime
status, session status, and sandbox availability via the sandboxProcesses
interest topic so the pill accurately reflects unreachable sandboxes.

Includes mock tasks demonstrating error, provisioning, and running states,
unit tests for deriveHeaderStatus, and workspace-dashboard integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-14 12:14:06 -07:00
Nathan Flurry
5a1b32a271 chore(release): update version to 0.3.2 2026-03-13 21:47:55 -07:00
Nathan Flurry
8fb19b50da
Remove frontend errors and app passthrough (#251) 2026-03-13 21:14:31 -07:00
Nathan Flurry
d8b8b49f37
Fix Foundry UI bugs: org names, sessions, and repo selection (#250)
* Fix Foundry auth: migrate to Better Auth adapter, fix access token retrieval

- Remove @ts-nocheck from better-auth.ts, auth-user/index.ts, app-shell.ts
  and fix all type errors
- Fix getAccessTokenForSession: read GitHub token directly from account
  record instead of calling Better Auth's internal /get-access-token
  endpoint which returns 403 on server-side calls
- Re-implement workspaceAuth helper functions (workspaceAuthColumn,
  normalizeAuthValue, workspaceAuthClause, workspaceAuthWhere) that were
  accidentally deleted
- Remove all retry logic (withRetries, isRetryableAppActorError)
- Implement CORS origin allowlist from configured environment
- Document cachedAppWorkspace singleton pattern
- Add inline org sync fallback in buildAppSnapshot for post-OAuth flow
- Add no-retry rule to CLAUDE.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Foundry dev panel from fix-git-data branch

Port the dev panel component that was left out when PR #243 was replaced
by PR #247. Adapted to remove runtime/mock-debug references that don't
exist on the current branch.

- Toggle with Shift+D, persists visibility to localStorage
- Shows context, session, GitHub sync status sections
- Dev-only (import.meta.env.DEV)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add full Docker image defaults, fix actor deadlocks, and improve dev experience

- Add Dockerfile.full and --all flag to install-agent CLI for pre-built images
- Centralize Docker image constant (FULL_IMAGE) pinned to 0.3.1-full
- Remove examples/shared/Dockerfile{,.dev} and daytona snapshot example
- Expand Docker docs with full runnable Dockerfile
- Fix self-deadlock in createWorkbenchSession (fire-and-forget provisioning)
- Audit and convert 12 task actions from wait:true to wait:false
- Add bun --hot for dev backend hot reload
- Remove --force from pnpm install in dev Dockerfile for faster startup
- Add env_file support to compose.dev.yaml for automatic credential loading
- Add mock frontend compose config and dev panel
- Update CLAUDE.md with wait:true policy and dev environment setup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* WIP: async action fixes and interest manager

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix Foundry UI bugs: org names, hanging sessions, and wrong repo creation

- Fix org display name using GitHub description instead of name field
- Fix createWorkbenchSession hanging when sandbox is provisioning
- Fix auto-session creation retry storm on errors
- Fix task creation using wrong repo due to React state race conditions
- Remove Bun hot-reload from backend Dockerfile (causes port drift)
- Add GitHub sync/install status to dev panel

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 20:48:22 -07:00
Nicholas Kissel
58c54156f1
Remove Download Foundry section from website (#248)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 12:30:03 -07:00
Nathan Flurry
ae191d1ae1
Refactor Foundry GitHub state and sandbox runtime (#247)
* Move Foundry HTTP APIs out of /api/rivet

* Move Foundry HTTP APIs onto /v1

* Fix Foundry Rivet base path and frontend endpoint fallback

* Configure Foundry Rivet runner pool for /v1

* Remove Foundry Rivet runner override

* Serve Foundry Rivet routes directly from Bun

* Log Foundry RivetKit deployment friction

* Add actor display metadata

* Tighten actor schema constraints

* Reset actor persistence baseline

* Remove temporary actor key version prefix

Railway has no persistent volumes so stale actors are wiped on
each deploy. The v2 key rotation is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Cache app workspace actor handle across requests

Every request was calling getOrCreate on the Rivet engine API
to resolve the workspace actor, even though it's always the same
actor. Cache the handle and invalidate on error so retries
re-resolve. This eliminates redundant cross-region round-trips
to api.rivet.dev on every request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add temporary debug logging to GitHub OAuth exchange

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Make squashed baseline migrations idempotent

Use CREATE TABLE IF NOT EXISTS and CREATE UNIQUE INDEX IF NOT
EXISTS so the squashed baseline can run against actors that
already have tables from the pre-squash migration sequence.
This fixes the "table already exists" error when org workspace
actors wake up with stale migration journals.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Revert "Make squashed baseline migrations idempotent"

This reverts commit 356c146035.

* Fix GitHub OAuth callback by removing retry wrapper

OAuth authorization codes are single-use. The appWorkspaceAction wrapper
retries failed calls up to 20 times, but if the code exchange succeeds
and a later step fails, every retry sends the already-consumed code,
producing "bad_verification_code" from GitHub.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add runner versioning to RivetKit registry

Uses Date.now() so each process start gets a unique version.
This ensures Rivet Cloud migrates actors to the new runner on
deploy instead of routing requests to stale runners.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add backend request and workspace logging

* Log callback request headers

* Make GitHub OAuth callback idempotent against duplicate requests

Clear oauthState before exchangeCode so duplicate callback requests
fail the state check instead of hitting GitHub with a consumed code.
Marked as HACK — root cause of duplicate HTTP requests is unknown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add temporary header dump on GitHub OAuth callback

Log all request headers on the callback endpoint to diagnose
the source of duplicate requests (Railway proxy, Cloudflare, browser).
Remove once root cause is identified.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Defer slow GitHub org sync to workflow queue for fast OAuth callback

Split syncGithubSessionFromToken into a fast path (initGithubSession:
exchange code, get viewer, store token+identity) and a slow path
(syncGithubOrganizations: list orgs/installations, sync workspaces).

completeAppGithubAuth now returns the 302 redirect in ~2s instead of
~18s by enqueuing the org sync to the workspace workflow queue
(fire-and-forget). This eliminates the proxy timeout window that was
causing duplicate callback requests.

bootstrapAppGithubSession (dev-only) still calls the full synchronous
sync since proxy timeouts are not a concern and it needs the session
fully populated before returning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* foundry: async app repo import on org select

* foundry: parallelize app snapshot org reads

* repo: push all current workspace changes

* foundry: update runner version and snapshot logging

* Refactor Foundry GitHub state and sandbox runtime

Refactors Foundry around organization/repository ownership and adds an organization-scoped GitHub state actor plus a user-scoped GitHub auth actor, removing the old project PR/branch sync actors and repo PR cache.

Updates sandbox provisioning to rely on sandbox-agent for in-sandbox work, hardens Daytona startup and image-build behavior, and surfaces runtime and task-startup errors more clearly in the UI.

Extends workbench and GitHub state handling to track merged PR state, adds runtime-issue tracking, refreshes client/test/config wiring, and documents the main live Foundry test flow plus actor coordination rules.

Also updates the remaining Sandbox Agent install-version references in docs/examples to the current pinned minor channel.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 02:45:07 -07:00
Nathan Flurry
436eb4a3a3 Add legacy Foundry GitHub callback route 2026-03-12 19:20:34 -07:00
Nathan Flurry
cdac0aa937 Ship RivetKit runtime peers in Foundry backend 2026-03-12 19:08:56 -07:00
Nathan Flurry
31de559fbb Fix Foundry Railway backend Docker context 2026-03-12 19:02:15 -07:00
Nathan Flurry
70d2cc35d7 Split Railway config per Foundry service 2026-03-12 18:59:04 -07:00
Nathan Flurry
e79a3d9389 Add Railway Caddy frontend images 2026-03-12 18:58:57 -07:00
Nathan Flurry
940e49fcfa Use vanilla Rivet routing in Foundry backend 2026-03-12 18:48:11 -07:00
Nathan Flurry
4bccd5fc8d
Fix desktop build artifact ignore (#244) 2026-03-12 11:10:29 -07:00
Nicholas Kissel
fde8b481bd
Foundry UI polish: terminal empty state, history minimap redesign, styling tweaks (#242)
- Hide terminal pane body when no terminal tabs exist
- Redesign history minimap from orange bar to single icon with popover dropdown
- Simplify popover items to single-line user messages with ellipsis
- Adjust min-used badge hover padding
- Add right padding to message list for history icon clearance

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:03:04 -07:00
Nicholas Kissel
f09b9090bb
Standardize Foundry frontend colors with semantic design tokens (#241)
Extract hardcoded colors from 15+ component files into a centralized
token system (tokens.ts + shared-styles.ts) so all UI colors flow
through FoundryTokens. This eliminates 160+ scattered color values
and makes light mode a single-file change in the future.

- Add FoundryTokens interface with dark/light variants
- Add shared style helpers (buttons, cards, inputs, badges)
- Bridge CSS custom properties for styles.css theme support
- Add useFoundryTokens() hook and ColorMode context
- Migrate all mock-layout/* and mock-onboarding components

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:52:06 -07:00
Nicholas Kissel
ed6e6f6fa5 Polish Foundry desktop UI: billing redesign, sidebar hover menu, org switching fix
- Redesign billing page with task-hours pricing model (Free: 8h, Pro: 200h/seat)
- Add bulk hour purchase packages and Stripe payment management
- Remove Usage nav section, add upgrade CTA in Members for free plan
- Fix gear icon to open menu on hover with debounced timers
- Fix org switching in workspace flyout (portal outside-click detection)
- Fix tab strip padding when sidebar is collapsed
- Update website components and Tauri config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:34:25 -07:00
Nicholas Kissel
f6656a90af Add Foundry Tauri v2 desktop app with UI polish
- Scaffold Tauri v2 desktop package (foundry/packages/desktop)
- Sidecar build script compiles backend into standalone Bun binary
- Frontend build script packages Vite output for Tauri webview
- macOS glass-effect app icon following Big Sur design standards
- Collapsible sidebars with smooth width transitions
- Inset content framing with borders and nested border-radius (Outer R = Inner R + Padding)
- iMessage-style chat bubble styling with proper corner radii
- Styled composer input with matching border-radius
- Vertical separator between chat and right sidebar
- Website download button component
- Cargo workspace exclude for standalone Tauri build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:27:35 -07:00
Nathan Flurry
dbc2ff0682
Improve Foundry auth and task flows (#240) 2026-03-11 18:13:31 -07:00
Nathan Flurry
d75e8c31d1
Rename Foundry handoffs to tasks (#239)
* Restore foundry onboarding stack

* Consolidate foundry rename

* Create foundry tasks without prompts

* Rename Foundry handoffs to tasks
2026-03-11 13:23:54 -07:00
Nathan Flurry
d30cc0bcc8
Merge pull request #238 from rivet-dev/test-dev-webhooks-flow
Fix Foundry handoff creation flow
2026-03-11 11:17:00 -07:00
Nathan Flurry
c8a095b69f Merge remote-tracking branch 'origin/main' into test-dev-webhooks-flow
# Conflicts:
#	factory/packages/backend/src/actors/project/actions.ts
#	factory/packages/backend/src/actors/workspace/actions.ts
#	factory/packages/frontend/src/components/mock-layout.tsx
2026-03-11 11:14:04 -07:00
Nathan Flurry
5e733e8b37 Fix Foundry handoff creation flow 2026-03-11 11:06:51 -07:00
Nicholas Kissel
302bc7b674
Replace all:unset with explicit button/input resets across Foundry UI (#237)
Styletron's generated stylesheet ordering makes all:unset unreliable —
it can override font-size, border-radius, and background unpredictably.
Replace with appearance:none + targeted resets and clean up duplicate
CSS properties introduced during iterative fixes.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:04:20 -07:00
Nicholas Kissel
e03484848e
Foundry UI polish: favicon, icon alignment, and border refinements (#236)
Add Foundry favicon, fix icon centering across sidebar/composer/header
buttons, restore center panel top-left border curve, and position right
sidebar border between header actions and tab strip.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 02:59:58 -07:00
Nicholas Kissel
e792a720a0
Refine Foundry UI layout and styling (#235)
* feat: modernize chat UI and rename handoff to task

- Remove agent message bubbles, keep user bubbles (right-aligned)
- Rename "Handoffs" to "Tasks" with ListChecks icon in sidebar
- Move model picker inside composer, add renderFooter to ChatComposer SDK
- Make project sections collapsible with hover-only chevrons
- Remove divider between chat and composer
- Update model picker chevron to flip on open/close
- Replace all user-visible "handoff" strings with "task" across frontend

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: real org mock data, model picker styling, project icons, task minutes indicator

- Replace fake acme/* mock data with real rivet-dev GitHub org repos and PRs
- Fix model picker popover: dark gray surface with backdrop blur instead of pure black
- Add colored letter icons to project section headers (swap to chevron on hover)
- Add "847 min used" indicator in transcript header
- Rename browser tab title from OpenHandoff to Foundry
- Reduce transcript header title font weight to 500

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: refine Foundry UI — single-line task cards, dark user bubbles, curved panel corners, send icon

- Collapse task sidebar cards to single-line layout (title, number, diffs, timestamp)
- Dark-themed user message bubbles matching site theme
- Curved top-left corner on center chat panel with border line
- Subtle focus border on composer input
- Replace ArrowUpFromLine with SendHorizonal icon
- Tab strip gaps, padding, and divider alignment fixes
- Plus button with visible background
- Right sidebar header color matching

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 01:50:36 -07:00
Nathan Flurry
20082512a3
Merge pull request #234 from rivet-dev/foundry-terminal-pane
Add Foundry terminal and process pane
2026-03-11 00:00:24 -07:00
Nathan Flurry
b00c0109d0 Merge remote-tracking branch 'origin/main' into foundry-terminal-pane
# Conflicts:
#	factory/packages/backend/src/driver.ts
#	factory/packages/backend/src/integrations/sandbox-agent/client.ts
#	factory/packages/backend/test/helpers/test-driver.ts
#	factory/packages/frontend/src/components/mock-layout.tsx
#	pnpm-lock.yaml
#	sdks/react/src/ProcessTerminal.tsx
2026-03-10 23:59:58 -07:00
Nathan Flurry
28c4ac22ff Add foundry terminal and process pane 2026-03-10 23:55:43 -07:00
Nicholas Kissel
32008797da
Foundry UI polish: real org data, project icons, model picker (#233)
* feat: modernize chat UI and rename handoff to task

- Remove agent message bubbles, keep user bubbles (right-aligned)
- Rename "Handoffs" to "Tasks" with ListChecks icon in sidebar
- Move model picker inside composer, add renderFooter to ChatComposer SDK
- Make project sections collapsible with hover-only chevrons
- Remove divider between chat and composer
- Update model picker chevron to flip on open/close
- Replace all user-visible "handoff" strings with "task" across frontend

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: real org mock data, model picker styling, project icons, task minutes indicator

- Replace fake acme/* mock data with real rivet-dev GitHub org repos and PRs
- Fix model picker popover: dark gray surface with backdrop blur instead of pure black
- Add colored letter icons to project section headers (swap to chevron on hover)
- Add "847 min used" indicator in transcript header
- Rename browser tab title from OpenHandoff to Foundry
- Reduce transcript header title font weight to 500

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 23:49:48 -07:00
Nathan Flurry
34a0587cbc
Add star repo onboarding flow (#232) 2026-03-10 23:47:33 -07:00
Nathan Flurry
d2346bafb3
Configure lefthook formatter checks (#231)
* Add lefthook formatter checks

* Fix SDK mode hydration

* Stabilize SDK mode integration test
2026-03-10 23:03:11 -07:00
Nathan Flurry
0471214d65
Share chat UI components in @sandbox-agent/react (#228)
* Extract shared chat UI components

* chore(release): update version to 0.3.1

* Use shared chat UI in Foundry
2026-03-10 22:31:36 -07:00
Nathan Flurry
6d7e67fe72 chore(release): update version to 0.3.1 2026-03-10 22:12:56 -07:00
Nathan Flurry
76586f409f
Add ACP permission mode support to the SDK (#224)
* chore: recover hamburg workspace state

* chore: drop workspace context files

* refactor: generalize permissions example

* refactor: parse permissions example flags

* docs: clarify why fs and terminal stay native

* feat: add interactive permission prompt UI to Inspector

Add permission request handling to the Inspector UI so users can
Allow, Always Allow, or Reject tool calls that require permissions
instead of having them auto-cancelled. Wires up SDK
onPermissionRequest/respondPermission through App → ChatPanel →
ChatMessages with proper toolCallId-to-pendingId mapping.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent permission reply from silently escalating "once" to "always"

Remove allow_always from the fallback chain when the user replies "once",
aligning with the ACP spec which says "map by option kind first" with no
fallback for allow_once. Also fix Inspector to use rawSend, revert
hydration guard to accept empty configOptions, and handle respondPermission
errors by rejecting the pending promise.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 21:52:43 -07:00
Nicholas Kissel
5d65013aa5
Merge pull request #213 from rivet-dev/NicholasKissel/rivet-font-update
Switch fonts to Manrope to match Rivet.dev
2026-03-09 17:43:36 -07:00
Nicholas Kissel
0e4a44f1f3 fix: increase base font-weight to 500 for Manrope readability
Manrope at weight 400 is thinner than Open Sans was, making small body
text hard to read. Bumping to 500 matches Rivet.dev's usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 17:26:31 -07:00
Nicholas Kissel
bf3df519e4 feat: switch website and inspector fonts to Manrope to match Rivet.dev
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 17:20:47 -07:00
Nathan Flurry
bf282199b5
Integrate OpenHandoff factory workspace (#212) 2026-03-09 14:00:20 -07:00
Nathan Flurry
3d9476ed0b chore(release): update version to 0.3.0 2026-03-07 18:54:35 -08:00
Nathan Flurry
cd29dd57c4 chore(release): update version to 0.3.0 2026-03-07 18:01:57 -08:00
Nathan Flurry
febe8601f6
feat: add process management support (#207)
* feat: improve inspector UI for processes and fix PTY terminal

- Simplify ProcessRunTab layout: compact form with collapsible Advanced section for timeout/maxOutputBytes
- Rewrite ProcessesTab: collapsible create form, lightweight list items with status dots, clean detail panel with tabs
- Extract error details: use problem.detail instead of generic "Stream Error" title for better error messages
- Fix GhosttyTerminal binary frame parsing: handle server's binary ArrayBuffer control frames (ready/exit/error)
- Enable WebSocket proxying in Vite dev server with ws: true
- Set TERM=xterm-256color default for TTY processes so tools like tmux, vim, htop work out of the box
- Remove orange gradient background from terminal container for cleaner look
- Remove orange left border from selected process list items
- Update inspector CSS with new process/terminal styles

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: address review issues and add processes documentation

- Fix unstable onExit callback in ProcessesTab (useCallback)
- Fix SSE follow stream race condition (subscribe before history read)
- Update inspector.mdx with new process management features
- Change observability icon to avoid conflict with processes
- Add docs/processes.mdx covering the full process management API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: simplify processes doc — rename sections, remove low-level protocol

- Rename "Interactive terminals" to "Terminals" with "Connect to a terminal" sub-heading
- Add TTY process creation step at top of Terminals section
- Remove low-level WebSocket protocol table and raw WebSocket example
- Keep browser terminal emulator reference with Ghostty link

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update GhosttyTerminal permalink to latest commit

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: use main branch permalink for GhosttyTerminal reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: refine process API — WebSocket binary protocol, SDK terminal session, updated tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update GhosttyTerminal permalink to 636eefb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* inspector: use websocket terminal API

* sdk: restore high-level terminal session

* docs: update inspector terminal permalink

* inspector: update run once placeholder

* Fix lazy install v1 API test fixture

* Add reusable React terminal component

* Fix terminal WebSocket ready state checks

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-07 17:58:31 -08:00
Nathan Flurry
e7656d78f0
perf: improve startup instrumentation and replace npx with npm install (#208)
Add comprehensive tracing instrumentation across the entire agent startup path (gigacode CLI, ACP HTTP adapter, agent installation, and process spawning) to enable detailed performance profiling. Replace npm-based agent process launchers that use npx (incurring resolution overhead on every spawn) with pre-installed npm packages, reducing startup latency. Improve error diagnostics when agent processes crash by capturing exit codes and stderr tails. Update error handling to map exited processes to dedicated error variants with actionable error messages.

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-06 12:05:19 -08:00
Nathan Flurry
9ada842cf2 chore(release): update version to 0.2.2 2026-03-06 00:28:08 -08:00
Nathan Flurry
c91791f88d
feat: add configuration for model, mode, and thought level (#205)
* feat: add configuration for model, mode, and thought level

* docs: document Claude effort-level filesystem config

* fix: prevent panic on empty modes/thoughtLevels in parse_agent_config

Use `.first()` with safe fallback instead of direct `[0]` index access,
which would panic if the Vec is empty and no default is set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden session lifecycle and align cli.mdx example with claude.json

- destroySession: wrap session/cancel RPC in try/catch so local cleanup
  always succeeds even when the agent is unreachable
- createSession/resumeOrCreateSession: clean up the remote session if
  post-creation config calls (setMode/setModel/setThoughtLevel) fail,
  preventing leaked orphan sessions
- cli.mdx: fix example output to match current claude.json (model name,
  model order, and populated modes)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden session lifecycle and align config persistence logic

- resumeOrCreateSession: Remove destroy-on-error for the resume path. Config
  errors now propagate without destroying a pre-existing session. The destroy
  pattern remains in createSession (where the session is newly created and has
  no prior state to preserve).

- setSessionMode fallback: When session/set_mode returns -32601 and the
  fallback uses session/set_config_option, now keep modes.currentModeId
  in sync with the updated currentValue. Prevents stale cached state in
  getModes() when the fallback path is used.

- persistSessionStateFromMethod: Re-read the record from persistence instead
  of using a stale pre-await snapshot. Prevents race conditions where
  concurrent session/update events (processed by persistSessionStateFromEvent)
  are silently overwritten by optimistic updates.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: correct doc examples with valid Codex modes and update stable API list

- Replace invalid Codex mode values ("plan", "build") with valid ones
  ("auto", "full-access") in agent-sessions.mdx and sdk-overview.mdx
- Update CLAUDE.md stable method enumerations to include new session
  config methods (setSessionMode, setSessionModel, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add OpenAPI annotations for process endpoints and fix config persistence race

Add summary/description to all process management endpoint specs and the
not_found error type. Fix hydrateSessionConfigOptions to re-read from
persistence after the network call, and sync mode-category configOptions
on session/update current_mode_update events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 00:24:32 -08:00
Nathan Flurry
e7343e14bd
Add SDK health wait gate (#206)
* Add SDK health wait gate

* Default connect to waiting for health

* Document connect health wait default

* Add abort signal to connect health wait

* Refactor SDK health probe helper

* Update quickstart health wait note

* Remove example health polling

* Fix docker example codex startup
2026-03-06 00:05:06 -08:00
Nathan Flurry
4335ef6af6
feat: add process management API (#203)
* feat: add process management API

Introduces a complete Process Management API for Sandbox Agent with process lifecycle management (start, stop, kill, delete), one-shot command execution, log streaming via SSE and WebSocket, stdin input, and PTY/terminal support. Includes new process_runtime module for managing process state, HTTP route handlers, OpenAPI documentation, and integration tests.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: address review issues in process management API

- Add doc comments to all 13 new #[utoipa::path] handlers (CLAUDE.md compliance)
- Fix send_signal ESRCH check: use raw_os_error() == Some(libc::ESRCH) instead of ErrorKind::NotFound
- Add max_input_bytes_per_request enforcement in WebSocket terminal handler
- URL-decode access_token query parameter for WebSocket auth
- Replace fragile string prefix matching with proper SandboxError::NotFound variant

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* feat: add TypeScript SDK support for process management

Add process CRUD operations (create, get, list, update, delete) and
event streaming to the TypeScript SDK. Includes integration tests,
mock agent updates, and test environment fixes for cross-platform
home directory handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: provide WebSocket impl for process terminal test on Node 20

Node 20 lacks globalThis.WebSocket. Add ws as a devDependency and
pass it to connectProcessTerminalWebSocket in the integration test
so CI no longer fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2026-03-05 18:20:20 -08:00
Nathan Flurry
fba06d3304
Fix broken CI checks workflow (#204)
* fix: repair ci checks workflow

* fix: run release checks via pnpm exec

* fix: install tsx in ci
2026-03-05 17:49:06 -08:00
Nathan Flurry
c3a95c3611 chore: add boxlite 2026-02-25 02:18:16 -08:00
Nathan Flurry
a3fe0cc764 fix(cloudflare): fix streaming responses 2026-02-25 01:39:27 -08:00
Nathan Flurry
e24b7cb140 fix: don't exclude cli-shared from library publishing
The startsWith("sdks/cli") filter was also matching sdks/cli-shared,
preventing it from being published. Use "sdks/cli/" (with trailing
slash) to only match the CLI directory itself.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:00:51 -08:00
Nathan Flurry
1fcc009156 fix: build cli-shared before sandbox-agent SDK
The sandbox-agent SDK build script was missing the cli-shared build
step, causing DTS build failures in CI when cli-shared wasn't
pre-built.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:32:06 -08:00
Nathan Flurry
7eabbf13c7 chore(release): update version to 0.2.1 2026-02-23 12:15:43 -08:00
Nathan Flurry
2aad675389 fix: optional chain onEventClick to fix typecheck
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:15:29 -08:00
NathanFlurry
4201bd204b
chore: simplify cloudflare compatibility (#191) 2026-02-23 19:31:54 +00:00
Nathan Flurry
03e06e956d
Merge pull request #177 from akalitenya/docs-fix-e2b-timeout
fix timeout for e2b (docs)
2026-02-17 23:45:32 -08:00
NicholasKissel
8a78e068cb feat(inspector): add SDK session clarification to persistence note (#184) 2026-02-13 20:21:07 +00:00
NicholasKissel
d4d7c66c05 feat(inspector): improve contrast and add collapsible debug panel (#182) 2026-02-13 07:15:40 +00:00
NicholasKissel
a897fbcb7c feat(inspector): markdown support and image update (#181) 2026-02-13 06:23:40 +00:00
NicholasKissel
e134012955 feat(inspector): improve session UI, skills dropdown, and visual polish (#179)
- Add delete button on ended sessions (visible on hover)
- Darken ended sessions with opacity and "ended" pill badge
- Sort ended sessions to bottom of list
- Add token usage pill in chat header
- Disable input when session ended
- Add Official Skills dropdown with SDK and Rivet presets
- Format session IDs shorter with full ID on hover
- Add arrow icon to "Configure persistence" link
- Add agent logo SVGs
2026-02-13 05:54:53 +00:00
akalitenya
c68c614a57 fix timeout for e2b (docs) 2026-02-13 05:01:18 +05:00
Nathan Flurry
1c381c552a
Merge pull request #176 from akalitenya/docs-ai
fix host for docs
2026-02-12 15:57:03 -08:00
akalitenya
80feb7fa41 fix host for docs 2026-02-13 04:53:53 +05:00
Nathan Flurry
783ea1086a chore: remove old sdk docs 2026-02-12 15:11:17 -08:00
Nathan Flurry
24fe22f42a fix: remove outdated skill file 2026-02-12 15:06:48 -08:00
Nathan Flurry
ee9ad25069 chore(release): update version to 0.2.1 2026-02-11 20:51:01 -08:00
Nathan Flurry
a0955ba752
Merge pull request #167 from rivet-dev/feat_add_session_persistence_examples_and_SQLite_driver
feat: add session persistence examples and SQLite driver
2026-02-11 20:40:44 -08:00
Nathan Flurry
3c2a9cbbbb feat: add session persistence examples and SQLite driver 2026-02-11 20:35:14 -08:00
Nathan Flurry
64d1324628 docs: fix bun trust command 2026-02-11 18:52:32 -08:00
Nathan Flurry
cb1f770b47 fix: bump missing packages to 0.2.0 and handle 404 in npmVersionExists
- Bump acp-http-client and persist-* packages from 0.1.0 to 0.2.0
- Fix npmVersionExists to handle 404 for never-published packages
2026-02-11 17:57:02 -08:00
Nathan Flurry
46193747e6 fix: dynamically discover packages in release script instead of hardcoding
- sdk.ts: discoverNpmPackages() + topoSort() for library packages
- sdk.ts: discoverCrates() via cargo metadata for workspace crates
- sdk.ts: publishNpmLibraries replaces publishNpmCliShared + publishNpmSdk
- sdk.ts: publishNpmCli now discovers CLI packages from filesystem
- update_version.ts: discovers SDK package.json files via glob
- update_version.ts: discovers internal crates from Cargo.toml path deps

This prevents packages like persist-*, acp-http-client from being
silently skipped during releases.
2026-02-11 17:51:38 -08:00
NicholasKissel
cdbe920070 feat(inspector): update visual styling to match landing page (#166)
feat(inspector): update visual styling to match landing page

- Update color scheme to match website (black bg, white/10 borders)
- Add Open Sans font
- Update button styles (white primary buttons)
- Add collapsible tool results and status messages
- Replace avatar letters with icons (User, Settings, AlertTriangle)
- Add status dividers for session/turn events
- Update feature coverage badges to lighter grey
- Remove pill styling from event times
- Update popup menus to solid black background

feat(website): add Pi agent to hero diagram and update styling

- Add Pi agent with cyan color (#06B6D4) to the diagram
- Update layout to 3 agents on top row, 2 on bottom row
- Add backdrop-blur glass effects for modern look
- Add animated dot background that changes with active adapter
- Add scroll fade effect for hero section
- Update subtitle to include Pi in supported agents list
- Increase 'CONNECTED TO' label font size

feat(website): add site styling updates and SEO improvements

- Update component styling to match Rivet design (FAQ, FeatureGrid, etc.)
- Add SEO improvements (sitemap, robots.txt, meta tags, Open Graph)
- Remove CTASection component
- Update footer tagline
- Add Pi logo
2026-02-12 01:42:54 +00:00
Nathan Flurry
89933c5f80 fix: use correct provider home dirs and make SSE failures non-fatal
- E2B: /home/user, Daytona: /home/daytona, Vercel: /home/vercel-sandbox, ComputeSDK: /home
- Docker-based examples keep /root (correct)
- Add missing install-agent steps to Daytona example and doc
- Make SSE loop failure non-fatal in acp-http-client transport
2026-02-11 09:29:14 -08:00
Nathan Flurry
8a1d17f165 fix: release pipeline for npm 2026-02-11 09:23:35 -08:00
Nathan Flurry
6b1950f9ab chore(release): update version to 0.2.0 2026-02-11 08:51:15 -08:00
Nathan Flurry
94353f7696 chore: fix bad merge 2026-02-11 07:57:02 -08:00
Nathan Flurry
1dd45908a3
Merge pull request #160 from rivet-dev/02-11-chore_fix_bad_merge
chore: fix bad merge
2026-02-11 07:40:01 -08:00
Nathan Flurry
b9efe971ff chore: fix bad merge 2026-02-11 07:33:19 -08:00
NathanFlurry
e72eb9f611
acp spec (#155) 2026-02-11 14:47:41 +00:00
NicholasKissel
70287ec471 chore(site): site updates and seo (#158) 2026-02-11 08:36:10 +00:00
Nathan Flurry
a33b1323ff feat: support Pi harness (#121) 2026-02-10 22:27:07 -08:00
Nathan Flurry
4c6c5983c0 Merge branch 'main' into feat/support-pi 2026-02-10 22:27:03 -08:00
Nathan Flurry
8f93d51883 feat: add ComputeSDK example & documentation (#66) 2026-02-10 22:15:01 -08:00
Franklin
11950d2a39 adding compute example 2026-02-10 22:14:56 -08:00
Nathan Flurry
d67cc6edf4 fix: OpenCode event streaming + bypass permission mode (#48) 2026-02-10 22:13:38 -08:00
WellDunDun
9c7a08a165 fix: OpenCode event streaming + bypass permission mode
Three independent fixes for the OpenCode agent adapter:

1. Wrong API endpoints: /event/subscribe → /event, /session/{id}/prompt → /session/{id}/message
2. Untagged enum mis-dispatch: replace serde_json::from_value with manual type-field dispatch
3. Wire permissionMode "bypass" for OpenCode: allow in normalize_permission_mode() and pass
   --dangerously-skip-permissions to CLI (both spawn and spawn_streaming)

Tested with OpenCode 1.1.48 + Kimi K2.5.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-10 22:13:17 -08:00
Nathan Flurry
87a4e81d31
Merge pull request #153 from soilSpoon/feature/ampcode
feature(ampcode): Enhances ampcode schema with new message types and fields
2026-02-10 22:12:08 -08:00
Nathan Flurry
4322cb1d8e
Merge pull request #148 from bobbythelobster/add-cursor-agent-support
Add cursor-agent support
2026-02-10 22:11:57 -08:00
Nathan Flurry
edf5c5d299
Merge pull request #59 from gregce/fix/spawn-streaming-env-vars
fix(agent-management): pass env vars to agent in spawn_streaming
2026-02-10 22:11:45 -08:00
이대희
9486343f4c feature(ampcode): Enhances ampcode schema with new message types and fields
Adds support for system, user, assistant, and result message types to the AMP schema, along with associated fields like subtype, session_id, tools, and duration metrics. Updates the schema validation and adds corresponding test cases. Also improves the command-line argument handling in the agent management package to accommodate the new message types and streamlined permission flags.

The changes enhance the schema's flexibility for different interaction patterns and provide better tracking of agent operations.
2026-02-10 22:20:51 +09:00
Bobby The Lobster
2cb2c07c6f Add cursor-agent support (#118)
- Add Cursor to AgentId enum
- Implement install_cursor() function for binary installation
- Add Cursor spawn logic with JSON format support
- Update README to mention Cursor support in all relevant sections

Cursor-agent runs on localhost:32123 and uses OpenCode-compatible format.
Based on opencode-cursor-auth pattern for Cursor Pro integration.

Resolves #118
2026-02-09 00:01:10 +00:00
Franklin
8b068eb1ae Merge remote-tracking branch 'origin/main' into feat/support-pi 2026-02-06 23:22:50 -05:00
Franklin
e37bde0103 adding pi for gigacode 2026-02-06 23:20:50 -05:00
Franklin
bd030904bc pi tests 2026-02-06 19:16:53 -05:00
Franklin
e2e7f11b9a pi working 2026-02-06 18:18:43 -05:00
Franklin
9a26604001 wip: pi working with variatns 2026-02-06 17:17:00 -05:00
Franklin
bef2e84d0c wip: pi working 2026-02-06 16:54:53 -05:00
Franklin
a6064e7027 wip: pi working 2026-02-06 16:54:43 -05:00
Franklin
e8ac963be4 fix: handle Pi in opencode agent display names 2026-02-05 17:10:53 -05:00
Franklin
a744a8086a Merge remote-tracking branch 'origin/main' into feat/support-pi
# Conflicts:
#	server/packages/sandbox-agent/src/lib.rs
#	server/packages/sandbox-agent/src/router.rs
2026-02-05 17:09:51 -05:00
Franklin
843498e9db support pi 2026-02-05 17:06:53 -05:00
Greg Ceccarelli
c4b033a5c0 fix(agent-management): pass env vars to agent in spawn_streaming
The spawn_streaming() function was not passing environment variables
from SpawnOptions.env to the spawned process. This caused agents like
Claude to not receive ANTHROPIC_API_KEY, resulting in silent
authentication failures.

The non-streaming spawn() method correctly passes env vars (lines 298-300),
but spawn_streaming() was missing this code path.

This fix adds the same env var loop to spawn_streaming(), ensuring that
credentials extracted from the host environment are properly passed to
spawned agents.
2026-01-29 17:05:24 -05:00
946 changed files with 137556 additions and 71142 deletions

View file

@ -1,265 +1,24 @@
---
name: agent-browser
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
allowed-tools: Bash(agent-browser:*)
description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*)
---
# Browser Automation with agent-browser
## Quick start
## Core Workflow
```bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
```
Every browser automation follows this pattern:
## Core workflow
1. Navigate: `agent-browser open <url>`
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
# Supports: https://, http://, file://, about:, data://
# Auto-prepends https:// if no protocol given
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser (aliases: quit, exit)
agent-browser connect 9222 # Connect to browser via CDP port
```
### Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key (alias: key)
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown option
agent-browser select @e1 "a" "b" # Select multiple options
agent-browser scroll down 500 # Scroll page (default: down 300px)
agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto)
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
```
### Get information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.)
```
### Check state
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
```
### Screenshots & PDF
```bash
agent-browser screenshot # Save to a temporary directory
agent-browser screenshot path.png # Save to a specific path
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
### Video recording
```bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
```
Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it
automatically returns to your current page. For smooth demos, explore first, then start recording.
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text (or -t)
agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u)
agent-browser wait --load networkidle # Wait for network idle (or -l)
agent-browser wait --fn "window.ready" # Wait for JS condition (or -f)
```
### Mouse control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
### Semantic locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find text "Sign In" click --exact # Exact match only
agent-browser find label "Email" fill "user@test.com"
agent-browser find placeholder "Search" type "query"
agent-browser find alt "Logo" click
agent-browser find title "Close" click
agent-browser find testid "submit-btn" click
agent-browser find first ".item" click
agent-browser find last ".item" click
agent-browser find nth 2 "a" hover
```
### Browser settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation)
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth (alias: auth)
agent-browser set media dark # Emulate color scheme
agent-browser set media light reduced-motion # Light mode + reduced motion
```
### Cookies & Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
```
### Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
### Tabs & Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab by index
agent-browser tab close # Close current tab
agent-browser tab close 2 # Close tab by index
agent-browser window new # New window
```
### Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
### Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
### JavaScript
```bash
agent-browser eval "document.title" # Run JavaScript
```
## Global options
```bash
agent-browser --session <name> ... # Isolated browser session
agent-browser --json ... # JSON output for parsing
agent-browser --headed ... # Show browser window (not headless)
agent-browser --full ... # Full page screenshot (-f)
agent-browser --cdp <port> ... # Connect via Chrome DevTools Protocol
agent-browser -p <provider> ... # Cloud browser provider (--provider)
agent-browser --proxy <url> ... # Use proxy server
agent-browser --headers <json> ... # HTTP headers scoped to URL's origin
agent-browser --executable-path <p> # Custom browser executable
agent-browser --extension <path> ... # Load browser extension (repeatable)
agent-browser --help # Show help (-h)
agent-browser --version # Show version (-V)
agent-browser <command> --help # Show detailed help for a command
```
### Proxy support
```bash
agent-browser --proxy http://proxy.com:8080 open example.com
agent-browser --proxy http://user:pass@proxy.com:8080 open example.com
agent-browser --proxy socks5://proxy.com:1080 open example.com
```
## Environment variables
```bash
AGENT_BROWSER_SESSION="mysession" # Default session name
AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path
AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths
AGENT_BROWSER_PROVIDER="your-cloud-browser-provider" # Cloud browser provider (select browseruse or browserbase)
AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port
AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location (for daemon.js)
```
## Example: Form submission
1. **Navigate**: `agent-browser open <url>`
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
3. **Interact**: Use refs to click, fill, select
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
```bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
@ -268,72 +27,504 @@ agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
## Example: Authentication with saved state
## Command Chaining
Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
```bash
# Login once
# Chain open + wait + snapshot in one call
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
# Chain multiple interactions
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
# Navigate and capture
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
```
**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
## Essential Commands
```bash
# Navigation
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser close # Close browser
# Snapshot
agent-browser snapshot -i # Interactive elements with refs (recommended)
agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer)
agent-browser snapshot -s "#selector" # Scope to CSS selector
# Interaction (use @refs from snapshot)
agent-browser click @e1 # Click element
agent-browser click @e1 --new-tab # Click and open in new tab
agent-browser fill @e2 "text" # Clear and type text
agent-browser type @e2 "text" # Type without clearing
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser press Enter # Press key
agent-browser keyboard type "text" # Type at current focus (no selector)
agent-browser keyboard inserttext "text" # Insert without key events
agent-browser scroll down 500 # Scroll page
agent-browser scroll down 500 --selector "div.content" # Scroll within a specific container
# Get information
agent-browser get text @e1 # Get element text
agent-browser get url # Get current URL
agent-browser get title # Get page title
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page" # Wait for URL pattern
agent-browser wait 2000 # Wait milliseconds
# Downloads
agent-browser download @e1 ./file.pdf # Click element to trigger download
agent-browser wait --download ./output.zip # Wait for any download to complete
agent-browser --download-path ./downloads open <url> # Set default download directory
# Capture
agent-browser screenshot # Screenshot to temp dir
agent-browser screenshot --full # Full page screenshot
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
agent-browser pdf output.pdf # Save as PDF
# Diff (compare page states)
agent-browser diff snapshot # Compare current vs last snapshot
agent-browser diff snapshot --baseline before.txt # Compare current vs saved file
agent-browser diff screenshot --baseline before.png # Visual pixel diff
agent-browser diff url <url1> <url2> # Compare two pages
agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait strategy
agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
```
## Common Patterns
### Form Submission
```bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
```
### Authentication with Auth Vault (Recommended)
```bash
# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY)
# Recommended: pipe password via stdin to avoid shell history exposure
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
# Login using saved profile (LLM never sees password)
agent-browser auth login github
# List/show/delete profiles
agent-browser auth list
agent-browser auth show github
agent-browser auth delete github
```
### Authentication with State Persistence
```bash
# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
## Sessions (parallel browsers)
### Session Persistence
```bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
# Auto-save/restore cookies and localStorage across browser restarts
agent-browser --session-name myapp open https://app.example.com/login
# ... login flow ...
agent-browser close # State auto-saved to ~/.agent-browser/sessions/
# Next time, state is auto-loaded
agent-browser --session-name myapp open https://app.example.com/dashboard
# Encrypt state at rest
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32)
agent-browser --session-name secure open https://app.example.com
# Manage saved states
agent-browser state list
agent-browser state show myapp-default.json
agent-browser state clear myapp
agent-browser state clean --older-than 7
```
## JSON output (for parsing)
Add `--json` for machine-readable output:
### Data Extraction
```bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # Get specific element text
agent-browser get text body > page.txt # Get all page text
# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
## Debugging
### Parallel Sessions
```bash
agent-browser --headed open example.com # Show browser window
agent-browser --cdp 9222 snapshot # Connect via CDP port
agent-browser connect 9222 # Alternative: connect command
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
agent-browser record start ./debug.webm # Record video from current page
agent-browser record stop # Save recording
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session list
```
## Deep-dive documentation
### Connect to Existing Chrome
For detailed patterns and best practices, see:
```bash
# Auto-discover running Chrome with remote debugging enabled
agent-browser --auto-connect open https://example.com
agent-browser --auto-connect snapshot
| Reference | Description |
# Or with explicit CDP port
agent-browser --cdp 9222 snapshot
```
### Color Scheme (Dark Mode)
```bash
# Persistent dark mode via flag (applies to all pages and new tabs)
agent-browser --color-scheme dark open https://example.com
# Or via environment variable
AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com
# Or set during session (persists for subsequent commands)
agent-browser set media dark
```
### Visual Browser (Debugging)
```bash
agent-browser --headed open https://example.com
agent-browser highlight @e1 # Highlight element
agent-browser record start demo.webm # Record session
agent-browser profiler start # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile (path optional)
```
Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode.
### Local Files (PDFs, HTML)
```bash
# Open local files with file:// URLs
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png
```
### iOS Simulator (Mobile Safari)
```bash
# List available iOS simulators
agent-browser device list
# Launch Safari on a specific device
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
# Same workflow as desktop - snapshot, interact, re-snapshot
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1 # Tap (alias for click)
agent-browser -p ios fill @e2 "text"
agent-browser -p ios swipe up # Mobile-specific gesture
# Take screenshot
agent-browser -p ios screenshot mobile.png
# Close session (shuts down simulator)
agent-browser -p ios close
```
**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)
**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
## Security
All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output.
### Content Boundaries (Recommended for AI Agents)
Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content:
```bash
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
agent-browser snapshot
# Output:
# --- AGENT_BROWSER_PAGE_CONTENT nonce=<hex> origin=https://example.com ---
# [accessibility tree]
# --- END_AGENT_BROWSER_PAGE_CONTENT nonce=<hex> ---
```
### Domain Allowlist
Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on:
```bash
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
agent-browser open https://example.com # OK
agent-browser open https://malicious.com # Blocked
```
### Action Policy
Use a policy file to gate destructive actions:
```bash
export AGENT_BROWSER_ACTION_POLICY=./policy.json
```
Example `policy.json`:
```json
{"default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"]}
```
Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies.
### Output Limits
Prevent context flooding from large pages:
```bash
export AGENT_BROWSER_MAX_OUTPUT=50000
```
## Diffing (Verifying Changes)
Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session.
```bash
# Typical workflow: snapshot -> action -> diff
agent-browser snapshot -i # Take baseline snapshot
agent-browser click @e2 # Perform action
agent-browser diff snapshot # See what changed (auto-compares to last snapshot)
```
For visual regression testing or monitoring:
```bash
# Save a baseline screenshot, then compare later
agent-browser screenshot baseline.png
# ... time passes or changes are made ...
agent-browser diff screenshot --baseline baseline.png
# Compare staging vs production
agent-browser diff url https://staging.example.com https://prod.example.com --screenshot
```
`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage.
## Timeouts and Slow Pages
The default Playwright timeout is 25 seconds for local browsers. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout:
```bash
# Wait for network activity to settle (best for slow pages)
agent-browser wait --load networkidle
# Wait for a specific element to appear
agent-browser wait "#content"
agent-browser wait @e1
# Wait for a specific URL pattern (useful after redirects)
agent-browser wait --url "**/dashboard"
# Wait for a JavaScript condition
agent-browser wait --fn "document.readyState === 'complete'"
# Wait a fixed duration (milliseconds) as a last resort
agent-browser wait 5000
```
When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait <selector>` or `wait @ref`.
## Session Management and Cleanup
When running multiple agents or automations concurrently, always use named sessions to avoid conflicts:
```bash
# Each agent gets its own isolated session
agent-browser --session agent1 open site-a.com
agent-browser --session agent2 open site-b.com
# Check active sessions
agent-browser session list
```
Always close your browser session when done to avoid leaked processes:
```bash
agent-browser close # Close default session
agent-browser --session agent1 close # Close specific session
```
If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work.
## Ref Lifecycle (Important)
Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after:
- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
```bash
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot
agent-browser click @e1 # Use new refs
```
## Annotated Screenshots (Vision Mode)
Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot.
```bash
agent-browser screenshot --annotate
# Output includes the image path and a legend:
# [1] @e1 button "Submit"
# [2] @e2 link "Home"
# [3] @e3 textbox "Email"
agent-browser click @e2 # Click using ref from annotated screenshot
```
Use annotated screenshots when:
- The page has unlabeled icon buttons or visual-only elements
- You need to verify visual layout or styling
- Canvas or chart elements are present (invisible to text snapshots)
- You need spatial reasoning about element positions
## Semantic Locators (Alternative to Refs)
When refs are unavailable or unreliable, use semantic locators:
```bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
```
## JavaScript Evaluation (eval)
Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues.
```bash
# Simple expressions work with regular quoting
agent-browser eval 'document.title'
agent-browser eval 'document.querySelectorAll("img").length'
# Complex JS: use --stdin with heredoc (RECOMMENDED)
agent-browser eval --stdin <<'EVALEOF'
JSON.stringify(
Array.from(document.querySelectorAll("img"))
.filter(i => !i.alt)
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
)
EVALEOF
# Alternative: base64 encoding (avoids all shell escaping issues)
agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"
```
**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely.
**Rules of thumb:**
- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine
- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
- Programmatic/generated scripts -> use `eval -b` with base64
## Configuration File
Create `agent-browser.json` in the project root for persistent settings:
```json
{
"headed": true,
"proxy": "http://localhost:8080",
"profile": "./browser-data"
}
```
Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config <path>` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced.
## Deep-Dive Documentation
| Reference | When to Use |
|-----------|-------------|
| [references/commands.md](references/commands.md) | Full command reference with all options |
| [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting |
| [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping |
| [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse |
| [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation |
| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis |
| [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies |
## Ready-to-use templates
## Experimental: Native Mode
Executable workflow scripts for common patterns:
agent-browser has an experimental native Rust daemon that communicates with Chrome directly via CDP, bypassing Node.js and Playwright entirely. It is opt-in and not recommended for production use yet.
```bash
# Enable via flag
agent-browser --native open example.com
# Enable via environment variable (avoids passing --native every time)
export AGENT_BROWSER_NATIVE=1
agent-browser open example.com
```
The native daemon supports Chromium and Safari (via WebDriver). Firefox and WebKit are not yet supported. All core commands (navigate, snapshot, click, fill, screenshot, cookies, storage, tabs, eval, etc.) work identically in native mode. Use `agent-browser close` before switching between native and default mode within the same session.
## Browser Engine Selection
Use `--engine` to choose a local browser engine. The default is `chrome`.
```bash
# Use Lightpanda (fast headless browser, requires separate install)
agent-browser --engine lightpanda open example.com
# Via environment variable
export AGENT_BROWSER_ENGINE=lightpanda
agent-browser open example.com
# With custom binary path
agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com
```
Supported engines:
- `chrome` (default) -- Chrome/Chromium via CDP
- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome)
Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation.
## Ready-to-Use Templates
| Template | Description |
|----------|-------------|
@ -341,16 +532,8 @@ Executable workflow scripts for common patterns:
| [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state |
| [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots |
Usage:
```bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
```
## HTTPS Certificate Errors
For sites with self-signed or invalid certificates:
```bash
agent-browser open https://localhost:8443 --ignore-https-errors
```

View file

@ -1,6 +1,20 @@
# Authentication Patterns
Patterns for handling login flows, session persistence, and authenticated browsing.
Login flows, session persistence, OAuth, 2FA, and authenticated browsing.
**Related**: [session-management.md](session-management.md) for state persistence details, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Login Flow](#basic-login-flow)
- [Saving Authentication State](#saving-authentication-state)
- [Restoring Authentication](#restoring-authentication)
- [OAuth / SSO Flows](#oauth--sso-flows)
- [Two-Factor Authentication](#two-factor-authentication)
- [HTTP Basic Auth](#http-basic-auth)
- [Cookie-Based Auth](#cookie-based-auth)
- [Token Refresh Handling](#token-refresh-handling)
- [Security Best Practices](#security-best-practices)
## Basic Login Flow

View file

@ -1,13 +1,29 @@
# Proxy Support
Configure proxy servers for browser automation, useful for geo-testing, rate limiting avoidance, and corporate environments.
Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments.
**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Proxy Configuration](#basic-proxy-configuration)
- [Authenticated Proxy](#authenticated-proxy)
- [SOCKS Proxy](#socks-proxy)
- [Proxy Bypass](#proxy-bypass)
- [Common Use Cases](#common-use-cases)
- [Verifying Proxy Connection](#verifying-proxy-connection)
- [Troubleshooting](#troubleshooting)
- [Best Practices](#best-practices)
## Basic Proxy Configuration
Set proxy via environment variable before starting:
Use the `--proxy` flag or set proxy via environment variable:
```bash
# HTTP proxy
# Via CLI flag
agent-browser --proxy "http://proxy.example.com:8080" open https://example.com
# Via environment variable
export HTTP_PROXY="http://proxy.example.com:8080"
agent-browser open https://example.com
@ -45,10 +61,13 @@ agent-browser open https://example.com
## Proxy Bypass
Skip proxy for specific domains:
Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`:
```bash
# Bypass proxy for local addresses
# Via CLI flag
agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com
# Via environment variable
export NO_PROXY="localhost,127.0.0.1,.internal.company.com"
agent-browser open https://internal.company.com # Direct connection
agent-browser open https://external.com # Via proxy

View file

@ -1,6 +1,18 @@
# Session Management
Run multiple isolated browser sessions concurrently with state persistence.
Multiple isolated browser sessions with state persistence and concurrent browsing.
**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Named Sessions](#named-sessions)
- [Session Isolation Properties](#session-isolation-properties)
- [Session State Persistence](#session-state-persistence)
- [Common Patterns](#common-patterns)
- [Default Session](#default-session)
- [Session Cleanup](#session-cleanup)
- [Best Practices](#best-practices)
## Named Sessions

View file

@ -1,21 +1,29 @@
# Snapshot + Refs Workflow
# Snapshot and Refs
The core innovation of agent-browser: compact element references that reduce context usage dramatically for AI agents.
Compact element references that reduce context usage dramatically for AI agents.
## How It Works
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
### The Problem
Traditional browser automation sends full DOM to AI agents:
## Contents
- [How Refs Work](#how-refs-work)
- [Snapshot Command](#the-snapshot-command)
- [Using Refs](#using-refs)
- [Ref Lifecycle](#ref-lifecycle)
- [Best Practices](#best-practices)
- [Ref Notation Details](#ref-notation-details)
- [Troubleshooting](#troubleshooting)
## How Refs Work
Traditional approach:
```
Full DOM/HTML sent → AI parses → Generates CSS selector → Executes action
~3000-5000 tokens per interaction
Full DOM/HTML → AI parses → CSS selector → Action (~3000-5000 tokens)
```
### The Solution
agent-browser uses compact snapshots with refs:
agent-browser approach:
```
Compact snapshot → @refs assigned → Direct ref interaction
~200-400 tokens per interaction
Compact snapshot → @refs assigned → Direct interaction (~200-400 tokens)
```
## The Snapshot Command
@ -166,8 +174,8 @@ agent-browser snapshot -i
### Element Not Visible in Snapshot
```bash
# Scroll to reveal element
agent-browser scroll --bottom
# Scroll down to reveal element
agent-browser scroll down 1000
agent-browser snapshot -i
# Or wait for dynamic content

View file

@ -1,6 +1,17 @@
# Video Recording
Capture browser automation sessions as video for debugging, documentation, or verification.
Capture browser automation as video for debugging, documentation, or verification.
**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
## Contents
- [Basic Recording](#basic-recording)
- [Recording Commands](#recording-commands)
- [Use Cases](#use-cases)
- [Best Practices](#best-practices)
- [Output Format](#output-format)
- [Limitations](#limitations)
## Basic Recording

View file

@ -1,67 +1,81 @@
#!/bin/bash
# Template: Authenticated Session Workflow
# Login once, save state, reuse for subsequent runs
# Purpose: Login once, save state, reuse for subsequent runs
# Usage: ./authenticated-session.sh <login-url> [state-file]
#
# Usage:
# ./authenticated-session.sh <login-url> [state-file]
# RECOMMENDED: Use the auth vault instead of this template:
# echo "<pass>" | agent-browser auth save myapp --url <login-url> --username <user> --password-stdin
# agent-browser auth login myapp
# The auth vault stores credentials securely and the LLM never sees passwords.
#
# Setup:
# 1. Run once to see your form structure
# 2. Note the @refs for your fields
# 3. Uncomment LOGIN FLOW section and update refs
# Environment variables:
# APP_USERNAME - Login username/email
# APP_PASSWORD - Login password
#
# Two modes:
# 1. Discovery mode (default): Shows form structure so you can identify refs
# 2. Login mode: Performs actual login after you update the refs
#
# Setup steps:
# 1. Run once to see form structure (discovery mode)
# 2. Update refs in LOGIN FLOW section below
# 3. Set APP_USERNAME and APP_PASSWORD
# 4. Delete the DISCOVERY section
set -euo pipefail
LOGIN_URL="${1:?Usage: $0 <login-url> [state-file]}"
STATE_FILE="${2:-./auth-state.json}"
echo "Authentication workflow for: $LOGIN_URL"
echo "Authentication workflow: $LOGIN_URL"
# ══════════════════════════════════════════════════════════════
# SAVED STATE: Skip login if we have valid saved state
# ══════════════════════════════════════════════════════════════
# ================================================================
# SAVED STATE: Skip login if valid saved state exists
# ================================================================
if [[ -f "$STATE_FILE" ]]; then
echo "Loading saved authentication state..."
agent-browser state load "$STATE_FILE"
agent-browser open "$LOGIN_URL"
agent-browser wait --load networkidle
echo "Loading saved state from $STATE_FILE..."
if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then
agent-browser wait --load networkidle
CURRENT_URL=$(agent-browser get url)
if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
echo "Session restored successfully!"
agent-browser snapshot -i
exit 0
CURRENT_URL=$(agent-browser get url)
if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
echo "Session restored successfully"
agent-browser snapshot -i
exit 0
fi
echo "Session expired, performing fresh login..."
agent-browser close 2>/dev/null || true
else
echo "Failed to load state, re-authenticating..."
fi
echo "Session expired, performing fresh login..."
rm -f "$STATE_FILE"
fi
# ══════════════════════════════════════════════════════════════
# DISCOVERY MODE: Show form structure (remove after setup)
# ══════════════════════════════════════════════════════════════
# ================================================================
# DISCOVERY MODE: Shows form structure (delete after setup)
# ================================================================
echo "Opening login page..."
agent-browser open "$LOGIN_URL"
agent-browser wait --load networkidle
echo ""
echo "┌─────────────────────────────────────────────────────────┐"
echo "│ LOGIN FORM STRUCTURE │"
echo "├─────────────────────────────────────────────────────────┤"
echo "Login form structure:"
echo "---"
agent-browser snapshot -i
echo "└─────────────────────────────────────────────────────────┘"
echo "---"
echo ""
echo "Next steps:"
echo " 1. Note refs: @e? = username, @e? = password, @e? = submit"
echo " 2. Uncomment LOGIN FLOW section below"
echo " 3. Replace @e1, @e2, @e3 with your refs"
echo " 1. Note the refs: username=@e?, password=@e?, submit=@e?"
echo " 2. Update the LOGIN FLOW section below with your refs"
echo " 3. Set: export APP_USERNAME='...' APP_PASSWORD='...'"
echo " 4. Delete this DISCOVERY MODE section"
echo ""
agent-browser close
exit 0
# ══════════════════════════════════════════════════════════════
# ================================================================
# LOGIN FLOW: Uncomment and customize after discovery
# ══════════════════════════════════════════════════════════════
# ================================================================
# : "${APP_USERNAME:?Set APP_USERNAME environment variable}"
# : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}"
#
@ -78,14 +92,14 @@ exit 0
# # Verify login succeeded
# FINAL_URL=$(agent-browser get url)
# if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then
# echo "ERROR: Login failed - still on login page"
# echo "Login failed - still on login page"
# agent-browser screenshot /tmp/login-failed.png
# agent-browser close
# exit 1
# fi
#
# # Save state for future runs
# echo "Saving authentication state to: $STATE_FILE"
# echo "Saving state to $STATE_FILE"
# agent-browser state save "$STATE_FILE"
# echo "Login successful!"
# echo "Login successful"
# agent-browser snapshot -i

View file

@ -1,68 +1,69 @@
#!/bin/bash
# Template: Content Capture Workflow
# Extract content from web pages with optional authentication
# Purpose: Extract content from web pages (text, screenshots, PDF)
# Usage: ./capture-workflow.sh <url> [output-dir]
#
# Outputs:
# - page-full.png: Full page screenshot
# - page-structure.txt: Page element structure with refs
# - page-text.txt: All text content
# - page.pdf: PDF version
#
# Optional: Load auth state for protected pages
set -euo pipefail
TARGET_URL="${1:?Usage: $0 <url> [output-dir]}"
OUTPUT_DIR="${2:-.}"
echo "Capturing content from: $TARGET_URL"
echo "Capturing: $TARGET_URL"
mkdir -p "$OUTPUT_DIR"
# Optional: Load authentication state if needed
# Optional: Load authentication state
# if [[ -f "./auth-state.json" ]]; then
# echo "Loading authentication state..."
# agent-browser state load "./auth-state.json"
# fi
# Navigate to target page
# Navigate to target
agent-browser open "$TARGET_URL"
agent-browser wait --load networkidle
# Get page metadata
echo "Page title: $(agent-browser get title)"
echo "Page URL: $(agent-browser get url)"
# Get metadata
TITLE=$(agent-browser get title)
URL=$(agent-browser get url)
echo "Title: $TITLE"
echo "URL: $URL"
# Capture full page screenshot
agent-browser screenshot --full "$OUTPUT_DIR/page-full.png"
echo "Screenshot saved: $OUTPUT_DIR/page-full.png"
echo "Saved: $OUTPUT_DIR/page-full.png"
# Get page structure
# Get page structure with refs
agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt"
echo "Structure saved: $OUTPUT_DIR/page-structure.txt"
echo "Saved: $OUTPUT_DIR/page-structure.txt"
# Extract main content
# Adjust selector based on target site structure
# agent-browser get text @e1 > "$OUTPUT_DIR/main-content.txt"
# Extract specific elements (uncomment as needed)
# agent-browser get text "article" > "$OUTPUT_DIR/article.txt"
# agent-browser get text "main" > "$OUTPUT_DIR/main.txt"
# agent-browser get text ".content" > "$OUTPUT_DIR/content.txt"
# Get full page text
# Extract all text content
agent-browser get text body > "$OUTPUT_DIR/page-text.txt"
echo "Text content saved: $OUTPUT_DIR/page-text.txt"
echo "Saved: $OUTPUT_DIR/page-text.txt"
# Optional: Save as PDF
# Save as PDF
agent-browser pdf "$OUTPUT_DIR/page.pdf"
echo "PDF saved: $OUTPUT_DIR/page.pdf"
echo "Saved: $OUTPUT_DIR/page.pdf"
# Optional: Capture with scrolling for infinite scroll pages
# scroll_and_capture() {
# local count=0
# while [[ $count -lt 5 ]]; do
# agent-browser scroll down 1000
# agent-browser wait 1000
# ((count++))
# done
# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"
# }
# scroll_and_capture
# Optional: Extract specific elements using refs from structure
# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt"
# Optional: Handle infinite scroll pages
# for i in {1..5}; do
# agent-browser scroll down 1000
# agent-browser wait 1000
# done
# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"
# Cleanup
agent-browser close
echo ""
echo "Capture complete! Files saved to: $OUTPUT_DIR"
echo "Capture complete:"
ls -la "$OUTPUT_DIR"

View file

@ -1,64 +1,62 @@
#!/bin/bash
# Template: Form Automation Workflow
# Fills and submits web forms with validation
# Purpose: Fill and submit web forms with validation
# Usage: ./form-automation.sh <form-url>
#
# This template demonstrates the snapshot-interact-verify pattern:
# 1. Navigate to form
# 2. Snapshot to get element refs
# 3. Fill fields using refs
# 4. Submit and verify result
#
# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output
set -euo pipefail
FORM_URL="${1:?Usage: $0 <form-url>}"
echo "Automating form at: $FORM_URL"
echo "Form automation: $FORM_URL"
# Navigate to form page
# Step 1: Navigate to form
agent-browser open "$FORM_URL"
agent-browser wait --load networkidle
# Get interactive snapshot to identify form fields
echo "Analyzing form structure..."
# Step 2: Snapshot to discover form elements
echo ""
echo "Form structure:"
agent-browser snapshot -i
# Example: Fill common form fields
# Uncomment and modify refs based on snapshot output
# Step 3: Fill form fields (customize these refs based on snapshot output)
#
# Common field types:
# agent-browser fill @e1 "John Doe" # Text input
# agent-browser fill @e2 "user@example.com" # Email input
# agent-browser fill @e3 "SecureP@ss123" # Password input
# agent-browser select @e4 "Option Value" # Dropdown
# agent-browser check @e5 # Checkbox
# agent-browser click @e6 # Radio button
# agent-browser fill @e7 "Multi-line text" # Textarea
# agent-browser upload @e8 /path/to/file.pdf # File upload
#
# Uncomment and modify:
# agent-browser fill @e1 "Test User"
# agent-browser fill @e2 "test@example.com"
# agent-browser click @e3 # Submit button
# Text inputs
# agent-browser fill @e1 "John Doe" # Name field
# agent-browser fill @e2 "user@example.com" # Email field
# agent-browser fill @e3 "+1-555-123-4567" # Phone field
# Password fields
# agent-browser fill @e4 "SecureP@ssw0rd!"
# Dropdowns
# agent-browser select @e5 "Option Value"
# Checkboxes
# agent-browser check @e6 # Check
# agent-browser uncheck @e7 # Uncheck
# Radio buttons
# agent-browser click @e8 # Select radio option
# Text areas
# agent-browser fill @e9 "Multi-line text content here"
# File uploads
# agent-browser upload @e10 /path/to/file.pdf
# Submit form
# agent-browser click @e11 # Submit button
# Wait for response
# Step 4: Wait for submission
# agent-browser wait --load networkidle
# agent-browser wait --url "**/success" # Or wait for redirect
# agent-browser wait --url "**/success" # Or wait for redirect
# Verify submission
echo "Form submission result:"
# Step 5: Verify result
echo ""
echo "Result:"
agent-browser get url
agent-browser snapshot -i
# Take screenshot of result
# Optional: Capture evidence
agent-browser screenshot /tmp/form-result.png
echo "Screenshot saved: /tmp/form-result.png"
# Cleanup
agent-browser close
echo "Form automation complete"
echo "Done"

View file

@ -43,7 +43,7 @@ Manually verify the install script works in a fresh environment:
```bash
docker run --rm alpine:latest sh -c "
apk add --no-cache curl ca-certificates libstdc++ libgcc bash &&
curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh &&
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh &&
sandbox-agent --version
"
```

View file

@ -9,11 +9,15 @@ build/
# Cache
.cache/
.turbo/
**/.turbo/
*.tsbuildinfo
.pnpm-store/
coverage/
# Environment
.env
.env.*
.foundry/
# IDE
.idea/
@ -24,3 +28,7 @@ build/
# Git
.git/
# Tests
**/test/
**/tests/

34
.env.development.example Normal file
View file

@ -0,0 +1,34 @@
# Foundry local development environment.
# Copy ~/misc/the-foundry.env to .env in the repo root to populate secrets.
# .env is gitignored — never commit it. The source of truth is ~/misc/the-foundry.env.
#
# Docker Compose (just foundry-dev) and the justfile (set dotenv-load := true)
# both read .env automatically.
APP_URL=http://localhost:4173
BETTER_AUTH_URL=http://localhost:4173
BETTER_AUTH_SECRET=sandbox-agent-foundry-development-only-change-me
GITHUB_REDIRECT_URI=http://localhost:4173/v1/auth/callback/github
# Fill these in when enabling live GitHub OAuth.
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
# Fill these in when enabling GitHub App-backed org installation and repo import.
GITHUB_APP_ID=
GITHUB_APP_CLIENT_ID=
GITHUB_APP_CLIENT_SECRET=
# Store PEM material as a quoted single-line value with \n escapes.
GITHUB_APP_PRIVATE_KEY=
# Webhook secret for verifying GitHub webhook payloads.
# Use smee.io for local development: https://smee.io/new
GITHUB_WEBHOOK_SECRET=
# Required for local GitHub webhook forwarding in compose.dev.
SMEE_URL=
SMEE_TARGET=http://backend:7741/v1/webhooks/github
# Fill these in when enabling live Stripe billing.
STRIPE_SECRET_KEY=
STRIPE_PUBLISHABLE_KEY=
STRIPE_WEBHOOK_SECRET=
STRIPE_PRICE_TEAM=

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.1 MiB

After

Width:  |  Height:  |  Size: 1.7 MiB

Before After
Before After

View file

@ -11,6 +11,8 @@ jobs:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
@ -21,5 +23,43 @@ jobs:
node-version: 20
cache: pnpm
- run: pnpm install
- name: Run formatter hooks
shell: bash
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
git fetch origin "${{ github.base_ref }}" --depth=1
diff_range="origin/${{ github.base_ref }}...HEAD"
elif [ "${{ github.event_name }}" = "push" ] && [ "${{ github.event.before }}" != "0000000000000000000000000000000000000000" ]; then
diff_range="${{ github.event.before }}...${{ github.sha }}"
else
diff_range="HEAD^...HEAD"
fi
mapfile -t changed_files < <(
git diff --name-only --diff-filter=ACMR "$diff_range" \
| grep -E '\.(cjs|cts|js|jsx|json|jsonc|mjs|mts|rs|ts|tsx)$' \
|| true
)
if [ ${#changed_files[@]} -eq 0 ]; then
echo "No formatter-managed files changed."
exit 0
fi
args=()
for file in "${changed_files[@]}"; do
args+=(--file "$file")
done
pnpm exec lefthook run pre-commit --no-stage-fixed --fail-on-changes "${args[@]}"
- run: npm install -g tsx
- name: Run checks
run: ./scripts/release/main.ts --version 0.0.0 --check
run: ./scripts/release/main.ts --version 0.0.0 --only-steps run-ci-checks
- name: Run ACP v1 server tests
run: |
cargo test -p sandbox-agent-agent-management
cargo test -p sandbox-agent --test v1_api
cargo test -p sandbox-agent --test v1_agent_process_matrix
cargo test -p sandbox-agent --lib
- name: Run SDK tests
run: pnpm --dir sdks/typescript test

View file

@ -147,8 +147,8 @@ jobs:
sudo apt-get install -y unzip curl
# Install AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscli.zip"
unzip awscli.zip
sudo ./aws/install --update
COMMIT_SHA_SHORT="${GITHUB_SHA::7}"
@ -180,10 +180,20 @@ jobs:
include:
- platform: linux/arm64
runner: depot-ubuntu-24.04-arm-8
arch_suffix: -arm64
tag_suffix: -arm64
dockerfile: docker/runtime/Dockerfile
- platform: linux/amd64
runner: depot-ubuntu-24.04-8
arch_suffix: -amd64
tag_suffix: -amd64
dockerfile: docker/runtime/Dockerfile
- platform: linux/arm64
runner: depot-ubuntu-24.04-arm-8
tag_suffix: -full-arm64
dockerfile: docker/runtime/Dockerfile.full
- platform: linux/amd64
runner: depot-ubuntu-24.04-8
tag_suffix: -full-amd64
dockerfile: docker/runtime/Dockerfile.full
runs-on: ${{ matrix.runner }}
steps:
- uses: actions/checkout@v4
@ -205,8 +215,8 @@ jobs:
with:
context: .
push: true
tags: rivetdev/sandbox-agent:${{ steps.vars.outputs.sha_short }}${{ matrix.arch_suffix }}
file: docker/runtime/Dockerfile
tags: rivetdev/sandbox-agent:${{ steps.vars.outputs.sha_short }}${{ matrix.tag_suffix }}
file: ${{ matrix.dockerfile }}
platforms: ${{ matrix.platform }}
build-args: |
TARGETARCH=${{ contains(matrix.platform, 'arm64') && 'arm64' || 'amd64' }}

10
.gitignore vendored
View file

@ -15,6 +15,9 @@ yarn.lock
.astro/
*.tsbuildinfo
.turbo/
**/.turbo/
.pnpm-store/
coverage/
# Environment
.env
@ -47,6 +50,13 @@ Cargo.lock
# Example temp files
.tmp-upload/
*.db
.foundry/
# CLI binaries (downloaded during npm publish)
sdks/cli/platforms/*/bin/
# Foundry desktop app build artifacts
foundry/packages/desktop/frontend-dist/
foundry/packages/desktop/src-tauri/sidecars/
.context/

View file

@ -1,10 +1,8 @@
{
"mcpServers": {
"everything": {
"args": [
"@modelcontextprotocol/server-everything"
],
"args": ["@modelcontextprotocol/server-everything"],
"command": "npx"
}
}
}
}

1
.npmrc Normal file
View file

@ -0,0 +1 @@
auto-install-peers=false

186
CLAUDE.md
View file

@ -1,136 +1,80 @@
# Instructions
## SDK Modes
## Naming and Ownership
There are two ways to work with the SDKs:
- This repository/product is **Sandbox Agent**.
- **Gigacode** is a separate user-facing UI/client, not the server product name.
- Gigacode integrates with Sandbox Agent via the OpenCode-compatible surface (`/opencode/*`) when that compatibility layer is enabled.
- Canonical extension namespace/domain string is `sandboxagent.dev` (no hyphen).
- Canonical custom ACP extension method prefix is `_sandboxagent/...` (no hyphen).
- **Embedded**: Spawns the `sandbox-agent` server as a subprocess on a unique port and communicates with it locally. Useful for local development or when running the SDK and agent in the same environment.
- **Server**: Connects to a remotely running `sandbox-agent` server. The server is typically running inside a sandbox (e.g., Docker, E2B, Daytona, Vercel Sandboxes) and the SDK connects to it over HTTP.
## Docs Terminology
## Agent Schemas
- Never mention "ACP" in user-facing docs (`docs/**/*.mdx`) except in docs that are specifically about ACP itself (e.g. `docs/acp-http-client.mdx`).
- Never expose underlying protocol method names (e.g. `session/request_permission`, `session/create`, `_sandboxagent/session/detach`) in non-ACP docs. Describe the behavior in user-facing terms instead.
- Do not describe the underlying protocol implementation in docs. Only document the SDK surface (methods, types, options). ACP protocol details belong exclusively in ACP-specific pages.
- Do not use em dashes (`—`) in docs. Use commas, periods, or parentheses instead.
Agent schemas (Claude Code, Codex, OpenCode, Amp) are available for reference in `resources/agent-schemas/artifacts/json-schema/`.
### Docs Source Of Truth (HTTP/CLI)
Extraction methods:
- **Claude**: Uses `claude --output-format json --json-schema` CLI command
- **Codex**: Uses `codex app-server generate-json-schema` CLI command
- **OpenCode**: Fetches from GitHub OpenAPI spec
- **Amp**: Scrapes from `https://ampcode.com/manual/appendix?preview#message-schema`
- For HTTP/CLI docs/examples, source of truth is:
- `server/packages/sandbox-agent/src/router.rs`
- `server/packages/sandbox-agent/src/cli.rs`
- Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy session REST APIs).
All extractors have fallback schemas for when CLI/URL is unavailable.
## Change Tracking
Research on how different agents operate (CLI flags, streaming formats, HITL patterns, etc.) is in `research/agents/`. When adding or making changes to agent docs, follow the same structure as existing files.
- If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push.
- Keep CLI subcommands and HTTP endpoints in sync.
- Update `docs/cli.mdx` when CLI behavior changes.
- Regenerate `docs/openapi.json` when HTTP contracts change.
- Keep `docs/inspector.mdx` and `docs/sdks/typescript.mdx` aligned with implementation.
- Append blockers/decisions to `research/acp/friction.md` during ACP work.
- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
- Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.
Universal schema guidance:
- The universal schema should cover the full feature set of all agents.
- Conversions must be best-effort overlap without being lossy; preserve raw payloads when needed.
- **The mock agent acts as the reference implementation** for correct event behavior. Real agents should use synthetic events to match the mock agent's event patterns (e.g., emitting both daemon synthetic and agent native `session.started` events, proper `item.started``item.delta``item.completed` sequences).
## Docker Test Image
## Spec Tracking
- Docker-backed Rust and TypeScript tests build `docker/test-agent/Dockerfile` directly in-process and cache the image tag only in memory (`OnceLock` in Rust, module-level variable in TypeScript).
- Do not add cross-process image-build scripts unless there is a concrete need for them.
- Keep CLI subcommands in sync with every HTTP endpoint.
- Update `CLAUDE.md` to keep CLI endpoints in sync with HTTP API changes.
- When adding or modifying CLI commands, update `docs/cli.mdx` to reflect the changes.
- When changing the HTTP API, update the TypeScript SDK and CLI together.
- Do not make breaking changes to API endpoints.
- When changing API routes, ensure the HTTP/SSE test suite has full coverage of every route.
- When agent schema changes, ensure API tests cover the new schema and event shapes end-to-end.
- When the universal schema changes, update mock-agent events to cover the new fields or event types.
- Update `docs/conversion.md` whenever agent-native schema terms, synthetic events, identifier mappings, or conversion logic change.
- Never use synthetic data or mocked responses in tests.
- Never manually write agent types; always use generated types in `resources/agent-schemas/`. If types are broken, fix the generated types.
- The universal schema must provide consistent behavior across providers; avoid requiring frontend/client logic to special-case agents.
- The UI must reflect every field in AgentCapabilities (feature coverage); keep it in sync with `docs/session-transcript-schema.mdx` and `agent_capabilities_for`.
- When parsing agent data, if something is unexpected or does not match the schema, bail out and surface the error rather than trying to continue with partial parsing.
- When defining the universal schema, choose the option most compatible with native agent APIs, and add synthetics to fill gaps for other agents.
- Use `docs/session-transcript-schema.mdx` as the source of truth for schema terminology and keep it updated alongside schema changes.
- On parse failures, emit an `agent.unparsed` event (source=daemon, synthetic=true) and treat it as a test failure. Preserve raw payloads when `include_raw=true`.
- Track subagent support in `docs/conversion.md`. For now, normalize subagent activity into normal message/tool flow, but revisit explicit subagent modeling later.
- Keep the FAQ in `README.md` and `frontend/packages/website/src/components/FAQ.tsx` in sync. When adding or modifying FAQ entries, update both files.
- Update `research/wip-agent-support.md` as agent support changes are implemented.
## Common Software Sync
### OpenAPI / utoipa requirements
- These three files must stay in sync:
- `docs/common-software.mdx` (user-facing documentation)
- `docker/test-common-software/Dockerfile` (packages installed in the test image)
- `server/packages/sandbox-agent/tests/common_software.rs` (test assertions)
- When adding or removing software from `docs/common-software.mdx`, also add/remove the corresponding `apt-get install` line in the Dockerfile and add/remove the test in `common_software.rs`.
- Run `cargo test -p sandbox-agent --test common_software` to verify.
Every `#[utoipa::path(...)]` handler function must have a doc comment where:
- The **first line** becomes the OpenAPI `summary` (short human-readable title, e.g. `"List Agents"`). This is used as the sidebar label and page heading in the docs site.
- The **remaining lines** become the OpenAPI `description` (one-sentence explanation of what the endpoint does).
- Every `responses(...)` entry must have a `description` (no empty descriptions).
## Install Version References
When adding or modifying endpoints, regenerate `docs/openapi.json` and verify titles render correctly in the docs site.
### CLI ⇄ HTTP endpoint map (keep in sync)
- `sandbox-agent api agents list``GET /v1/agents`
- `sandbox-agent api agents install``POST /v1/agents/{agent}/install`
- `sandbox-agent api agents modes``GET /v1/agents/{agent}/modes`
- `sandbox-agent api agents models``GET /v1/agents/{agent}/models`
- `sandbox-agent api sessions list``GET /v1/sessions`
- `sandbox-agent api sessions create``POST /v1/sessions/{sessionId}`
- `sandbox-agent api sessions send-message``POST /v1/sessions/{sessionId}/messages`
- `sandbox-agent api sessions send-message-stream``POST /v1/sessions/{sessionId}/messages/stream`
- `sandbox-agent api sessions terminate``POST /v1/sessions/{sessionId}/terminate`
- `sandbox-agent api sessions events` / `get-messages``GET /v1/sessions/{sessionId}/events`
- `sandbox-agent api sessions events-sse``GET /v1/sessions/{sessionId}/events/sse`
- `sandbox-agent api sessions reply-question``POST /v1/sessions/{sessionId}/questions/{questionId}/reply`
- `sandbox-agent api sessions reject-question``POST /v1/sessions/{sessionId}/questions/{questionId}/reject`
- `sandbox-agent api sessions reply-permission``POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply`
- `sandbox-agent api fs entries``GET /v1/fs/entries`
- `sandbox-agent api fs read``GET /v1/fs/file`
- `sandbox-agent api fs write``PUT /v1/fs/file`
- `sandbox-agent api fs delete``DELETE /v1/fs/entry`
- `sandbox-agent api fs mkdir``POST /v1/fs/mkdir`
- `sandbox-agent api fs move``POST /v1/fs/move`
- `sandbox-agent api fs stat``GET /v1/fs/stat`
- `sandbox-agent api fs upload-batch``POST /v1/fs/upload-batch`
## OpenCode Compatibility Layer
`sandbox-agent opencode` starts a sandbox-agent server and attaches an OpenCode session (uses `/opencode`).
### Session ownership
Sessions are stored **only** in sandbox-agent's v1 `SessionManager` — they are never sent to or stored in the native OpenCode server. The OpenCode TUI reads sessions via `GET /session` which the compat layer serves from the v1 store. The native OpenCode process has no knowledge of sessions.
### Proxy elimination strategy
The `/opencode` compat layer (`opencode_compat.rs`) historically proxied many endpoints to the native OpenCode server via `proxy_native_opencode()`. The goal is to **eliminate proxying** by implementing each endpoint natively using the v1 `SessionManager` as the single source of truth.
**Already de-proxied** (use v1 SessionManager directly):
- `GET /session``oc_session_list` reads from `SessionManager::list_sessions()`
- `GET /session/{id}``oc_session_get` reads from `SessionManager::get_session_info()`
- `GET /session/status``oc_session_status` derives busy/idle from v1 session `ended` flag
- `POST /tui/open-sessions` — returns `true` directly (TUI fetches sessions from `GET /session`)
- `POST /tui/select-session` — emits `tui.session.select` event via the OpenCode event broadcaster
**Still proxied** (none of these reference session IDs or the session list — all are session-agnostic):
- `GET /command` — command list
- `GET /config`, `PATCH /config` — project config read/write
- `GET /global/config`, `PATCH /global/config` — global config read/write
- `GET /tui/control/next`, `POST /tui/control/response` — TUI control loop
- `POST /tui/append-prompt`, `/tui/submit-prompt`, `/tui/clear-prompt` — prompt management
- `POST /tui/open-help`, `/tui/open-themes`, `/tui/open-models` — TUI navigation
- `POST /tui/execute-command`, `/tui/show-toast`, `/tui/publish` — TUI actions
When converting a proxied endpoint: add needed fields to `SessionState`/`SessionInfo` in `router.rs`, implement the logic natively in `opencode_compat.rs`, and use `session_info_to_opencode_value()` to format responses.
## Post-Release Testing
After cutting a release, verify the release works correctly. Run `/project:post-release-testing` to execute the testing agent.
## OpenCode Compatibility Tests
The OpenCode compatibility suite lives at `server/packages/sandbox-agent/tests/opencode-compat` and validates the `@opencode-ai/sdk` against the `/opencode` API. Run it with:
```bash
SANDBOX_AGENT_SKIP_INSPECTOR=1 pnpm --filter @sandbox-agent/opencode-compat-tests test
```
## Naming
- The product name is "Gigacode" (capital G, lowercase c). The CLI binary/package is `gigacode` (lowercase).
## Git Commits
- Do not include any co-authors in commit messages (no `Co-Authored-By` lines)
- Use conventional commits style (e.g., `feat:`, `fix:`, `docs:`, `chore:`, `refactor:`)
- Keep commit messages to a single line
- Channel policy:
- Sandbox Agent install/version references use a pinned minor channel `0.N.x` (for curl URLs and `sandbox-agent` / `@sandbox-agent/cli` npm/bun installs).
- Gigacode install/version references use `latest` (for `@sandbox-agent/gigacode` install/run commands and `gigacode-install.*` release promotion).
- Release promotion policy: `latest` releases must still update `latest`; when a release is `latest`, Sandbox Agent must also be promoted to the matching minor channel `0.N.x`.
- Keep every install-version reference below in sync whenever versions/channels change:
- `README.md`
- `docs/acp-http-client.mdx`
- `docs/cli.mdx`
- `docs/quickstart.mdx`
- `docs/sdk-overview.mdx`
- `docs/react-components.mdx`
- `docs/session-persistence.mdx`
- `docs/deploy/local.mdx`
- `docs/deploy/cloudflare.mdx`
- `docs/deploy/vercel.mdx`
- `docs/deploy/daytona.mdx`
- `docs/deploy/e2b.mdx`
- `docs/deploy/docker.mdx`
- `frontend/packages/website/src/components/GetStarted.tsx`
- `.claude/commands/post-release-testing.md`
- `examples/cloudflare/Dockerfile`
- `examples/daytona/src/index.ts`
- `examples/shared/src/docker.ts`
- `examples/docker/src/index.ts`
- `examples/e2b/src/index.ts`
- `examples/vercel/src/index.ts`
- `scripts/release/main.ts`
- `scripts/release/promote-artifacts.ts`
- `scripts/release/sdk.ts`

View file

@ -1,9 +1,10 @@
[workspace]
resolver = "2"
members = ["server/packages/*", "gigacode"]
exclude = ["factory/packages/desktop/src-tauri", "foundry/packages/desktop/src-tauri"]
[workspace.package]
version = "0.1.12-rc.1"
version = "0.4.2"
edition = "2021"
authors = [ "Rivet Gaming, LLC <developer@rivet.gg>" ]
license = "Apache-2.0"
@ -12,12 +13,13 @@ description = "Universal API for automatic coding agents in sandboxes. Supports
[workspace.dependencies]
# Internal crates
sandbox-agent = { version = "0.1.12-rc.1", path = "server/packages/sandbox-agent" }
sandbox-agent-error = { version = "0.1.12-rc.1", path = "server/packages/error" }
sandbox-agent-agent-management = { version = "0.1.12-rc.1", path = "server/packages/agent-management" }
sandbox-agent-agent-credentials = { version = "0.1.12-rc.1", path = "server/packages/agent-credentials" }
sandbox-agent-universal-agent-schema = { version = "0.1.12-rc.1", path = "server/packages/universal-agent-schema" }
sandbox-agent-extracted-agent-schemas = { version = "0.1.12-rc.1", path = "server/packages/extracted-agent-schemas" }
sandbox-agent = { version = "0.4.2", path = "server/packages/sandbox-agent" }
sandbox-agent-error = { version = "0.4.2", path = "server/packages/error" }
sandbox-agent-agent-management = { version = "0.4.2", path = "server/packages/agent-management" }
sandbox-agent-agent-credentials = { version = "0.4.2", path = "server/packages/agent-credentials" }
sandbox-agent-opencode-adapter = { version = "0.4.2", path = "server/packages/opencode-adapter" }
sandbox-agent-opencode-server-manager = { version = "0.4.2", path = "server/packages/opencode-server-manager" }
acp-http-adapter = { version = "0.4.2", path = "server/packages/acp-http-adapter" }
# Serialization
serde = { version = "1.0", features = ["derive"] }
@ -31,7 +33,7 @@ schemars = "0.8"
utoipa = { version = "4.2", features = ["axum_extras"] }
# Web framework
axum = "0.7"
axum = { version = "0.7", features = ["ws"] }
tower = { version = "0.5", features = ["util"] }
tower-http = { version = "0.5", features = ["cors", "trace"] }

View file

@ -5,7 +5,7 @@
<h3 align="center">Run Coding Agents in Sandboxes. Control Them Over HTTP.</h3>
<p align="center">
A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, or Amp — streaming events, handling permissions, managing sessions.
A server that runs inside your sandbox. Your app connects remotely to control Claude Code, Codex, OpenCode, Cursor, Amp, or Pi — streaming events, handling permissions, managing sessions.
</p>
<p align="center">
@ -24,17 +24,14 @@ Sandbox Agent solves three problems:
1. **Coding agents need sandboxes** — You can't let AI execute arbitrary code on your production servers. Coding agents need isolated environments, but existing SDKs assume local execution. Sandbox Agent is a server that runs inside the sandbox and exposes HTTP/SSE.
2. **Every coding agent is different** — Claude Code, Codex, OpenCode, and Amp each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
2. **Every coding agent is different** — Claude Code, Codex, OpenCode, Cursor, Amp, and Pi each have proprietary APIs, event formats, and behaviors. Swapping agents means rewriting your integration. Sandbox Agent provides one HTTP API — write your code once, swap agents with a config change.
3. **Sessions are ephemeral** — Agent transcripts live in the sandbox. When the process ends, you lose everything. Sandbox Agent streams events in a universal schema to your storage. Persist to Postgres, ClickHouse, or [Rivet](https://rivet.dev). Replay later, audit everything.
## Features
- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, and Amp with full feature coverage
- **Streaming Events**: Real-time SSE stream of everything the agent does — tool calls, permission requests, file edits, and more
- **Universal Session Schema**: [Standardized schema](https://sandboxagent.dev/docs/session-transcript-schema) that normalizes all agent event formats for storage and replay
- **Human-in-the-Loop**: Approve or deny tool executions and answer agent questions remotely over HTTP
- **Automatic Agent Installation**: Agents are installed on-demand when first used — no setup required
- **Universal Agent API**: Single interface to control Claude Code, Codex, OpenCode, Cursor, Amp, and Pi with full feature coverage
- **Universal Session Schema**: Standardized schema that normalizes all agent event formats for storage and replay
- **Runs Inside Any Sandbox**: Lightweight static Rust binary. One curl command to install inside E2B, Daytona, Vercel Sandboxes, or Docker
- **Server or SDK Mode**: Run as an HTTP server or embed with the TypeScript SDK
- **OpenAPI Spec**: [Well documented](https://sandboxagent.dev/docs/api-reference) and easy to integrate from any language
@ -83,13 +80,13 @@ Import the SDK directly into your Node or browser application. Full type safety
**Install**
```bash
npm install sandbox-agent
npm install sandbox-agent@0.4.x
```
```bash
bun add sandbox-agent
bun add sandbox-agent@0.4.x
# Optional: allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
**Setup**
@ -121,7 +118,6 @@ const agents = await client.listAgents();
await client.createSession("demo", {
agent: "codex",
agentMode: "default",
permissionMode: "plan",
});
await client.postMessage("demo", { message: "Hello from the SDK." });
@ -131,9 +127,7 @@ for await (const event of client.streamEvents("demo", { offset: 0 })) {
}
```
`permissionMode: "acceptEdits"` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
[SDK documentation](https://sandboxagent.dev/docs/sdks/typescript) — [Building a Chat UI](https://sandboxagent.dev/docs/building-chat-ui) — [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions)
[SDK documentation](https://sandboxagent.dev/docs/sdks/typescript) — [Managing Sessions](https://sandboxagent.dev/docs/manage-sessions)
### HTTP Server
@ -141,7 +135,7 @@ Run as an HTTP server and connect from any language. Deploy to E2B, Daytona, Ver
```bash
# Install it
curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
# Run it
sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
```
@ -149,10 +143,7 @@ sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
Optional: preinstall agent binaries (no server required; they will be installed lazily on first use if you skip this):
```bash
sandbox-agent install-agent claude
sandbox-agent install-agent codex
sandbox-agent install-agent opencode
sandbox-agent install-agent amp
sandbox-agent install-agent --all
```
To disable auth locally:
@ -168,13 +159,13 @@ sandbox-agent server --no-token --host 127.0.0.1 --port 2468
Install the CLI wrapper (optional but convenient):
```bash
npm install -g @sandbox-agent/cli
npm install -g @sandbox-agent/cli@0.4.x
```
```bash
# Allow Bun to run postinstall scripts for native binaries.
bun add -g @sandbox-agent/cli
bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
bun add -g @sandbox-agent/cli@0.4.x
bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
Create a session and send a message:
@ -188,11 +179,11 @@ sandbox-agent api sessions send-message-stream my-session --message "Hello" --en
You can also use npx like:
```bash
npx sandbox-agent --help
npx @sandbox-agent/cli@0.4.x --help
```
```bash
bunx sandbox-agent --help
bunx @sandbox-agent/cli@0.4.x --help
```
[CLI documentation](https://sandboxagent.dev/docs/cli)
@ -209,10 +200,6 @@ Debug sessions and events with the built-in Inspector UI (e.g., `http://localhos
[Explore API](https://sandboxagent.dev/docs/api-reference) — [View Specification](https://github.com/rivet-dev/sandbox-agent/blob/main/docs/openapi.json)
### Session Transcript Schema
All events follow a [session transcript schema](https://sandboxagent.dev/docs/session-transcript-schema) that normalizes differences between agents.
### Tip: Extract credentials
Often you need to use your personal API tokens to test agents on sandboxes:
@ -234,7 +221,7 @@ No, they're complementary. AI SDK is for building chat interfaces and calling LL
<details>
<summary><strong>Which coding agents are supported?</strong></summary>
Claude Code, Codex, OpenCode, and Amp. The SDK normalizes their APIs so you can swap between them without changing your code.
Claude Code, Codex, OpenCode, Cursor, Amp, and Pi. The SDK normalizes their APIs so you can swap between them without changing your code.
</details>
<details>
@ -258,7 +245,7 @@ The server is a single Rust binary that runs anywhere with a curl install. If yo
<details>
<summary><strong>Can I use this with my personal API keys?</strong></summary>
Yes. Use `sandbox-agent credentials extract-env` to extract API keys from your local agent configs (Claude Code, Codex, OpenCode, Amp) and pass them to the sandbox environment.
Yes. Use `sandbox-agent credentials extract-env` to extract API keys from your local agent configs (Claude Code, Codex, OpenCode, Amp, Pi) and pass them to the sandbox environment.
</details>
<details>
@ -290,7 +277,7 @@ Coding agents expect interactive terminals with proper TTY handling. SSH with pi
- **Storage of sessions on disk**: Sessions are already stored by the respective coding agents on disk. It's assumed that the consumer is streaming data from this machine to an external storage, such as Postgres, ClickHouse, or Rivet.
- **Direct LLM wrappers**: Use the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction) if you want to implement your own agent from scratch.
- **Git Repo Management**: Just use git commands or the features provided by your sandbox provider of choice.
- **Sandbox Provider API**: Sandbox providers have many nuanced differences in their API, it does not make sense for us to try to provide a custom layer. Instead, we opt to provide guides that let you integrate this project with sandbox providers.
- **Sandbox Provider API**: Sandbox providers have many nuanced differences in their API, it does not make sense for us to try to provide a custom layer. Instead, we opt to provide guides that let you integrate this repository with sandbox providers.
## Roadmap

7
biome.json Normal file
View file

@ -0,0 +1,7 @@
{
"$schema": "./node_modules/@biomejs/biome/configuration_schema.json",
"formatter": {
"indentStyle": "space",
"lineWidth": 160
}
}

View file

@ -0,0 +1,7 @@
FROM node:22-bookworm-slim
RUN npm install -g pnpm@10.28.2
WORKDIR /app
CMD ["bash", "-lc", "pnpm install --filter @sandbox-agent/inspector... && cd frontend/packages/inspector && exec pnpm vite --host 0.0.0.0 --port 5173"]

View file

@ -9,6 +9,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -17,11 +19,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector

View file

@ -9,6 +9,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -17,11 +19,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector

View file

@ -9,6 +9,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -17,11 +19,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector

View file

@ -9,6 +9,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -17,11 +19,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector

View file

@ -9,6 +9,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -17,11 +19,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector

View file

@ -11,6 +11,8 @@ RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
# Install dependencies
@ -19,11 +21,15 @@ RUN pnpm install --filter @sandbox-agent/inspector...
# Copy SDK source (with pre-generated types from docs/openapi.json)
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
# Build cli-shared and SDK (just tsup, skip generate since types are pre-generated)
# Build cli-shared, acp-http-client, SDK, then persist-indexeddb and react (depends on SDK)
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
# Copy inspector source and build
COPY frontend/packages/inspector ./frontend/packages/inspector
@ -143,7 +149,8 @@ FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y \
ca-certificates \
curl \
git && \
git \
ffmpeg && \
rm -rf /var/lib/apt/lists/*
# Copy the binary from builder
@ -158,4 +165,4 @@ WORKDIR /home/sandbox
EXPOSE 2468
ENTRYPOINT ["sandbox-agent"]
CMD ["--host", "0.0.0.0", "--port", "2468"]
CMD ["server", "--host", "0.0.0.0", "--port", "2468"]

View file

@ -0,0 +1,159 @@
# syntax=docker/dockerfile:1.10.0
# ============================================================================
# Build inspector frontend
# ============================================================================
FROM node:22-alpine AS inspector-build
WORKDIR /app
RUN npm install -g pnpm
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY frontend/packages/inspector/package.json ./frontend/packages/inspector/
COPY sdks/cli-shared/package.json ./sdks/cli-shared/
COPY sdks/acp-http-client/package.json ./sdks/acp-http-client/
COPY sdks/react/package.json ./sdks/react/
COPY sdks/typescript/package.json ./sdks/typescript/
RUN pnpm install --filter @sandbox-agent/inspector...
COPY docs/openapi.json ./docs/
COPY sdks/cli-shared ./sdks/cli-shared
COPY sdks/acp-http-client ./sdks/acp-http-client
COPY sdks/react ./sdks/react
COPY sdks/typescript ./sdks/typescript
RUN cd sdks/cli-shared && pnpm exec tsup
RUN cd sdks/acp-http-client && pnpm exec tsup
RUN cd sdks/typescript && SKIP_OPENAPI_GEN=1 pnpm exec tsup
RUN cd sdks/react && pnpm exec tsup
COPY frontend/packages/inspector ./frontend/packages/inspector
RUN cd frontend/packages/inspector && pnpm exec vite build
# ============================================================================
# AMD64 Builder - Uses cross-tools musl toolchain
# ============================================================================
FROM --platform=linux/amd64 rust:1.88.0 AS builder-amd64
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
musl-tools \
musl-dev \
llvm-14-dev \
libclang-14-dev \
clang-14 \
libssl-dev \
pkg-config \
ca-certificates \
g++ \
g++-multilib \
git \
curl \
wget && \
rm -rf /var/lib/apt/lists/*
RUN wget -q https://github.com/cross-tools/musl-cross/releases/latest/download/x86_64-unknown-linux-musl.tar.xz && \
tar -xf x86_64-unknown-linux-musl.tar.xz -C /opt/ && \
rm x86_64-unknown-linux-musl.tar.xz && \
rustup target add x86_64-unknown-linux-musl
ENV PATH="/opt/x86_64-unknown-linux-musl/bin:$PATH" \
LIBCLANG_PATH=/usr/lib/llvm-14/lib \
CLANG_PATH=/usr/bin/clang-14 \
CC_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-gcc \
CXX_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-g++ \
AR_x86_64_unknown_linux_musl=x86_64-unknown-linux-musl-ar \
CARGO_TARGET_X86_64_UNKNOWN_LINUX_MUSL_LINKER=x86_64-unknown-linux-musl-gcc \
CARGO_INCREMENTAL=0 \
CARGO_NET_GIT_FETCH_WITH_CLI=true
ENV SSL_VER=1.1.1w
RUN wget https://www.openssl.org/source/openssl-$SSL_VER.tar.gz && \
tar -xzf openssl-$SSL_VER.tar.gz && \
cd openssl-$SSL_VER && \
./Configure no-shared no-async --prefix=/musl --openssldir=/musl/ssl linux-x86_64 && \
make -j$(nproc) && \
make install_sw && \
cd .. && \
rm -rf openssl-$SSL_VER*
ENV OPENSSL_DIR=/musl \
OPENSSL_INCLUDE_DIR=/musl/include \
OPENSSL_LIB_DIR=/musl/lib \
PKG_CONFIG_ALLOW_CROSS=1 \
RUSTFLAGS="-C target-feature=+crt-static -C link-arg=-static-libgcc"
WORKDIR /build
COPY . .
COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
RUN --mount=type=cache,target=/usr/local/cargo/registry \
--mount=type=cache,target=/usr/local/cargo/git \
--mount=type=cache,target=/build/target \
cargo build -p sandbox-agent --release --target x86_64-unknown-linux-musl && \
cp target/x86_64-unknown-linux-musl/release/sandbox-agent /sandbox-agent
# ============================================================================
# ARM64 Builder - Uses Alpine with native musl
# ============================================================================
FROM --platform=linux/arm64 rust:1.88-alpine AS builder-arm64
RUN apk add --no-cache \
musl-dev \
clang \
llvm-dev \
openssl-dev \
openssl-libs-static \
pkgconfig \
git \
curl \
build-base
RUN rustup target add aarch64-unknown-linux-musl
ENV CARGO_INCREMENTAL=0 \
CARGO_NET_GIT_FETCH_WITH_CLI=true \
RUSTFLAGS="-C target-feature=+crt-static"
WORKDIR /build
COPY . .
COPY --from=inspector-build /app/frontend/packages/inspector/dist ./frontend/packages/inspector/dist
RUN --mount=type=cache,target=/usr/local/cargo/registry \
--mount=type=cache,target=/usr/local/cargo/git \
--mount=type=cache,target=/build/target \
cargo build -p sandbox-agent --release --target aarch64-unknown-linux-musl && \
cp target/aarch64-unknown-linux-musl/release/sandbox-agent /sandbox-agent
# ============================================================================
# Select the appropriate builder based on target architecture
# ============================================================================
ARG TARGETARCH
FROM builder-${TARGETARCH} AS builder
# Runtime stage - full image with all supported agents preinstalled
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y \
bash \
ca-certificates \
curl \
git && \
rm -rf /var/lib/apt/lists/*
COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
RUN chmod +x /usr/local/bin/sandbox-agent
RUN useradd -m -s /bin/bash sandbox
USER sandbox
WORKDIR /home/sandbox
RUN sandbox-agent install-agent --all
EXPOSE 2468
ENTRYPOINT ["sandbox-agent"]
CMD ["server", "--host", "0.0.0.0", "--port", "2468"]

View file

@ -0,0 +1,61 @@
FROM rust:1.88.0-bookworm AS builder
WORKDIR /build
COPY Cargo.toml Cargo.lock ./
COPY server/ ./server/
COPY gigacode/ ./gigacode/
COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/
COPY scripts/agent-configs/ ./scripts/agent-configs/
COPY scripts/audit-acp-deps/ ./scripts/audit-acp-deps/
ENV SANDBOX_AGENT_SKIP_INSPECTOR=1
RUN --mount=type=cache,target=/usr/local/cargo/registry \
--mount=type=cache,target=/usr/local/cargo/git \
--mount=type=cache,target=/build/target \
cargo build -p sandbox-agent --release && \
cp target/release/sandbox-agent /sandbox-agent
# Extract neko binary from the official image for WebRTC desktop streaming.
# Using neko v3 base image from GHCR which provides multi-arch support (amd64, arm64).
# Pinned by digest to prevent breaking changes from upstream.
# Reference client: https://github.com/demodesk/neko-client/blob/37f93eae6bd55b333c94bd009d7f2b079075a026/src/component/internal/webrtc.ts
FROM ghcr.io/m1k1o/neko/base@sha256:0c384afa56268aaa2d5570211d284763d0840dcdd1a7d9a24be3081d94d3dfce AS neko-base
FROM node:22-bookworm-slim
RUN apt-get update -qq && \
apt-get install -y -qq --no-install-recommends \
ca-certificates \
bash \
libstdc++6 \
xvfb \
openbox \
xdotool \
imagemagick \
ffmpeg \
gstreamer1.0-tools \
gstreamer1.0-plugins-base \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly \
gstreamer1.0-nice \
gstreamer1.0-x \
gstreamer1.0-pulseaudio \
libxcvt0 \
x11-xserver-utils \
dbus-x11 \
xauth \
fonts-dejavu-core \
xterm \
> /dev/null 2>&1 && \
rm -rf /var/lib/apt/lists/*
COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
COPY --from=neko-base /usr/bin/neko /usr/local/bin/neko
EXPOSE 3000
# Expose UDP port range for WebRTC media transport
EXPOSE 59050-59070/udp
ENTRYPOINT ["/usr/local/bin/sandbox-agent"]
CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"]

View file

@ -0,0 +1,37 @@
# Extends the base test-agent image with common software pre-installed.
# Used by the common_software integration test to verify that all documented
# software in docs/common-software.mdx works correctly inside the sandbox.
#
# KEEP IN SYNC with docs/common-software.mdx
ARG BASE_IMAGE=sandbox-agent-test:dev
FROM ${BASE_IMAGE}
USER root
RUN apt-get update -qq && \
apt-get install -y -qq --no-install-recommends \
# Browsers
chromium \
firefox-esr \
# Languages
python3 python3-pip python3-venv \
default-jdk \
ruby-full \
# Databases
sqlite3 \
redis-server \
# Build tools
build-essential cmake pkg-config \
# CLI tools
git jq tmux \
# Media and graphics
imagemagick \
poppler-utils \
# Desktop apps
gimp \
> /dev/null 2>&1 && \
rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/usr/local/bin/sandbox-agent"]
CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"]

View file

@ -1,278 +1,268 @@
---
title: "Agent Sessions"
description: "Create sessions and send messages to agents."
description: "Create sessions, prompt agents, and inspect event history."
sidebarTitle: "Sessions"
icon: "comments"
---
Sessions are the unit of interaction with an agent. You create one session per task, then send messages and stream events.
Sessions are the unit of interaction with an agent. Create one session per task, send prompts, and consume event history.
## Session Options
For SDK-based flows, sessions can be restored after runtime/session loss when persistence is enabled.
See [Session Restoration](/session-restoration).
`POST /v1/sessions/{sessionId}` accepts the following fields:
## Create a session
- `agent` (required): `claude`, `codex`, `opencode`, `amp`, or `mock`
- `agentMode`: agent mode string (for example, `build`, `plan`)
- `permissionMode`: permission mode string (`default`, `plan`, `bypass`, etc.)
- `model`: model override (agent-specific)
- `variant`: model variant (agent-specific)
- `agentVersion`: agent version override
- `mcp`: MCP server config map (see `MCP`)
- `skills`: skill path config (see `Skills`)
## Create A Session
<CodeGroup>
```ts TypeScript
```ts
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("build-session", {
const session = await sdk.createSession({
agent: "codex",
agentMode: "build",
permissionMode: "default",
model: "gpt-4.1",
variant: "reasoning",
agentVersion: "latest",
cwd: "/",
});
console.log(session.id, session.agentSessionId);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "codex",
"agentMode": "build",
"permissionMode": "default",
"model": "gpt-4.1",
"variant": "reasoning",
"agentVersion": "latest"
}'
```
</CodeGroup>
## Send a prompt
## Send A Message
```ts
const response = await session.prompt([
{ type: "text", text: "Summarize the repository structure." },
]);
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("build-session", {
message: "Summarize the repository structure.",
});
console.log(response.stopReason);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Summarize the repository structure."}'
## Subscribe to live events
```ts
const unsubscribe = session.onEvent((event) => {
console.log(event.eventIndex, event.sender, event.payload);
});
await session.prompt([
{ type: "text", text: "Explain the main entrypoints." },
]);
unsubscribe();
```
</CodeGroup>
## Stream A Turn
### Event types
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
Each event's `payload` contains a session update. The `sessionUpdate` field identifies the type.
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
<AccordionGroup>
<Accordion title="agent_message_chunk">
Streamed text or content from the agent's response.
const response = await client.postMessageStream("build-session", {
message: "Explain the main entrypoints.",
});
const reader = response.body?.getReader();
if (reader) {
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value, { stream: true }));
}
```json
{
"sessionUpdate": "agent_message_chunk",
"content": { "type": "text", "text": "Here's how the repository is structured..." }
}
```
</Accordion>
```bash cURL
curl -N -X POST "http://127.0.0.1:2468/v1/sessions/build-session/messages/stream" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Explain the main entrypoints."}'
<Accordion title="agent_thought_chunk">
Internal reasoning from the agent (chain-of-thought / extended thinking).
```json
{
"sessionUpdate": "agent_thought_chunk",
"content": { "type": "text", "text": "I should start by looking at the project structure..." }
}
```
</CodeGroup>
</Accordion>
## Fetch Events
<Accordion title="user_message_chunk">
Echo of the user's prompt being processed.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
```json
{
"sessionUpdate": "user_message_chunk",
"content": { "type": "text", "text": "Summarize the repository structure." }
}
```
</Accordion>
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
<Accordion title="tool_call">
The agent invoked a tool (file edit, terminal command, etc.).
const events = await client.getEvents("build-session", {
offset: 0,
```json
{
"sessionUpdate": "tool_call",
"toolCallId": "tc_abc123",
"title": "Read file",
"status": "in_progress",
"rawInput": { "path": "/src/index.ts" }
}
```
</Accordion>
<Accordion title="tool_call_update">
Progress or result update for an in-progress tool call.
```json
{
"sessionUpdate": "tool_call_update",
"toolCallId": "tc_abc123",
"status": "completed",
"content": [{ "type": "text", "text": "import express from 'express';\n..." }]
}
```
</Accordion>
<Accordion title="plan">
The agent's execution plan for the current task.
```json
{
"sessionUpdate": "plan",
"entries": [
{ "content": "Read the project structure", "status": "completed" },
{ "content": "Identify main entrypoints", "status": "in_progress" },
{ "content": "Write summary", "status": "pending" }
]
}
```
</Accordion>
<Accordion title="usage_update">
Token usage metrics for the current turn.
```json
{
"sessionUpdate": "usage_update"
}
```
</Accordion>
<Accordion title="session_info_update">
Session metadata changed (e.g. agent-generated title).
```json
{
"sessionUpdate": "session_info_update",
"title": "Repository structure analysis"
}
```
</Accordion>
</AccordionGroup>
## Fetch persisted event history
```ts
const page = await sdk.getEvents({
sessionId: session.id,
limit: 50,
includeRaw: false,
});
console.log(events.events);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events?offset=0&limit=50" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
`GET /v1/sessions/{sessionId}/get-messages` is an alias for `events`.
## Stream Events (SSE)
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
for await (const event of client.streamEvents("build-session", { offset: 0 })) {
console.log(event.type, event.data);
for (const event of page.items) {
console.log(event.id, event.createdAt, event.sender);
}
```
```bash cURL
curl -N -X GET "http://127.0.0.1:2468/v1/sessions/build-session/events/sse?offset=0" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## List and load sessions
## List Sessions
```ts
const sessions = await sdk.listSessions({ limit: 20 });
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
for (const item of sessions.items) {
console.log(item.id, item.agent, item.createdAt);
}
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const sessions = await client.listSessions();
console.log(sessions.sessions);
if (sessions.items.length > 0) {
const loaded = await sdk.resumeSession(sessions.items[0]!.id);
await loaded.prompt([{ type: "text", text: "Continue." }]);
}
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/sessions" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Configure model, mode, and thought level
## Reply To A Question
Set the model, mode, or thought level on a session at creation time or after:
When the agent asks a question, reply with an array of answers. Each inner array is one multi-select response.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.replyQuestion("build-session", "question-1", {
answers: [["yes"]],
```ts
// At creation time
const session = await sdk.createSession({
agent: "codex",
model: "gpt-5.3-codex",
mode: "auto",
thoughtLevel: "high",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"answers":[["yes"]]}'
```ts
// After creation
await session.setModel("gpt-5.2-codex");
await session.setMode("full-access");
await session.setThoughtLevel("medium");
```
</CodeGroup>
## Reject A Question
Query available modes:
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
```ts
const modes = await session.getModes();
console.log(modes?.currentModeId, modes?.availableModes);
```
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
### Advanced config options
For config options beyond model, mode, and thought level, use `getConfigOptions` to discover what the agent supports and `setConfigOption` to set any option by ID:
```ts
const options = await session.getConfigOptions();
for (const opt of options) {
console.log(opt.id, opt.category, opt.type);
}
```
```ts
await session.setConfigOption("some-agent-option", "value");
```
## Handle permission requests
For agents that request tool-use permissions, register a permission listener and reply with `once`, `always`, or `reject`:
```ts
const session = await sdk.createSession({
agent: "claude",
mode: "default",
});
await client.rejectQuestion("build-session", "question-1");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/questions/question-1/reject" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>
## Reply To A Permission Request
Use `once`, `always`, or `reject`.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
session.onPermissionRequest((request) => {
console.log(request.toolCall.title, request.availableReplies);
void session.respondPermission(request.id, "once");
});
await client.replyPermission("build-session", "permission-1", {
reply: "once",
await session.prompt([
{ type: "text", text: "Create ./permission-example.txt with the text hello." },
]);
```
### Auto-approving permissions
To auto-approve all permission requests, respond with `"once"` or `"always"` in your listener:
```ts
session.onPermissionRequest((request) => {
void session.respondPermission(request.id, "always");
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/permissions/permission-1/reply" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{"reply":"once"}'
See `examples/permissions/src/index.ts` for a complete permissions example that works with Claude and Codex.
<Info>
Some agents like Claude allow configuring permission behavior through modes (e.g. `bypassPermissions`, `acceptEdits`). We recommend leaving the mode as `default` and handling permission decisions explicitly in `onPermissionRequest` instead.
</Info>
## Destroy a session
```ts
await sdk.destroySession(session.id);
```
</CodeGroup>
## Terminate A Session
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.terminateSession("build-session");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/build-session/terminate" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</CodeGroup>

20
docs/agents/amp.mdx Normal file
View file

@ -0,0 +1,20 @@
---
title: "Amp"
description: "Use Amp as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "amp",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | `amp-default` |
| **Modes** | `default`, `bypass` |
| **Thought levels** | Unsupported |

49
docs/agents/claude.mdx Normal file
View file

@ -0,0 +1,49 @@
---
title: "Claude"
description: "Use Claude Code as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "claude",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | `default`, `sonnet`, `opus`, `haiku` |
| **Modes** | `default`, `acceptEdits`, `plan`, `dontAsk`, `bypassPermissions` |
| **Thought levels** | Unsupported |
## Configuring effort level
Claude does not support changing effort level after a session starts. Configure it in the filesystem before creating the session.
```ts
import { mkdir, writeFile } from "node:fs/promises";
import path from "node:path";
const cwd = "/path/to/workspace";
await mkdir(path.join(cwd, ".claude"), { recursive: true });
await writeFile(
path.join(cwd, ".claude", "settings.json"),
JSON.stringify({ effortLevel: "high" }, null, 2),
);
const session = await client.createSession({
agent: "claude",
cwd,
});
```
<Accordion title="Supported settings file locations (highest precedence last)">
1. `~/.claude/settings.json`
2. `<session cwd>/.claude/settings.json`
3. `<session cwd>/.claude/settings.local.json`
</Accordion>

20
docs/agents/codex.mdx Normal file
View file

@ -0,0 +1,20 @@
---
title: "Codex"
description: "Use OpenAI Codex as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "codex",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | `gpt-5.3-codex` (default), `gpt-5.3-codex-spark`, `gpt-5.2-codex`, `gpt-5.1-codex-max`, `gpt-5.2`, `gpt-5.1-codex-mini` |
| **Modes** | `read-only` (default), `auto`, `full-access` |
| **Thought levels** | `low`, `medium`, `high` (default), `xhigh` |

34
docs/agents/cursor.mdx Normal file
View file

@ -0,0 +1,34 @@
---
title: "Cursor"
description: "Use Cursor as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "cursor",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | See below |
| **Modes** | Unsupported |
| **Thought levels** | Unsupported |
<Accordion title="All models">
| Group | Models |
|-------|--------|
| **Auto** | `auto` |
| **Composer** | `composer-1.5`, `composer-1` |
| **GPT-5.3 Codex** | `gpt-5.3-codex`, `gpt-5.3-codex-low`, `gpt-5.3-codex-high`, `gpt-5.3-codex-xhigh`, `gpt-5.3-codex-fast`, `gpt-5.3-codex-low-fast`, `gpt-5.3-codex-high-fast`, `gpt-5.3-codex-xhigh-fast` |
| **GPT-5.2** | `gpt-5.2`, `gpt-5.2-high`, `gpt-5.2-codex`, `gpt-5.2-codex-low`, `gpt-5.2-codex-high`, `gpt-5.2-codex-xhigh`, `gpt-5.2-codex-fast`, `gpt-5.2-codex-low-fast`, `gpt-5.2-codex-high-fast`, `gpt-5.2-codex-xhigh-fast` |
| **GPT-5.1** | `gpt-5.1-high`, `gpt-5.1-codex-max`, `gpt-5.1-codex-max-high` |
| **Claude** | `opus-4.6-thinking` (default), `opus-4.6`, `opus-4.5`, `opus-4.5-thinking`, `sonnet-4.5`, `sonnet-4.5-thinking` |
| **Other** | `gemini-3-pro`, `gemini-3-flash`, `grok` |
</Accordion>

31
docs/agents/opencode.mdx Normal file
View file

@ -0,0 +1,31 @@
---
title: "OpenCode"
description: "Use OpenCode as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "opencode",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | See below |
| **Modes** | `build` (default), `plan` |
| **Thought levels** | Unsupported |
<Accordion title="All models">
| Provider | Models |
|----------|--------|
| **Anthropic** | `anthropic/claude-3-5-haiku-20241022`, `anthropic/claude-3-5-haiku-latest`, `anthropic/claude-3-5-sonnet-20240620`, `anthropic/claude-3-5-sonnet-20241022`, `anthropic/claude-3-7-sonnet-20250219`, `anthropic/claude-3-7-sonnet-latest`, `anthropic/claude-3-haiku-20240307`, `anthropic/claude-3-opus-20240229`, `anthropic/claude-3-sonnet-20240229`, `anthropic/claude-haiku-4-5`, `anthropic/claude-haiku-4-5-20251001`, `anthropic/claude-opus-4-0`, `anthropic/claude-opus-4-1`, `anthropic/claude-opus-4-1-20250805`, `anthropic/claude-opus-4-20250514`, `anthropic/claude-opus-4-5`, `anthropic/claude-opus-4-5-20251101`, `anthropic/claude-opus-4-6`, `anthropic/claude-sonnet-4-0`, `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-sonnet-4-5`, `anthropic/claude-sonnet-4-5-20250929` |
| **OpenAI** | `openai/gpt-5.1-codex`, `openai/gpt-5.1-codex-max`, `openai/gpt-5.1-codex-mini`, `openai/gpt-5.2`, `openai/gpt-5.2-codex`, `openai/gpt-5.3-codex` |
| **Cerebras** | `cerebras/gpt-oss-120b`, `cerebras/qwen-3-235b-a22b-instruct-2507`, `cerebras/zai-glm-4.7` |
| **OpenCode Zen** | `opencode/big-pickle`, `opencode/claude-3-5-haiku`, `opencode/claude-haiku-4-5`, `opencode/claude-opus-4-1`, `opencode/claude-opus-4-5`, `opencode/claude-opus-4-6`, `opencode/claude-sonnet-4`, `opencode/claude-sonnet-4-5`, `opencode/gemini-3-flash`, `opencode/gemini-3-pro` (default), `opencode/glm-4.6`, `opencode/glm-4.7`, `opencode/gpt-5`, `opencode/gpt-5-codex`, `opencode/gpt-5-nano`, `opencode/gpt-5.1`, `opencode/gpt-5.1-codex`, `opencode/gpt-5.1-codex-max`, `opencode/gpt-5.1-codex-mini`, `opencode/gpt-5.2`, `opencode/gpt-5.2-codex`, `opencode/kimi-k2`, `opencode/kimi-k2-thinking`, `opencode/kimi-k2.5`, `opencode/kimi-k2.5-free`, `opencode/minimax-m2.1`, `opencode/minimax-m2.1-free`, `opencode/trinity-large-preview-free` |
</Accordion>

20
docs/agents/pi.mdx Normal file
View file

@ -0,0 +1,20 @@
---
title: "Pi"
description: "Use Pi as a sandbox agent."
---
## Usage
```typescript
const session = await client.createSession({
agent: "pi",
});
```
## Capabilities
| Category | Values |
|----------|--------|
| **Models** | `default` |
| **Modes** | Unsupported |
| **Thought levels** | Unsupported |

View file

@ -8,8 +8,8 @@ Mintlify publishes `llms.txt` and `llms-full.txt` for this documentation site.
Access them at:
```
https://rivet.dev/docs/llms.txt
https://rivet.dev/docs/llms-full.txt
https://sandboxagent.dev/docs/llms.txt
https://sandboxagent.dev/docs/llms-full.txt
```
If you run a reverse proxy in front of the docs, forward `/llms.txt` and `/llms-full.txt` to Mintlify.

View file

@ -8,7 +8,7 @@ Mintlify hosts a `skill.md` file for this documentation site.
Access it at:
```
https://rivet.dev/docs/skill.md
https://sandboxagent.dev/docs/skill.md
```
To add it to an agent using the Skills CLI:

63
docs/architecture.mdx Normal file
View file

@ -0,0 +1,63 @@
---
title: "Architecture"
description: "How the Sandbox Agent server, SDK, and agent processes fit together."
---
Sandbox Agent is a lightweight HTTP server that runs **inside** a sandbox. It:
- **Agent management**: Installs, spawns, and stops coding agent processes
- **Sessions**: Routes prompts to agents and streams events back in real time
- **Sandbox APIs**: Filesystem, process, and terminal access for the sandbox environment
## Components
```mermaid
flowchart LR
CLIENT["Your App"]
subgraph SANDBOX["Sandbox"]
direction TB
SERVER["Sandbox Agent Server"]
AGENT["Agent Process<br/>(Claude, Codex, etc.)"]
SERVER --> AGENT
end
CLIENT -->|"SDK (HTTP)"| SERVER
```
- **Your app**: Uses the `sandbox-agent` TypeScript SDK to talk to the server over HTTP.
- **Sandbox**: An isolated runtime (local process, Docker, E2B, Daytona, Vercel, Cloudflare).
- **Sandbox Agent server**: A single binary inside the sandbox that manages agent lifecycles, routes prompts, streams events, and exposes filesystem/process/terminal APIs.
- **Agent process**: A coding agent (Claude Code, Codex, etc.) spawned by the server. Each session maps to one agent process.
## What `SandboxAgent.start()` does
1. **Provision**: The provider creates a sandbox (starts a container, creates a VM, etc.)
2. **Install**: The Sandbox Agent binary is installed inside the sandbox
3. **Boot**: The server starts listening on an HTTP port
4. **Health check**: The SDK waits for `/v1/health` to respond
5. **Ready**: The SDK returns a connected client
For the `local` provider, provisioning is a no-op and the server runs as a local subprocess.
### Server recovery
If the server process stops, the SDK automatically calls the provider's `ensureServer()` after 3 consecutive health-check failures. Most built-in providers implement this. Custom providers can add `ensureServer(sandboxId)` to their `SandboxProvider` object.
## Server HTTP API
See the [HTTP API reference](/api-reference) for the full list of server endpoints.
## Agent installation
Agents are installed lazily on first use. To avoid the cold-start delay, pre-install them:
```bash
sandbox-agent install-agent --all
```
The `rivetdev/sandbox-agent:0.4.2-full` Docker image ships with all agents pre-installed.
## Production-ready agent orchestration
For production deployments, see [Orchestration Architecture](/orchestration-architecture) for recommended topology, backend requirements, and session persistence patterns.

View file

@ -1,11 +1,11 @@
---
title: "Attachments"
description: "Upload files into the sandbox and attach them to prompts."
description: "Upload files into the sandbox and reference them in prompts."
sidebarTitle: "Attachments"
icon: "paperclip"
---
Use the filesystem API to upload files, then reference them as attachments when sending prompts.
Use the filesystem API to upload files, then include file references in prompt content.
<Steps>
<Step title="Upload a file">
@ -14,15 +14,14 @@ Use the filesystem API to upload files, then reference them as attachments when
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const buffer = await fs.promises.readFile("./data.csv");
const upload = await client.writeFsFile(
{ path: "./uploads/data.csv", sessionId: "my-session" },
const upload = await sdk.writeFsFile(
{ path: "./uploads/data.csv" },
buffer,
);
@ -30,58 +29,33 @@ Use the filesystem API to upload files, then reference them as attachments when
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./uploads/data.csv" \
--data-binary @./data.csv
```
</CodeGroup>
The response returns the absolute path that you should use for attachments.
The upload response returns the absolute path.
</Step>
<Step title="Attach the file in a prompt">
<CodeGroup>
<Step title="Reference the file in a prompt">
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const session = await sdk.createSession({ agent: "mock" });
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.postMessage("my-session", {
message: "Please analyze the attached CSV.",
attachments: [
{
path: "/home/sandbox/uploads/data.csv",
mime: "text/csv",
filename: "data.csv",
},
],
});
await session.prompt([
{ type: "text", text: "Please analyze the attached CSV." },
{
type: "resource_link",
name: "data.csv",
uri: "file:///home/sandbox/uploads/data.csv",
mimeType: "text/csv",
},
]);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"message": "Please analyze the attached CSV.",
"attachments": [
{
"path": "/home/sandbox/uploads/data.csv",
"mime": "text/csv",
"filename": "data.csv"
}
]
}'
```
</CodeGroup>
</Step>
</Steps>
## Notes
- Use absolute paths from the upload response to avoid ambiguity.
- If `mime` is omitted, the server defaults to `application/octet-stream`.
- OpenCode receives file parts directly; other agents will see the attachment paths appended to the prompt.
- Use absolute file URIs in `resource_link` blocks.
- If `mimeType` is omitted, the agent/runtime may infer a default.
- Support for non-text resources depends on each agent's prompt capabilities.

View file

@ -1,370 +0,0 @@
---
title: "Building a Chat UI"
description: "Build a chat interface using the universal event stream."
icon: "comments"
---
## Setup
### List agents
```ts
const { agents } = await client.listAgents();
// Each agent exposes feature coverage via `capabilities` to determine what UI to show
const claude = agents.find((a) => a.id === "claude");
if (claude?.capabilities.permissions) {
// Show permission approval UI
}
if (claude?.capabilities.questions) {
// Show question response UI
}
```
### Create a session
```ts
const sessionId = `session-${crypto.randomUUID()}`;
await client.createSession(sessionId, {
agent: "claude",
agentMode: "code", // Optional: agent-specific mode
permissionMode: "default", // Optional: "default" | "plan" | "bypass" | "acceptEdits" (Claude: accept edits; Codex: auto-approve file changes; others: default)
model: "claude-sonnet-4", // Optional: model override
});
```
### Send a message
```ts
await client.postMessage(sessionId, { message: "Hello, world!" });
```
### Stream events
Three options for receiving events:
```ts
// Option 1: SSE (recommended for real-time UI)
const stream = client.streamEvents(sessionId, { offset: 0 });
for await (const event of stream) {
handleEvent(event);
}
// Option 2: Polling
const { events, hasMore } = await client.getEvents(sessionId, { offset: 0 });
events.forEach(handleEvent);
// Option 3: Turn streaming (send + stream in one call)
const stream = client.streamTurn(sessionId, { message: "Hello" });
for await (const event of stream) {
handleEvent(event);
}
```
Use `offset` to track the last seen `sequence` number and resume from where you left off.
---
## Handling Events
### Bare minimum
Handle item lifecycle plus turn lifecycle to render a basic chat:
```ts
type ItemState = {
item: UniversalItem;
deltas: string[];
};
const items = new Map<string, ItemState>();
let turnInProgress = false;
function handleEvent(event: UniversalEvent) {
switch (event.type) {
case "turn.started": {
turnInProgress = true;
break;
}
case "turn.ended": {
turnInProgress = false;
break;
}
case "item.started": {
const { item } = event.data as ItemEventData;
items.set(item.item_id, { item, deltas: [] });
break;
}
case "item.delta": {
const { item_id, delta } = event.data as ItemDeltaData;
const state = items.get(item_id);
if (state) {
state.deltas.push(delta);
}
break;
}
case "item.completed": {
const { item } = event.data as ItemEventData;
const state = items.get(item.item_id);
if (state) {
state.item = item;
state.deltas = []; // Clear deltas, use final content
}
break;
}
}
}
```
When rendering:
- Use `turnInProgress` for turn-level UI state (disable send button, show global "Agent is responding", etc.).
- Use `item.status === "in_progress"` for per-item streaming state.
```ts
function renderItem(state: ItemState) {
const { item, deltas } = state;
const isItemLoading = item.status === "in_progress";
// For streaming text, combine item content with accumulated deltas
const text = item.content
.filter((p) => p.type === "text")
.map((p) => p.text)
.join("");
const streamedText = text + deltas.join("");
return {
content: streamedText,
isItemLoading,
isTurnLoading: turnInProgress,
role: item.role,
kind: item.kind,
};
}
```
### Extra events
Handle these for a complete implementation:
```ts
function handleEvent(event: UniversalEvent) {
switch (event.type) {
// ... bare minimum events above ...
case "session.started": {
// Session is ready
break;
}
case "session.ended": {
const { reason, terminated_by } = event.data as SessionEndedData;
// Disable input, show end reason
// reason: "completed" | "error" | "terminated"
// terminated_by: "agent" | "daemon"
break;
}
case "error": {
const { message, code } = event.data as ErrorData;
// Display error to user
break;
}
case "agent.unparsed": {
const { error, location } = event.data as AgentUnparsedData;
// Parsing failure - treat as bug in development
console.error(`Parse error at ${location}: ${error}`);
break;
}
}
}
```
### Content parts
Each item has `content` parts. Render based on `type`:
```ts
function renderContentPart(part: ContentPart) {
switch (part.type) {
case "text":
return <Markdown>{part.text}</Markdown>;
case "tool_call":
return <ToolCall name={part.name} args={part.arguments} />;
case "tool_result":
return <ToolResult output={part.output} />;
case "file_ref":
return <FileChange path={part.path} action={part.action} diff={part.diff} />;
case "reasoning":
return <Reasoning>{part.text}</Reasoning>;
case "status":
return <Status label={part.label} detail={part.detail} />;
case "image":
return <Image src={part.path} />;
}
}
```
---
## Handling Permissions
When `permission.requested` arrives, show an approval UI:
```ts
const pendingPermissions = new Map<string, PermissionEventData>();
function handleEvent(event: UniversalEvent) {
if (event.type === "permission.requested") {
const data = event.data as PermissionEventData;
pendingPermissions.set(data.permission_id, data);
}
if (event.type === "permission.resolved") {
const data = event.data as PermissionEventData;
pendingPermissions.delete(data.permission_id);
}
}
// User clicks approve/deny
async function replyPermission(id: string, reply: "once" | "always" | "reject") {
await client.replyPermission(sessionId, id, { reply });
pendingPermissions.delete(id);
}
```
Render permission requests:
```ts
function PermissionRequest({ data }: { data: PermissionEventData }) {
return (
<div>
<p>Allow: {data.action}</p>
<button onClick={() => replyPermission(data.permission_id, "once")}>
Allow Once
</button>
<button onClick={() => replyPermission(data.permission_id, "always")}>
Always Allow
</button>
<button onClick={() => replyPermission(data.permission_id, "reject")}>
Reject
</button>
</div>
);
}
```
---
## Handling Questions
When `question.requested` arrives, show a selection UI:
```ts
const pendingQuestions = new Map<string, QuestionEventData>();
function handleEvent(event: UniversalEvent) {
if (event.type === "question.requested") {
const data = event.data as QuestionEventData;
pendingQuestions.set(data.question_id, data);
}
if (event.type === "question.resolved") {
const data = event.data as QuestionEventData;
pendingQuestions.delete(data.question_id);
}
}
// User selects answer(s)
async function answerQuestion(id: string, answers: string[][]) {
await client.replyQuestion(sessionId, id, { answers });
pendingQuestions.delete(id);
}
async function rejectQuestion(id: string) {
await client.rejectQuestion(sessionId, id);
pendingQuestions.delete(id);
}
```
Render question requests:
```ts
function QuestionRequest({ data }: { data: QuestionEventData }) {
const [selected, setSelected] = useState<string[]>([]);
return (
<div>
<p>{data.prompt}</p>
{data.options.map((option) => (
<label key={option}>
<input
type="checkbox"
checked={selected.includes(option)}
onChange={(e) => {
if (e.target.checked) {
setSelected([...selected, option]);
} else {
setSelected(selected.filter((s) => s !== option));
}
}}
/>
{option}
</label>
))}
<button onClick={() => answerQuestion(data.question_id, [selected])}>
Submit
</button>
<button onClick={() => rejectQuestion(data.question_id)}>
Reject
</button>
</div>
);
}
```
---
## Testing with Mock Agent
The `mock` agent lets you test UI behaviors without external credentials:
```ts
await client.createSession("test-session", { agent: "mock" });
```
Send `help` to see available commands:
| Command | Tests |
|---------|-------|
| `help` | Lists all commands |
| `demo` | Full UI coverage sequence with markers |
| `markdown` | Streaming markdown rendering |
| `tool` | Tool call + result with file refs |
| `status` | Status item updates |
| `image` | Image content part |
| `permission` | Permission request flow |
| `question` | Question request flow |
| `error` | Error + unparsed events |
| `end` | Session ended event |
| `echo <text>` | Echo text as assistant message |
Any unrecognized text is echoed back as an assistant message.
---
## Reference Implementation
The [Inspector UI](https://github.com/rivet-dev/sandbox-agent/blob/main/frontend/packages/inspector/src/App.tsx)
is a complete reference showing session management, event rendering, and HITL flows.

View file

@ -1,12 +1,17 @@
---
title: "CLI Reference"
description: "Complete CLI reference for sandbox-agent."
description: "CLI reference for sandbox-agent."
sidebarTitle: "CLI"
---
## Server
Global flags (available on all commands):
Start the HTTP server:
- `-t, --token <TOKEN>`: require/use bearer auth
- `-n, --no-token`: disable auth
## server
Run the HTTP server.
```bash
sandbox-agent server [OPTIONS]
@ -14,50 +19,112 @@ sandbox-agent server [OPTIONS]
| Option | Default | Description |
|--------|---------|-------------|
| `-t, --token <TOKEN>` | - | Authentication token for all requests |
| `-n, --no-token` | - | Disable authentication (local dev only) |
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `-O, --cors-allow-origin <ORIGIN>` | - | CORS origin to allow (repeatable) |
| `-M, --cors-allow-method <METHOD>` | all | CORS allowed method (repeatable) |
| `-A, --cors-allow-header <HEADER>` | all | CORS allowed header (repeatable) |
| `-C, --cors-allow-credentials` | - | Enable CORS credentials |
| `--no-telemetry` | - | Disable anonymous telemetry |
| `--log-to-file` | - | Redirect server logs to a daily log file |
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind |
| `-p, --port <PORT>` | `2468` | Port to bind |
| `-O, --cors-allow-origin <ORIGIN>` | - | Allowed CORS origin (repeatable) |
| `-M, --cors-allow-method <METHOD>` | all | Allowed CORS method (repeatable) |
| `-A, --cors-allow-header <HEADER>` | all | Allowed CORS header (repeatable) |
| `-C, --cors-allow-credentials` | false | Enable CORS credentials |
| `--no-telemetry` | false | Disable anonymous telemetry |
```bash
sandbox-agent server --token "$TOKEN" --port 3000
sandbox-agent server --port 3000
```
Server logs print to stdout/stderr by default. Use `--log-to-file` or `SANDBOX_AGENT_LOG_TO_FILE=1` to redirect logs to a daily log file under the sandbox-agent data directory (for example, `~/.local/share/sandbox-agent/logs`). Override the directory with `SANDBOX_AGENT_LOG_DIR`, or set `SANDBOX_AGENT_LOG_STDOUT=1` to force stdout/stderr.
Notes:
HTTP request logging is enabled by default. Control it with:
- `SANDBOX_AGENT_LOG_HTTP=0` to disable request logs
- `SANDBOX_AGENT_LOG_HTTP_HEADERS=1` to include request headers (Authorization is redacted)
- Server logs are redirected to files by default.
- Set `SANDBOX_AGENT_LOG_STDOUT=1` to force stdout/stderr logging.
- Use `SANDBOX_AGENT_LOG_DIR` to override log directory.
---
## install
## Install Agent (Local)
Install first-party runtime dependencies.
Install an agent without running the server:
### install desktop
Install the Linux desktop runtime packages required by `/v1/desktop/*`.
```bash
sandbox-agent install-agent <AGENT> [OPTIONS]
sandbox-agent install desktop [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-r, --reinstall` | Force reinstall even if already installed |
| `--yes` | Skip the confirmation prompt |
| `--print-only` | Print the package-manager command without executing it |
| `--package-manager <apt\|dnf\|apk>` | Override package-manager detection |
| `--no-fonts` | Skip the default DejaVu font package |
```bash
sandbox-agent install desktop --yes
sandbox-agent install desktop --print-only
```
Notes:
- Supported on Linux only.
- The command detects `apt`, `dnf`, or `apk`.
- If the host is not already running as root, the command requires `sudo`.
## install-agent
Install or reinstall a single agent, or every supported agent with `--all`.
```bash
sandbox-agent install-agent [<AGENT>] [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--all` | Install every supported agent |
| `-r, --reinstall` | Force reinstall |
| `--agent-version <VERSION>` | Override agent package version (conflicts with `--all`) |
| `--agent-process-version <VERSION>` | Override agent process version (conflicts with `--all`) |
Examples:
```bash
sandbox-agent install-agent claude --reinstall
sandbox-agent install-agent --all
```
---
### Custom Pi implementation path
## OpenCode (Experimental)
If you use a forked/custom `pi` binary with `pi-acp`, you can override what executable gets launched.
Start (or reuse) a sandbox-agent daemon and attach an OpenCode session (uses `opencode attach`):
#### Option 1: explicit command override (recommended)
Set `PI_ACP_PI_COMMAND` in the environment where `sandbox-agent` runs:
```bash
PI_ACP_PI_COMMAND=/absolute/path/to/your/pi-fork sandbox-agent server
```
This is forwarded to `pi-acp`, which uses it instead of looking up `pi` on `PATH`.
#### Option 2: PATH override
Put your custom `pi` first on `PATH` before starting `sandbox-agent`:
```bash
export PATH="/path/to/custom-pi-dir:$PATH"
sandbox-agent server
```
#### Option 3: symlink override
Point `pi` to your custom binary via symlink in a directory that is early on `PATH`:
```bash
ln -sf /absolute/path/to/your/pi-fork /usr/local/bin/pi
```
Then start `sandbox-agent` normally.
## opencode (experimental)
Start/reuse daemon and run `opencode attach` against `/opencode`.
```bash
sandbox-agent opencode [OPTIONS]
@ -65,27 +132,20 @@ sandbox-agent opencode [OPTIONS]
| Option | Default | Description |
|--------|---------|-------------|
| `-t, --token <TOKEN>` | - | Authentication token for all requests |
| `-n, --no-token` | - | Disable authentication (local dev only) |
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `--session-title <TITLE>` | - | Title for the OpenCode session |
| `-H, --host <HOST>` | `127.0.0.1` | Daemon host |
| `-p, --port <PORT>` | `2468` | Daemon port |
| `--session-title <TITLE>` | - | Reserved option (currently no-op) |
| `--yolo` | false | OpenCode attach mode flag |
```bash
sandbox-agent opencode --token "$TOKEN"
sandbox-agent opencode
```
The daemon logs to a per-host log file under the sandbox-agent data directory (for example, `~/.local/share/sandbox-agent/daemon/daemon-127-0-0-1-2468.log`).
## daemon
Existing installs are reused and missing binaries are installed automatically.
Manage the background daemon.
---
## Daemon
Manage the background daemon. See the [Daemon](/daemon) docs for details on lifecycle and auto-upgrade.
### Start
### daemon start
```bash
sandbox-agent daemon start [OPTIONS]
@ -93,16 +153,16 @@ sandbox-agent daemon start [OPTIONS]
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `-t, --token <TOKEN>` | - | Authentication token |
| `-n, --no-token` | - | Disable authentication |
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
| `--upgrade` | false | Use ensure-running + upgrade behavior |
```bash
sandbox-agent daemon start --no-token
sandbox-agent daemon start
sandbox-agent daemon start --upgrade
```
### Stop
### daemon stop
```bash
sandbox-agent daemon stop [OPTIONS]
@ -110,10 +170,10 @@ sandbox-agent daemon stop [OPTIONS]
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host of the daemon |
| `-p, --port <PORT>` | `2468` | Port of the daemon |
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
### Status
### daemon status
```bash
sandbox-agent daemon status [OPTIONS]
@ -121,16 +181,12 @@ sandbox-agent daemon status [OPTIONS]
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host of the daemon |
| `-p, --port <PORT>` | `2468` | Port of the daemon |
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
---
## credentials
## Credentials
### Extract
Extract locally discovered credentials:
### credentials extract
```bash
sandbox-agent credentials extract [OPTIONS]
@ -138,20 +194,17 @@ sandbox-agent credentials extract [OPTIONS]
| Option | Description |
|--------|-------------|
| `-a, --agent <AGENT>` | Filter by agent (`claude`, `codex`, `opencode`, `amp`) |
| `-p, --provider <PROVIDER>` | Filter by provider (`anthropic`, `openai`) |
| `-d, --home-dir <DIR>` | Custom home directory for credential search |
| `-r, --reveal` | Show full credential values (default: redacted) |
| `--no-oauth` | Exclude OAuth credentials |
| `-a, --agent <AGENT>` | Filter by `claude`, `codex`, `opencode`, or `amp` |
| `-p, --provider <PROVIDER>` | Filter by provider |
| `-d, --home-dir <DIR>` | Override home dir |
| `--no-oauth` | Skip OAuth sources |
| `-r, --reveal` | Show full credential values |
```bash
sandbox-agent credentials extract --agent claude --reveal
sandbox-agent credentials extract --provider anthropic
```
### Extract as Environment Variables
Output credentials as shell environment variables:
### credentials extract-env
```bash
sandbox-agent credentials extract-env [OPTIONS]
@ -159,378 +212,87 @@ sandbox-agent credentials extract-env [OPTIONS]
| Option | Description |
|--------|-------------|
| `-e, --export` | Prefix each line with `export` |
| `-d, --home-dir <DIR>` | Custom home directory for credential search |
| `--no-oauth` | Exclude OAuth credentials |
| `-e, --export` | Prefix output with `export` |
| `-d, --home-dir <DIR>` | Override home dir |
| `--no-oauth` | Skip OAuth sources |
```bash
# Source directly into shell
eval "$(sandbox-agent credentials extract-env --export)"
```
---
## api
## API Commands
API subcommands for scripting.
The `sandbox-agent api` subcommand mirrors the HTTP API for scripting without client code.
All API commands support:
Shared option:
| Option | Default | Description |
|--------|---------|-------------|
| `-e, --endpoint <URL>` | `http://127.0.0.1:2468` | API endpoint |
| `-t, --token <TOKEN>` | - | Authentication token |
| `-e, --endpoint <URL>` | `http://127.0.0.1:2468` | Target server |
---
### api agents
### Agents
```bash
sandbox-agent api agents list [--endpoint <URL>]
sandbox-agent api agents report [--endpoint <URL>]
sandbox-agent api agents install <AGENT> [--reinstall] [--endpoint <URL>]
```
#### List Agents
#### api agents list
List all agents and their install status.
```bash
sandbox-agent api agents list
```
#### Install Agent
#### api agents report
Emit a JSON report of available models, modes, and thought levels for every agent, grouped by category.
```bash
sandbox-agent api agents install <AGENT> [OPTIONS]
sandbox-agent api agents report --endpoint http://127.0.0.1:2468 | jq .
```
| Option | Description |
|--------|-------------|
| `-r, --reinstall` | Force reinstall |
Example output:
```json
{
"generatedAtMs": 1740000000000,
"endpoint": "http://127.0.0.1:2468",
"agents": [
{
"id": "claude",
"installed": true,
"models": {
"currentValue": "default",
"values": [
{ "value": "default", "name": "Default" },
{ "value": "sonnet", "name": "Sonnet" },
{ "value": "opus", "name": "Opus" },
{ "value": "haiku", "name": "Haiku" }
]
},
"modes": {
"currentValue": "default",
"values": [
{ "value": "default", "name": "Default" },
{ "value": "acceptEdits", "name": "Accept Edits" },
{ "value": "plan", "name": "Plan" },
{ "value": "dontAsk", "name": "Don't Ask" },
{ "value": "bypassPermissions", "name": "Bypass Permissions" }
]
},
"thoughtLevels": { "values": [] }
}
]
}
```
See individual agent pages (e.g. [Claude](/agents/claude), [Codex](/agents/codex)) for supported models, modes, and thought levels.
#### api agents install
```bash
sandbox-agent api agents install claude --reinstall
sandbox-agent api agents install codex --reinstall
```
#### Get Agent Modes
```bash
sandbox-agent api agents modes <AGENT>
```
```bash
sandbox-agent api agents modes claude
```
#### Get Agent Models
```bash
sandbox-agent api agents models <AGENT>
```
```bash
sandbox-agent api agents models claude
```
---
### Sessions
#### List Sessions
```bash
sandbox-agent api sessions list
```
#### Create Session
```bash
sandbox-agent api sessions create <SESSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-a, --agent <AGENT>` | Agent identifier (required) |
| `-g, --agent-mode <MODE>` | Agent mode |
| `-p, --permission-mode <MODE>` | Permission mode (`default`, `plan`, `bypass`, `acceptEdits`) |
| `-m, --model <MODEL>` | Model override |
| `-v, --variant <VARIANT>` | Model variant |
| `-A, --agent-version <VERSION>` | Agent version |
| `--mcp-config <PATH>` | JSON file with MCP server config (see `mcp` docs) |
| `--skill <PATH>` | Skill directory or `SKILL.md` path (repeatable) |
```bash
sandbox-agent api sessions create my-session \
--agent claude \
--agent-mode code \
--permission-mode default
```
`acceptEdits` passes through to Claude, auto-approves file changes for Codex, and is treated as `default` for other agents.
#### Send Message
```bash
sandbox-agent api sessions send-message <SESSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-m, --message <TEXT>` | Message text (required) |
```bash
sandbox-agent api sessions send-message my-session \
--message "Summarize the repository"
```
#### Send Message (Streaming)
Send a message and stream the response:
```bash
sandbox-agent api sessions send-message-stream <SESSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-m, --message <TEXT>` | Message text (required) |
| `--include-raw` | Include raw agent data |
```bash
sandbox-agent api sessions send-message-stream my-session \
--message "Help me debug this"
```
#### Terminate Session
```bash
sandbox-agent api sessions terminate <SESSION_ID>
```
```bash
sandbox-agent api sessions terminate my-session
```
#### Get Events
Fetch session events:
```bash
sandbox-agent api sessions events <SESSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-o, --offset <N>` | Event offset |
| `-l, --limit <N>` | Max events to return |
| `--include-raw` | Include raw agent data |
```bash
sandbox-agent api sessions events my-session --offset 0 --limit 50
```
`get-messages` is an alias for `events`.
#### Stream Events (SSE)
Stream session events via Server-Sent Events:
```bash
sandbox-agent api sessions events-sse <SESSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-o, --offset <N>` | Event offset to start from |
| `--include-raw` | Include raw agent data |
```bash
sandbox-agent api sessions events-sse my-session --offset 0
```
#### Reply to Question
```bash
sandbox-agent api sessions reply-question <SESSION_ID> <QUESTION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-a, --answers <JSON>` | JSON array of answers (required) |
```bash
sandbox-agent api sessions reply-question my-session q1 \
--answers '[["yes"]]'
```
#### Reject Question
```bash
sandbox-agent api sessions reject-question <SESSION_ID> <QUESTION_ID>
```
```bash
sandbox-agent api sessions reject-question my-session q1
```
#### Reply to Permission
```bash
sandbox-agent api sessions reply-permission <SESSION_ID> <PERMISSION_ID> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `-r, --reply <REPLY>` | `once`, `always`, or `reject` (required) |
```bash
sandbox-agent api sessions reply-permission my-session perm1 --reply once
```
---
### Filesystem
#### List Entries
```bash
sandbox-agent api fs entries [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--path <PATH>` | Directory path (default: `.`) |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs entries --path ./workspace
```
#### Read File
`api fs read` writes raw bytes to stdout.
```bash
sandbox-agent api fs read <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs read ./notes.txt > ./notes.txt
```
#### Write File
```bash
sandbox-agent api fs write <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--content <TEXT>` | Write UTF-8 content |
| `--from-file <PATH>` | Read content from a local file |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs write ./hello.txt --content "hello"
sandbox-agent api fs write ./image.bin --from-file ./image.bin
```
#### Delete Entry
```bash
sandbox-agent api fs delete <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--recursive` | Delete directories recursively |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs delete ./old.log
```
#### Create Directory
```bash
sandbox-agent api fs mkdir <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs mkdir ./cache
```
#### Move/Rename
```bash
sandbox-agent api fs move <FROM> <TO> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--overwrite` | Overwrite destination if it exists |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs move ./a.txt ./b.txt --overwrite
```
#### Stat
```bash
sandbox-agent api fs stat <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs stat ./notes.txt
```
#### Upload Batch (tar)
```bash
sandbox-agent api fs upload-batch --tar <PATH> [OPTIONS]
```
| Option | Description |
|--------|-------------|
| `--tar <PATH>` | Tar archive to extract |
| `--path <PATH>` | Destination directory |
| `--session-id <SESSION_ID>` | Resolve relative paths from the session working directory |
```bash
sandbox-agent api fs upload-batch --tar ./skills.tar --path ./skills
```
---
## CLI to HTTP Mapping
| CLI Command | HTTP Endpoint |
|-------------|---------------|
| `api agents list` | `GET /v1/agents` |
| `api agents install` | `POST /v1/agents/{agent}/install` |
| `api agents modes` | `GET /v1/agents/{agent}/modes` |
| `api agents models` | `GET /v1/agents/{agent}/models` |
| `api sessions list` | `GET /v1/sessions` |
| `api sessions create` | `POST /v1/sessions/{sessionId}` |
| `api sessions send-message` | `POST /v1/sessions/{sessionId}/messages` |
| `api sessions send-message-stream` | `POST /v1/sessions/{sessionId}/messages/stream` |
| `api sessions terminate` | `POST /v1/sessions/{sessionId}/terminate` |
| `api sessions events` | `GET /v1/sessions/{sessionId}/events` |
| `api sessions events-sse` | `GET /v1/sessions/{sessionId}/events/sse` |
| `api sessions reply-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reply` |
| `api sessions reject-question` | `POST /v1/sessions/{sessionId}/questions/{questionId}/reject` |
| `api sessions reply-permission` | `POST /v1/sessions/{sessionId}/permissions/{permissionId}/reply` |
| `api fs entries` | `GET /v1/fs/entries` |
| `api fs read` | `GET /v1/fs/file` |
| `api fs write` | `PUT /v1/fs/file` |
| `api fs delete` | `DELETE /v1/fs/entry` |
| `api fs mkdir` | `POST /v1/fs/mkdir` |
| `api fs move` | `POST /v1/fs/move` |
| `api fs stat` | `GET /v1/fs/stat` |
| `api fs upload-batch` | `POST /v1/fs/upload-batch` |

560
docs/common-software.mdx Normal file
View file

@ -0,0 +1,560 @@
---
title: "Common Software"
description: "Install browsers, languages, databases, and other tools inside the sandbox."
sidebarTitle: "Common Software"
icon: "box-open"
---
The sandbox runs a Debian/Ubuntu base image. You can install software with `apt-get` via the [Process API](/processes) or by customizing your Docker image. This page covers commonly needed packages and how to install them.
## Browsers
### Chromium
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "chromium", "chromium-sandbox"],
});
// Launch headless
await sdk.runProcess({
command: "chromium",
args: ["--headless", "--no-sandbox", "--disable-gpu", "https://example.com"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","chromium","chromium-sandbox"]}'
```
</CodeGroup>
<Note>
Use `--no-sandbox` when running Chromium inside a container. The container itself provides isolation.
</Note>
### Firefox
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "firefox-esr"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","firefox-esr"]}'
```
</CodeGroup>
### Playwright browsers
Playwright bundles its own browser binaries. Install the Playwright CLI and let it download browsers for you.
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "npx",
args: ["playwright", "install", "--with-deps", "chromium"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"npx","args":["playwright","install","--with-deps","chromium"]}'
```
</CodeGroup>
---
## Languages and runtimes
### Node.js
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "nodejs", "npm"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","nodejs","npm"]}'
```
</CodeGroup>
For a specific version, use [nvm](https://github.com/nvm-sh/nvm):
```ts TypeScript
await sdk.runProcess({
command: "bash",
args: ["-c", "curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash && . ~/.nvm/nvm.sh && nvm install 22"],
});
```
### Python
Python 3 is typically pre-installed. To add pip and common packages:
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "python3", "python3-pip", "python3-venv"],
});
await sdk.runProcess({
command: "pip3",
args: ["install", "numpy", "pandas", "matplotlib"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","python3","python3-pip","python3-venv"]}'
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"pip3","args":["install","numpy","pandas","matplotlib"]}'
```
</CodeGroup>
### Go
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "bash",
args: ["-c", "curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"],
});
// Add to PATH for subsequent commands
await sdk.runProcess({
command: "bash",
args: ["-c", "export PATH=$PATH:/usr/local/go/bin && go version"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"bash","args":["-c","curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"]}'
```
</CodeGroup>
### Rust
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "bash",
args: ["-c", "curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"bash","args":["-c","curl --proto =https --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"]}'
```
</CodeGroup>
### Java (OpenJDK)
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "default-jdk"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","default-jdk"]}'
```
</CodeGroup>
### Ruby
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "ruby-full"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","ruby-full"]}'
```
</CodeGroup>
---
## Databases
### PostgreSQL
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "postgresql", "postgresql-client"],
});
// Start the service
const proc = await sdk.createProcess({
command: "bash",
args: ["-c", "su - postgres -c 'pg_ctlcluster 15 main start'"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","postgresql","postgresql-client"]}'
```
</CodeGroup>
### SQLite
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "sqlite3"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","sqlite3"]}'
```
</CodeGroup>
### Redis
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "redis-server"],
});
const proc = await sdk.createProcess({
command: "redis-server",
args: ["--daemonize", "no"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","redis-server"]}'
curl -X POST "http://127.0.0.1:2468/v1/processes" \
-H "Content-Type: application/json" \
-d '{"command":"redis-server","args":["--daemonize","no"]}'
```
</CodeGroup>
### MySQL / MariaDB
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "mariadb-server", "mariadb-client"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","mariadb-server","mariadb-client"]}'
```
</CodeGroup>
---
## Build tools
### Essential build toolchain
Most compiled software needs the standard build toolchain:
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "build-essential", "cmake", "pkg-config"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","build-essential","cmake","pkg-config"]}'
```
</CodeGroup>
This installs `gcc`, `g++`, `make`, `cmake`, and related tools.
---
## Desktop applications
These require the [Computer Use](/computer-use) desktop to be started first.
### LibreOffice
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "libreoffice"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","libreoffice"]}'
```
</CodeGroup>
### GIMP
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "gimp"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","gimp"]}'
```
</CodeGroup>
### VLC
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "vlc"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","vlc"]}'
```
</CodeGroup>
### VS Code (code-server)
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "bash",
args: ["-c", "curl -fsSL https://code-server.dev/install.sh | sh"],
});
const proc = await sdk.createProcess({
command: "code-server",
args: ["--bind-addr", "0.0.0.0:8080", "--auth", "none"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"bash","args":["-c","curl -fsSL https://code-server.dev/install.sh | sh"]}'
curl -X POST "http://127.0.0.1:2468/v1/processes" \
-H "Content-Type: application/json" \
-d '{"command":"code-server","args":["--bind-addr","0.0.0.0:8080","--auth","none"]}'
```
</CodeGroup>
---
## CLI tools
### Git
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "git"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","git"]}'
```
</CodeGroup>
### Docker
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "bash",
args: ["-c", "curl -fsSL https://get.docker.com | sh"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"bash","args":["-c","curl -fsSL https://get.docker.com | sh"]}'
```
</CodeGroup>
### jq
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "jq"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","jq"]}'
```
</CodeGroup>
### tmux
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "tmux"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","tmux"]}'
```
</CodeGroup>
---
## Media and graphics
### FFmpeg
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "ffmpeg"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","ffmpeg"]}'
```
</CodeGroup>
### ImageMagick
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "imagemagick"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","imagemagick"]}'
```
</CodeGroup>
### Poppler (PDF utilities)
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "poppler-utils"],
});
// Convert PDF to images
await sdk.runProcess({
command: "pdftoppm",
args: ["-png", "document.pdf", "output"],
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","poppler-utils"]}'
```
</CodeGroup>
---
## Pre-installing in a Docker image
For production use, install software in your Dockerfile instead of at runtime. This avoids repeated downloads and makes startup faster.
```dockerfile
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y \
chromium \
firefox-esr \
nodejs npm \
python3 python3-pip \
git curl wget \
build-essential \
sqlite3 \
ffmpeg \
imagemagick \
jq \
&& rm -rf /var/lib/apt/lists/*
RUN pip3 install numpy pandas matplotlib
```
See [Docker deployment](/deploy/docker) for how to use custom images with Sandbox Agent.

859
docs/computer-use.mdx Normal file
View file

@ -0,0 +1,859 @@
---
title: "Computer Use"
description: "Control a virtual desktop inside the sandbox with mouse, keyboard, screenshots, recordings, and live streaming."
sidebarTitle: "Computer Use"
icon: "desktop"
---
Sandbox Agent provides a managed virtual desktop (Xvfb + openbox) that you can control programmatically. This is useful for browser automation, GUI testing, and AI computer-use workflows.
## Start and stop
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
const status = await sdk.startDesktop({
width: 1920,
height: 1080,
dpi: 96,
});
console.log(status.state); // "active"
console.log(status.display); // ":99"
// When done
await sdk.stopDesktop();
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \
-H "Content-Type: application/json" \
-d '{"width":1920,"height":1080,"dpi":96}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/stop"
```
</CodeGroup>
All fields in the start request are optional. Defaults are 1440x900 at 96 DPI.
### Start request options
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `width` | number | 1440 | Desktop width in pixels |
| `height` | number | 900 | Desktop height in pixels |
| `dpi` | number | 96 | Display DPI |
| `displayNum` | number | 99 | Starting X display number. The runtime probes from this number upward to find an available display. |
| `stateDir` | string | (auto) | Desktop state directory for home, logs, recordings |
| `streamVideoCodec` | string | `"vp8"` | WebRTC video codec (`vp8`, `vp9`, `h264`) |
| `streamAudioCodec` | string | `"opus"` | WebRTC audio codec (`opus`, `g722`) |
| `streamFrameRate` | number | 30 | Streaming frame rate (1-60) |
| `webrtcPortRange` | string | `"59050-59070"` | UDP port range for WebRTC media |
| `recordingFps` | number | 30 | Default recording FPS when not specified in `startDesktopRecording` (1-60) |
The streaming and recording options configure defaults for the desktop session. They take effect when streaming or recording is started later.
<CodeGroup>
```ts TypeScript
const status = await sdk.startDesktop({
width: 1920,
height: 1080,
streamVideoCodec: "h264",
streamFrameRate: 60,
webrtcPortRange: "59100-59120",
recordingFps: 15,
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \
-H "Content-Type: application/json" \
-d '{
"width": 1920,
"height": 1080,
"streamVideoCodec": "h264",
"streamFrameRate": 60,
"webrtcPortRange": "59100-59120",
"recordingFps": 15
}'
```
</CodeGroup>
## Status
<CodeGroup>
```ts TypeScript
const status = await sdk.getDesktopStatus();
console.log(status.state); // "inactive" | "active" | "failed" | ...
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/status"
```
</CodeGroup>
## Screenshots
Capture the full desktop or a specific region. Optionally include the cursor position.
<CodeGroup>
```ts TypeScript
// Full screenshot (PNG by default)
const png = await sdk.takeDesktopScreenshot();
// JPEG at 70% quality, half scale
const jpeg = await sdk.takeDesktopScreenshot({
format: "jpeg",
quality: 70,
scale: 0.5,
});
// Include cursor overlay
const withCursor = await sdk.takeDesktopScreenshot({
showCursor: true,
});
// Region screenshot
const region = await sdk.takeDesktopRegionScreenshot({
x: 100,
y: 100,
width: 400,
height: 300,
});
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/screenshot" --output screenshot.png
curl "http://127.0.0.1:2468/v1/desktop/screenshot?format=jpeg&quality=70&scale=0.5" \
--output screenshot.jpg
# Include cursor overlay
curl "http://127.0.0.1:2468/v1/desktop/screenshot?show_cursor=true" \
--output with_cursor.png
curl "http://127.0.0.1:2468/v1/desktop/screenshot/region?x=100&y=100&width=400&height=300" \
--output region.png
```
</CodeGroup>
### Screenshot options
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | `"png"` | Output format: `png`, `jpeg`, or `webp` |
| `quality` | number | 85 | Compression quality (1-100, JPEG/WebP only) |
| `scale` | number | 1.0 | Scale factor (0.1-1.0) |
| `showCursor` | boolean | `false` | Composite a crosshair at the cursor position |
When `showCursor` is enabled, the cursor position is captured at the moment of the screenshot and a red crosshair is drawn at that location. This is useful for AI agents that need to see where the cursor is in the screenshot.
## Mouse
<CodeGroup>
```ts TypeScript
// Get current position
const pos = await sdk.getDesktopMousePosition();
console.log(pos.x, pos.y);
// Move
await sdk.moveDesktopMouse({ x: 500, y: 300 });
// Click (left by default)
await sdk.clickDesktop({ x: 500, y: 300 });
// Right click
await sdk.clickDesktop({ x: 500, y: 300, button: "right" });
// Double click
await sdk.clickDesktop({ x: 500, y: 300, clickCount: 2 });
// Drag
await sdk.dragDesktopMouse({
startX: 100, startY: 100,
endX: 400, endY: 400,
});
// Scroll
await sdk.scrollDesktop({ x: 500, y: 300, deltaY: -3 });
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/mouse/position"
curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/click" \
-H "Content-Type: application/json" \
-d '{"x":500,"y":300}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/drag" \
-H "Content-Type: application/json" \
-d '{"startX":100,"startY":100,"endX":400,"endY":400}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/scroll" \
-H "Content-Type: application/json" \
-d '{"x":500,"y":300,"deltaY":-3}'
```
</CodeGroup>
## Keyboard
<CodeGroup>
```ts TypeScript
// Type text
await sdk.typeDesktopText({ text: "Hello, world!" });
// Press a key with modifiers
await sdk.pressDesktopKey({
key: "c",
modifiers: { ctrl: true },
});
// Low-level key down/up
await sdk.keyDownDesktop({ key: "Shift_L" });
await sdk.keyUpDesktop({ key: "Shift_L" });
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/type" \
-H "Content-Type: application/json" \
-d '{"text":"Hello, world!"}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/press" \
-H "Content-Type: application/json" \
-d '{"key":"c","modifiers":{"ctrl":true}}'
```
</CodeGroup>
## Clipboard
Read and write the X11 clipboard programmatically.
<CodeGroup>
```ts TypeScript
// Read clipboard
const clipboard = await sdk.getDesktopClipboard();
console.log(clipboard.text);
// Read primary selection (mouse-selected text)
const primary = await sdk.getDesktopClipboard({ selection: "primary" });
// Write to clipboard
await sdk.setDesktopClipboard({ text: "Pasted via API" });
// Write to both clipboard and primary selection
await sdk.setDesktopClipboard({
text: "Synced text",
selection: "both",
});
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/clipboard"
curl "http://127.0.0.1:2468/v1/desktop/clipboard?selection=primary"
curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \
-H "Content-Type: application/json" \
-d '{"text":"Pasted via API"}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \
-H "Content-Type: application/json" \
-d '{"text":"Synced text","selection":"both"}'
```
</CodeGroup>
The `selection` parameter controls which X11 selection to read or write:
| Value | Description |
|-------|-------------|
| `clipboard` (default) | The standard clipboard (Ctrl+C / Ctrl+V) |
| `primary` | The primary selection (text selected with the mouse) |
| `both` | Write to both clipboard and primary selection (write only) |
## Display and windows
<CodeGroup>
```ts TypeScript
const display = await sdk.getDesktopDisplayInfo();
console.log(display.resolution); // { width: 1920, height: 1080, dpi: 96 }
const { windows } = await sdk.listDesktopWindows();
for (const win of windows) {
console.log(win.title, win.x, win.y, win.width, win.height);
}
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/display/info"
curl "http://127.0.0.1:2468/v1/desktop/windows"
```
</CodeGroup>
The windows endpoint filters out noise automatically: window manager internals (Openbox), windows with empty titles, and tiny helper windows (under 120x80) are excluded. The currently active/focused window is always included regardless of filters.
### Focused window
Get the currently focused window without listing all windows.
<CodeGroup>
```ts TypeScript
const focused = await sdk.getDesktopFocusedWindow();
console.log(focused.title, focused.id);
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/windows/focused"
```
</CodeGroup>
Returns 404 if no window currently has focus.
### Window management
Focus, move, and resize windows by their X11 window ID.
<CodeGroup>
```ts TypeScript
const { windows } = await sdk.listDesktopWindows();
const win = windows[0];
// Bring window to foreground
await sdk.focusDesktopWindow(win.id);
// Move window
await sdk.moveDesktopWindow(win.id, { x: 100, y: 50 });
// Resize window
await sdk.resizeDesktopWindow(win.id, { width: 1280, height: 720 });
```
```bash cURL
# Focus a window
curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/focus"
# Move a window
curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/move" \
-H "Content-Type: application/json" \
-d '{"x":100,"y":50}'
# Resize a window
curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/resize" \
-H "Content-Type: application/json" \
-d '{"width":1280,"height":720}'
```
</CodeGroup>
All three endpoints return the updated window info so you can verify the operation took effect. The window manager may adjust the requested position or size.
## App launching
Launch applications or open files/URLs on the desktop without needing to shell out.
<CodeGroup>
```ts TypeScript
// Launch an app by name
const result = await sdk.launchDesktopApp({
app: "firefox",
args: ["--private"],
});
console.log(result.processId); // "proc_7"
// Launch and wait for the window to appear
const withWindow = await sdk.launchDesktopApp({
app: "xterm",
wait: true,
});
console.log(withWindow.windowId); // "12345" or null if timed out
// Open a URL with the default handler
const opened = await sdk.openDesktopTarget({
target: "https://example.com",
});
console.log(opened.processId);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \
-H "Content-Type: application/json" \
-d '{"app":"firefox","args":["--private"]}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \
-H "Content-Type: application/json" \
-d '{"app":"xterm","wait":true}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/open" \
-H "Content-Type: application/json" \
-d '{"target":"https://example.com"}'
```
</CodeGroup>
The returned `processId` can be used with the [Process API](/processes) to read logs (`GET /v1/processes/{id}/logs`) or stop the application (`POST /v1/processes/{id}/stop`).
When `wait` is `true`, the API polls for up to 5 seconds for a window to appear. If the window appears, its ID is returned in `windowId`. If it times out, `windowId` is `null` but the process is still running.
<Tip>
**Launch/Open vs the Process API:** Both `launch` and `open` are convenience wrappers around the [Process API](/processes). They create managed processes (with `owner: "desktop"`) that you can inspect, log, and stop through the same Process endpoints. The difference is that `launch` validates the binary exists in PATH first and can optionally wait for a window to appear, while `open` delegates to the system default handler (`xdg-open`). Use the Process API directly when you need full control over command, environment, working directory, or restart policies.
</Tip>
## Recording
Record the desktop to MP4.
<CodeGroup>
```ts TypeScript
const recording = await sdk.startDesktopRecording({ fps: 30 });
console.log(recording.id);
// ... do things ...
const stopped = await sdk.stopDesktopRecording();
// List all recordings
const { recordings } = await sdk.listDesktopRecordings();
// Download
const mp4 = await sdk.downloadDesktopRecording(recording.id);
// Clean up
await sdk.deleteDesktopRecording(recording.id);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/start" \
-H "Content-Type: application/json" \
-d '{"fps":30}'
curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/stop"
curl "http://127.0.0.1:2468/v1/desktop/recordings"
curl "http://127.0.0.1:2468/v1/desktop/recordings/rec_1/download" --output recording.mp4
curl -X DELETE "http://127.0.0.1:2468/v1/desktop/recordings/rec_1"
```
</CodeGroup>
## Desktop processes
The desktop runtime manages several background processes (Xvfb, openbox, neko, ffmpeg). These are all registered with the general [Process API](/processes) under the `desktop` owner, so you can inspect logs, check status, and troubleshoot using the same tools you use for any other managed process.
<CodeGroup>
```ts TypeScript
// List all processes, including desktop-owned ones
const { processes } = await sdk.listProcesses();
const desktopProcs = processes.filter((p) => p.owner === "desktop");
for (const p of desktopProcs) {
console.log(p.id, p.command, p.status);
}
// Read logs from a specific desktop process
const logs = await sdk.getProcessLogs(desktopProcs[0].id, { tail: 50 });
for (const entry of logs.entries) {
console.log(entry.stream, atob(entry.data));
}
```
```bash cURL
# List all processes (desktop processes have owner: "desktop")
curl "http://127.0.0.1:2468/v1/processes"
# Get logs from a specific desktop process
curl "http://127.0.0.1:2468/v1/processes/proc_1/logs?tail=50"
```
</CodeGroup>
The desktop status endpoint also includes a summary of running processes:
<CodeGroup>
```ts TypeScript
const status = await sdk.getDesktopStatus();
for (const proc of status.processes) {
console.log(proc.name, proc.pid, proc.running);
}
```
```bash cURL
curl "http://127.0.0.1:2468/v1/desktop/status"
# Response includes: processes: [{ name: "Xvfb", pid: 123, running: true }, ...]
```
</CodeGroup>
| Process | Role | Restart policy |
|---------|------|---------------|
| Xvfb | Virtual X11 framebuffer | Auto-restart while desktop is active |
| openbox | Window manager | Auto-restart while desktop is active |
| neko | WebRTC streaming server (started by `startDesktopStream`) | No auto-restart |
| ffmpeg | Screen recorder (started by `startDesktopRecording`) | No auto-restart |
## Live streaming
Start a WebRTC stream for real-time desktop viewing in a browser.
<CodeGroup>
```ts TypeScript
await sdk.startDesktopStream();
// Check stream status
const status = await sdk.getDesktopStreamStatus();
console.log(status.active); // true
console.log(status.processId); // "proc_5"
// Connect via the React DesktopViewer component or
// use the WebSocket signaling endpoint directly
// at ws://127.0.0.1:2468/v1/desktop/stream/signaling
await sdk.stopDesktopStream();
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/start"
# Check stream status
curl "http://127.0.0.1:2468/v1/desktop/stream/status"
# Connect to ws://127.0.0.1:2468/v1/desktop/stream/signaling for WebRTC signaling
curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/stop"
```
</CodeGroup>
For a drop-in React component, see [React Components](/react-components).
## API reference
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/v1/desktop/start` | Start the desktop runtime |
| `POST` | `/v1/desktop/stop` | Stop the desktop runtime |
| `GET` | `/v1/desktop/status` | Get desktop runtime status |
| `GET` | `/v1/desktop/screenshot` | Capture full desktop screenshot |
| `GET` | `/v1/desktop/screenshot/region` | Capture a region screenshot |
| `GET` | `/v1/desktop/mouse/position` | Get current mouse position |
| `POST` | `/v1/desktop/mouse/move` | Move the mouse |
| `POST` | `/v1/desktop/mouse/click` | Click the mouse |
| `POST` | `/v1/desktop/mouse/down` | Press mouse button down |
| `POST` | `/v1/desktop/mouse/up` | Release mouse button |
| `POST` | `/v1/desktop/mouse/drag` | Drag from one point to another |
| `POST` | `/v1/desktop/mouse/scroll` | Scroll at a position |
| `POST` | `/v1/desktop/keyboard/type` | Type text |
| `POST` | `/v1/desktop/keyboard/press` | Press a key with optional modifiers |
| `POST` | `/v1/desktop/keyboard/down` | Press a key down (hold) |
| `POST` | `/v1/desktop/keyboard/up` | Release a key |
| `GET` | `/v1/desktop/display/info` | Get display info |
| `GET` | `/v1/desktop/windows` | List visible windows |
| `GET` | `/v1/desktop/windows/focused` | Get focused window info |
| `POST` | `/v1/desktop/windows/{id}/focus` | Focus a window |
| `POST` | `/v1/desktop/windows/{id}/move` | Move a window |
| `POST` | `/v1/desktop/windows/{id}/resize` | Resize a window |
| `GET` | `/v1/desktop/clipboard` | Read clipboard contents |
| `POST` | `/v1/desktop/clipboard` | Write to clipboard |
| `POST` | `/v1/desktop/launch` | Launch an application |
| `POST` | `/v1/desktop/open` | Open a file or URL |
| `POST` | `/v1/desktop/recording/start` | Start recording |
| `POST` | `/v1/desktop/recording/stop` | Stop recording |
| `GET` | `/v1/desktop/recordings` | List recordings |
| `GET` | `/v1/desktop/recordings/{id}` | Get recording metadata |
| `GET` | `/v1/desktop/recordings/{id}/download` | Download recording |
| `DELETE` | `/v1/desktop/recordings/{id}` | Delete recording |
| `POST` | `/v1/desktop/stream/start` | Start WebRTC streaming |
| `POST` | `/v1/desktop/stream/stop` | Stop WebRTC streaming |
| `GET` | `/v1/desktop/stream/status` | Get stream status |
| `GET` | `/v1/desktop/stream/signaling` | WebSocket for WebRTC signaling |
### TypeScript SDK methods
| Method | Returns | Description |
|--------|---------|-------------|
| `startDesktop(request?)` | `DesktopStatusResponse` | Start the desktop |
| `stopDesktop()` | `DesktopStatusResponse` | Stop the desktop |
| `getDesktopStatus()` | `DesktopStatusResponse` | Get desktop status |
| `takeDesktopScreenshot(query?)` | `Uint8Array` | Capture screenshot |
| `takeDesktopRegionScreenshot(query)` | `Uint8Array` | Capture region screenshot |
| `getDesktopMousePosition()` | `DesktopMousePositionResponse` | Get mouse position |
| `moveDesktopMouse(request)` | `DesktopMousePositionResponse` | Move mouse |
| `clickDesktop(request)` | `DesktopMousePositionResponse` | Click mouse |
| `mouseDownDesktop(request)` | `DesktopMousePositionResponse` | Mouse button down |
| `mouseUpDesktop(request)` | `DesktopMousePositionResponse` | Mouse button up |
| `dragDesktopMouse(request)` | `DesktopMousePositionResponse` | Drag mouse |
| `scrollDesktop(request)` | `DesktopMousePositionResponse` | Scroll |
| `typeDesktopText(request)` | `DesktopActionResponse` | Type text |
| `pressDesktopKey(request)` | `DesktopActionResponse` | Press key |
| `keyDownDesktop(request)` | `DesktopActionResponse` | Key down |
| `keyUpDesktop(request)` | `DesktopActionResponse` | Key up |
| `getDesktopDisplayInfo()` | `DesktopDisplayInfoResponse` | Get display info |
| `listDesktopWindows()` | `DesktopWindowListResponse` | List windows |
| `getDesktopFocusedWindow()` | `DesktopWindowInfo` | Get focused window |
| `focusDesktopWindow(id)` | `DesktopWindowInfo` | Focus a window |
| `moveDesktopWindow(id, request)` | `DesktopWindowInfo` | Move a window |
| `resizeDesktopWindow(id, request)` | `DesktopWindowInfo` | Resize a window |
| `getDesktopClipboard(query?)` | `DesktopClipboardResponse` | Read clipboard |
| `setDesktopClipboard(request)` | `DesktopActionResponse` | Write clipboard |
| `launchDesktopApp(request)` | `DesktopLaunchResponse` | Launch an app |
| `openDesktopTarget(request)` | `DesktopOpenResponse` | Open file/URL |
| `startDesktopRecording(request?)` | `DesktopRecordingInfo` | Start recording |
| `stopDesktopRecording()` | `DesktopRecordingInfo` | Stop recording |
| `listDesktopRecordings()` | `DesktopRecordingListResponse` | List recordings |
| `getDesktopRecording(id)` | `DesktopRecordingInfo` | Get recording |
| `downloadDesktopRecording(id)` | `Uint8Array` | Download recording |
| `deleteDesktopRecording(id)` | `void` | Delete recording |
| `startDesktopStream()` | `DesktopStreamStatusResponse` | Start streaming |
| `stopDesktopStream()` | `DesktopStreamStatusResponse` | Stop streaming |
| `getDesktopStreamStatus()` | `DesktopStreamStatusResponse` | Stream status |
## Customizing the desktop environment
The desktop runs inside the sandbox filesystem, so you can customize it using the [File System](/file-system) API before or after starting the desktop. The desktop HOME directory is located at `~/.local/state/sandbox-agent/desktop/home` (or `$XDG_STATE_HOME/sandbox-agent/desktop/home` if `XDG_STATE_HOME` is set).
All configuration files below are written to paths relative to this HOME directory.
### Window manager (openbox)
The desktop uses [openbox](http://openbox.org/) as its window manager. You can customize its behavior, theme, and keyboard shortcuts by writing an `rc.xml` config file.
<CodeGroup>
```ts TypeScript
const openboxConfig = `<?xml version="1.0" encoding="UTF-8"?>
<openbox_config xmlns="http://openbox.org/3.4/rc">
<theme>
<name>Clearlooks</name>
<titleLayout>NLIMC</titleLayout>
<font place="ActiveWindow"><name>DejaVu Sans</name><size>10</size></font>
</theme>
<desktops><number>1</number></desktops>
<keyboard>
<keybind key="A-F4"><action name="Close"/></keybind>
<keybind key="A-Tab"><action name="NextWindow"/></keybind>
</keyboard>
</openbox_config>`;
await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" },
openboxConfig,
);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox"
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" \
-H "Content-Type: application/octet-stream" \
--data-binary @rc.xml
```
</CodeGroup>
### Autostart programs
Openbox runs scripts in `~/.config/openbox/autostart` on startup. Use this to launch applications, set the background, or configure the environment.
<CodeGroup>
```ts TypeScript
const autostart = `#!/bin/sh
# Set a solid background color
xsetroot -solid "#1e1e2e" &
# Launch a terminal
xterm -geometry 120x40+50+50 &
# Launch a browser
firefox --no-remote &
`;
await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" },
autostart,
);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox"
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \
-H "Content-Type: application/octet-stream" \
--data-binary @autostart.sh
```
</CodeGroup>
<Note>
The autostart script runs when openbox starts, which happens during `startDesktop()`. Write the autostart file before calling `startDesktop()` for it to take effect.
</Note>
### Background
There is no wallpaper set by default (the background is the X root window default). You can set it using `xsetroot` in the autostart script (as shown above), or use `feh` if you need an image:
<CodeGroup>
```ts TypeScript
// Upload a wallpaper image
import fs from "node:fs";
const wallpaper = await fs.promises.readFile("./wallpaper.png");
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/wallpaper.png" },
wallpaper,
);
// Set the autostart to apply it
const autostart = `#!/bin/sh
feh --bg-fill ~/wallpaper.png &
`;
await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" });
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" },
autostart,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/wallpaper.png" \
-H "Content-Type: application/octet-stream" \
--data-binary @wallpaper.png
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \
-H "Content-Type: application/octet-stream" \
--data-binary @autostart.sh
```
</CodeGroup>
<Note>
`feh` is not installed by default. Install it via the [Process API](/processes) before starting the desktop: `await sdk.runProcess({ command: "apt-get", args: ["install", "-y", "feh"] })`.
</Note>
### Fonts
Only `fonts-dejavu-core` is installed by default. To add more fonts, install them with your system package manager or copy font files into the sandbox:
<CodeGroup>
```ts TypeScript
// Install a font package
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "fonts-noto", "fonts-liberation"],
});
// Or copy a custom font file
import fs from "node:fs";
const font = await fs.promises.readFile("./CustomFont.ttf");
await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts" });
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" },
font,
);
// Rebuild the font cache
await sdk.runProcess({ command: "fc-cache", args: ["-fv"] });
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","fonts-noto","fonts-liberation"]}'
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts"
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" \
-H "Content-Type: application/octet-stream" \
--data-binary @CustomFont.ttf
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"fc-cache","args":["-fv"]}'
```
</CodeGroup>
### Cursor theme
<CodeGroup>
```ts TypeScript
await sdk.runProcess({
command: "apt-get",
args: ["install", "-y", "dmz-cursor-theme"],
});
const xresources = `Xcursor.theme: DMZ-White\nXcursor.size: 24\n`;
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.Xresources" },
xresources,
);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"apt-get","args":["install","-y","dmz-cursor-theme"]}'
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.Xresources" \
-H "Content-Type: application/octet-stream" \
--data-binary 'Xcursor.theme: DMZ-White\nXcursor.size: 24'
```
</CodeGroup>
<Note>
Run `xrdb -merge ~/.Xresources` (via the autostart or process API) after writing the file for changes to take effect.
</Note>
### Shell and terminal
No terminal emulator or shell is launched by default. Add one to the openbox autostart:
```sh
# In ~/.config/openbox/autostart
xterm -geometry 120x40+50+50 &
```
To use a different shell, set the `SHELL` environment variable in your Dockerfile or install your preferred shell and configure the terminal to use it.
### GTK theme
Applications using GTK will pick up settings from `~/.config/gtk-3.0/settings.ini`:
<CodeGroup>
```ts TypeScript
const gtkSettings = `[Settings]
gtk-theme-name=Adwaita
gtk-icon-theme-name=Adwaita
gtk-font-name=DejaVu Sans 10
gtk-cursor-theme-name=DMZ-White
gtk-cursor-theme-size=24
`;
await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0" });
await sdk.writeFsFile(
{ path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" },
gtkSettings,
);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0"
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" \
-H "Content-Type: application/octet-stream" \
--data-binary @settings.ini
```
</CodeGroup>
### Summary of configuration paths
All paths are relative to the desktop HOME directory (`~/.local/state/sandbox-agent/desktop/home`).
| What | Path | Notes |
|------|------|-------|
| Openbox config | `.config/openbox/rc.xml` | Window manager theme, keybindings, behavior |
| Autostart | `.config/openbox/autostart` | Shell script run on desktop start |
| Custom fonts | `.local/share/fonts/` | TTF/OTF files, run `fc-cache -fv` after |
| Cursor theme | `.Xresources` | Requires `xrdb -merge` to apply |
| GTK 3 settings | `.config/gtk-3.0/settings.ini` | Theme, icons, fonts for GTK apps |
| Wallpaper | Any path, referenced from autostart | Requires `feh` or similar tool |

View file

@ -1,91 +0,0 @@
# Universal ↔ Agent Term Mapping
Source of truth: generated agent schemas in `resources/agent-schemas/artifacts/json-schema/`.
Identifiers
+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
| Universal term | Claude | Codex (app-server) | OpenCode | Amp |
+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
| session_id | n/a (daemon-only) | n/a (daemon-only) | n/a (daemon-only) | n/a (daemon-only) |
| native_session_id | none | threadId | sessionID | none |
| item_id | synthetic | ThreadItem.id | Message.id | StreamJSONMessage.id |
| native_item_id | none | ThreadItem.id | Message.id | StreamJSONMessage.id |
+----------------------+------------------------+------------------------------------------+-----------------------------+------------------------+
Notes:
- When a provider does not supply IDs (Claude), we synthesize item_id values and keep native_item_id null.
- native_session_id is the only provider session identifier. It is intentionally used for thread/session/run ids.
- native_item_id preserves the agent-native item/message id when present.
- source indicates who emitted the event: agent (native) or daemon (synthetic).
- raw is always present on events. When clients do not opt-in to raw payloads, raw is null.
- opt-in via `include_raw=true` on events endpoints (HTTP + SSE).
- If parsing fails, emit agent.unparsed (source=daemon, synthetic=true). Tests must assert zero unparsed events.
Events / Message Flow
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
| Universal term | Claude | Codex (app-server) | OpenCode | Amp |
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
| session.started | none | method=thread/started | type=session.created | none |
| session.ended | SDKMessage.type=result | no explicit session end (turn/completed) | no explicit session end (session.deleted)| type=done |
| turn.started | synthetic on message send | method=turn/started | type=session.status (busy) | synthetic on message send |
| turn.ended | synthetic after result | method=turn/completed | type=session.idle | synthetic on done |
| message (user) | SDKMessage.type=user | item/completed (ThreadItem.type=userMessage)| message.updated (Message.role=user) | type=message |
| message (assistant) | SDKMessage.type=assistant | item/completed (ThreadItem.type=agentMessage)| message.updated (Message.role=assistant)| type=message |
| message.delta | stream_event (partial) or synthetic | method=item/agentMessage/delta | type=message.part.updated (text-part delta) | synthetic |
| tool call | type=tool_use | method=item/mcpToolCall/progress | message.part.updated (part.type=tool) | type=tool_call |
| tool result | user.message.content.tool_result | item/completed (tool result ThreadItem variants) | message.part.updated (part.type=tool, state=completed) | type=tool_result |
| permission.requested | control_request.can_use_tool | none | type=permission.asked | none |
| permission.resolved | daemon reply to can_use_tool | none | type=permission.replied | none |
| question.requested | tool_use (AskUserQuestion) | experimental request_user_input (payload) | type=question.asked | none |
| question.resolved | tool_result (AskUserQuestion) | experimental request_user_input (payload) | type=question.replied / question.rejected | none |
| error | SDKResultMessage.error | method=error | type=session.error (or message error) | type=error |
+------------------------+------------------------------+--------------------------------------------+-----------------------------------------+----------------------------------+
Permission status normalization:
- `permission.requested` uses `status=requested`.
- `permission.resolved` uses `status=accept`, `accept_for_session`, or `reject`.
Synthetics
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| Synthetic element | When it appears | Stored as | Notes |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| session.started | When agent emits no explicit start | session.started event | Mark source=daemon |
| session.ended | When agent emits no explicit end | session.ended event | Mark source=daemon; reason may be inferred |
| turn.started | When agent emits no explicit turn start | turn.started event | Mark source=daemon |
| turn.ended | When agent emits no explicit turn end | turn.ended event | Mark source=daemon |
| item_id (Claude) | Claude provides no item IDs | item_id | Maintain provider_item_id map when possible |
| user message (Claude) | Claude emits only assistant output | item.completed | Mark source=daemon; preserve raw input in event metadata |
| question events (Claude) | AskUserQuestion tool usage | question.requested/resolved | Derived from tool_use blocks (source=agent) |
| native_session_id (Codex) | Codex uses threadId | native_session_id | Intentionally merged threadId into native_session_id |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| message.delta (Claude) | No native deltas emitted | item.delta | Synthetic delta with full message content; source=daemon |
| message.delta (Amp) | No native deltas | item.delta | Synthetic delta with full message content; source=daemon |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
| message.delta (OpenCode) | text part delta before message | item.delta | If part arrives first, create item.started stub then delta |
+------------------------------+------------------------+--------------------------+--------------------------------------------------------------+
Delta handling
- Codex emits agent message and other deltas (e.g., item/agentMessage/delta).
- OpenCode emits part deltas via message.part.updated with a delta string.
- Claude can emit stream_event deltas when partial streaming is enabled; Amp does not emit deltas.
Policy:
- Emit item.delta for streamable text content across providers.
- For providers without native deltas, emit a single synthetic delta containing the full content prior to item.completed.
- For Claude when partial streaming is enabled, forward native deltas and skip the synthetic full-content delta.
- For providers with native deltas, forward as-is; also emit item.completed when final content is known.
- For OpenCode reasoning part deltas, emit typed reasoning item updates (item.started/item.completed with content.type=reasoning) instead of item.delta.
Message normalization notes
- user vs assistant: normalized via role in the universal item; provider role fields or item types determine role.
- file artifacts: always represented as content parts (type=file_ref) inside message/tool_result items, not a separate item kind.
- reasoning: represented as content parts (type=reasoning) inside message items, with visibility when available.
- subagents: OpenCode subtask parts and Claude Task tool usage are currently normalized into standard message/tool flow (no dedicated subagent fields).
- OpenCode unrolling: message.updated creates/updates the parent message item; tool-related parts emit separate tool item events (item.started/ item.completed) with parent_id pointing to the message item.
- If a message.part.updated arrives before message.updated, we create a stub item.started (source=daemon) so deltas have a parent.
- Tool calls/results are always emitted as separate tool items to keep behavior consistent across agents.

View file

@ -2,7 +2,6 @@
title: "CORS Configuration"
description: "Configure CORS for browser-based applications."
sidebarTitle: "CORS"
icon: "globe"
---
When calling the Sandbox Agent server from a browser, CORS (Cross-Origin Resource Sharing) controls which origins can make requests.
@ -13,7 +12,6 @@ By default, no CORS origins are allowed. You must explicitly specify origins for
```bash
sandbox-agent server \
--token "$SANDBOX_TOKEN" \
--cors-allow-origin "http://localhost:5173"
```
@ -36,7 +34,6 @@ Specify the flag multiple times to allow multiple origins:
```bash
sandbox-agent server \
--token "$SANDBOX_TOKEN" \
--cors-allow-origin "http://localhost:5173" \
--cors-allow-origin "http://localhost:3000"
```
@ -47,7 +44,6 @@ By default, all methods and headers are allowed. To restrict them:
```bash
sandbox-agent server \
--token "$SANDBOX_TOKEN" \
--cors-allow-origin "https://your-app.com" \
--cors-allow-method "GET" \
--cors-allow-method "POST" \

View file

@ -1,144 +0,0 @@
---
title: "Credentials"
description: "How sandbox-agent discovers and uses provider credentials."
icon: "key"
---
Sandbox-agent automatically discovers API credentials from environment variables and agent config files. Credentials are used to authenticate with AI providers (Anthropic, OpenAI) when spawning agents.
## Credential sources
Credentials are extracted in priority order. The first valid credential found for each provider is used.
### Environment variables (highest priority)
**API keys** (checked first):
| Variable | Provider |
|----------|----------|
| `ANTHROPIC_API_KEY` | Anthropic |
| `CLAUDE_API_KEY` | Anthropic (fallback) |
| `OPENAI_API_KEY` | OpenAI |
| `CODEX_API_KEY` | OpenAI (fallback) |
**OAuth tokens** (checked if no API key found):
| Variable | Provider |
|----------|----------|
| `CLAUDE_CODE_OAUTH_TOKEN` | Anthropic (OAuth) |
| `ANTHROPIC_AUTH_TOKEN` | Anthropic (OAuth fallback) |
OAuth tokens from environment variables are only used when `include_oauth` is enabled (the default).
### Agent config files
If no environment variable is set, sandbox-agent checks agent-specific config files:
| Agent | Config path | Provider |
|-------|-------------|----------|
| Amp | `~/.amp/config.json` | Anthropic |
| Claude Code | `~/.claude.json`, `~/.claude/.credentials.json` | Anthropic |
| Codex | `~/.codex/auth.json` | OpenAI |
| OpenCode | `~/.local/share/opencode/auth.json` | Both |
OAuth tokens are supported for Claude Code, Codex, and OpenCode. Expired tokens are automatically skipped.
## Provider requirements by agent
| Agent | Required provider |
|-------|-------------------|
| Claude Code | Anthropic |
| Amp | Anthropic |
| Codex | OpenAI |
| OpenCode | Anthropic or OpenAI |
| Mock | None |
## Error handling behavior
Sandbox-agent uses a **best-effort, fail-forward** approach to credentials:
### Extraction failures are silent
If a config file is missing, unreadable, or malformed, extraction continues to the next source. No errors are thrown. Missing credentials simply mean the provider is marked as unavailable.
```
~/.claude.json missing → try ~/.claude/.credentials.json
~/.claude/.credentials.json missing → try OpenCode config
All sources exhausted → anthropic = None (not an error)
```
### Agents spawn without credential validation
When you send a message to a session, sandbox-agent does **not** pre-validate credentials. The agent process is spawned with whatever credentials were found (or none), and the agent's native error surfaces if authentication fails.
This design:
- Lets you test agent error handling behavior
- Avoids duplicating provider-specific auth validation
- Ensures sandbox-agent faithfully proxies agent behavior
For example, sending a message to Claude Code without Anthropic credentials will spawn the agent, which will then emit its own "ANTHROPIC_API_KEY not set" error through the event stream.
## Checking credential status
### API endpoint
The `GET /v1/agents` endpoint includes a `credentialsAvailable` field for each agent:
```json
{
"agents": [
{
"id": "claude",
"installed": true,
"credentialsAvailable": true,
...
},
{
"id": "codex",
"installed": true,
"credentialsAvailable": false,
...
}
]
}
```
### TypeScript SDK
```typescript
const { agents } = await client.listAgents();
for (const agent of agents) {
console.log(`${agent.id}: ${agent.credentialsAvailable ? 'authenticated' : 'no credentials'}`);
}
```
### OpenCode compatibility
The `/opencode/provider` endpoint returns a `connected` array listing providers with valid credentials:
```json
{
"all": [...],
"connected": ["claude", "mock"]
}
```
## Passing credentials explicitly
You can override auto-discovered credentials by setting environment variables before starting sandbox-agent:
```bash
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
sandbox-agent daemon start
```
Or when using the SDK in embedded mode:
```typescript
const client = await SandboxAgentClient.spawn({
env: {
ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
},
});
```

View file

@ -5,241 +5,155 @@ sidebarTitle: "Custom Tools"
icon: "wrench"
---
There are two ways to give agents custom tools that run inside the sandbox:
There are two common patterns for sandbox-local custom tooling:
| | MCP Server | Skill |
|---|---|---|
| **How it works** | Sandbox Agent spawns your MCP server process and routes tool calls to it via stdio | A markdown file that instructs the agent to run your script with `node` (or any command) |
| **Tool discovery** | Agent sees tools automatically via MCP protocol | Agent reads instructions from the skill file |
| **Best for** | Structured tools with typed inputs/outputs | Lightweight scripts with natural-language instructions |
| **Requires** | `@modelcontextprotocol/sdk` dependency | Just a markdown file and a script |
| **How it works** | Agent connects to an MCP server (`mcpServers`) | Agent follows `SKILL.md` instructions and runs scripts |
| **Best for** | Typed tool calls and structured protocols | Lightweight task-specific guidance |
| **Requires** | MCP server process (stdio/http/sse) | Script + `SKILL.md` |
Both approaches execute code inside the sandbox, so your tools have full access to the sandbox filesystem, network, and installed system tools.
## Option A: Tools via MCP
## Option A: MCP server (stdio)
<Steps>
<Step title="Write your MCP server">
Create an MCP server that exposes tools using `@modelcontextprotocol/sdk` with `StdioServerTransport`. This server will run inside the sandbox.
<Step title="Write and bundle your MCP server">
```ts src/mcp-server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
```ts src/mcp-server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "rand",
version: "1.0.0",
});
const server = new McpServer({ name: "rand", version: "1.0.0" });
server.tool(
"random_number",
"Generate a random integer between min and max (inclusive)",
{
min: z.number().describe("Minimum value"),
max: z.number().describe("Maximum value"),
},
async ({ min, max }) => ({
content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
}),
);
server.tool(
"random_number",
"Generate a random integer between min and max",
{
min: z.number(),
max: z.number(),
},
async ({ min, max }) => ({
content: [{ type: "text", text: String(Math.floor(Math.random() * (max - min + 1)) + min) }],
}),
);
const transport = new StdioServerTransport();
await server.connect(transport);
```
await server.connect(new StdioServerTransport());
```
This is a simple example. Your MCP server runs inside the sandbox, so you can execute any code you'd like: query databases, call internal APIs, run shell commands, or interact with any service available in the container.
```bash
npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/mcp-server.cjs
```
</Step>
<Step title="Package the MCP server">
Bundle into a single JS file so it can be uploaded and executed without a `node_modules` folder.
<Step title="Upload it into the sandbox">
```bash
npx esbuild src/mcp-server.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/mcp-server.cjs
```
```ts
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
This creates `dist/mcp-server.cjs` ready to upload.
const sdk = await SandboxAgent.connect({ baseUrl: "http://127.0.0.1:2468" });
const content = await fs.promises.readFile("./dist/mcp-server.cjs");
await sdk.writeFsFile({ path: "/opt/mcp/custom-tools/mcp-server.cjs" }, content);
```
```bash
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
--data-binary @./dist/mcp-server.cjs
```
</Step>
<Step title="Create sandbox and upload MCP server">
Start your sandbox, then write the bundled file into it.
<Step title="Register MCP config and create a session">
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
```ts
await sdk.setMcpConfig(
{
directory: "/workspace",
mcpName: "customTools",
},
{
type: "local",
command: "node",
args: ["/opt/mcp/custom-tools/mcp-server.cjs"],
},
);
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const session = await sdk.createSession({
agent: "claude",
cwd: "/workspace",
});
const content = await fs.promises.readFile("./dist/mcp-server.cjs");
await client.writeFsFile(
{ path: "/opt/mcp/custom-tools/mcp-server.cjs" },
content,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/mcp/custom-tools/mcp-server.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/mcp-server.cjs
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point an MCP server config at the bundled JS file. When the session starts, Sandbox Agent spawns the MCP server process and routes tool calls to it.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
mcp: {
customTools: {
type: "local",
command: ["node", "/opt/mcp/custom-tools/mcp-server.cjs"],
},
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"customTools": {
"type": "local",
"command": ["node", "/opt/mcp/custom-tools/mcp-server.cjs"]
}
}
}'
```
</CodeGroup>
await session.prompt([
{ type: "text", text: "Use the random_number tool with min=1 and max=10." },
]);
```
</Step>
</Steps>
## Option B: Tools via Skills
Skills are markdown files that instruct the agent how to use a script. Upload the script and a skill file, then point the session at the skill directory.
## Option B: Skills
<Steps>
<Step title="Write your script">
Write a script that the agent will execute. This runs inside the sandbox just like an MCP server, but the agent invokes it directly via its shell tool.
<Step title="Write script + skill file">
```ts src/random-number.ts
const min = Number(process.argv[2]);
const max = Number(process.argv[3]);
```ts src/random-number.ts
const min = Number(process.argv[2]);
const max = Number(process.argv[3]);
if (Number.isNaN(min) || Number.isNaN(max)) {
console.error("Usage: random-number <min> <max>");
process.exit(1);
}
if (Number.isNaN(min) || Number.isNaN(max)) {
console.error("Usage: random-number <min> <max>");
process.exit(1);
}
console.log(Math.floor(Math.random() * (max - min + 1)) + min);
```
console.log(Math.floor(Math.random() * (max - min + 1)) + min);
```
````md SKILL.md
---
name: random-number
description: Generate a random integer between min and max.
---
Run:
```bash
node /opt/skills/random-number/random-number.cjs <min> <max>
```
````
```bash
npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/random-number.cjs
```
</Step>
<Step title="Write a skill file">
Create a `SKILL.md` that tells the agent what the script does and how to run it. The frontmatter `name` and `description` fields are required. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
<Step title="Upload files">
```md SKILL.md
---
name: random-number
description: Generate a random integer between min and max (inclusive). Use when the user asks for a random number.
---
```ts
import fs from "node:fs";
To generate a random number, run:
const script = await fs.promises.readFile("./dist/random-number.cjs");
await sdk.writeFsFile({ path: "/opt/skills/random-number/random-number.cjs" }, script);
```bash
node /opt/skills/random-number/random-number.cjs <min> <max>
```
This prints a single random integer between min and max (inclusive).
const skill = await fs.promises.readFile("./SKILL.md");
await sdk.writeFsFile({ path: "/opt/skills/random-number/SKILL.md" }, skill);
```
</Step>
<Step title="Package the script">
Bundle the script just like an MCP server so it has no dependencies at runtime.
<Step title="Use in a session">
```bash
npx esbuild src/random-number.ts --bundle --format=cjs --platform=node --target=node18 --minify --outfile=dist/random-number.cjs
```
</Step>
```ts
const session = await sdk.createSession({
agent: "claude",
cwd: "/workspace",
});
<Step title="Create sandbox and upload files">
Upload both the bundled script and the skill file.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
import fs from "node:fs";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const script = await fs.promises.readFile("./dist/random-number.cjs");
await client.writeFsFile(
{ path: "/opt/skills/random-number/random-number.cjs" },
script,
);
const skill = await fs.promises.readFile("./SKILL.md");
await client.writeFsFile(
{ path: "/opt/skills/random-number/SKILL.md" },
skill,
);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/random-number.cjs" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./dist/random-number.cjs
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=/opt/skills/random-number/SKILL.md" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
--data-binary @./SKILL.md
```
</CodeGroup>
</Step>
<Step title="Create a session">
Point the session at the skill directory. The agent reads `SKILL.md` and learns how to use your script.
<CodeGroup>
```ts TypeScript
await client.createSession("custom-tools", {
agent: "claude",
skills: {
sources: [
{ type: "local", source: "/opt/skills/random-number" },
],
},
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/custom-tools" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "local", "source": "/opt/skills/random-number" }
]
}
}'
```
</CodeGroup>
await session.prompt([
{ type: "text", text: "Use the random-number skill to pick a number from 1 to 100." },
]);
```
</Step>
</Steps>
## Notes
- The sandbox image must include a Node.js runtime that can execute the bundled files.
- The sandbox runtime must include Node.js (or your chosen runtime).
- For persistent skill-source wiring by directory, see [Skills](/skills-config).

View file

@ -1,96 +1,69 @@
---
title: "Daemon"
description: "Background daemon lifecycle, auto-upgrade, and management."
icon: "microchip"
description: "Background daemon lifecycle and management."
---
The sandbox-agent daemon is a background server process that stays running between sessions. Commands like `sandbox-agent opencode` and `gigacode` automatically start it when needed and restart it when the binary is updated.
The sandbox-agent daemon is a background server process. Commands like `sandbox-agent opencode` and `gigacode` can ensure it is running.
## How it works
1. When you run `sandbox-agent opencode`, `sandbox-agent daemon start`, or `gigacode`, the CLI checks if a daemon is already healthy at the configured host and port.
2. If no daemon is running, one is spawned in the background with stdout/stderr redirected to a log file.
3. The CLI writes a PID file and a build ID file to track the running process and its version.
4. On subsequent invocations, if the daemon is still running but was built from a different commit, the CLI automatically stops the old daemon and starts a new one.
1. A daemon-aware command checks for a healthy daemon at host/port.
2. If missing, it starts one in the background and records PID/version files.
3. Subsequent checks can compare build/version and restart when required.
## Auto-upgrade
## Auto-upgrade behavior
Each build of sandbox-agent embeds a unique **build ID** (the git short hash, or a version-timestamp fallback). When a daemon is started, this build ID is written to a version file alongside the PID file.
On every invocation of `ensure_running` (called by `opencode`, `gigacode`, and `daemon start`), the CLI compares the stored build ID against the current binary's build ID. If they differ, the running daemon is stopped and replaced automatically:
```
daemon outdated (build a1b2c3d -> f4e5d6c), restarting...
```
This means installing a new version of sandbox-agent and running any daemon-aware command is enough to upgrade — no manual restart needed.
- `sandbox-agent opencode` and `gigacode` use ensure-running behavior with upgrade checks.
- `sandbox-agent daemon start` uses direct start by default.
- `sandbox-agent daemon start --upgrade` uses ensure-running behavior (including version check/restart).
## Managing the daemon
### Start
Start a daemon in the background. If one is already running and healthy, this is a no-op.
```bash
sandbox-agent daemon start [OPTIONS]
```
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host to bind to |
| `-p, --port <PORT>` | `2468` | Port to bind to |
| `-t, --token <TOKEN>` | - | Authentication token |
| `-n, --no-token` | - | Disable authentication |
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
| `--upgrade` | false | Use ensure-running + upgrade behavior |
```bash
sandbox-agent daemon start --no-token
sandbox-agent daemon start
sandbox-agent daemon start --upgrade
```
### Stop
Stop a running daemon. Sends SIGTERM and waits up to 5 seconds for a graceful shutdown before falling back to SIGKILL.
```bash
sandbox-agent daemon stop [OPTIONS]
```
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host of the daemon |
| `-p, --port <PORT>` | `2468` | Port of the daemon |
```bash
sandbox-agent daemon stop
```
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
### Status
Show whether the daemon is running, its PID, build ID, and log path.
```bash
sandbox-agent daemon status [OPTIONS]
```
| Option | Default | Description |
|--------|---------|-------------|
| `-H, --host <HOST>` | `127.0.0.1` | Host of the daemon |
| `-p, --port <PORT>` | `2468` | Port of the daemon |
```bash
sandbox-agent daemon status
# Daemon running (PID 12345, build a1b2c3d, logs: ~/.local/share/sandbox-agent/daemon/daemon-127-0-0-1-2468.log)
```
If the daemon was started with an older binary, the status includes an `[outdated, restart recommended]` notice.
| `-H, --host <HOST>` | `127.0.0.1` | Host |
| `-p, --port <PORT>` | `2468` | Port |
## Files
All daemon state files live under the sandbox-agent data directory (typically `~/.local/share/sandbox-agent/daemon/`):
Daemon state is stored under the sandbox-agent data directory (for example `~/.local/share/sandbox-agent/daemon/`):
| File | Purpose |
|------|---------|
| `daemon-{host}-{port}.pid` | PID of the running daemon process |
| `daemon-{host}-{port}.version` | Build ID of the running daemon |
| `daemon-{host}-{port}.log` | Daemon stdout/stderr log output |
Multiple daemons can run on different host/port combinations without conflicting.
| `daemon-{host}-{port}.pid` | PID of running daemon |
| `daemon-{host}-{port}.version` | Build/version marker |
| `daemon-{host}-{port}.log` | Daemon stdout/stderr log |

67
docs/deploy/boxlite.mdx Normal file
View file

@ -0,0 +1,67 @@
---
title: "BoxLite"
description: "Run Sandbox Agent inside a BoxLite micro-VM."
---
BoxLite is a local-first micro-VM sandbox — no cloud account needed.
See [BoxLite docs](https://docs.boxlite.ai) for platform requirements (KVM on Linux, Apple Silicon on macOS).
## Prerequisites
- `@boxlite-ai/boxlite` installed (requires KVM or Apple Hypervisor)
- Docker (to build the base image)
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## Base image
Build a Docker image with Sandbox Agent pre-installed, then export it as an OCI layout
that BoxLite can load directly (BoxLite has its own image store separate from Docker):
```dockerfile
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y curl ca-certificates && rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
RUN sandbox-agent install-agent claude
RUN sandbox-agent install-agent codex
```
```bash
docker build -t sandbox-agent-boxlite .
mkdir -p oci-image
docker save sandbox-agent-boxlite | tar -xf - -C oci-image
```
## TypeScript example
```typescript
import { SimpleBox } from "@boxlite-ai/boxlite";
import { SandboxAgent } from "sandbox-agent";
const env: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const box = new SimpleBox({
rootfsPath: "./oci-image",
env,
ports: [{ hostPort: 3000, guestPort: 3000 }],
diskSizeGb: 4,
});
await box.exec("sh", "-c",
"nohup sandbox-agent server --no-token --host 0.0.0.0 --port 3000 >/tmp/sandbox-agent.log 2>&1 &"
);
const baseUrl = "http://localhost:3000";
const sdk = await SandboxAgent.connect({ baseUrl });
const session = await sdk.createSession({ agent: "claude" });
const off = session.onEvent((event) => {
console.log(event.sender, event.payload);
});
await session.prompt([{ type: "text", text: "Summarize this repository" }]);
off();
await box.stop();
```

View file

@ -1,21 +1,19 @@
---
title: "Cloudflare"
description: "Deploy the daemon inside a Cloudflare Sandbox."
description: "Deploy Sandbox Agent inside a Cloudflare Sandbox."
---
## Prerequisites
- Cloudflare account with Workers Paid plan
- Docker running locally for `wrangler dev`
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
- Cloudflare account with Workers paid plan
- Docker for local `wrangler dev`
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
<Note>
Cloudflare Sandbox SDK is in beta. See [Sandbox SDK docs](https://developers.cloudflare.com/sandbox/) for details.
Cloudflare Sandbox SDK is beta. See [Sandbox SDK docs](https://developers.cloudflare.com/sandbox/).
</Note>
## Quick Start
Create a new Sandbox SDK project:
## Quick start
```bash
npm create cloudflare@latest -- my-sandbox --template=cloudflare/sandbox-sdk/examples/minimal
@ -24,78 +22,65 @@ cd my-sandbox
## Dockerfile
Create a `Dockerfile` with sandbox-agent and agents pre-installed:
```dockerfile
FROM cloudflare/sandbox:0.7.0
# Install sandbox-agent
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
RUN sandbox-agent install-agent claude && sandbox-agent install-agent codex
# Pre-install agents
RUN sandbox-agent install-agent claude && \
sandbox-agent install-agent codex
# Required for local development with wrangler dev
EXPOSE 8000
```
<Note>
The `EXPOSE 8000` directive is required for `wrangler dev` to proxy requests to the container. Port 3000 is reserved for the Cloudflare control plane.
</Note>
## TypeScript example (with provider)
## Wrangler Configuration
For standalone scripts, use the `cloudflare` provider:
Update `wrangler.jsonc` to use your Dockerfile:
```bash
npm install sandbox-agent@0.4.x @cloudflare/sandbox
```
```jsonc
{
"name": "my-sandbox-agent",
"main": "src/index.ts",
"compatibility_date": "2025-01-01",
"compatibility_flags": ["nodejs_compat"],
"containers": [
{
"class_name": "Sandbox",
"image": "./Dockerfile",
"instance_type": "lite",
"max_instances": 1
}
],
"durable_objects": {
"bindings": [
{
"class_name": "Sandbox",
"name": "Sandbox"
}
]
},
"migrations": [
{
"new_sqlite_classes": ["Sandbox"],
"tag": "v1"
}
]
```typescript
import { SandboxAgent } from "sandbox-agent";
import { cloudflare } from "sandbox-agent/cloudflare";
const sdk = await SandboxAgent.start({
sandbox: cloudflare(),
});
try {
const session = await sdk.createSession({ agent: "codex" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
```
## TypeScript Example
The `cloudflare` provider uses `containerFetch` under the hood, automatically stripping `AbortSignal` to avoid dropped streaming updates.
This example proxies requests to sandbox-agent via `containerFetch`, which works reliably in both local development and production:
## TypeScript example (Durable Objects)
For Workers with Durable Objects, use `SandboxAgent.connect(...)` with a custom `fetch` backed by `sandbox.containerFetch(...)`:
```typescript
import { getSandbox, type Sandbox } from "@cloudflare/sandbox";
import { Hono } from "hono";
import { SandboxAgent } from "sandbox-agent";
export { Sandbox } from "@cloudflare/sandbox";
type Env = {
type Bindings = {
Sandbox: DurableObjectNamespace<Sandbox>;
ASSETS: Fetcher;
ANTHROPIC_API_KEY?: string;
OPENAI_API_KEY?: string;
};
const app = new Hono<{ Bindings: Bindings }>();
const PORT = 8000;
/** Check if sandbox-agent is already running */
async function isServerRunning(sandbox: Sandbox): Promise<boolean> {
try {
const result = await sandbox.exec(`curl -sf http://localhost:${PORT}/v1/health`);
@ -105,147 +90,99 @@ async function isServerRunning(sandbox: Sandbox): Promise<boolean> {
}
}
/** Ensure sandbox-agent is running in the container */
async function ensureRunning(sandbox: Sandbox, env: Env): Promise<void> {
if (await isServerRunning(sandbox)) return;
// Set environment variables for agents
const envVars: Record<string, string> = {};
if (env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = env.ANTHROPIC_API_KEY;
if (env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = env.OPENAI_API_KEY;
await sandbox.setEnvVars(envVars);
// Start sandbox-agent server
await sandbox.startProcess(
`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`
);
// Poll health endpoint until server is ready
for (let i = 0; i < 30; i++) {
if (await isServerRunning(sandbox)) return;
await new Promise((r) => setTimeout(r, 200));
async function getReadySandbox(name: string, env: Bindings): Promise<Sandbox> {
const sandbox = getSandbox(env.Sandbox, name);
if (!(await isServerRunning(sandbox))) {
const envVars: Record<string, string> = {};
if (env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = env.ANTHROPIC_API_KEY;
if (env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = env.OPENAI_API_KEY;
await sandbox.setEnvVars(envVars);
await sandbox.startProcess(`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`);
}
return sandbox;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
app.post("/sandbox/:name/prompt", async (c) => {
const sandbox = await getReadySandbox(c.req.param("name"), c.env);
// Proxy requests: /sandbox/:name/v1/...
const match = url.pathname.match(/^\/sandbox\/([^/]+)(\/.*)?$/);
if (match) {
const [, name, path = "/"] = match;
const sandbox = getSandbox(env.Sandbox, name);
const sdk = await SandboxAgent.connect({
fetch: (input, init) =>
sandbox.containerFetch(
input as Request | string | URL,
{
...(init ?? {}),
// Avoid passing AbortSignal through containerFetch; it can drop streamed session updates.
signal: undefined,
},
PORT,
),
});
await ensureRunning(sandbox, env);
const session = await sdk.createSession({ agent: "codex" });
const response = await session.prompt([{ type: "text", text: "Summarize this repository" }]);
await sdk.destroySession(session.id);
await sdk.dispose();
// Proxy request to container
return sandbox.containerFetch(
new Request(`http://localhost${path}${url.search}`, request),
PORT
);
}
return new Response("Not found", { status: 404 });
},
};
```
## Connect from Client
```typescript
import { SandboxAgent } from "sandbox-agent";
// Connect via the proxy endpoint
const client = await SandboxAgent.connect({
baseUrl: "http://localhost:8787/sandbox/my-sandbox",
return c.json(response);
});
// Wait for server to be ready
for (let i = 0; i < 30; i++) {
try {
await client.getHealth();
break;
} catch {
await new Promise((r) => setTimeout(r, 1000));
}
}
app.all("/sandbox/:name/proxy/*", async (c) => {
const sandbox = await getReadySandbox(c.req.param("name"), c.env);
const wildcard = c.req.param("*");
const path = wildcard ? `/${wildcard}` : "/";
const query = new URL(c.req.raw.url).search;
// Create a session and start coding
await client.createSession("my-session", { agent: "claude" });
await client.postMessage("my-session", {
message: "Summarize this repository",
return sandbox.containerFetch(new Request(`http://localhost${path}${query}`, c.req.raw), PORT);
});
for await (const event of client.streamEvents("my-session")) {
// Auto-approve permissions
if (event.type === "permission.requested") {
await client.replyPermission("my-session", event.data.permission_id, {
reply: "once",
});
}
app.all("*", (c) => c.env.ASSETS.fetch(c.req.raw));
// Handle text output
if (event.type === "item.delta" && event.data?.delta) {
process.stdout.write(event.data.delta);
}
}
export default app;
```
## Environment Variables
This keeps all Sandbox Agent calls inside the Cloudflare sandbox routing path and does not require a `baseUrl`.
Use `.dev.vars` for local development:
## Troubleshooting streaming updates
```bash
echo "ANTHROPIC_API_KEY=your-api-key" > .dev.vars
If you only receive:
- the outbound prompt request
- the final `{ stopReason: "end_turn" }` response
then the streamed update channel dropped. In Cloudflare sandbox paths, this is typically caused by forwarding `AbortSignal` from SDK fetch init into `containerFetch(...)`.
Fix:
```ts
const sdk = await SandboxAgent.connect({
fetch: (input, init) =>
sandbox.containerFetch(
input as Request | string | URL,
{
...(init ?? {}),
// Avoid passing AbortSignal through containerFetch; it can drop streamed session updates.
signal: undefined,
},
PORT,
),
});
```
<Warning>
Use plain `KEY=value` format in `.dev.vars`. Do not use `export KEY=value` - wrangler won't parse the bash syntax.
</Warning>
This keeps prompt completion behavior the same, but restores streamed text/tool updates.
<Note>
The `.dev.vars` file is automatically gitignored and only used during local development with `npm run dev`.
</Note>
For production, set secrets via wrangler:
```bash
wrangler secret put ANTHROPIC_API_KEY
```
## Local Development
Start the development server:
## Local development
```bash
npm run dev
```
<Note>
First run builds the Docker container (2-3 minutes). Subsequent runs are much faster.
</Note>
Test with curl:
Test health:
```bash
curl http://localhost:8787/sandbox/demo/v1/health
curl http://localhost:8787/sandbox/demo/proxy/v1/health
```
<Tip>
Containers cache environment variables. If you change `.dev.vars`, either use a new sandbox name or clear existing containers:
```bash
docker ps -a | grep sandbox | awk '{print $1}' | xargs -r docker rm -f
```
</Tip>
## Production Deployment
Deploy to Cloudflare:
## Production deployment
```bash
wrangler deploy
```
For production with preview URLs (direct container access), you'll need a custom domain with wildcard DNS routing. See [Cloudflare Production Deployment](https://developers.cloudflare.com/sandbox/guides/production-deployment/) for setup instructions.

View file

@ -0,0 +1,81 @@
---
title: "ComputeSDK"
description: "Deploy Sandbox Agent using ComputeSDK's provider-agnostic sandbox API."
---
[ComputeSDK](https://computesdk.com) provides a unified interface for managing sandboxes across multiple providers. Write once, deploy anywhere by changing environment variables.
## Prerequisites
- `COMPUTESDK_API_KEY` from [console.computesdk.com](https://console.computesdk.com)
- Provider API key (one of: `E2B_API_KEY`, `DAYTONA_API_KEY`, `VERCEL_TOKEN`, `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`, `BLAXEL_API_KEY`, `CSB_API_KEY`)
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## TypeScript example
```bash
npm install sandbox-agent@0.4.x computesdk
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { computesdk } from "sandbox-agent/computesdk";
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const sdk = await SandboxAgent.start({
sandbox: computesdk({
create: {
envs,
image: process.env.COMPUTESDK_IMAGE,
templateId: process.env.COMPUTESDK_TEMPLATE_ID,
},
}),
});
try {
const session = await sdk.createSession({ agent: "claude" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
```
The `computesdk` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically. ComputeSDK routes to your configured provider behind the scenes.
The `create` option now forwards the full ComputeSDK sandbox-create payload, including provider-specific fields such as `image` and `templateId` when the selected provider supports them.
Before calling `SandboxAgent.start()`, configure ComputeSDK with your provider:
```typescript
import { compute } from "computesdk";
compute.setConfig({
provider: "e2b", // or auto-detect via detectProvider()
computesdkApiKey: process.env.COMPUTESDK_API_KEY,
});
```
## Supported providers
ComputeSDK auto-detects your provider from environment variables:
| Provider | Environment Variables |
|----------|----------------------|
| E2B | `E2B_API_KEY` |
| Daytona | `DAYTONA_API_KEY` |
| Vercel | `VERCEL_TOKEN` or `VERCEL_OIDC_TOKEN` |
| Modal | `MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET` |
| Blaxel | `BLAXEL_API_KEY` |
| CodeSandbox | `CSB_API_KEY` |
## Notes
- **Provider resolution**: Set `COMPUTESDK_PROVIDER` to force a specific provider, or let ComputeSDK auto-detect from API keys.
- `sandbox.runCommand(..., { background: true })` keeps the server running while your app continues.
- `sandbox.getUrl({ port })` returns a public URL for the sandbox port.
- Always destroy the sandbox when done to avoid leaking resources.

View file

@ -1,63 +1,52 @@
---
title: "Daytona"
description: "Run the daemon in a Daytona workspace."
description: "Run Sandbox Agent in a Daytona workspace."
---
<Warning>
Daytona Tier 3+ is required to access api.anthropic.com and api.openai.com. Tier 1/2 sandboxes have restricted network access that will cause agent failures. See [Daytona network limits](https://www.daytona.io/docs/en/network-limits/) for details.
Daytona Tier 3+ is required for access to common model provider endpoints.
See [Daytona network limits](https://www.daytona.io/docs/en/network-limits/).
</Warning>
## Prerequisites
- `DAYTONA_API_KEY` environment variable
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
- `DAYTONA_API_KEY`
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## TypeScript Example
## TypeScript example
```bash
npm install sandbox-agent@0.4.x @daytonaio/sdk
```
```typescript
import { Daytona } from "@daytonaio/sdk";
import { SandboxAgent } from "sandbox-agent";
import { daytona } from "sandbox-agent/daytona";
const daytona = new Daytona();
// Pass API keys to the sandbox
const envVars: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const sandbox = await daytona.create({ envVars });
// Install sandbox-agent
await sandbox.process.executeCommand(
"curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh"
);
// Start the server in the background
await sandbox.process.executeCommand(
"nohup sandbox-agent server --no-token --host 0.0.0.0 --port 3000 >/tmp/sandbox-agent.log 2>&1 &"
);
// Wait for server to be ready
await new Promise((r) => setTimeout(r, 2000));
// Get the public URL
const baseUrl = (await sandbox.getSignedPreviewUrl(3000, 4 * 60 * 60)).url;
// Connect and use the SDK
const client = await SandboxAgent.connect({ baseUrl });
await client.createSession("my-session", {
agent: "claude",
permissionMode: "default",
const sdk = await SandboxAgent.start({
sandbox: daytona({
create: { envVars },
}),
});
// Cleanup when done
await sandbox.delete();
try {
const session = await sdk.createSession({ agent: "claude" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
```
## Using Snapshots for Faster Startup
The `daytona` provider uses the `rivetdev/sandbox-agent:0.4.2-full` image by default and starts the server automatically.
For production, use snapshots with pre-installed binaries:
## Using snapshots for faster startup
```typescript
import { Daytona, Image } from "@daytonaio/sdk";
@ -65,7 +54,6 @@ import { Daytona, Image } from "@daytonaio/sdk";
const daytona = new Daytona();
const SNAPSHOT = "sandbox-agent-ready";
// Create snapshot once (takes 2-3 minutes)
const hasSnapshot = await daytona.snapshot.get(SNAPSHOT).then(() => true, () => false);
if (!hasSnapshot) {
@ -73,18 +61,10 @@ if (!hasSnapshot) {
name: SNAPSHOT,
image: Image.base("ubuntu:22.04").runCommands(
"apt-get update && apt-get install -y curl ca-certificates",
"curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh",
"curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh",
"sandbox-agent install-agent claude",
"sandbox-agent install-agent codex",
),
});
}
// Now sandboxes start instantly
const sandbox = await daytona.create({
snapshot: SNAPSHOT,
envVars,
});
```
See [Daytona Snapshots](https://daytona.io/docs/snapshots) for details.

View file

@ -1,33 +1,46 @@
---
title: "Docker"
description: "Build and run the daemon in a Docker container."
description: "Build and run Sandbox Agent in a Docker container."
---
<Warning>
Docker is not recommended for production. Standard Docker containers don't provide sufficient isolation for running untrusted code. Use a dedicated sandbox provider like E2B or Daytona for production workloads.
Docker is not recommended for production isolation of untrusted workloads. Use dedicated sandbox providers (E2B, Daytona, etc.) for stronger isolation.
</Warning>
## Quick Start
## Quick start
Run sandbox-agent in a container with agents pre-installed:
Run the published full image with all supported agents pre-installed:
```bash
docker run --rm -p 3000:3000 \
-e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
alpine:latest sh -c "\
apk add --no-cache curl ca-certificates libstdc++ libgcc bash && \
curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh && \
sandbox-agent install-agent claude && \
sandbox-agent install-agent codex && \
rivetdev/sandbox-agent:0.4.2-full \
server --no-token --host 0.0.0.0 --port 3000
```
The `0.4.2-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image.
If you also want the desktop API inside the container, install desktop dependencies before starting the server:
```bash
docker run --rm -p 3000:3000 \
-e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
node:22-bookworm-slim sh -c "\
apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y curl ca-certificates bash libstdc++6 && \
rm -rf /var/lib/apt/lists/* && \
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh && \
sandbox-agent install desktop --yes && \
sandbox-agent server --no-token --host 0.0.0.0 --port 3000"
```
<Note>
Alpine is required because Claude Code is built for musl libc. Debian/Ubuntu images use glibc and won't work.
</Note>
In a Dockerfile:
Access the API at `http://localhost:3000`.
```dockerfile
RUN sandbox-agent install desktop --yes
```
## TypeScript with dockerode
@ -39,17 +52,12 @@ const docker = new Docker();
const PORT = 3000;
const container = await docker.createContainer({
Image: "alpine:latest",
Cmd: ["sh", "-c", [
"apk add --no-cache curl ca-certificates libstdc++ libgcc bash",
"curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh",
"sandbox-agent install-agent claude",
"sandbox-agent install-agent codex",
`sandbox-agent server --no-token --host 0.0.0.0 --port ${PORT}`,
].join(" && ")],
Image: "rivetdev/sandbox-agent:0.4.2-full",
Cmd: ["server", "--no-token", "--host", "0.0.0.0", "--port", `${PORT}`],
Env: [
`ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`,
`OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`,
`CODEX_API_KEY=${process.env.CODEX_API_KEY}`,
].filter(Boolean),
ExposedPorts: { [`${PORT}/tcp`]: {} },
HostConfig: {
@ -60,24 +68,41 @@ const container = await docker.createContainer({
await container.start();
// Wait for server and connect
const baseUrl = `http://127.0.0.1:${PORT}`;
const client = await SandboxAgent.connect({ baseUrl });
const sdk = await SandboxAgent.connect({ baseUrl });
// Use the client...
await client.createSession("my-session", {
agent: "claude",
permissionMode: "default",
});
const session = await sdk.createSession({ agent: "codex" });
await session.prompt([{ type: "text", text: "Summarize this repository." }]);
```
## Building from Source
## Building a custom image with everything preinstalled
To build a static binary for use in minimal containers:
If you need to extend your own base image, install Sandbox Agent and preinstall every supported agent in one step:
```dockerfile
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
bash ca-certificates curl git && \
rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh && \
sandbox-agent install-agent --all
RUN useradd -m -s /bin/bash sandbox
USER sandbox
WORKDIR /home/sandbox
EXPOSE 2468
ENTRYPOINT ["sandbox-agent"]
CMD ["server", "--host", "0.0.0.0", "--port", "2468"]
```
## Building from source
```bash
docker build -f docker/release/linux-x86_64.Dockerfile -t sandbox-agent-build .
docker run --rm -v "$PWD/artifacts:/artifacts" sandbox-agent-build
```
The binary will be at `./artifacts/sandbox-agent-x86_64-unknown-linux-musl`.
Binary output: `./artifacts/sandbox-agent-x86_64-unknown-linux-musl`.

View file

@ -1,79 +1,52 @@
---
title: "E2B"
description: "Deploy the daemon inside an E2B sandbox."
description: "Deploy Sandbox Agent inside an E2B sandbox."
---
## Prerequisites
- `E2B_API_KEY` environment variable
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
- `E2B_API_KEY`
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## TypeScript Example
## TypeScript example
```bash
npm install sandbox-agent@0.4.x @e2b/code-interpreter
```
```typescript
import { Sandbox } from "@e2b/code-interpreter";
import { SandboxAgent } from "sandbox-agent";
import { e2b } from "sandbox-agent/e2b";
// Pass API keys to the sandbox
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const template = process.env.E2B_TEMPLATE;
const sandbox = await Sandbox.create({ allowInternetAccess: true, envs });
// Install sandbox-agent
await sandbox.commands.run(
"curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh"
);
// Install agents before starting the server
await sandbox.commands.run("sandbox-agent install-agent claude");
await sandbox.commands.run("sandbox-agent install-agent codex");
// Start the server in the background
await sandbox.commands.run(
"sandbox-agent server --no-token --host 0.0.0.0 --port 3000",
{ background: true }
);
// Connect to the server
const baseUrl = `https://${sandbox.getHost(3000)}`;
const client = await SandboxAgent.connect({ baseUrl });
// Wait for server to be ready
for (let i = 0; i < 30; i++) {
try {
await client.getHealth();
break;
} catch {
await new Promise((r) => setTimeout(r, 1000));
}
}
// Create a session and start coding
await client.createSession("my-session", {
agent: "claude",
permissionMode: "default",
const sdk = await SandboxAgent.start({
sandbox: e2b({
template,
create: { envs },
}),
});
await client.postMessage("my-session", {
message: "Summarize this repository",
});
for await (const event of client.streamEvents("my-session")) {
console.log(event.type, event.data);
try {
const session = await sdk.createSession({ agent: "claude" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
// Cleanup
await sandbox.kill();
```
## Faster Cold Starts
The `e2b` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically. Sandboxes pause by default instead of being deleted, and reconnecting with the same `sandboxId` resumes them automatically.
For faster startup, create a custom E2B template with sandbox-agent and agents pre-installed:
Pass `template` when you want to start from a custom E2B template alias or template ID. E2B base-image selection happens when you build the template, then `sandbox-agent/e2b` uses that template at sandbox creation time.
1. Create a template with the install script baked in
2. Pre-install agents: `sandbox-agent install-agent claude codex`
3. Use the template ID when creating sandboxes
## Faster cold starts
See [E2B Custom Templates](https://e2b.dev/docs/sandbox-template) for details.
For faster startup, create a custom E2B template with Sandbox Agent and target agents pre-installed.
Build System 2.0 also lets you choose the template's base image in code.
See [E2B Custom Templates](https://e2b.dev/docs/sandbox-template) and [E2B Base Images](https://e2b.dev/docs/template/base-image).

View file

@ -1,52 +1,70 @@
---
title: "Local"
description: "Run the daemon locally for development."
description: "Run Sandbox Agent locally for development."
---
For local development, you can run the daemon directly on your machine.
For local development, run Sandbox Agent directly on your machine.
## With the CLI
```bash
# Install
curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
# Run
sandbox-agent server --no-token --host 127.0.0.1 --port 2468
```
Or with npm or Bun:
Or with npm/Bun:
<Tabs>
<Tab title="npx">
```bash
npx sandbox-agent server --no-token --host 127.0.0.1 --port 2468
npx @sandbox-agent/cli@0.4.x server --no-token --host 127.0.0.1 --port 2468
```
</Tab>
<Tab title="bunx">
```bash
bunx sandbox-agent server --no-token --host 127.0.0.1 --port 2468
bunx @sandbox-agent/cli@0.4.x server --no-token --host 127.0.0.1 --port 2468
```
</Tab>
</Tabs>
## With the TypeScript SDK
The SDK can automatically spawn and manage the server as a subprocess:
The SDK can spawn and manage the server as a subprocess using the `local` provider:
```typescript
import { SandboxAgent } from "sandbox-agent";
import { local } from "sandbox-agent/local";
// Spawns sandbox-agent server as a subprocess
const client = await SandboxAgent.start();
await client.createSession("my-session", {
agent: "claude",
permissionMode: "default",
const sdk = await SandboxAgent.start({
sandbox: local(),
});
// When done
await client.dispose();
const session = await sdk.createSession({
agent: "claude",
});
await session.prompt([
{ type: "text", text: "Summarize this repository." },
]);
await sdk.destroySandbox();
```
This installs the binary (if needed) and starts the server on a random available port. No manual setup required.
This starts the server on an available local port and connects automatically.
Pass options to customize the local provider:
```typescript
const sdk = await SandboxAgent.start({
sandbox: local({
port: 3000,
log: "inherit",
env: {
ANTHROPIC_API_KEY: process.env.MY_ANTHROPIC_KEY,
},
}),
});
```

55
docs/deploy/modal.mdx Normal file
View file

@ -0,0 +1,55 @@
---
title: "Modal"
description: "Deploy Sandbox Agent inside a Modal sandbox."
---
## Prerequisites
- `MODAL_TOKEN_ID` and `MODAL_TOKEN_SECRET` from [modal.com/settings](https://modal.com/settings)
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## TypeScript example
```bash
npm install sandbox-agent@0.4.x modal
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { modal } from "sandbox-agent/modal";
const secrets: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) secrets.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) secrets.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const baseImage = process.env.MODAL_BASE_IMAGE ?? "node:22-slim";
const sdk = await SandboxAgent.start({
sandbox: modal({
image: baseImage,
create: { secrets },
}),
});
try {
const session = await sdk.createSession({ agent: "claude" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
```
The `modal` provider handles app creation, image building, sandbox provisioning, agent installation, server startup, and tunnel networking automatically.
Set `image` to change the base Docker image before Sandbox Agent and its agent binaries are layered on top. You can also pass a prebuilt Modal `Image` object.
## Faster cold starts
Modal caches image layers, so the Dockerfile commands that install `curl` and `sandbox-agent` only run on the first build. Subsequent sandbox creates reuse the cached image.
## Notes
- Modal sandboxes use [gVisor](https://gvisor.dev/) for strong isolation.
- Ports are exposed via encrypted tunnels (`encryptedPorts`). The provider uses `sb.tunnels()` to get the public HTTPS URL.
- Environment variables (API keys) are passed as Modal [Secrets](https://modal.com/docs/guide/secrets) for security.

View file

@ -1,91 +1,50 @@
---
title: "Vercel"
description: "Deploy the daemon inside a Vercel Sandbox."
description: "Deploy Sandbox Agent inside a Vercel Sandbox."
---
## Prerequisites
- `VERCEL_OIDC_TOKEN` or `VERCEL_ACCESS_TOKEN` environment variable
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
- `VERCEL_OIDC_TOKEN` or `VERCEL_ACCESS_TOKEN`
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY`
## TypeScript Example
## TypeScript example
```bash
npm install sandbox-agent@0.4.x @vercel/sandbox
```
```typescript
import { Sandbox } from "@vercel/sandbox";
import { SandboxAgent } from "sandbox-agent";
import { vercel } from "sandbox-agent/vercel";
// Pass API keys to the sandbox
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const env: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
// Create sandbox with port 3000 exposed
const sandbox = await Sandbox.create({
runtime: "node24",
ports: [3000],
const sdk = await SandboxAgent.start({
sandbox: vercel({
create: {
runtime: "node24",
env,
},
}),
});
// Helper to run commands
const run = async (cmd: string, args: string[] = []) => {
const result = await sandbox.runCommand({ cmd, args, env: envs });
if (result.exitCode !== 0) {
throw new Error(`Command failed: ${cmd} ${args.join(" ")}`);
}
return result;
};
// Install sandbox-agent
await run("sh", ["-c", "curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh"]);
// Install agents before starting the server
await run("sandbox-agent", ["install-agent", "claude"]);
await run("sandbox-agent", ["install-agent", "codex"]);
// Start the server in the background
await sandbox.runCommand({
cmd: "sandbox-agent",
args: ["server", "--no-token", "--host", "0.0.0.0", "--port", "3000"],
env: envs,
detached: true,
});
// Connect to the server
const baseUrl = sandbox.domain(3000);
const client = await SandboxAgent.connect({ baseUrl });
// Wait for server to be ready
for (let i = 0; i < 30; i++) {
try {
await client.getHealth();
break;
} catch {
await new Promise((r) => setTimeout(r, 1000));
}
try {
const session = await sdk.createSession({ agent: "claude" });
const response = await session.prompt([
{ type: "text", text: "Summarize this repository" },
]);
console.log(response.stopReason);
} finally {
await sdk.destroySandbox();
}
// Create a session and start coding
await client.createSession("my-session", {
agent: "claude",
permissionMode: "default",
});
await client.postMessage("my-session", {
message: "Summarize this repository",
});
for await (const event of client.streamEvents("my-session")) {
console.log(event.type, event.data);
}
// Cleanup
await sandbox.stop();
```
The `vercel` provider handles sandbox creation, Sandbox Agent installation, agent setup, and server startup automatically.
## Authentication
Vercel Sandboxes support two authentication methods:
- **OIDC Token**: Set `VERCEL_OIDC_TOKEN` (recommended for CI/CD)
- **Access Token**: Set `VERCEL_ACCESS_TOKEN` (for local development, run `vercel env pull`)
See [Vercel Sandbox docs](https://vercel.com/docs/functions/sandbox) for details.
Vercel Sandboxes support OIDC token auth (recommended) and access-token auth.
See [Vercel Sandbox docs](https://vercel.com/docs/functions/sandbox).

View file

@ -1,122 +1,130 @@
{
"$schema": "https://mintlify.com/docs.json",
"theme": "willow",
"name": "Sandbox Agent SDK",
"appearance": {
"default": "dark",
"strict": true
},
"colors": {
"primary": "#ff4f00",
"light": "#ff4f00",
"dark": "#ff4f00"
},
"favicon": "/favicon.svg",
"logo": {
"light": "/logo/light.svg",
"dark": "/logo/dark.svg"
},
"integrations": {
"posthog": {
"apiKey": "phc_6kfTNEAVw7rn1LA51cO3D69FefbKupSWFaM7OUgEpEo",
"apiHost": "https://ph.rivet.gg",
"sessionRecording": true
}
},
"navbar": {
"links": [
{
"label": "Gigacode",
"icon": "terminal",
"href": "https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode"
},
{
"label": "Discord",
"icon": "discord",
"href": "https://discord.gg/auCecybynK"
},
{
"type": "github",
"href": "https://github.com/rivet-dev/sandbox-agent"
}
]
},
"navigation": {
"tabs": [
{
"tab": "Documentation",
"pages": [
{
"group": "Getting started",
"pages": [
"quickstart",
"building-chat-ui",
"manage-sessions",
{
"group": "Deploy",
"icon": "server",
"pages": [
"deploy/local",
"deploy/e2b",
"deploy/daytona",
"deploy/vercel",
"deploy/cloudflare",
"deploy/docker"
]
}
]
},
{
"group": "SDKs",
"pages": ["sdks/typescript", "sdks/python"]
},
{
"group": "Agent Features",
"pages": [
"agent-sessions",
"attachments",
"skills-config",
"mcp-config",
"custom-tools"
]
},
{
"group": "Features",
"pages": ["file-system"]
},
{
"group": "Reference",
"pages": [
"cli",
"inspector",
"session-transcript-schema",
"opencode-compatibility",
{
"group": "More",
"pages": [
"credentials",
"daemon",
"cors",
"telemetry",
{
"group": "AI",
"pages": ["ai/skill", "ai/llms-txt"]
}
]
}
]
}
]
},
{
"tab": "HTTP API",
"pages": [
{
"group": "HTTP Reference",
"openapi": "openapi.json"
}
]
}
]
}
"$schema": "https://mintlify.com/docs.json",
"theme": "mint",
"name": "Sandbox Agent SDK",
"appearance": {
"default": "dark",
"strict": true
},
"colors": {
"primary": "#ff4f00",
"light": "#ff6a2a",
"dark": "#cc3f00"
},
"favicon": "/favicon.svg",
"logo": {
"light": "/logo/light.svg",
"dark": "/logo/dark.svg"
},
"integrations": {
"posthog": {
"apiKey": "phc_6kfTNEAVw7rn1LA51cO3D69FefbKupSWFaM7OUgEpEo",
"apiHost": "https://ph.rivet.gg",
"sessionRecording": true
}
},
"navbar": {
"links": [
{
"label": "Discord",
"icon": "discord",
"href": "https://discord.gg/auCecybynK"
},
{
"label": "GitHub",
"type": "github",
"href": "https://github.com/rivet-dev/sandbox-agent"
}
]
},
"navigation": {
"tabs": [
{
"tab": "Documentation",
"pages": [
{
"group": "Getting started",
"pages": [
"quickstart",
"sdk-overview",
"llm-credentials",
"react-components",
{
"group": "Deploy",
"icon": "server",
"pages": [
"deploy/local",
"deploy/e2b",
"deploy/daytona",
"deploy/vercel",
"deploy/cloudflare",
"deploy/docker",
"deploy/modal",
"deploy/boxlite",
"deploy/computesdk"
]
}
]
},
{
"group": "Agent",
"pages": [
"agent-sessions",
{
"group": "Agents",
"icon": "robot",
"pages": ["agents/claude", "agents/codex", "agents/opencode", "agents/cursor", "agents/amp", "agents/pi"]
},
"attachments",
"skills-config",
"mcp-config",
"custom-tools"
]
},
{
"group": "System",
"pages": ["file-system", "processes", "computer-use", "common-software"]
},
{
"group": "Reference",
"pages": [
"troubleshooting",
"architecture",
"cli",
"inspector",
"opencode-compatibility",
{
"group": "More",
"pages": [
"daemon",
"cors",
"session-restoration",
"telemetry",
{
"group": "AI",
"pages": ["ai/skill", "ai/llms-txt"]
}
]
}
]
}
]
},
{
"tab": "HTTP API",
"pages": [
{
"group": "HTTP Reference",
"openapi": "openapi.json"
}
]
}
]
},
"__removed": [
{
"group": "Orchestration",
"pages": ["orchestration-architecture", "session-persistence", "observability", "multiplayer", "security"]
}
]
}

View file

@ -5,43 +5,37 @@ sidebarTitle: "File System"
icon: "folder"
---
The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload batches of files via tar archives.
The filesystem API lets you list, read, write, move, and delete files inside the sandbox, plus upload tar archives in batch.
## Path Resolution
## Path resolution
- Absolute paths are used as-is.
- Relative paths use the session working directory when `sessionId` is provided.
- Without `sessionId`, relative paths resolve against the server home directory.
- Relative paths cannot contain `..` or absolute prefixes; requests that attempt to escape the root are rejected.
- Relative paths resolve from the server process working directory.
- Requests that attempt to escape allowed roots are rejected by the server.
The session working directory is the server process current working directory at the moment the session is created.
## List Entries
## List entries
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const entries = await client.listFsEntries({
const entries = await sdk.listFsEntries({
path: "./workspace",
sessionId: "my-session",
});
console.log(entries);
```
```bash cURL
curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace"
```
</CodeGroup>
## Read And Write Files
## Read and write files
`PUT /v1/fs/file` writes raw bytes. `GET /v1/fs/file` returns raw bytes.
@ -49,102 +43,81 @@ curl -X GET "http://127.0.0.1:2468/v1/fs/entries?path=./workspace&sessionId=my-s
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.writeFsFile({ path: "./notes.txt", sessionId: "my-session" }, "hello");
const bytes = await client.readFsFile({
path: "./notes.txt",
sessionId: "my-session",
});
await sdk.writeFsFile({ path: "./notes.txt" }, "hello");
const bytes = await sdk.readFsFile({ path: "./notes.txt" });
const text = new TextDecoder().decode(bytes);
console.log(text);
```
```bash cURL
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt" \
--data-binary "hello"
curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
curl -X GET "http://127.0.0.1:2468/v1/fs/file?path=./notes.txt" \
--output ./notes.txt
```
</CodeGroup>
## Create Directories
## Create directories
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.mkdirFs({
path: "./data",
sessionId: "my-session",
});
await sdk.mkdirFs({ path: "./data" });
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=./data"
```
</CodeGroup>
## Move, Delete, And Stat
## Move, delete, and stat
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.moveFs(
{ from: "./notes.txt", to: "./notes-old.txt", overwrite: true },
{ sessionId: "my-session" },
);
const stat = await client.statFs({
path: "./notes-old.txt",
sessionId: "my-session",
await sdk.moveFs({
from: "./notes.txt",
to: "./notes-old.txt",
overwrite: true,
});
await client.deleteFsEntry({
path: "./notes-old.txt",
sessionId: "my-session",
});
const stat = await sdk.statFs({ path: "./notes-old.txt" });
await sdk.deleteFsEntry({ path: "./notes-old.txt" });
console.log(stat);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/fs/move?sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
curl -X POST "http://127.0.0.1:2468/v1/fs/move" \
-H "Content-Type: application/json" \
-d '{"from":"./notes.txt","to":"./notes-old.txt","overwrite":true}'
curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X GET "http://127.0.0.1:2468/v1/fs/stat?path=./notes-old.txt"
curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
curl -X DELETE "http://127.0.0.1:2468/v1/fs/entry?path=./notes-old.txt"
```
</CodeGroup>
## Batch Upload (Tar)
## Batch upload (tar)
Batch upload accepts `application/x-tar` only and extracts into the destination directory. The response returns absolute paths for extracted files, capped at 1024 entries.
Batch upload accepts `application/x-tar` and extracts into the destination directory.
<CodeGroup>
```ts TypeScript
@ -153,9 +126,8 @@ import fs from "node:fs";
import path from "node:path";
import tar from "tar";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
const archivePath = path.join(process.cwd(), "skills.tar");
@ -165,9 +137,8 @@ await tar.c({
}, ["."]);
const tarBuffer = await fs.promises.readFile(archivePath);
const result = await client.uploadFsBatch(tarBuffer, {
const result = await sdk.uploadFsBatch(tarBuffer, {
path: "./skills",
sessionId: "my-session",
});
console.log(result);
@ -176,8 +147,7 @@ console.log(result);
```bash cURL
tar -cf skills.tar -C ./skills .
curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills&sessionId=my-session" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
curl -X POST "http://127.0.0.1:2468/v1/fs/upload-batch?path=./skills" \
-H "Content-Type: application/x-tar" \
--data-binary @skills.tar
```

View file

@ -1,6 +0,0 @@
---
title: Gigacode
url: "https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode"
---

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.1 MiB

After

Width:  |  Height:  |  Size: 1.7 MiB

Before After
Before After

View file

@ -3,7 +3,7 @@ title: "Inspector"
description: "Debug and inspect agent sessions with the Inspector UI."
---
The Inspector is a web-based GUI for debugging and inspecting Sandbox Agent sessions. Use it to view events, send messages, and troubleshoot agent behavior in real-time.
The Inspector is a web UI for inspecting Sandbox Agent sessions. Use it to view events, inspect payloads, and troubleshoot behavior.
<Frame>
<img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
@ -11,34 +11,56 @@ The Inspector is a web-based GUI for debugging and inspecting Sandbox Agent sess
## Open the Inspector
The Inspector UI is served at `/ui/` on your sandbox-agent server. For example, if your server is running at `http://localhost:2468`, open `http://localhost:2468/ui/` in your browser.
The Inspector is served at `/ui/` on your Sandbox Agent server.
For example, if your server runs at `http://localhost:2468`, open `http://localhost:2468/ui/`.
You can also generate a pre-filled Inspector URL with authentication from the TypeScript SDK:
You can also generate a pre-filled Inspector URL from the SDK:
```typescript
import { buildInspectorUrl } from "sandbox-agent";
const url = buildInspectorUrl({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
console.log(url);
// http://127.0.0.1:2468/ui/?token=...
// http://127.0.0.1:2468/ui/
```
## Features
- **Session list**: View all active sessions and their status
- **Event stream**: See events in real-time as they arrive (SSE or polling)
- **Event details**: Expand any event to see its full JSON payload
- **Send messages**: Post messages to a session directly from the UI
- **Agent selection**: Switch between agents and modes
- **Request log**: View raw HTTP requests and responses for debugging
- Session list
- Event stream view
- Event JSON inspector
- Prompt testing
- Request/response debugging
- Interactive permission prompts (approve, always-allow, or reject tool-use requests)
- Desktop panel for status, remediation, start/stop, and screenshot refresh
- Process management (create, stop, kill, delete, view logs)
- Interactive PTY terminal for tty processes
- One-shot command execution
## When to Use
## When to use
The Inspector is useful for:
- Development: validate session behavior quickly
- Debugging: inspect raw event payloads
- Integration work: compare UI behavior with SDK/API calls
- **Development**: Test your integration without writing client code
- **Debugging**: Inspect event payloads and timing issues
- **Learning**: Understand how agents respond to different prompts
## Process terminal
The Inspector includes an embedded Ghostty-based terminal for interactive tty
processes. The UI uses the SDK's high-level `connectProcessTerminal(...)`
wrapper via the shared `@sandbox-agent/react` `ProcessTerminal` component.
## Desktop panel
The `Desktop` panel shows the current desktop runtime state, missing dependencies,
the suggested install command, last error details, process/log paths, and the
latest captured screenshot.
Use it to:
- Check whether desktop dependencies are installed
- Start or stop the managed desktop runtime
- Refresh desktop status
- Capture a fresh screenshot on demand

250
docs/llm-credentials.mdx Normal file
View file

@ -0,0 +1,250 @@
---
title: "LLM Credentials"
description: "Strategies for providing LLM provider credentials to agents."
icon: "key"
---
Sandbox Agent needs LLM provider credentials (Anthropic, OpenAI, etc.) to run agent sessions.
## Configuration
Pass credentials via `spawn.env` when starting a sandbox. Each call to `SandboxAgent.start()` can use different credentials:
```typescript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.start({
spawn: {
env: {
ANTHROPIC_API_KEY: "sk-ant-...",
OPENAI_API_KEY: "sk-...",
},
},
});
```
Each agent requires credentials from a specific provider. Sandbox Agent checks environment variables (including those passed via `spawn.env`) and host config files:
| Agent | Provider | Environment variables | Config files |
|-------|----------|----------------------|--------------|
| Claude Code | Anthropic | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | `~/.claude.json`, `~/.claude/.credentials.json` |
| Amp | Anthropic | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | `~/.amp/config.json` |
| Codex | OpenAI | `OPENAI_API_KEY`, `CODEX_API_KEY` | `~/.codex/auth.json` |
| OpenCode | Anthropic or OpenAI | `ANTHROPIC_API_KEY`, `OPENAI_API_KEY` | `~/.local/share/opencode/auth.json` |
| Mock | None | - | - |
## Credential strategies
LLM credentials are passed into the sandbox as environment variables. The agent and everything inside the sandbox has access to the token, so it's important to choose the right strategy for how you provision and scope these credentials.
| Strategy | Who pays | Cost attribution | Best for |
|----------|----------|-----------------|----------|
| **Per-tenant gateway** (recommended) | Your organization, billed back per tenant | Per-tenant keys with budgets | Multi-tenant SaaS, usage-based billing |
| **Bring your own key** | Each user (usage-based) | Per-user by default | Dev environments, internal tools |
| **Shared API key** | Your organization | None (single bill) | Single-tenant apps, internal platforms |
| **Personal subscription** | Each user (existing subscription) | Per-user by default | Local dev, internal tools where users have Claude or Codex subscriptions |
### Per-tenant gateway (recommended)
Route LLM traffic through a gateway that mints per-tenant API keys, each with its own spend tracking and budget limits.
```mermaid
graph LR
B[Your Backend] -->|tenant key| S[Sandbox]
S -->|LLM requests| G[Gateway]
G -->|scoped key| P[LLM Provider]
```
Your backend issues a scoped key per tenant, then passes it to the sandbox. This is the typical pattern when using sandbox providers (E2B, Daytona, Docker).
```typescript expandable
import { SandboxAgent } from "sandbox-agent";
async function createTenantSandbox(tenantId: string) {
// Issue a scoped key for this tenant via OpenRouter
const res = await fetch("https://openrouter.ai/api/v1/keys", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.OPENROUTER_PROVISIONING_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
name: `tenant-${tenantId}`,
limit: 50,
limitResetType: "monthly",
}),
});
const { key } = await res.json();
// Start a sandbox with the tenant's scoped key
const sdk = await SandboxAgent.start({
spawn: {
env: {
OPENAI_API_KEY: key, // OpenRouter uses OpenAI-compatible endpoints
},
},
});
const session = await sdk.createSession({
agent: "claude",
sessionInit: { cwd: "/workspace" },
});
return { sdk, session };
}
```
#### Security
Recommended for multi-tenant applications. Each tenant gets a scoped key with its own budget, so exfiltration only exposes that tenant's allowance.
#### Use cases
- **Multi-tenant SaaS**: per-tenant spend tracking and budget limits
- **Production apps**: exposed to end users who need isolated credentials
- **Usage-based billing**: each tenant pays for their own consumption
#### Choosing a gateway
<AccordionGroup>
<Accordion title="OpenRouter provisioned keys" icon="cloud">
Managed service, zero infrastructure. [OpenRouter](https://openrouter.ai/docs/features/provisioning-api-keys) provides per-tenant API keys with spend tracking and budget limits via their Provisioning API. Pass the tenant key to Sandbox Agent as `OPENAI_API_KEY` (OpenRouter uses OpenAI-compatible endpoints).
```bash
# Create a key for a tenant with a $50/month budget
curl https://openrouter.ai/api/v1/keys \
-H "Authorization: Bearer $PROVISIONING_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "tenant-acme",
"limit": 50,
"limitResetType": "monthly"
}'
```
Easiest to set up but not open-source. See [OpenRouter pricing](https://openrouter.ai/docs/framework/pricing) for details.
</Accordion>
<Accordion title="LiteLLM proxy" icon="server">
Self-hosted, open-source (MIT). [LiteLLM](https://github.com/BerriAI/litellm) is an OpenAI-compatible proxy with hierarchical budgets (org, team, user, key), virtual keys, and spend tracking. Requires Python + PostgreSQL.
```bash
# Create a team (tenant) with a $500 budget
curl http://litellm:4000/team/new \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"team_alias": "tenant-acme",
"max_budget": 500
}'
# Generate a key for that team
curl http://litellm:4000/key/generate \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"team_id": "team-abc123",
"max_budget": 100
}'
```
Full control with no vendor lock-in. Organization-level features require an enterprise license.
</Accordion>
<Accordion title="Portkey gateway" icon="code-branch">
Self-hosted, open-source (Apache 2.0). [Portkey](https://github.com/Portkey-AI/gateway) is a lightweight OpenAI-compatible gateway supporting 200+ providers. Single binary, no database required. Create virtual keys with per-tenant budget limits and pass them to Sandbox Agent.
Lightest operational footprint of the self-hosted options. Observability and analytics require the managed platform or your own tooling.
</Accordion>
</AccordionGroup>
To bill tenants for LLM usage, use [Stripe token billing](https://docs.stripe.com/billing/token-billing) (integrates natively with OpenRouter) or query your gateway's spend API and feed usage into your billing system.
### Bring your own key
Each user provides their own API key. Users are billed directly by the LLM provider with no additional infrastructure needed.
Pass the user's key via `spawn.env`:
```typescript
const sdk = await SandboxAgent.start({
spawn: {
env: {
ANTHROPIC_API_KEY: userProvidedKey,
},
},
});
```
#### Security
API keys are typically long-lived. The key is visible to the agent and anything running inside the sandbox, so exfiltration is possible. This is usually acceptable for developer-facing tools where the user owns the key.
#### Use cases
- **Developer tools**: each user manages their own API key
- **Internal platforms**: users already have LLM provider accounts
- **Per-user billing**: no extra infrastructure needed
### Shared credentials
A single organization-wide API key is used for all sessions. All token usage appears on one bill with no per-user or per-tenant cost attribution.
```typescript
const sdk = await SandboxAgent.start({
spawn: {
env: {
ANTHROPIC_API_KEY: process.env.ORG_ANTHROPIC_KEY!,
OPENAI_API_KEY: process.env.ORG_OPENAI_KEY!,
},
},
});
```
If you need to track or limit spend per tenant, use a per-tenant gateway instead.
#### Security
Not recommended for anything other than internal tooling. A single exfiltrated key exposes your organization's entire LLM budget. If you need org-paid credentials for external users, use a per-tenant gateway with scoped keys instead.
#### Use cases
- **Single-tenant apps**: small number of users, one bill
- **Prototyping**: cost attribution not needed yet
- **Simplicity over security**: acceptable when exfiltration risk is low
### Personal subscription
If the user is signed into Claude Code or Codex on the host machine, Sandbox Agent automatically picks up their OAuth tokens. No configuration is needed.
#### Remote sandboxes
Extract credentials locally and pass them to a remote sandbox via `spawn.env`:
```bash
$ sandbox-agent credentials extract-env
ANTHROPIC_API_KEY=sk-ant-...
CLAUDE_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
CODEX_API_KEY=sk-...
```
Use `-e` to prefix with `export` for shell sourcing.
#### Security
Personal subscriptions use OAuth tokens with a limited lifespan. These are the same credentials used when running an agent normally on the host. If a token is exfiltrated from the sandbox, the exposure window is short.
#### Use cases
- **Local development**: users are already signed into Claude Code or Codex
- **Internal tools**: every user has their own subscription
- **Prototyping**: no key management needed

View file

@ -6,8 +6,6 @@ icon: "database"
Sandbox Agent stores sessions in memory only. When the server restarts or the sandbox is destroyed, all session data is lost. It's your responsibility to persist events to your own database.
See the [Building a Chat UI](/building-chat-ui) guide for understanding session lifecycle events like `session.started` and `session.ended`.
## Recommended approach
1. Store events to your database as they arrive
@ -18,17 +16,18 @@ This prevents duplicate writes and lets you recover from disconnects.
## Receiving Events
Two ways to receive events: SSE streaming (recommended) or polling.
Two ways to receive events: streaming (recommended) or polling.
### Streaming
Use SSE for real-time events with automatic reconnection support.
Use streaming for real-time events with automatic reconnection support.
```typescript
import { SandboxAgent } from "sandbox-agent";
import { SandboxAgentClient } from "sandbox-agent";
const client = await SandboxAgent.connect({
const client = new SandboxAgentClient({
baseUrl: "http://127.0.0.1:2468",
agent: "mock",
});
// Get offset from last stored event (0 returns all events)
@ -43,7 +42,7 @@ for await (const event of client.streamEvents("my-session", { offset })) {
### Polling
If you can't use SSE streaming, poll the events endpoint:
If you can't use streaming, poll the events endpoint:
```typescript
const lastEvent = await db.getLastEvent("my-session");
@ -130,7 +129,10 @@ const codingSession = actor({
},
createVars: async (c): Promise<CodingSessionVars> => {
const client = await SandboxAgent.connect({ baseUrl: c.state.baseUrl });
const client = new SandboxAgentClient({
baseUrl: c.state.baseUrl,
agent: "mock",
});
await client.createSession(c.state.sessionId, { agent: "claude" });
return { client };
},
@ -240,7 +242,7 @@ const events = await redis.lrange(`session:${sessionId}`, offset, -1);
## Handling disconnects
The SSE stream may disconnect due to network issues. Handle reconnection gracefully:
The event stream may disconnect due to network issues. Handle reconnection gracefully:
```typescript
async function streamWithRetry(sessionId: string) {

View file

@ -5,118 +5,78 @@ sidebarTitle: "MCP"
icon: "plug"
---
MCP (Model Context Protocol) servers extend agents with tools. Sandbox Agent can auto-load MCP servers when a session starts by passing an `mcp` map in the create-session request.
MCP (Model Context Protocol) servers extend agents with tools and external context.
## Session Config
## Configuring MCP servers
The `mcp` field is a map of server name to config. Use `type: "local"` for stdio servers and `type: "remote"` for HTTP/SSE servers:
The HTTP config endpoints let you store/retrieve MCP server configs by directory + name.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-mcp", {
agent: "claude",
mcp: {
filesystem: {
type: "local",
command: "my-mcp-server",
args: ["--root", "."],
},
github: {
type: "remote",
url: "https://example.com/mcp",
headers: {
Authorization: "Bearer ${GITHUB_TOKEN}",
},
},
```ts
// Create MCP config
await sdk.setMcpConfig(
{
directory: "/workspace",
mcpName: "github",
},
{
type: "remote",
url: "https://example.com/mcp",
},
);
// Create a session using the configured MCP servers
const session = await sdk.createSession({
agent: "claude",
cwd: "/workspace",
});
await session.prompt([
{ type: "text", text: "Use available MCP servers to help with this task." },
]);
// List MCP configs
const config = await sdk.getMcpConfig({
directory: "/workspace",
mcpName: "github",
});
console.log(config.type);
// Delete MCP config
await sdk.deleteMcpConfig({
directory: "/workspace",
mcpName: "github",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-mcp" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"mcp": {
"filesystem": {
"type": "local",
"command": "my-mcp-server",
"args": ["--root", "."]
},
"github": {
"type": "remote",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${GITHUB_TOKEN}"
}
}
}
}'
```
## Config fields
</CodeGroup>
## Config Fields
### Local Server
Stdio servers that run inside the sandbox.
### Local server
| Field | Description |
|---|---|
| `type` | `local` |
| `command` | string or array (`["node", "server.js"]`) |
| `args` | array of string arguments |
| `env` | environment variables map |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `cwd` | working directory for the MCP process |
| `command` | executable path |
| `args` | array of CLI args |
| `env` | environment variable map |
| `cwd` | working directory |
| `enabled` | enable/disable server |
| `timeoutMs` | timeout override |
```json
{
"type": "local",
"command": ["node", "./mcp/server.js"],
"args": ["--root", "."],
"env": { "LOG_LEVEL": "debug" },
"cwd": "/workspace"
}
```
### Remote Server
HTTP/SSE servers accessed over the network.
### Remote server
| Field | Description |
|---|---|
| `type` | `remote` |
| `url` | MCP server URL |
| `headers` | static headers map |
| `bearerTokenEnvVar` | env var name to inject into `Authorization: Bearer ...` |
| `envHeaders` | map of header name to env var name |
| `oauth` | object with `clientId`, `clientSecret`, `scope`, or `false` to disable |
| `enabled` | enable or disable the server |
| `timeoutMs` | tool timeout override |
| `transport` | `http` or `sse` |
| `headers` | static headers map |
| `bearerTokenEnvVar` | env var name to inject in auth header |
| `envHeaders` | header name to env var map |
| `oauth` | optional OAuth config object |
| `enabled` | enable/disable server |
| `timeoutMs` | timeout override |
```json
{
"type": "remote",
"url": "https://example.com/mcp",
"headers": { "x-client": "sandbox-agent" },
"bearerTokenEnvVar": "MCP_TOKEN",
"transport": "sse"
}
```
## Custom MCP Servers
## Custom MCP servers
To bundle and upload your own MCP server into the sandbox, see [Custom Tools](/custom-tools).

147
docs/multiplayer.mdx Normal file
View file

@ -0,0 +1,147 @@
---
title: "Multiplayer"
description: "Use Rivet Actors to coordinate shared sessions."
icon: "users"
---
For multiplayer orchestration, use [Rivet Actors](https://rivet.dev/docs/actors).
Recommended model:
- One actor per collaborative workspace/thread.
- The actor owns Sandbox Agent session lifecycle and persistence.
- Clients connect to the actor and receive realtime broadcasts.
Use [actor keys](https://rivet.dev/docs/actors/keys) to map each workspace to one actor, [events](https://rivet.dev/docs/actors/events) for realtime updates, and [lifecycle hooks](https://rivet.dev/docs/actors/lifecycle) for cleanup.
## Example
<CodeGroup>
```ts Actor (server)
import { actor, setup } from "rivetkit";
import { SandboxAgent, type SessionPersistDriver, type SessionRecord, type SessionEvent, type ListPageRequest, type ListPage, type ListEventsRequest } from "sandbox-agent";
interface RivetPersistData { sessions: Record<string, SessionRecord>; events: Record<string, SessionEvent[]>; }
type RivetPersistState = { _sandboxAgentPersist: RivetPersistData };
class RivetSessionPersistDriver implements SessionPersistDriver {
private readonly stateKey: string;
private readonly ctx: { state: Record<string, unknown> };
constructor(ctx: { state: Record<string, unknown> }, options: { stateKey?: string } = {}) {
this.ctx = ctx;
this.stateKey = options.stateKey ?? "_sandboxAgentPersist";
if (!this.ctx.state[this.stateKey]) {
this.ctx.state[this.stateKey] = { sessions: {}, events: {} };
}
}
private get data(): RivetPersistData { return this.ctx.state[this.stateKey] as RivetPersistData; }
async getSession(id: string) { const s = this.data.sessions[id]; return s ? { ...s } : undefined; }
async listSessions(request: ListPageRequest = {}): Promise<ListPage<SessionRecord>> {
const sorted = Object.values(this.data.sessions).sort((a, b) => a.createdAt - b.createdAt || a.id.localeCompare(b.id));
const offset = Number(request.cursor ?? 0);
const limit = request.limit ?? 100;
const slice = sorted.slice(offset, offset + limit);
return { items: slice, nextCursor: offset + slice.length < sorted.length ? String(offset + slice.length) : undefined };
}
async updateSession(session: SessionRecord) { this.data.sessions[session.id] = { ...session }; if (!this.data.events[session.id]) this.data.events[session.id] = []; }
async listEvents(request: ListEventsRequest): Promise<ListPage<SessionEvent>> {
const all = [...(this.data.events[request.sessionId] ?? [])].sort((a, b) => a.eventIndex - b.eventIndex || a.id.localeCompare(b.id));
const offset = Number(request.cursor ?? 0);
const limit = request.limit ?? 100;
const slice = all.slice(offset, offset + limit);
return { items: slice, nextCursor: offset + slice.length < all.length ? String(offset + slice.length) : undefined };
}
async insertEvent(sessionId: string, event: SessionEvent) { const events = this.data.events[sessionId] ?? []; events.push({ ...event, payload: JSON.parse(JSON.stringify(event.payload)) }); this.data.events[sessionId] = events; }
}
type WorkspaceState = RivetPersistState & {
sandboxId: string;
baseUrl: string;
};
export const workspace = actor({
createState: async () => {
return {
sandboxId: "sbx_123",
baseUrl: "http://127.0.0.1:2468",
} satisfies Partial<WorkspaceState>;
},
createVars: async (c) => {
const persist = new RivetSessionPersistDriver(c);
const sdk = await SandboxAgent.connect({
baseUrl: c.state.baseUrl,
persist,
});
const session = await sdk.resumeOrCreateSession({ id: "default", agent: "codex" });
const unsubscribe = session.onEvent((event) => {
c.broadcast("session.event", event);
});
return { sdk, session, unsubscribe };
},
actions: {
getSessionInfo: (c) => ({
workspaceId: c.key[0],
sandboxId: c.state.sandboxId,
}),
prompt: async (c, input: { userId: string; text: string }) => {
c.broadcast("chat.user", {
userId: input.userId,
text: input.text,
createdAt: Date.now(),
});
await c.vars.session.prompt([{ type: "text", text: input.text }]);
},
},
onSleep: async (c) => {
c.vars.unsubscribe?.();
await c.vars.sdk.dispose();
},
});
export const registry = setup({
use: { workspace },
});
```
```ts Client (browser)
import { createClient } from "rivetkit/client";
import type { registry } from "./actors";
const client = createClient<typeof registry>({
endpoint: process.env.NEXT_PUBLIC_RIVET_ENDPOINT!,
});
const workspaceId = "workspace-42";
const room = client.workspace.getOrCreate([workspaceId]);
const conn = room.connect();
conn.on("chat.user", (event) => {
console.log("user message", event);
});
conn.on("session.event", (event) => {
console.log("sandbox event", event);
});
await conn.prompt({
userId: "user-123",
text: "Propose a refactor plan for auth middleware.",
});
```
</CodeGroup>
## Notes
- Keep sandbox calls actor-only. Browser clients should not call Sandbox Agent directly.
- Copy the Rivet persist driver from the example above into your project so session history persists in actor state.
- For client connection patterns, see [Rivet JavaScript client](https://rivet.dev/docs/clients/javascript).

64
docs/observability.mdx Normal file
View file

@ -0,0 +1,64 @@
---
title: "Observability"
description: "Track session activity with OpenTelemetry."
icon: "chart-line"
---
Use OpenTelemetry to instrument session traffic, then ship telemetry to your collector/backend.
## Common collectors and backends
- [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
- [Jaeger](https://www.jaegertracing.io/)
- [Grafana Tempo](https://grafana.com/oss/tempo/)
- [Honeycomb](https://www.honeycomb.io/)
- [Datadog APM](https://docs.datadoghq.com/tracing/)
## Example: trace a prompt round-trip
Wrap `session.prompt()` in a span to measure the full round-trip, then log individual events as span events.
Assumes your OTEL provider/exporter is already configured.
```ts
import { trace } from "@opentelemetry/api";
import { SandboxAgent } from "sandbox-agent";
const tracer = trace.getTracer("my-app/sandbox-agent");
const sdk = await SandboxAgent.connect({
baseUrl: process.env.SANDBOX_URL!,
});
const session = await sdk.createSession({ agent: "mock" });
// Log each event as an OTEL span event on the active span
const unsubscribe = session.onEvent((event) => {
const activeSpan = trace.getActiveSpan();
if (!activeSpan) return;
activeSpan.addEvent("session.event", {
"sandbox.sender": event.sender,
"sandbox.event_index": event.eventIndex,
});
});
// The span covers the full prompt round-trip
await tracer.startActiveSpan("sandbox_agent.prompt", async (span) => {
span.setAttribute("sandbox.session_id", session.id);
try {
const result = await session.prompt([
{ type: "text", text: "Summarize this repository." },
]);
span.setAttribute("sandbox.stop_reason", result.stopReason);
} catch (error) {
span.recordException(error as Error);
throw error;
} finally {
span.end();
}
});
unsubscribe();
```

File diff suppressed because it is too large Load diff

View file

@ -4,52 +4,40 @@ description: "Connect OpenCode clients, SDKs, and web UI to Sandbox Agent."
---
<Warning>
**Experimental**: OpenCode SDK & UI support is experimental and may change without notice.
**Experimental**: OpenCode SDK/UI compatibility may change.
</Warning>
Sandbox Agent exposes an OpenCode-compatible API, allowing you to connect any OpenCode client, SDK, or web UI to control coding agents running inside sandboxes.
Sandbox Agent exposes an OpenCode-compatible API at `/opencode`.
## Why Use OpenCode Clients with Sandbox Agent?
## Why use OpenCode clients with Sandbox Agent?
OpenCode provides a rich ecosystem of clients:
- OpenCode CLI (`opencode attach`)
- OpenCode web UI
- OpenCode TypeScript SDK (`@opencode-ai/sdk`)
- **OpenCode CLI** (`opencode attach`): Terminal-based interface
- **OpenCode Web UI**: Browser-based chat interface
- **OpenCode SDK** (`@opencode-ai/sdk`): Rich TypeScript SDK
## Quick start
## Quick Start
### Using OpenCode CLI & TUI
Sandbox Agent provides an all-in-one command to setup Sandbox Agent and connect an OpenCode session, great for local development:
### OpenCode CLI / TUI
```bash
sandbox-agent opencode --port 2468 --no-token
```
Or, start the server and attach separately:
Or start server + attach manually:
```bash
# Start sandbox-agent
sandbox-agent server --no-token --host 127.0.0.1 --port 2468
# Attach OpenCode CLI
opencode attach http://localhost:2468/opencode
```
With authentication enabled:
```bash
# Start with token
sandbox-agent server --token "$SANDBOX_TOKEN" --host 127.0.0.1 --port 2468
# Attach with password
opencode attach http://localhost:2468/opencode --password "$SANDBOX_TOKEN"
```
### Using the OpenCode Web UI
The OpenCode web UI can connect to Sandbox Agent for a full browser-based experience.
### OpenCode web UI
<Steps>
<Step title="Start Sandbox Agent with CORS">
@ -57,7 +45,7 @@ The OpenCode web UI can connect to Sandbox Agent for a full browser-based experi
sandbox-agent server --no-token --host 127.0.0.1 --port 2468 --cors-allow-origin http://127.0.0.1:5173
```
</Step>
<Step title="Clone and Start the OpenCode Web App">
<Step title="Run OpenCode web app">
```bash
git clone https://github.com/anomalyco/opencode
cd opencode/packages/app
@ -67,31 +55,22 @@ The OpenCode web UI can connect to Sandbox Agent for a full browser-based experi
bun run dev -- --host 127.0.0.1 --port 5173
```
</Step>
<Step title="Open the UI">
Navigate to `http://127.0.0.1:5173/` in your browser.
<Step title="Open UI">
Visit `http://127.0.0.1:5173/`.
</Step>
</Steps>
<Note>
If you see `Error: Could not connect to server`, check that:
- The sandbox-agent server is running
- `--cors-allow-origin` matches the **exact** browser origin (`localhost` and `127.0.0.1` are different origins)
</Note>
### Using OpenCode SDK
### OpenCode SDK
```typescript
import { createOpencodeClient } from "@opencode-ai/sdk";
const client = createOpencodeClient({
baseUrl: "http://localhost:2468/opencode",
headers: { Authorization: "Bearer YOUR_TOKEN" }, // if using auth
});
// Create a session
const session = await client.session.create();
// Send a prompt
await client.session.promptAsync({
path: { id: session.data.id },
body: {
@ -99,7 +78,6 @@ await client.session.promptAsync({
},
});
// Subscribe to events
const events = await client.event.subscribe({});
for await (const event of events.stream) {
console.log(event);
@ -108,42 +86,40 @@ for await (const event of events.stream) {
## Notes
- **API Routing**: The OpenCode API is available at the `/opencode` base path
- **Authentication**: If sandbox-agent is started with `--token`, include `Authorization: Bearer <token>` header or use `--password` flag with CLI
- **CORS**: When using the web UI from a different origin, configure `--cors-allow-origin`
- **Provider Selection**: Use the provider/model selector in the UI to choose which backing agent to use (claude, codex, opencode, amp)
- **Models & Variants**: Providers are grouped by backing agent (e.g. Claude Code, Codex, Amp). OpenCode models are grouped by `OpenCode (<provider>)` to preserve their native provider grouping. Each model keeps its real model ID, and variants are exposed when available (Codex/OpenCode/Amp).
- **Optional Native Proxy for TUI/Config Endpoints**: Set `OPENCODE_COMPAT_PROXY_URL` (for example `http://127.0.0.1:4096`) to proxy select OpenCode-native endpoints to a real OpenCode server. This currently applies to `/command`, `/config`, `/global/config`, and `/tui/*`. If not set, sandbox-agent uses its built-in compatibility handlers.
- API base path: `/opencode`
- If server auth is enabled, pass bearer auth (or `--password` in OpenCode CLI)
- For browser UIs, configure CORS with `--cors-allow-origin`
- Provider selector currently exposes compatible providers (`mock`, `amp`, `claude`, `codex`)
- Provider/model metadata for compatibility endpoints is normalized and may differ from native OpenCode grouping
- Optional proxy: set `OPENCODE_COMPAT_PROXY_URL` to forward selected endpoints to native OpenCode
## Endpoint Coverage
See the full endpoint compatibility table below. Most endpoints are functional for session management, messaging, and event streaming. Some endpoints return stub responses for features not yet implemented.
## Endpoint coverage
<Accordion title="Endpoint Status Table">
| Endpoint | Status | Notes |
|---|---|---|
| `GET /event` | ✓ | Emits events for session/message updates (SSE) |
| `GET /global/event` | ✓ | Wraps events in GlobalEvent format (SSE) |
| `GET /session` | ✓ | In-memory session store |
| `POST /session` | ✓ | Create new sessions |
| `GET /session/{id}` | ✓ | Get session details |
| `POST /session/{id}/message` | ✓ | Send messages to session |
| `GET /session/{id}/message` | ✓ | Get session messages |
| `GET /permission` | ✓ | List pending permissions |
| `POST /permission/{id}/reply` | ✓ | Respond to permission requests |
| `GET /question` | ✓ | List pending questions |
| `POST /question/{id}/reply` | ✓ | Answer agent questions |
| `GET /provider` | ✓ | Returns provider metadata |
| `GET /command` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `GET /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub response |
| `PATCH /global/config` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `/tui/*` | ↔ | Proxied to native OpenCode when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise local compatibility behavior |
| `GET /agent` | | Returns agent list |
| *other endpoints* | | Return empty/stub responses |
| `GET /event` | ✓ | Session/message updates (SSE) |
| `GET /global/event` | ✓ | GlobalEvent-wrapped stream |
| `GET /session` | ✓ | Session list |
| `POST /session` | ✓ | Create session |
| `GET /session/{id}` | ✓ | Session details |
| `POST /session/{id}/message` | ✓ | Send message |
| `GET /session/{id}/message` | ✓ | Session messages |
| `GET /permission` | ✓ | Pending permissions |
| `POST /permission/{id}/reply` | ✓ | Permission reply |
| `GET /question` | ✓ | Pending questions |
| `POST /question/{id}/reply` | ✓ | Question reply |
| `GET /provider` | ✓ | Provider metadata |
| `GET /command` | ↔ | Proxied when `OPENCODE_COMPAT_PROXY_URL` is set; otherwise stub |
| `GET /config` | ↔ | Proxied when set; otherwise stub |
| `PATCH /config` | ↔ | Proxied when set; otherwise local compatibility behavior |
| `GET /global/config` | ↔ | Proxied when set; otherwise stub |
| `PATCH /global/config` | ↔ | Proxied when set; otherwise local compatibility behavior |
| `/tui/*` | ↔ | Proxied when set; otherwise local compatibility behavior |
| `GET /agent` | | Agent list |
| *other endpoints* | | Empty/stub responses |
✓ Functional &nbsp;&nbsp; ↔ Proxied (optional) &nbsp;&nbsp; Stubbed
✓ Functional ↔ Proxied optional Stubbed
</Accordion>

View file

@ -0,0 +1,43 @@
---
title: "Orchestration Architecture"
description: "Production topology, backend requirements, and session persistence."
icon: "sitemap"
---
This page covers production topology and backend requirements. Read [Architecture](/architecture) first for an overview of how the server, SDK, and agent processes fit together.
## Suggested Topology
Run the SDK on your backend, then call it from your frontend.
This extra hop is recommended because it keeps auth/token logic on the backend and makes persistence simpler.
```mermaid placement="top-right"
flowchart LR
BROWSER["Browser"]
subgraph BACKEND["Your backend"]
direction TB
SDK["Sandbox Agent SDK"]
end
subgraph SANDBOX_SIMPLE["Sandbox"]
SERVER_SIMPLE["Sandbox Agent server"]
end
BROWSER --> BACKEND
BACKEND --> SDK --> SERVER_SIMPLE
```
### Backend requirements
Your backend layer needs to handle:
- **Long-running connections**: prompts can take minutes.
- **Session affinity**: follow-up messages must reach the same session.
- **State between requests**: session metadata and event history must persist across requests.
- **Graceful recovery**: sessions should resume after backend restarts.
We recommend [Rivet](https://rivet.dev) over serverless because actors natively support the long-lived connections, session routing, and state persistence that agent workloads require.
## Session persistence
For storage driver options and replay behavior, see [Persisting Sessions](/session-persistence).

258
docs/processes.mdx Normal file
View file

@ -0,0 +1,258 @@
---
title: "Processes"
description: "Run commands and manage long-lived processes inside the sandbox."
sidebarTitle: "Processes"
icon: "terminal"
---
The process API supports:
- **One-shot execution** — run a command to completion and capture stdout, stderr, and exit code
- **Managed processes** — spawn, list, stop, kill, and delete long-lived processes
- **Log streaming** — fetch buffered logs or follow live output
- **Terminals** — full PTY support with bidirectional WebSocket I/O
- **Configurable limits** — control concurrency, timeouts, and buffer sizes per runtime
## Run a command
Execute a command to completion and get its output.
<CodeGroup>
```ts TypeScript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
const result = await sdk.runProcess({
command: "ls",
args: ["-la", "/workspace"],
});
console.log(result.exitCode); // 0
console.log(result.stdout);
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"ls","args":["-la","/workspace"]}'
```
</CodeGroup>
You can set a timeout and cap output size:
<CodeGroup>
```ts TypeScript
const result = await sdk.runProcess({
command: "make",
args: ["build"],
timeoutMs: 60000,
maxOutputBytes: 1048576,
});
if (result.timedOut) {
console.log("Build timed out");
}
if (result.stdoutTruncated) {
console.log("Output was truncated");
}
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/run" \
-H "Content-Type: application/json" \
-d '{"command":"make","args":["build"],"timeoutMs":60000,"maxOutputBytes":1048576}'
```
</CodeGroup>
## Managed processes
Create a long-lived process that you can interact with, monitor, and stop later.
### Create
<CodeGroup>
```ts TypeScript
const proc = await sdk.createProcess({
command: "node",
args: ["server.js"],
cwd: "/workspace",
});
console.log(proc.id, proc.pid); // proc_1, 12345
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes" \
-H "Content-Type: application/json" \
-d '{"command":"node","args":["server.js"],"cwd":"/workspace"}'
```
</CodeGroup>
### List and get
<CodeGroup>
```ts TypeScript
const { processes } = await sdk.listProcesses();
for (const p of processes) {
console.log(p.id, p.command, p.status);
}
const proc = await sdk.getProcess("proc_1");
```
```bash cURL
curl "http://127.0.0.1:2468/v1/processes"
curl "http://127.0.0.1:2468/v1/processes/proc_1"
```
</CodeGroup>
### Stop, kill, and delete
<CodeGroup>
```ts TypeScript
// SIGTERM with optional wait
await sdk.stopProcess("proc_1", { waitMs: 5000 });
// SIGKILL
await sdk.killProcess("proc_1", { waitMs: 1000 });
// Remove exited process record
await sdk.deleteProcess("proc_1");
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/stop?waitMs=5000"
curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/kill?waitMs=1000"
curl -X DELETE "http://127.0.0.1:2468/v1/processes/proc_1"
```
</CodeGroup>
## Logs
### Fetch buffered logs
<CodeGroup>
```ts TypeScript
const logs = await sdk.getProcessLogs("proc_1", {
tail: 50,
stream: "combined",
});
for (const entry of logs.entries) {
console.log(entry.stream, atob(entry.data));
}
```
```bash cURL
curl "http://127.0.0.1:2468/v1/processes/proc_1/logs?tail=50&stream=combined"
```
</CodeGroup>
### Follow logs
Stream log entries in real time. The subscription replays buffered entries first, then streams new output as it arrives.
```ts TypeScript
const sub = await sdk.followProcessLogs("proc_1", (entry) => {
console.log(entry.stream, atob(entry.data));
});
// Later, stop following
sub.close();
await sub.closed;
```
## Terminals
Create a process with `tty: true` to allocate a pseudo-terminal, then connect via WebSocket for full bidirectional I/O.
```ts TypeScript
const proc = await sdk.createProcess({
command: "bash",
tty: true,
});
```
### Write input
<CodeGroup>
```ts TypeScript
await sdk.sendProcessInput("proc_1", {
data: "echo hello\n",
encoding: "utf8",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/processes/proc_1/input" \
-H "Content-Type: application/json" \
-d '{"data":"echo hello\n","encoding":"utf8"}'
```
</CodeGroup>
### Connect to a terminal
Use `ProcessTerminalSession` unless you need direct frame access.
```ts TypeScript
const terminal = sdk.connectProcessTerminal("proc_1");
terminal.onReady(() => {
terminal.resize({ cols: 120, rows: 40 });
terminal.sendInput("ls\n");
});
terminal.onData((bytes) => {
process.stdout.write(new TextDecoder().decode(bytes));
});
terminal.onExit((status) => {
console.log("exit:", status.exitCode);
});
terminal.onError((error) => {
console.error(error instanceof Error ? error.message : error.message);
});
terminal.onClose(() => {
console.log("terminal closed");
});
```
Since the browser WebSocket API cannot send custom headers, the endpoint accepts an `access_token` query parameter for authentication. The SDK handles this automatically.
### Browser terminal emulators
The terminal session works with any browser terminal emulator like ghostty-web or xterm.js. For a drop-in React terminal, see [React Components](/react-components).
## Configuration
Adjust runtime limits like max concurrent processes, timeouts, and buffer sizes.
<CodeGroup>
```ts TypeScript
const config = await sdk.getProcessConfig();
console.log(config);
await sdk.setProcessConfig({
...config,
maxConcurrentProcesses: 32,
defaultRunTimeoutMs: 60000,
});
```
```bash cURL
curl "http://127.0.0.1:2468/v1/processes/config"
curl -X POST "http://127.0.0.1:2468/v1/processes/config" \
-H "Content-Type: application/json" \
-d '{"maxConcurrentProcesses":32,"defaultRunTimeoutMs":60000,"maxRunTimeoutMs":300000,"maxOutputBytes":1048576,"maxLogBytesPerProcess":10485760,"maxInputBytesPerRequest":65536}'
```
</CodeGroup>

View file

@ -61,21 +61,26 @@ icon: "rocket"
<Tab title="Docker">
```bash
docker run -e ANTHROPIC_API_KEY="sk-ant-..." \
docker run -p 2468:2468 \
-e ANTHROPIC_API_KEY="sk-ant-..." \
-e OPENAI_API_KEY="sk-..." \
your-image
rivetdev/sandbox-agent:0.4.2-full \
server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
</Tabs>
<AccordionGroup>
<Accordion title="Extracting API keys from current machine">
Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from your existing Claude Code or Codex config files on your machine.
</Accordion>
<Accordion title="Testing without API keys">
If you want to test Sandbox Agent without API keys, use the `mock` agent to test the SDK without any credentials. It simulates agent responses for development and testing.
</Accordion>
</AccordionGroup>
<AccordionGroup>
<Accordion title="Extracting API keys from current machine">
Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from local Claude Code or Codex config files.
</Accordion>
<Accordion title="Testing without API keys">
Use the `mock` agent for SDK and integration testing without provider credentials.
</Accordion>
<Accordion title="Multi-tenant and per-user billing">
For per-tenant token tracking, budget enforcement, or usage-based billing, see [LLM Credentials](/llm-credentials) for gateway options like OpenRouter, LiteLLM, and Portkey.
</Accordion>
</AccordionGroup>
</Step>
<Step title="Run the server">
@ -84,7 +89,7 @@ icon: "rocket"
Install and run the binary directly.
```bash
curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
@ -93,7 +98,7 @@ icon: "rocket"
Run without installing globally.
```bash
npx @sandbox-agent/cli server --no-token --host 0.0.0.0 --port 2468
npx @sandbox-agent/cli@0.4.x server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
@ -101,7 +106,7 @@ icon: "rocket"
Run without installing globally.
```bash
bunx @sandbox-agent/cli server --no-token --host 0.0.0.0 --port 2468
bunx @sandbox-agent/cli@0.4.x server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
@ -109,7 +114,7 @@ icon: "rocket"
Install globally, then run.
```bash
npm install -g @sandbox-agent/cli
npm install -g @sandbox-agent/cli@0.4.x
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
@ -118,44 +123,41 @@ icon: "rocket"
Install globally, then run.
```bash
bun add -g @sandbox-agent/cli
bun add -g @sandbox-agent/cli@0.4.x
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="Node.js (local)">
For local development, use `SandboxAgent.start()` to automatically spawn and manage the server as a subprocess.
For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
```bash
npm install sandbox-agent
npm install sandbox-agent@0.4.x
```
```typescript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.start();
const sdk = await SandboxAgent.start();
```
</Tab>
<Tab title="Bun (local)">
For local development, use `SandboxAgent.start()` to automatically spawn and manage the server as a subprocess.
For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
```bash
bun add sandbox-agent
bun add sandbox-agent@0.4.x
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
```typescript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.start();
const sdk = await SandboxAgent.start();
```
This installs the binary and starts the server for you. No manual setup required.
</Tab>
<Tab title="Build from source">
@ -167,179 +169,121 @@ icon: "rocket"
</Tab>
</Tabs>
Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely.
Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely.
<AccordionGroup>
<Accordion title="Configuring token">
Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure their networking at the infrastructure level, so the server endpoint is never publicly accessible. For local development, binding to `127.0.0.1` ensures only local connections are accepted.
<AccordionGroup>
<Accordion title="Configuring token">
Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure networking at the infrastructure layer.
If you need to expose the server on a public endpoint, use `--token "$SANDBOX_TOKEN"` to require authentication on all requests:
If you expose the server publicly, use `--token "$SANDBOX_TOKEN"` to require authentication:
```bash
sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
```
```bash
sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
```
Then pass the token when connecting:
Then pass the token when connecting:
<Tabs>
<Tab title="TypeScript">
```typescript
const client = await SandboxAgent.connect({
baseUrl: "http://your-server:2468",
token: process.env.SANDBOX_TOKEN,
});
```
</Tab>
<Tabs>
<Tab title="TypeScript">
```typescript
import { SandboxAgent } from "sandbox-agent";
<Tab title="curl">
```bash
curl "http://your-server:2468/v1/sessions" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</Tab>
const sdk = await SandboxAgent.connect({
baseUrl: "http://your-server:2468",
token: process.env.SANDBOX_TOKEN,
});
```
</Tab>
<Tab title="CLI">
```bash
sandbox-agent api sessions list \
--endpoint http://your-server:2468 \
--token "$SANDBOX_TOKEN"
```
</Tab>
</Tabs>
</Accordion>
<Accordion title="CORS">
If you're calling the server from a browser, see the [CORS configuration guide](/docs/cors).
</Accordion>
</AccordionGroup>
<Tab title="curl">
```bash
curl "http://your-server:2468/v1/health" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</Tab>
<Tab title="CLI">
```bash
sandbox-agent --token "$SANDBOX_TOKEN" api agents list \
--endpoint http://your-server:2468
```
</Tab>
</Tabs>
</Accordion>
<Accordion title="CORS">
If you're calling the server from a browser, see the [CORS configuration guide](/cors).
</Accordion>
</AccordionGroup>
</Step>
<Step title="Install agents (optional)">
To preinstall agents:
```bash
sandbox-agent install-agent claude
sandbox-agent install-agent codex
sandbox-agent install-agent opencode
sandbox-agent install-agent amp
sandbox-agent install-agent --all
```
If agents are not installed up front, they will be lazily installed when creating a session. It's recommended to pre-install agents then take a snapshot of the sandbox for faster coldstarts.
If agents are not installed up front, they are lazily installed when creating a session.
</Step>
<Step title="Install desktop dependencies (optional, Linux only)">
If you want to use `/v1/desktop/*`, install the desktop runtime packages first:
```bash
sandbox-agent install desktop --yes
```
Then use `GET /v1/desktop/status` or `sdk.getDesktopStatus()` to verify the runtime is ready before calling desktop screenshot or input APIs.
</Step>
<Step title="Create a session">
<Tabs>
<Tab title="TypeScript">
```typescript
import { SandboxAgent } from "sandbox-agent";
```typescript
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
await client.createSession("my-session", {
agent: "claude",
agentMode: "build",
permissionMode: "default",
});
```
</Tab>
const session = await sdk.createSession({
agent: "claude",
sessionInit: {
cwd: "/",
mcpServers: [],
},
});
<Tab title="curl">
```bash
curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session" \
-H "Content-Type: application/json" \
-d '{"agent":"claude","agentMode":"build","permissionMode":"default"}'
```
</Tab>
<Tab title="CLI">
```bash
sandbox-agent api sessions create my-session \
--agent claude \
--endpoint http://127.0.0.1:2468
```
</Tab>
</Tabs>
console.log(session.id);
```
</Step>
<Step title="Send a message">
<Tabs>
<Tab title="TypeScript">
```typescript
await client.postMessage("my-session", {
message: "Summarize the repository and suggest next steps.",
});
```
</Tab>
```typescript
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
<Tab title="curl">
```bash
curl -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages" \
-H "Content-Type: application/json" \
-d '{"message":"Summarize the repository and suggest next steps."}'
```
</Tab>
<Tab title="CLI">
```bash
sandbox-agent api sessions send-message my-session \
--message "Summarize the repository and suggest next steps." \
--endpoint http://127.0.0.1:2468
```
</Tab>
</Tabs>
console.log(result.stopReason);
```
</Step>
<Step title="Read events">
<Tabs>
<Tab title="TypeScript">
```typescript
// Poll for events
const events = await client.getEvents("my-session", { offset: 0, limit: 50 });
```typescript
const off = session.onEvent((event) => {
console.log(event.sender, event.payload);
});
// Or stream events
for await (const event of client.streamEvents("my-session", { offset: 0 })) {
console.log(event.type, event.data);
}
```
</Tab>
const page = await sdk.getEvents({
sessionId: session.id,
limit: 50,
});
<Tab title="curl">
```bash
# Poll for events
curl "http://127.0.0.1:2468/v1/sessions/my-session/events?offset=0&limit=50"
# Stream events via SSE
curl "http://127.0.0.1:2468/v1/sessions/my-session/events/sse?offset=0"
# Single-turn stream (post message and get streamed response)
curl -N -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages/stream" \
-H "Content-Type: application/json" \
-d '{"message":"Hello"}'
```
</Tab>
<Tab title="CLI">
```bash
# Poll for events
sandbox-agent api sessions events my-session \
--endpoint http://127.0.0.1:2468
# Stream events via SSE
sandbox-agent api sessions events-sse my-session \
--endpoint http://127.0.0.1:2468
# Single-turn stream
sandbox-agent api sessions send-message-stream my-session \
--message "Hello" \
--endpoint http://127.0.0.1:2468
```
</Tab>
</Tabs>
console.log(page.items.length);
off();
```
</Step>
<Step title="Test with Inspector">
Open the Inspector UI at `/ui/` on your server (e.g., `http://localhost:2468/ui/`) to inspect session state using a GUI.
Open the Inspector UI at `/ui/` on your server (for example, `http://localhost:2468/ui/`) to inspect sessions and events in a GUI.
<Frame>
<img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
@ -350,13 +294,13 @@ icon: "rocket"
## Next steps
<CardGroup cols={3}>
<Card title="Build a Chat UI" icon="comments" href="/building-chat-ui">
Learn how to build a chat interface for your agent.
<Card title="Session Persistence" icon="database" href="/session-persistence">
Configure in-memory, Rivet Actor state, IndexedDB, SQLite, and Postgres persistence.
</Card>
<Card title="Manage Sessions" icon="database" href="/manage-sessions">
Persist and replay agent transcripts.
<Card title="Deploy to a Sandbox" icon="box" href="/deploy/local">
Deploy your agent to E2B, Daytona, Docker, Vercel, or Cloudflare.
</Card>
<Card title="Deploy to a Sandbox" icon="box" href="/deploy">
Deploy your agent to E2B, Daytona, or Vercel Sandboxes.
<Card title="SDK Overview" icon="compass" href="/sdk-overview">
Use the latest TypeScript SDK API.
</Card>
</CardGroup>

245
docs/react-components.mdx Normal file
View file

@ -0,0 +1,245 @@
---
title: "React Components"
description: "Drop-in React components for Sandbox Agent frontends."
icon: "react"
---
`@sandbox-agent/react` exposes small React components built on top of the `sandbox-agent` SDK.
Current exports:
- `AgentConversation` for a combined transcript + composer surface
- `ProcessTerminal` for attaching to a running tty process
- `AgentTranscript` for rendering session/message timelines without bundling any styles
- `ChatComposer` for a reusable prompt input/send surface
- `useTranscriptVirtualizer` for wiring large transcript lists to a scroll container
## Install
```bash
npm install @sandbox-agent/react@0.4.x
```
## Full example
This example connects to a running Sandbox Agent server, starts a tty shell, renders `ProcessTerminal`, and cleans up the process when the component unmounts.
```tsx TerminalPane.tsx expandable highlight={5,32-36,71}
"use client";
import { useEffect, useState } from "react";
import { SandboxAgent } from "sandbox-agent";
import { ProcessTerminal } from "@sandbox-agent/react";
export default function TerminalPane() {
const [client, setClient] = useState<SandboxAgent | null>(null);
const [processId, setProcessId] = useState<string | null>(null);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
let cancelled = false;
let sdk: SandboxAgent | null = null;
let createdProcessId: string | null = null;
const cleanup = async () => {
if (!sdk || !createdProcessId) {
return;
}
await sdk.killProcess(createdProcessId, { waitMs: 1_000 }).catch(() => {});
await sdk.deleteProcess(createdProcessId).catch(() => {});
};
const start = async () => {
try {
sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
const process = await sdk.createProcess({
command: "sh",
interactive: true,
tty: true,
});
if (cancelled) {
createdProcessId = process.id;
await cleanup();
await sdk.dispose();
return;
}
createdProcessId = process.id;
setClient(sdk);
setProcessId(process.id);
} catch (err) {
const message = err instanceof Error ? err.message : "Failed to start terminal.";
setError(message);
}
};
void start();
return () => {
cancelled = true;
void cleanup();
void sdk?.dispose();
};
}, []);
if (error) {
return <div>{error}</div>;
}
if (!client || !processId) {
return <div>Starting terminal...</div>;
}
return <ProcessTerminal client={client} processId={processId} height={480} />;
}
```
## Component
`ProcessTerminal` attaches to a running tty process.
- `client`: a `SandboxAgent` client
- `processId`: the process to attach to
- `height`, `style`, `terminalStyle`: optional layout overrides
- `onExit`, `onError`: optional lifecycle callbacks
See [Processes](/processes) for the lower-level terminal APIs.
## Headless transcript
`AgentTranscript` is intentionally unstyled. It follows the common headless React pattern used by libraries like Radix, Headless UI, and React Aria: behavior lives in the component, while styling stays in your app through `className`, slot-level `classNames`, and `data-*` state attributes on the rendered DOM.
```tsx TranscriptPane.tsx
import {
AgentTranscript,
type AgentTranscriptClassNames,
type TranscriptEntry,
} from "@sandbox-agent/react";
const transcriptClasses: Partial<AgentTranscriptClassNames> = {
root: "transcript",
message: "transcript-message",
messageContent: "transcript-message-content",
toolGroupContainer: "transcript-tools",
toolGroupHeader: "transcript-tools-header",
toolItem: "transcript-tool-item",
toolItemHeader: "transcript-tool-item-header",
toolItemBody: "transcript-tool-item-body",
divider: "transcript-divider",
dividerText: "transcript-divider-text",
error: "transcript-error",
};
export function TranscriptPane({ entries }: { entries: TranscriptEntry[] }) {
return (
<AgentTranscript
entries={entries}
classNames={transcriptClasses}
renderMessageText={(entry) => <div>{entry.text}</div>}
renderInlinePendingIndicator={() => <span>...</span>}
renderToolGroupIcon={() => <span>Events</span>}
renderChevron={(expanded) => <span>{expanded ? "Hide" : "Show"}</span>}
/>
);
}
```
```css
.transcript {
display: grid;
gap: 12px;
}
.transcript [data-slot="message"][data-variant="user"] .transcript-message-content {
background: #161616;
color: white;
}
.transcript [data-slot="message"][data-variant="assistant"] .transcript-message-content {
background: #f4f4f0;
color: #161616;
}
.transcript [data-slot="tool-item"][data-failed="true"] {
border-color: #d33;
}
.transcript [data-slot="tool-item-header"][data-expanded="true"] {
background: rgba(0, 0, 0, 0.06);
}
```
`AgentTranscript` accepts `TranscriptEntry[]`, which matches the Inspector timeline shape:
- `message` entries render user/assistant text
- `tool` entries render expandable tool input/output sections
- `reasoning` entries render expandable reasoning blocks
- `meta` entries render status rows or expandable metadata details
Useful props:
- `className`: root class hook
- `classNames`: slot-level class hooks for styling from outside the package
- `scrollRef` + `virtualize`: opt into TanStack Virtual against an external scroll container
- `renderMessageText`: custom text or markdown renderer
- `renderToolItemIcon`, `renderToolGroupIcon`, `renderChevron`, `renderEventLinkContent`: presentation overrides
- `renderInlinePendingIndicator`, `renderThinkingState`: loading/thinking UI overrides
- `isDividerEntry`, `canOpenEvent`, `getToolGroupSummary`: behavior overrides for grouping and labels
## Transcript virtualization hook
`useTranscriptVirtualizer` exposes the same TanStack Virtual behavior used by `AgentTranscript` when `virtualize` is enabled.
- Pass the grouped transcript rows you want to virtualize
- Pass a `scrollRef` that points at the actual scrollable element
- Use it when you need transcript-aware virtualization outside the stock `AgentTranscript` renderer
## Composer and conversation
`ChatComposer` is the headless message input. `AgentConversation` composes `AgentTranscript` and `ChatComposer` so apps can reuse the transcript/composer pairing without pulling in Inspector session chrome.
```tsx ConversationPane.tsx
import { AgentConversation, type TranscriptEntry } from "@sandbox-agent/react";
export function ConversationPane({
entries,
message,
onMessageChange,
onSubmit,
}: {
entries: TranscriptEntry[];
message: string;
onMessageChange: (value: string) => void;
onSubmit: () => void;
}) {
return (
<AgentConversation
entries={entries}
emptyState={<div>Start the conversation.</div>}
transcriptProps={{
renderMessageText: (entry) => <div>{entry.text}</div>,
}}
composerProps={{
message,
onMessageChange,
onSubmit,
placeholder: "Send a message...",
}}
/>
);
}
```
Useful `ChatComposer` props:
- `className` and `classNames` for external styling
- `inputRef` to manage focus or autoresize from the consumer
- `textareaProps` for lower-level textarea behavior
- `allowEmptySubmit` when the submit action is valid without draft text, such as a stop button
Use `transcriptProps` and `composerProps` when you want the shared composition but still need custom rendering or behavior. Use `transcriptClassNames` and `composerClassNames` when you want styling hooks for each subcomponent.

276
docs/sdk-overview.mdx Normal file
View file

@ -0,0 +1,276 @@
---
title: "SDK Overview"
description: "Use the TypeScript SDK to manage Sandbox Agent sessions and APIs."
icon: "compass"
---
The TypeScript SDK is centered on `sandbox-agent` and its `SandboxAgent` class.
## Install
<Tabs>
<Tab title="npm">
```bash
npm install sandbox-agent@0.4.x
```
</Tab>
<Tab title="bun">
```bash
bun add sandbox-agent@0.4.x
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
</Tab>
</Tabs>
## Optional React components
```bash
npm install @sandbox-agent/react@0.4.x
```
## Create a client
```ts
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
```
`SandboxAgent.connect(...)` now waits for `/v1/health` by default before other SDK requests proceed. To disable that gate, pass `waitForHealth: false`. To keep the default gate but fail after a bounded wait, pass `waitForHealth: { timeoutMs: 120_000 }`. To cancel the startup wait early, pass `signal: abortController.signal`.
With a custom fetch handler (for example, proxying requests inside Workers):
```ts
const sdk = await SandboxAgent.connect({
fetch: (input, init) => customFetch(input, init),
});
```
With an abort signal for the startup health gate:
```ts
const controller = new AbortController();
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
signal: controller.signal,
});
controller.abort();
```
With persistence (see [Persisting Sessions](/session-persistence) for driver options):
```ts
import { SandboxAgent, InMemorySessionPersistDriver } from "sandbox-agent";
const persist = new InMemorySessionPersistDriver();
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
persist,
});
```
Local spawn with a sandbox provider:
```ts
import { SandboxAgent } from "sandbox-agent";
import { local } from "sandbox-agent/local";
const sdk = await SandboxAgent.start({
sandbox: local(),
});
// sdk.sandboxId — prefixed provider ID (e.g. "local/127.0.0.1:2468")
await sdk.destroySandbox(); // provider-defined cleanup + disposes client
```
`SandboxAgent.start(...)` requires a `sandbox` provider. Built-in providers:
| Import | Provider |
|--------|----------|
| `sandbox-agent/local` | Local subprocess |
| `sandbox-agent/docker` | Docker container |
| `sandbox-agent/e2b` | E2B sandbox |
| `sandbox-agent/daytona` | Daytona workspace |
| `sandbox-agent/vercel` | Vercel Sandbox |
| `sandbox-agent/cloudflare` | Cloudflare Sandbox |
Use `sdk.dispose()` to disconnect without changing sandbox state, `sdk.pauseSandbox()` for graceful suspension when supported, or `sdk.killSandbox()` for permanent deletion.
## Session flow
```ts
const session = await sdk.createSession({
agent: "mock",
cwd: "/",
});
const prompt = await session.prompt([
{ type: "text", text: "Summarize this repository." },
]);
console.log(prompt.stopReason);
```
Load and destroy:
```ts
const restored = await sdk.resumeSession(session.id);
await restored.prompt([{ type: "text", text: "Continue from previous context." }]);
await sdk.destroySession(restored.id);
```
## Session configuration
Set model, mode, or thought level at creation or on an existing session:
```ts
const session = await sdk.createSession({
agent: "codex",
model: "gpt-5.3-codex",
});
await session.setModel("gpt-5.2-codex");
await session.setMode("auto");
const options = await session.getConfigOptions();
const modes = await session.getModes();
```
Handle permission requests from agents that ask before executing tools:
```ts
const claude = await sdk.createSession({
agent: "claude",
mode: "default",
});
claude.onPermissionRequest((request) => {
void claude.respondPermission(request.id, "once");
});
```
See [Agent Sessions](/agent-sessions) for full details on config options and error handling.
## Events
Subscribe to live events:
```ts
const unsubscribe = session.onEvent((event) => {
console.log(event.eventIndex, event.sender, event.payload);
});
await session.prompt([{ type: "text", text: "Give me a short summary." }]);
unsubscribe();
```
Fetch persisted events:
```ts
const page = await sdk.getEvents({
sessionId: session.id,
limit: 100,
});
console.log(page.items.length);
```
## Control-plane and HTTP helpers
```ts
const health = await sdk.getHealth();
const agents = await sdk.listAgents();
await sdk.installAgent("codex", { reinstall: true });
const entries = await sdk.listFsEntries({ path: "." });
const writeResult = await sdk.writeFsFile({ path: "./hello.txt" }, "hello");
console.log(health.status, agents.agents.length, entries.length, writeResult.path);
```
## Desktop API
The SDK also wraps the desktop host/runtime HTTP API.
Install desktop dependencies first on Linux hosts:
```bash
sandbox-agent install desktop --yes
```
Then query status, surface remediation if needed, and start the runtime:
```ts
const status = await sdk.getDesktopStatus();
if (status.state === "install_required") {
console.log(status.installCommand);
}
const started = await sdk.startDesktop({
width: 1440,
height: 900,
dpi: 96,
});
const screenshot = await sdk.takeDesktopScreenshot();
const displayInfo = await sdk.getDesktopDisplayInfo();
await sdk.moveDesktopMouse({ x: 400, y: 300 });
await sdk.clickDesktop({ x: 400, y: 300, button: "left", clickCount: 1 });
await sdk.typeDesktopText({ text: "hello world", delayMs: 10 });
await sdk.pressDesktopKey({ key: "ctrl+l" });
await sdk.stopDesktop();
```
Screenshot helpers return `Uint8Array` PNG bytes. The SDK does not attempt to install OS packages remotely; callers should surface `missingDependencies` and `installCommand` from `getDesktopStatus()`.
## Error handling
```ts
import { SandboxAgentError } from "sandbox-agent";
try {
await sdk.listAgents();
} catch (error) {
if (error instanceof SandboxAgentError) {
console.error(error.status, error.problem);
}
}
```
## Inspector URL
```ts
import { buildInspectorUrl } from "sandbox-agent";
const url = buildInspectorUrl({
baseUrl: "https://your-sandbox-agent.example.com",
headers: { "X-Custom-Header": "value" },
});
console.log(url);
```
Parameters:
- `baseUrl` (required unless `fetch` is provided): Sandbox Agent server URL
- `token` (optional): Bearer token for authenticated servers
- `headers` (optional): Additional request headers
- `fetch` (optional): Custom fetch implementation used by SDK HTTP and session calls
- `skipHealthCheck` (optional): set `true` to skip the startup `/v1/health` wait
- `waitForHealth` (optional, defaults to enabled): waits for `/v1/health` before HTTP helpers and session setup proceed; pass `false` to disable or `{ timeoutMs }` to bound the wait
- `signal` (optional): aborts the startup `/v1/health` wait used by `connect()`
## LLM credentials
Sandbox Agent supports personal API keys, shared organization keys, and per-tenant gateway keys with budget enforcement. See [LLM Credentials](/llm-credentials) for setup details.

View file

@ -1,41 +0,0 @@
---
title: "Python"
description: "Python client for managing sessions and streaming events."
icon: "python"
tag: "Coming Soon"
---
The Python SDK is on our roadmap. It will provide a typed client for managing sessions and streaming events, similar to the TypeScript SDK.
In the meantime, you can use the [HTTP API](/http-api) directly with any HTTP client like `requests` or `httpx`.
```python
import httpx
base_url = "http://127.0.0.1:2468"
headers = {"Authorization": f"Bearer {token}"}
# Create a session
httpx.post(
f"{base_url}/v1/sessions/my-session",
headers=headers,
json={"agent": "claude", "permissionMode": "default"}
)
# Send a message
httpx.post(
f"{base_url}/v1/sessions/my-session/messages",
headers=headers,
json={"message": "Hello from Python"}
)
# Get events
response = httpx.get(
f"{base_url}/v1/sessions/my-session/events",
headers=headers,
params={"offset": 0, "limit": 50}
)
events = response.json()["events"]
```
Want the Python SDK sooner? [Open an issue](https://github.com/rivet-dev/sandbox-agent/issues) to let us know.

View file

@ -1,162 +0,0 @@
---
title: "TypeScript"
description: "Use the generated client to manage sessions and stream events."
icon: "js"
---
The TypeScript SDK is generated from the OpenAPI spec that ships with the server. It provides a typed
client for sessions, events, and agent operations.
## Install
<Tabs>
<Tab title="npm">
```bash
npm install sandbox-agent
```
</Tab>
<Tab title="bun">
```bash
bun add sandbox-agent
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
</Tab>
</Tabs>
## Create a client
```ts
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
```
## Autospawn (Node only)
If you run locally, the SDK can launch the server for you.
```ts
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.start();
await client.dispose();
```
Autospawn uses the local `sandbox-agent` binary. Install `@sandbox-agent/cli` (recommended) or set
`SANDBOX_AGENT_BIN` to a custom path.
## Sessions and messages
```ts
await client.createSession("demo-session", {
agent: "codex",
agentMode: "default",
permissionMode: "plan",
});
await client.postMessage("demo-session", { message: "Hello" });
```
List agents and inspect feature coverage (available on `capabilities`):
```ts
const agents = await client.listAgents();
const codex = agents.agents.find((agent) => agent.id === "codex");
console.log(codex?.capabilities);
```
## Poll events
```ts
const events = await client.getEvents("demo-session", {
offset: 0,
limit: 200,
includeRaw: false,
});
for (const event of events.events) {
console.log(event.type, event.data);
}
```
## Stream events (SSE)
```ts
for await (const event of client.streamEvents("demo-session", {
offset: 0,
includeRaw: false,
})) {
console.log(event.type, event.data);
}
```
The SDK parses `text/event-stream` into `UniversalEvent` objects. If you want full control, use
`getEventsSse()` and parse the stream yourself.
## Stream a single turn
```ts
for await (const event of client.streamTurn("demo-session", { message: "Hello" })) {
console.log(event.type, event.data);
}
```
This method posts the message and streams only the next turn. For manual control, call
`postMessageStream()` and parse the SSE response yourself.
## Optional raw payloads
Set `includeRaw: true` on `getEvents`, `streamEvents`, or `streamTurn` to include the raw provider
payload in `event.raw`. This is useful for debugging and conversion analysis.
## Error handling
All HTTP errors throw `SandboxAgentError`:
```ts
import { SandboxAgentError } from "sandbox-agent";
try {
await client.postMessage("missing-session", { message: "Hi" });
} catch (error) {
if (error instanceof SandboxAgentError) {
console.error(error.status, error.problem);
}
}
```
## Inspector URL
Build a URL to open the sandbox-agent Inspector UI with pre-filled connection settings:
```ts
import { buildInspectorUrl } from "sandbox-agent";
const url = buildInspectorUrl({
baseUrl: "https://your-sandbox-agent.example.com",
token: "optional-bearer-token",
headers: { "X-Custom-Header": "value" },
});
console.log(url);
// https://your-sandbox-agent.example.com/ui/?token=...&headers=...
```
Parameters:
- `baseUrl` (required): The sandbox-agent server URL
- `token` (optional): Bearer token for authentication
- `headers` (optional): Extra headers to pass to the server (JSON-encoded in the URL)
## Types
The SDK exports OpenAPI-derived types for events, items, and feature coverage:
```ts
import type { UniversalEvent, UniversalItem, AgentCapabilities } from "sandbox-agent";
```
See the [API Reference](/api) for schema details.

191
docs/security.mdx Normal file
View file

@ -0,0 +1,191 @@
---
title: "Security"
description: "Backend-first auth and access control patterns."
icon: "shield"
---
As covered in [Orchestration Architecture](/orchestration-architecture), run the Sandbox Agent client on your backend, not in the browser.
This keeps sandbox credentials private and gives you one place for authz, rate limiting, and audit logging.
## Auth model
Implement auth however it fits your stack (sessions, JWT, API keys, etc.), but enforce it before any sandbox-bound request.
Minimum checks:
- Authenticate the caller.
- Authorize access to the target workspace/sandbox/session.
- Apply request rate limits and request logging.
## Examples
### Rivet
<CodeGroup>
```ts Actor (server)
import { UserError, actor } from "rivetkit";
import { SandboxAgent } from "sandbox-agent";
type ConnParams = {
accessToken: string;
};
type WorkspaceClaims = {
sub: string;
workspaceId: string;
role: "owner" | "member" | "viewer";
};
async function verifyWorkspaceToken(
token: string,
workspaceId: string,
): Promise<WorkspaceClaims | null> {
// Validate JWT/session token here, then enforce workspace scope.
// Return null when invalid/expired/not a member.
if (!token) return null;
return { sub: "user_123", workspaceId, role: "member" };
}
export const workspace = actor({
state: {
events: [] as Array<{ userId: string; prompt: string; createdAt: number }>,
},
onBeforeConnect: async (c, params: ConnParams) => {
const claims = await verifyWorkspaceToken(params.accessToken, c.key[0]);
if (!claims) {
throw new UserError("Forbidden", { code: "forbidden" });
}
},
createConnState: async (c, params: ConnParams) => {
const claims = await verifyWorkspaceToken(params.accessToken, c.key[0]);
if (!claims) {
throw new UserError("Forbidden", { code: "forbidden" });
}
return {
userId: claims.sub,
role: claims.role,
workspaceId: claims.workspaceId,
};
},
actions: {
submitPrompt: async (c, prompt: string) => {
if (!c.conn) {
throw new UserError("Connection required", { code: "connection_required" });
}
if (c.conn.state.role === "viewer") {
throw new UserError("Insufficient permissions", { code: "forbidden" });
}
// Connect to Sandbox Agent from the actor (server-side only).
// Sandbox credentials never reach the client.
const sdk = await SandboxAgent.connect({
baseUrl: process.env.SANDBOX_URL!,
token: process.env.SANDBOX_TOKEN,
});
const session = await sdk.createSession({
agent: "claude",
cwd: "/workspace",
});
session.onEvent((event) => {
c.broadcast("session.event", {
userId: c.conn!.state.userId,
eventIndex: event.eventIndex,
sender: event.sender,
payload: event.payload,
});
});
const result = await session.prompt([
{ type: "text", text: prompt },
]);
c.state.events.push({
userId: c.conn.state.userId,
prompt,
createdAt: Date.now(),
});
return { stopReason: result.stopReason };
},
},
});
```
```ts Client (browser)
import { createClient } from "rivetkit/client";
import type { registry } from "./actors";
const client = createClient<typeof registry>({
endpoint: process.env.NEXT_PUBLIC_RIVET_ENDPOINT!,
});
const handle = client.workspace.getOrCreate(["ws_123"], {
params: { accessToken: userJwt },
});
const conn = handle.connect();
conn.on("session.event", (event) => {
console.log(event.sender, event.payload);
});
const result = await conn.submitPrompt("Plan a refactor for auth middleware.");
console.log(result.stopReason);
```
</CodeGroup>
Use [onBeforeConnect](https://rivet.dev/docs/actors/authentication), [connection params](https://rivet.dev/docs/actors/connections), and [actor keys](https://rivet.dev/docs/actors/keys) together so each actor enforces auth per workspace.
### Hono
```ts
import { Hono } from "hono";
import { bearerAuth } from "hono/bearer-auth";
const app = new Hono();
app.use("/sandbox/*", bearerAuth({ token: process.env.APP_API_TOKEN! }));
app.all("/sandbox/*", async (c) => {
const incoming = new URL(c.req.url);
const upstreamUrl = new URL(process.env.SANDBOX_URL!);
upstreamUrl.pathname = incoming.pathname.replace(/^\/sandbox/, "/v1");
upstreamUrl.search = incoming.search;
const headers = new Headers();
headers.set("authorization", `Bearer ${process.env.SANDBOX_TOKEN ?? ""}`);
const accept = c.req.header("accept");
if (accept) headers.set("accept", accept);
const contentType = c.req.header("content-type");
if (contentType) headers.set("content-type", contentType);
const body =
c.req.method === "POST" || c.req.method === "PUT" || c.req.method === "PATCH"
? await c.req.text()
: undefined;
const upstream = await fetch(upstreamUrl, {
method: c.req.method,
headers,
body,
});
return new Response(upstream.body, {
status: upstream.status,
headers: upstream.headers,
});
});
```

View file

@ -0,0 +1,121 @@
---
title: "Persisting Sessions"
description: "Choose and configure session persistence for the TypeScript SDK."
icon: "database"
---
The TypeScript SDK uses a `SessionPersistDriver` to store session records and event history.
If you do not provide one, the SDK uses in-memory storage.
With persistence enabled, sessions can be restored after runtime/session loss. See [Session Restoration](/session-restoration).
Each driver stores:
- `SessionRecord` (`id`, `agent`, `agentSessionId`, `lastConnectionId`, `createdAt`, optional `destroyedAt`, optional `sandboxId`, optional `sessionInit`, optional `configOptions`, optional `modes`)
- `SessionEvent` (`id`, `eventIndex`, `sessionId`, `connectionId`, `sender`, `payload`, `createdAt`)
## Persistence drivers
### Rivet
Recommended for sandbox orchestration with actor state. See [Multiplayer](/multiplayer) for a full Rivet actor example with persistence in actor state.
### IndexedDB (browser)
Best for browser apps that should survive reloads. See the [Inspector source](https://github.com/rivet-dev/sandbox-agent/tree/main/frontend/packages/inspector/src/persist-indexeddb.ts) for a complete IndexedDB driver you can copy into your project.
### In-memory (built-in)
Best for local dev and ephemeral workloads. No extra dependencies required.
```ts
import { InMemorySessionPersistDriver, SandboxAgent } from "sandbox-agent";
const persist = new InMemorySessionPersistDriver({
maxSessions: 1024,
maxEventsPerSession: 500,
});
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
persist,
});
```
### SQLite
Best for local/server Node apps that need durable storage without a DB server.
```bash
npm install better-sqlite3
```
```ts
import { SandboxAgent } from "sandbox-agent";
import { SQLiteSessionPersistDriver } from "./persist.ts";
const persist = new SQLiteSessionPersistDriver({
filename: "./sandbox-agent.db",
});
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
persist,
});
```
See the [full SQLite example](https://github.com/rivet-dev/sandbox-agent/tree/main/examples/persist-sqlite) for the complete driver implementation you can copy into your project.
### Postgres
Use when you already run Postgres and want shared relational storage.
```bash
npm install pg
```
```ts
import { SandboxAgent } from "sandbox-agent";
import { PostgresSessionPersistDriver } from "./persist.ts";
const persist = new PostgresSessionPersistDriver({
connectionString: process.env.DATABASE_URL,
schema: "public",
});
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
persist,
});
```
See the [full Postgres example](https://github.com/rivet-dev/sandbox-agent/tree/main/examples/persist-postgres) for the complete driver implementation you can copy into your project.
### Custom driver
Implement `SessionPersistDriver` for custom backends.
```ts
import type { SessionPersistDriver } from "sandbox-agent";
class MyDriver implements SessionPersistDriver {
async getSession(id) { return undefined; }
async listSessions(request) { return { items: [] }; }
async updateSession(session) {}
async listEvents(request) { return { items: [] }; }
async insertEvent(sessionId, event) {}
}
```
## Replay controls
`SandboxAgent.connect(...)` supports:
- `replayMaxEvents` (default `50`)
- `replayMaxChars` (default `12000`)
These cap replay size when restoring sessions.
## Related docs
- [SDK Overview](/sdk-overview)
- [Session Restoration](/session-restoration)

View file

@ -0,0 +1,33 @@
---
title: "Session Restoration"
description: "How the TypeScript SDK restores sessions after connection/runtime loss."
---
Sandbox Agent automatically restores stale sessions when live session state is no longer available.
This is driven by the configured `SessionPersistDriver` (`inMemory`, IndexedDB, SQLite, Postgres, or custom).
## How Auto-Restore Works
When you call `session.prompt(...)` (or `resumeSession(...)`) and the saved session points to a stale connection, the SDK:
1. Recreates a fresh session for the same local session id.
2. Rebinds the local session to the new runtime session id.
3. Replays recent persisted events into the next prompt as context.
This happens automatically; you do not need to manually rebuild the session.
## Replay Limits
Replay payload size is capped by:
- `replayMaxEvents` (default `50`)
- `replayMaxChars` (default `12000`)
These controls limit prompt growth during restore while preserving recent context.
## Related Docs
- [SDK Overview](/sdk-overview)
- [Persisting Sessions](/session-persistence)
- [Agent Sessions](/agent-sessions)

View file

@ -1,387 +0,0 @@
---
title: "Session Transcript Schema"
description: "Universal event schema for session transcripts across all agents."
---
Each coding agent outputs events in its own native format. The sandbox-agent converts these into a universal event schema, giving you a consistent session transcript regardless of which agent you use.
The schema is defined in [OpenAPI format](https://github.com/rivet-dev/sandbox-agent/blob/main/docs/openapi.json). See the [HTTP API Reference](/api-reference) for endpoint documentation.
## Coverage Matrix
This table shows which agent feature coverage appears in the universal event stream. All agents retain their full native feature coverage—this only reflects what's normalized into the schema.
| Feature | Claude | Codex | OpenCode | Amp |
|--------------------|:------:|:-----:|:------------:|:------------:|
| Stability | Stable | Stable| Experimental | Experimental |
| Text Messages | ✓ | ✓ | ✓ | ✓ |
| Tool Calls | ✓ | ✓ | ✓ | ✓ |
| Tool Results | ✓ | ✓ | ✓ | ✓ |
| Questions (HITL) | ✓ | | ✓ | |
| Permissions (HITL) | ✓ | ✓ | ✓ | - |
| Images | - | ✓ | ✓ | - |
| File Attachments | - | ✓ | ✓ | - |
| Session Lifecycle | - | ✓ | ✓ | - |
| Error Events | - | ✓ | ✓ | ✓ |
| Reasoning/Thinking | - | ✓ | - | - |
| Command Execution | - | ✓ | - | - |
| File Changes | - | ✓ | - | - |
| MCP Tools | ✓ | ✓ | ✓ | ✓ |
| Streaming Deltas | ✓ | ✓ | ✓ | - |
| Variants | | ✓ | ✓ | ✓ |
Agents: [Claude Code](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview) · [Codex](https://github.com/openai/codex) · [OpenCode](https://github.com/opencode-ai/opencode) · [Amp](https://ampcode.com)
- ✓ = Appears in session events
- \- = Agent supports natively, schema conversion coming soon
- (blank) = Not supported by agent
<AccordionGroup>
<Accordion title="Text Messages">
Basic message exchange between user and assistant.
</Accordion>
<Accordion title="Tool Calls & Results">
Visibility into tool invocations (file reads, command execution, etc.) and their results. When not natively supported, tool activity is embedded in message content.
</Accordion>
<Accordion title="Questions (HITL)">
Interactive questions the agent asks the user. Emits `question.requested` and `question.resolved` events.
</Accordion>
<Accordion title="Permissions (HITL)">
Permission requests for sensitive operations. Emits `permission.requested` and `permission.resolved` events.
</Accordion>
<Accordion title="Images">
Support for image attachments in messages.
</Accordion>
<Accordion title="File Attachments">
Support for file attachments in messages.
</Accordion>
<Accordion title="Session Lifecycle">
Native `session.started` and `session.ended` events. When not supported, the daemon emits synthetic lifecycle events.
</Accordion>
<Accordion title="Error Events">
Structured error events for runtime failures.
</Accordion>
<Accordion title="Reasoning/Thinking">
Extended thinking or reasoning content with visibility controls.
</Accordion>
<Accordion title="Command Execution">
Detailed command execution events with stdout/stderr.
</Accordion>
<Accordion title="File Changes">
Structured file modification events with diffs.
</Accordion>
<Accordion title="MCP Tools">
Model Context Protocol tool support.
</Accordion>
<Accordion title="Streaming Deltas">
Native streaming of content deltas. When not supported, the daemon emits a single synthetic delta before `item.completed`.
</Accordion>
<Accordion title="Variants">
Model variants such as reasoning effort or depth. Agents may expose different variant sets per model.
</Accordion>
</AccordionGroup>
Want support for another agent? [Open an issue](https://github.com/rivet-dev/sandbox-agent/issues/new) to request it.
## UniversalEvent
Every event from the API is wrapped in a `UniversalEvent` envelope.
| Field | Type | Description |
|-------|------|-------------|
| `event_id` | string | Unique identifier for this event |
| `sequence` | integer | Monotonic sequence number within the session (starts at 1) |
| `time` | string | RFC3339 timestamp |
| `session_id` | string | Daemon-generated session identifier |
| `native_session_id` | string? | Provider-native session/thread identifier (e.g., Codex `threadId`, OpenCode `sessionID`) |
| `source` | string | Event origin: `agent` (native) or `daemon` (synthetic) |
| `synthetic` | boolean | Whether this event was generated by the daemon to fill gaps |
| `type` | string | Event type (see [Event Types](#event-types)) |
| `data` | object | Event-specific payload |
| `raw` | any? | Original provider payload (only when `include_raw=true`) |
```json
{
"event_id": "evt_abc123",
"sequence": 1,
"time": "2025-01-28T12:00:00Z",
"session_id": "my-session",
"native_session_id": "thread_xyz",
"source": "agent",
"synthetic": false,
"type": "item.completed",
"data": { ... }
}
```
## Event Types
### Session Lifecycle
| Type | Description | Data |
|------|-------------|------|
| `session.started` | Session has started | `{ metadata?: any }` |
| `session.ended` | Session has ended | `{ reason, terminated_by, message?, exit_code? }` |
### Turn Lifecycle
| Type | Description | Data |
|------|-------------|------|
| `turn.started` | Turn has started | `{ phase: "started", turn_id?, metadata? }` |
| `turn.ended` | Turn has ended | `{ phase: "ended", turn_id?, metadata? }` |
**SessionEndedData**
| Field | Type | Values |
|-------|------|--------|
| `reason` | string | `completed`, `error`, `terminated` |
| `terminated_by` | string | `agent`, `daemon` |
| `message` | string? | Error message (only present when reason is `error`) |
| `exit_code` | int? | Process exit code (only present when reason is `error`) |
| `stderr` | StderrOutput? | Structured stderr output (only present when reason is `error`) |
**StderrOutput**
| Field | Type | Description |
|-------|------|-------------|
| `head` | string? | First 20 lines of stderr (if truncated) or full stderr (if not truncated) |
| `tail` | string? | Last 50 lines of stderr (only present if truncated) |
| `truncated` | boolean | Whether the output was truncated |
| `total_lines` | int? | Total number of lines in stderr |
### Item Lifecycle
| Type | Description | Data |
|------|-------------|------|
| `item.started` | Item creation | `{ item }` |
| `item.delta` | Streaming content delta | `{ item_id, native_item_id?, delta }` |
| `item.completed` | Item finalized | `{ item }` |
Items follow a consistent lifecycle: `item.started` → `item.delta` (0 or more) → `item.completed`.
### HITL (Human-in-the-Loop)
| Type | Description | Data |
|------|-------------|------|
| `permission.requested` | Permission request pending | `{ permission_id, action, status, metadata? }` |
| `permission.resolved` | Permission decision recorded | `{ permission_id, action, status, metadata? }` |
| `question.requested` | Question pending user input | `{ question_id, prompt, options, status }` |
| `question.resolved` | Question answered or rejected | `{ question_id, prompt, options, status, response? }` |
**PermissionEventData**
| Field | Type | Description |
|-------|------|-------------|
| `permission_id` | string | Identifier for the permission request |
| `action` | string | What the agent wants to do |
| `status` | string | `requested`, `accept`, `accept_for_session`, `reject` |
| `metadata` | any? | Additional context |
**QuestionEventData**
| Field | Type | Description |
|-------|------|-------------|
| `question_id` | string | Identifier for the question |
| `prompt` | string | Question text |
| `options` | string[] | Available answer options |
| `status` | string | `requested`, `answered`, `rejected` |
| `response` | string? | Selected answer (when resolved) |
### Errors
| Type | Description | Data |
|------|-------------|------|
| `error` | Runtime error | `{ message, code?, details? }` |
| `agent.unparsed` | Parse failure | `{ error, location, raw_hash? }` |
The `agent.unparsed` event indicates the daemon failed to parse an agent payload. This should be treated as a bug.
## UniversalItem
Items represent discrete units of content within a session.
| Field | Type | Description |
|-------|------|-------------|
| `item_id` | string | Daemon-generated identifier |
| `native_item_id` | string? | Provider-native item/message identifier |
| `parent_id` | string? | Parent item ID (e.g., tool call/result parented to a message) |
| `kind` | string | Item category (see below) |
| `role` | string? | Actor role for message items |
| `status` | string | Lifecycle status |
| `content` | ContentPart[] | Ordered list of content parts |
### ItemKind
| Value | Description |
|-------|-------------|
| `message` | User or assistant message |
| `tool_call` | Tool invocation |
| `tool_result` | Tool execution result |
| `system` | System message |
| `status` | Status update |
| `unknown` | Unrecognized item type |
### ItemRole
| Value | Description |
|-------|-------------|
| `user` | User message |
| `assistant` | Assistant response |
| `system` | System prompt |
| `tool` | Tool-related message |
### ItemStatus
| Value | Description |
|-------|-------------|
| `in_progress` | Item is streaming or pending |
| `completed` | Item is finalized |
| `failed` | Item execution failed |
## Content Parts
The `content` array contains typed parts that make up an item's payload.
### text
Plain text content.
```json
{ "type": "text", "text": "Hello, world!" }
```
### json
Structured JSON content.
```json
{ "type": "json", "json": { "key": "value" } }
```
### tool_call
Tool invocation.
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | Tool name |
| `arguments` | string | JSON-encoded arguments |
| `call_id` | string | Unique call identifier |
```json
{
"type": "tool_call",
"name": "read_file",
"arguments": "{\"path\": \"/src/main.ts\"}",
"call_id": "call_abc123"
}
```
### tool_result
Tool execution result.
| Field | Type | Description |
|-------|------|-------------|
| `call_id` | string | Matching call identifier |
| `output` | string | Tool output |
```json
{
"type": "tool_result",
"call_id": "call_abc123",
"output": "File contents here..."
}
```
### file_ref
File reference with optional diff.
| Field | Type | Description |
|-------|------|-------------|
| `path` | string | File path |
| `action` | string | `read`, `write`, `patch` |
| `diff` | string? | Unified diff (for patches) |
```json
{
"type": "file_ref",
"path": "/src/main.ts",
"action": "write",
"diff": "@@ -1,3 +1,4 @@\n+import { foo } from 'bar';"
}
```
### image
Image reference.
| Field | Type | Description |
|-------|------|-------------|
| `path` | string | Image file path |
| `mime` | string? | MIME type |
```json
{ "type": "image", "path": "/tmp/screenshot.png", "mime": "image/png" }
```
### reasoning
Model reasoning/thinking content.
| Field | Type | Description |
|-------|------|-------------|
| `text` | string | Reasoning text |
| `visibility` | string | `public` or `private` |
```json
{ "type": "reasoning", "text": "Let me think about this...", "visibility": "public" }
```
### status
Status indicator.
| Field | Type | Description |
|-------|------|-------------|
| `label` | string | Status label |
| `detail` | string? | Additional detail |
```json
{ "type": "status", "label": "Running tests", "detail": "3 of 10 passed" }
```
## Source & Synthetics
### EventSource
The `source` field indicates who emitted the event:
| Value | Description |
|-------|-------------|
| `agent` | Native event from the agent |
| `daemon` | Synthetic event generated by the daemon |
### Synthetic Events
The daemon emits synthetic events (`synthetic: true`, `source: "daemon"`) to provide a consistent event stream across all agents. Common synthetics:
| Synthetic | When |
|-----------|------|
| `session.started` | Agent doesn't emit explicit session start |
| `session.ended` | Agent doesn't emit explicit session end |
| `turn.started` | Agent doesn't emit explicit turn start |
| `turn.ended` | Agent doesn't emit explicit turn end |
| `item.started` | Agent doesn't emit item start events |
| `item.delta` | Agent doesn't stream deltas natively |
| `question.*` | Claude Code plan mode (from ExitPlanMode tool) |
### Raw Payloads
Pass `include_raw=true` to event endpoints to receive the original agent payload in the `raw` field. Useful for debugging or accessing agent-specific data not in the universal schema.
```typescript
const events = await client.getEvents("my-session", { includeRaw: true });
// events[0].raw contains the original agent payload
```

View file

@ -1,87 +1,79 @@
---
title: "Skills"
description: "Auto-load skills into agent sessions."
description: "Configure skill sources for agent sessions."
sidebarTitle: "Skills"
icon: "sparkles"
---
Skills are local instruction bundles stored in `SKILL.md` files. Sandbox Agent can fetch, discover, and link skill directories into agent-specific skill paths at session start using the `skills.sources` field. The format is fully compatible with [skills.sh](https://skills.sh).
Skills are local instruction bundles stored in `SKILL.md` files.
## Session Config
## Configuring skills
Pass `skills.sources` when creating a session to load skills from GitHub repos, local paths, or git URLs.
Use `setSkillsConfig` / `getSkillsConfig` / `deleteSkillsConfig` to manage skill source config by directory + skill name.
<CodeGroup>
```ts TypeScript
```ts
import { SandboxAgent } from "sandbox-agent";
const client = await SandboxAgent.connect({
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
token: process.env.SANDBOX_TOKEN,
});
await client.createSession("claude-skills", {
agent: "claude",
skills: {
// Add a skill
await sdk.setSkillsConfig(
{
directory: "/workspace",
skillName: "default",
},
{
sources: [
{ type: "github", source: "rivet-dev/skills", skills: ["sandbox-agent"] },
{ type: "local", source: "/workspace/my-custom-skill" },
],
},
);
// Create a session using the configured skills
const session = await sdk.createSession({
agent: "claude",
cwd: "/workspace",
});
await session.prompt([
{ type: "text", text: "Use available skills to help with this task." },
]);
// List skills
const config = await sdk.getSkillsConfig({
directory: "/workspace",
skillName: "default",
});
console.log(config.sources.length);
// Delete skill
await sdk.deleteSkillsConfig({
directory: "/workspace",
skillName: "default",
});
```
```bash cURL
curl -X POST "http://127.0.0.1:2468/v1/sessions/claude-skills" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agent": "claude",
"skills": {
"sources": [
{ "type": "github", "source": "rivet-dev/skills", "skills": ["sandbox-agent"] },
{ "type": "local", "source": "/workspace/my-custom-skill" }
]
}
}'
```
## Skill sources
</CodeGroup>
Each skill directory must contain `SKILL.md`. See [Skill authoring best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for tips on writing effective skills.
## Skill Sources
Each entry in `skills.sources` describes where to find skills. Three source types are supported:
Each `skills.sources` entry describes where to find skills.
| Type | `source` value | Example |
|------|---------------|---------|
| `github` | `owner/repo` | `"rivet-dev/skills"` |
| `local` | Filesystem path | `"/workspace/my-skill"` |
| `git` | Git clone URL | `"https://git.example.com/skills.git"` |
| `local` | filesystem path | `"/workspace/my-skill"` |
| `git` | git clone URL | `"https://git.example.com/skills.git"` |
### Optional fields
Optional fields:
- **`skills`** — Array of skill directory names to include. When omitted, all discovered skills are installed.
- **`ref`** — Branch, tag, or commit to check out (default: HEAD). Applies to `github` and `git` types.
- **`subpath`** — Subdirectory within the repo to search for skills.
- `skills`: subset of skill directory names to include
- `ref`: branch/tag/commit (for `github` and `git`)
- `subpath`: subdirectory within repo to scan
## Custom Skills
## Custom skills
To write, upload, and configure your own skills inside the sandbox, see [Custom Tools](/custom-tools).
## Advanced
### Discovery logic
After resolving a source to a local directory (cloning if needed), Sandbox Agent discovers skills by:
1. Checking if the directory itself contains `SKILL.md`.
2. Scanning `skills/` subdirectory for child directories containing `SKILL.md`.
3. Scanning immediate children of the directory for `SKILL.md`.
Discovered skills are symlinked into project-local skill roots (`.claude/skills/<name>`, `.agents/skills/<name>`, `.opencode/skill/<name>`).
### Caching
GitHub sources are downloaded as zip archives and git sources are cloned to `~/.sandbox-agent/skills-cache/` and updated on subsequent session creations. GitHub sources do not require `git` to be installed.

View file

@ -20,7 +20,6 @@ body {
color: var(--sa-text);
}
/*
a {
color: var(--sa-primary);
}
@ -41,6 +40,13 @@ select {
color: var(--sa-text);
}
code,
pre {
background-color: var(--sa-card);
border: 1px solid var(--sa-border);
color: var(--sa-text);
}
.card,
.mintlify-card,
.docs-card {
@ -64,4 +70,3 @@ select {
.alert-danger {
border-color: var(--sa-danger);
}
*/

View file

@ -29,25 +29,6 @@ Verify the agent is installed:
ls -la ~/.local/share/sandbox-agent/bin/
```
### 4. Binary libc mismatch (musl vs glibc)
Claude Code binaries are available in both musl and glibc variants. If you see errors like:
```
cannot execute: required file not found
Error loading shared library libstdc++.so.6: No such file or directory
```
This means the wrong binary variant was downloaded.
**For sandbox-agent 0.2.0+**: Platform detection is automatic. The correct binary (musl or glibc) is downloaded based on the runtime environment.
**For sandbox-agent 0.1.x**: Use Alpine Linux which has native musl support:
```dockerfile
FROM alpine:latest
RUN apk add --no-cache curl ca-certificates libstdc++ libgcc bash
```
## Daytona Network Restrictions

View file

@ -6,12 +6,17 @@
- Do not bind mount host files or host directories into Docker example containers.
- If an example needs tools, skills, or MCP servers, install them inside the container during setup.
## Testing Examples
## Testing Examples (ACP v2)
Examples can be tested by starting them in the background and communicating directly with the sandbox-agent API:
Examples should be validated against v2 endpoints:
1. Start the example: `SANDBOX_AGENT_DEV=1 pnpm start &`
2. Note the base URL and session ID from the output.
3. Send messages: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/messages -H "Content-Type: application/json" -d '{"message":"..."}'`
4. Poll events: `curl http://127.0.0.1:<port>/v1/sessions/<sessionId>/events`
5. Approve permissions: `curl -X POST http://127.0.0.1:<port>/v1/sessions/<sessionId>/permissions/<permissionId>/reply -H "Content-Type: application/json" -d '{"reply":"once"}'`
1. Start the example: `SANDBOX_AGENT_DEV=1 pnpm start`
2. Create an ACP client by POSTing `initialize` to `/v2/rpc` with `x-acp-agent: mock` (or another installed agent).
3. Capture `x-acp-connection-id` from the response headers.
4. Open SSE stream: `GET /v2/rpc` with `x-acp-connection-id`.
5. Send `session/new` then `session/prompt` via `POST /v2/rpc` with the same connection id.
6. Close connection via `DELETE /v2/rpc` with `x-acp-connection-id`.
v1 reminder:
- `/v1/*` is removed and returns `410 Gone`.

4
examples/boxlite-python/.gitignore vendored Normal file
View file

@ -0,0 +1,4 @@
__pycache__/
*.pyc
.venv/
oci-image/

View file

@ -0,0 +1,5 @@
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y curl ca-certificates && rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
RUN sandbox-agent install-agent claude
RUN sandbox-agent install-agent codex

View file

@ -0,0 +1,145 @@
"""Minimal JSON-RPC client for sandbox-agent's streamable HTTP transport."""
import json
import threading
import time
import uuid
import httpx
class SandboxConnection:
"""Connects to a sandbox-agent server via JSON-RPC over streamable HTTP.
Endpoints used:
POST /v1/acp/{server_id}?agent=... (bootstrap + requests)
GET /v1/acp/{server_id} (SSE event stream)
DELETE /v1/acp/{server_id} (close)
"""
def __init__(self, base_url: str, agent: str):
self.base_url = base_url.rstrip("/")
self.agent = agent
self.server_id = f"py-{uuid.uuid4().hex[:8]}"
self.url = f"{self.base_url}/v1/acp/{self.server_id}"
self._next_id = 0
self._events: list[dict] = []
self._stop = threading.Event()
self._sse_thread: threading.Thread | None = None
def _alloc_id(self) -> int:
self._next_id += 1
return self._next_id
def _post(self, method: str, params: dict | None = None, *, bootstrap: bool = False) -> dict:
payload: dict = {
"jsonrpc": "2.0",
"id": self._alloc_id(),
"method": method,
}
if params is not None:
payload["params"] = params
url = f"{self.url}?agent={self.agent}" if bootstrap else self.url
r = httpx.post(url, json=payload, timeout=120)
r.raise_for_status()
body = r.text.strip()
return json.loads(body) if body else {}
# -- Lifecycle -----------------------------------------------------------
def initialize(self) -> dict:
result = self._post(
"initialize",
{
"protocolVersion": 1,
"clientInfo": {"name": "python-example", "version": "0.1.0"},
},
bootstrap=True,
)
self._start_sse()
# Auto-authenticate if the agent advertises env-var-based auth methods.
auth_methods = result.get("result", {}).get("authMethods", [])
env_ids = ("anthropic-api-key", "codex-api-key", "openai-api-key")
for method in auth_methods:
if method.get("id") not in env_ids:
continue
try:
resp = self._post("authenticate", {"methodId": method["id"]})
if "error" not in resp:
break
except Exception:
continue
return result
def new_session(self, cwd: str = "/root") -> str:
result = self._post("session/new", {"cwd": cwd, "mcpServers": []})
if "error" in result:
raise RuntimeError(f"session/new failed: {result['error'].get('message', result['error'])}")
return result["result"]["sessionId"]
def prompt(self, session_id: str, text: str) -> dict:
result = self._post(
"session/prompt",
{
"sessionId": session_id,
"prompt": [{"type": "text", "text": text}],
},
)
return result
def close(self) -> None:
self._stop.set()
try:
httpx.delete(self.url, timeout=2)
except Exception:
pass
# -- SSE event stream (background thread) --------------------------------
@property
def events(self) -> list[dict]:
return list(self._events)
def _start_sse(self) -> None:
self._sse_thread = threading.Thread(target=self._sse_loop, daemon=True)
self._sse_thread.start()
def _sse_loop(self) -> None:
while not self._stop.is_set():
try:
with httpx.stream(
"GET",
self.url,
headers={"Accept": "text/event-stream"},
timeout=httpx.Timeout(connect=5, read=None, write=5, pool=5),
) as resp:
buffer = ""
for chunk in resp.iter_text():
if self._stop.is_set():
break
buffer += chunk.replace("\r\n", "\n")
while "\n\n" in buffer:
event_chunk, buffer = buffer.split("\n\n", 1)
self._process_sse_event(event_chunk)
except Exception:
if self._stop.is_set():
return
time.sleep(0.15)
def _process_sse_event(self, chunk: str) -> None:
data_lines: list[str] = []
for line in chunk.split("\n"):
if line.startswith("data:"):
data_lines.append(line[5:].lstrip())
if not data_lines:
return
payload = "\n".join(data_lines).strip()
if not payload:
return
try:
self._events.append(json.loads(payload))
except json.JSONDecodeError:
pass

View file

@ -0,0 +1,32 @@
"""Agent detection and credential helpers for sandbox-agent examples."""
import os
import sys
def detect_agent() -> str:
"""Pick an agent based on env vars. Exits if no credentials are found."""
if os.environ.get("SANDBOX_AGENT"):
return os.environ["SANDBOX_AGENT"]
has_claude = bool(
os.environ.get("ANTHROPIC_API_KEY")
or os.environ.get("CLAUDE_API_KEY")
or os.environ.get("CLAUDE_CODE_OAUTH_TOKEN")
)
has_codex = (os.environ.get("OPENAI_API_KEY") or "").startswith("sk-")
if has_codex:
return "codex"
if has_claude:
return "claude"
print("No API keys found. Set ANTHROPIC_API_KEY or OPENAI_API_KEY.")
sys.exit(1)
def build_box_env() -> list[tuple[str, str]]:
"""Collect credential env vars to forward into the BoxLite sandbox."""
env: list[tuple[str, str]] = []
for key in ("ANTHROPIC_API_KEY", "CLAUDE_API_KEY", "OPENAI_API_KEY", "CODEX_API_KEY"):
val = os.environ.get(key)
if val:
env.append((key, val))
return env

View file

@ -0,0 +1,110 @@
"""
Sandbox Agent Python + BoxLite example.
Builds a Docker image, exports it to OCI layout, runs it inside a BoxLite
sandbox, connects to the sandbox-agent server, creates a session, and sends a prompt.
Usage:
pip install -r requirements.txt
python main.py
"""
import asyncio
import json
import signal
import time
import boxlite
import httpx
from client import SandboxConnection
from credentials import build_box_env, detect_agent
from setup_image import OCI_DIR, setup_image
PORT = 3000
def wait_for_health(base_url: str, timeout_s: float = 120) -> None:
deadline = time.monotonic() + timeout_s
last_err: str | None = None
while time.monotonic() < deadline:
try:
r = httpx.get(f"{base_url}/v1/health", timeout=5)
if r.status_code == 200 and r.json().get("status") == "ok":
return
last_err = f"health returned {r.status_code}"
except Exception as exc:
last_err = str(exc)
time.sleep(0.5)
raise RuntimeError(f"Timed out waiting for /v1/health: {last_err}")
async def main() -> None:
agent = detect_agent()
print(f"Agent: {agent}")
setup_image()
env = build_box_env()
print("Creating BoxLite sandbox...")
box = boxlite.SimpleBox(
rootfs_path=OCI_DIR,
env=env,
ports=[(PORT, PORT, "tcp")],
)
async with box:
print("Starting server...")
result = await box.exec(
"sh", "-c",
f"nohup sandbox-agent server --no-token --host 0.0.0.0 --port {PORT} "
">/tmp/sandbox-agent.log 2>&1 &",
)
if result.exit_code != 0:
raise RuntimeError(f"Failed to start server: {result.stderr}")
base_url = f"http://localhost:{PORT}"
print("Waiting for server...")
wait_for_health(base_url)
print("Server ready.")
print(f"Inspector: {base_url}/ui/")
# -- Session flow ----------------------------------------------------
conn = SandboxConnection(base_url, agent)
print("Connecting...")
init_result = conn.initialize()
agent_info = init_result.get("result", {}).get("agentInfo", {})
print(f"Connected to: {agent_info.get('title', agent)} {agent_info.get('version', '')}")
session_id = conn.new_session()
print(f"Session: {session_id}")
prompt_text = "Say hello and tell me what you are. Be brief (one sentence)."
print(f"\n> {prompt_text}")
response = conn.prompt(session_id, prompt_text)
if "error" in response:
err = response["error"]
print(f"Error: {err.get('message', err)}")
else:
print(f"Stop reason: {response.get('result', {}).get('stopReason', 'unknown')}")
# Give SSE events a moment to arrive.
time.sleep(1)
if conn.events:
for ev in conn.events:
if ev.get("method") == "session/update":
content = ev.get("params", {}).get("update", {}).get("content", {})
if content.get("text"):
print(content["text"], end="")
print()
conn.close()
print("\nDone.")
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,2 @@
boxlite>=0.5.0
httpx>=0.27.0

View file

@ -0,0 +1,29 @@
"""Build the sandbox-agent Docker image and export it to OCI layout."""
import os
import subprocess
DOCKER_IMAGE = "sandbox-agent-boxlite"
OCI_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "oci-image")
def setup_image() -> None:
dockerfile_dir = os.path.dirname(os.path.abspath(__file__))
print(f'Building image "{DOCKER_IMAGE}" (cached after first run)...')
subprocess.run(
["docker", "build", "-t", DOCKER_IMAGE, dockerfile_dir],
check=True,
)
if not os.path.exists(os.path.join(OCI_DIR, "oci-layout")):
print("Exporting to OCI layout...")
os.makedirs(OCI_DIR, exist_ok=True)
subprocess.run(
[
"skopeo", "copy",
f"docker-daemon:{DOCKER_IMAGE}:latest",
f"oci:{OCI_DIR}:latest",
],
check=True,
)

1
examples/boxlite/.gitignore vendored Normal file
View file

@ -0,0 +1 @@
oci-image/

View file

@ -0,0 +1,5 @@
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y curl ca-certificates && rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/0.4.x/install.sh | sh
RUN sandbox-agent install-agent claude
RUN sandbox-agent install-agent codex

View file

@ -0,0 +1,19 @@
{
"name": "@sandbox-agent/example-boxlite",
"private": true,
"type": "module",
"scripts": {
"start": "tsx src/index.ts",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@boxlite-ai/boxlite": "latest",
"@sandbox-agent/example-shared": "workspace:*",
"sandbox-agent": "workspace:*"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}

View file

@ -0,0 +1,41 @@
import { SimpleBox } from "@boxlite-ai/boxlite";
import { SandboxAgent } from "sandbox-agent";
import { detectAgent, buildInspectorUrl } from "@sandbox-agent/example-shared";
import { setupImage, OCI_DIR } from "./setup-image.ts";
const env: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) env.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) env.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
setupImage();
console.log("Creating BoxLite sandbox...");
const box = new SimpleBox({
rootfsPath: OCI_DIR,
env,
ports: [{ hostPort: 3000, guestPort: 3000 }],
diskSizeGb: 4,
});
console.log("Starting server...");
const result = await box.exec("sh", "-c", "nohup sandbox-agent server --no-token --host 0.0.0.0 --port 3000 >/tmp/sandbox-agent.log 2>&1 &");
if (result.exitCode !== 0) throw new Error(`Failed to start server: ${result.stderr}`);
const baseUrl = "http://localhost:3000";
console.log("Connecting to server...");
const client = await SandboxAgent.connect({ baseUrl });
const session = await client.createSession({ agent: detectAgent(), cwd: "/root" });
const sessionId = session.id;
console.log(` UI: ${buildInspectorUrl({ baseUrl, sessionId })}`);
console.log(" Press Ctrl+C to stop.");
const keepAlive = setInterval(() => {}, 60_000);
const cleanup = async () => {
clearInterval(keepAlive);
await box.stop();
process.exit(0);
};
process.once("SIGINT", cleanup);
process.once("SIGTERM", cleanup);

View file

@ -0,0 +1,16 @@
import { execSync } from "node:child_process";
import { existsSync, mkdirSync } from "node:fs";
export const DOCKER_IMAGE = "sandbox-agent-boxlite";
export const OCI_DIR = new URL("../oci-image", import.meta.url).pathname;
export function setupImage() {
console.log(`Building image "${DOCKER_IMAGE}" (cached after first run)...`);
execSync(`docker build -t ${DOCKER_IMAGE} ${new URL("..", import.meta.url).pathname}`, { stdio: "inherit" });
if (!existsSync(`${OCI_DIR}/oci-layout`)) {
console.log("Exporting to OCI layout...");
mkdirSync(OCI_DIR, { recursive: true });
execSync(`docker save ${DOCKER_IMAGE} | tar -xf - -C ${OCI_DIR}`, { stdio: "inherit" });
}
}

Some files were not shown because too many files have changed in this diff Show more