diff --git a/.context/proposal-revert-actions-to-queues.md b/.context/proposal-revert-actions-to-queues.md deleted file mode 100644 index 65a28e2..0000000 --- a/.context/proposal-revert-actions-to-queues.md +++ /dev/null @@ -1,202 +0,0 @@ -# Proposal: Revert Actions-Only Pattern Back to Queues/Workflows - -## Background - -We converted all actors from queue/workflow-based communication to direct actions as a workaround for a RivetKit bug where `c.queue.iter()` deadlocked for actors created from another actor's context. That bug has since been fixed in RivetKit. We want to revert to queues/workflows because they provide better observability (workflow history in the inspector), replay/recovery semantics, and are the idiomatic RivetKit pattern. - -## Reference branches - -- **`main`** at commit `32f3c6c3` — the original queue/workflow code BEFORE the actions refactor -- **`queues-to-actions`** — the actions refactor code with bug fixes (E2B, lazy tasks, etc.) -- **`task-owner-git-auth`** at commit `3684e2e5` — the CURRENT branch with all work including task owner system, lazy tasks, and actions refactor - -Use `main` as the reference for the queue/workflow communication patterns. Use `task-owner-git-auth` (current HEAD) as the authoritative source for ALL features and bug fixes that MUST be preserved — it has everything from `queues-to-actions` plus the task owner system. - -## What to KEEP (do NOT revert these) - -These are bug fixes and improvements made during the actions refactor that are independent of the communication pattern: - -### 1. Lazy task actor creation -- Virtual task entries in org's `taskIndex` + `taskSummaries` tables (no actor fan-out during PR sync) -- `refreshTaskSummaryForBranchMutation` writes directly to org tables instead of spawning task actors -- Task actors self-initialize in `getCurrentRecord()` from `getTaskIndexEntry` when lazily created -- `getTaskIndexEntry` action on org actor -- See CLAUDE.md "Lazy Task Actor Creation" section - -### 2. `resolveTaskRepoId` replacing `requireRepoExists` -- `requireRepoExists` was removed — it did a cross-actor call from org to github-data that was fragile -- Replaced with `resolveTaskRepoId` which reads from the org's local `taskIndex` table -- `getTask` action resolves `repoId` from task index when not provided (sandbox actor only has taskId) - -### 3. `getOrganizationContext` overrides threaded through sync phases -- `fullSyncBranchBatch`, `fullSyncMembers`, `fullSyncPullRequestBatch` now pass `connectedAccount`, `installationStatus`, `installationId` overrides from `FullSyncConfig` -- Without this, phases 2-4 fail with "Organization not initialized" when the org profile doesn't exist yet (webhook-triggered sync before user sign-in) - -### 4. E2B sandbox fixes -- `timeoutMs: 60 * 60 * 1000` in E2B create options (TEMPORARY until rivetkit autoPause lands) -- Sandbox repo path uses `/home/user/repo` for E2B compatibility -- `listProcesses` error handling for expired E2B sandboxes - -### 5. Frontend fixes -- React `useEffect` dependency stability in `mock-layout.tsx` and `organization-dashboard.tsx` (prevents infinite re-render loops) -- Terminal pane ref handling - -### 6. Process crash protection -- `process.on("uncaughtException")` and `process.on("unhandledRejection")` handlers in `foundry/packages/backend/src/index.ts` - -### 7. CLAUDE.md updates -- All new sections: lazy task creation rules, no-silent-catch policy, React hook dependency safety, dev workflow instructions, debugging section - -### 8. `requireWorkspaceTask` uses `getOrCreate` -- User-initiated actions (createSession, sendMessage, etc.) use `getOrCreate` to lazily materialize virtual tasks -- The `getOrCreate` call passes `{ organizationId, repoId, taskId }` as `createWithInput` - -### 9. `getTask` uses `getOrCreate` with `resolveTaskRepoId` -- When `repoId` is not provided (sandbox actor), resolves from task index -- Uses `getOrCreate` since the task may be virtual - -### 10. Audit log deleted workflow file -- `foundry/packages/backend/src/actors/audit-log/workflow.ts` was deleted -- The audit-log actor was simplified to a single `append` action -- Keep this simplification — audit-log doesn't need a workflow - -### 11. Task owner (primary user) system -- New `task_owner` single-row table in task actor DB schema (`foundry/packages/backend/src/actors/task/db/schema.ts`) — stores `primaryUserId`, `primaryGithubLogin`, `primaryGithubEmail`, `primaryGithubAvatarUrl` -- New migration in `foundry/packages/backend/src/actors/task/db/migrations.ts` creating the `task_owner` table -- `primaryUserLogin` and `primaryUserAvatarUrl` columns added to org's `taskSummaries` table (`foundry/packages/backend/src/actors/organization/db/schema.ts`) + corresponding migration -- `readTaskOwner()`, `upsertTaskOwner()` helpers in `workspace.ts` -- `maybeSwapTaskOwner()` — called from `sendWorkspaceMessage()`, checks if a different user is sending and swaps owner + injects git credentials into sandbox -- `changeTaskOwnerManually()` — called from the new `changeOwner` action on the task actor, updates owner without injecting credentials (credentials injected on next message from that user) -- `injectGitCredentials()` — pushes `git config user.name/email` + credential store file into the sandbox via `runProcess` -- `resolveGithubIdentity()` — resolves user's GitHub login/email/avatar/accessToken from their auth session -- `buildTaskSummary()` now includes `primaryUserLogin` and `primaryUserAvatarUrl` in the summary pushed to org coordinator -- New `changeOwner` action on task actor in `workflow/index.ts` -- New `changeWorkspaceTaskOwner` action on org actor in `actions/tasks.ts` -- New `TaskWorkspaceChangeOwnerInput` type in shared types (`foundry/packages/shared/src/workspace.ts`) -- `TaskSummary` type extended with `primaryUserLogin` and `primaryUserAvatarUrl` - -### 12. Task owner UI -- New "Overview" tab in right sidebar (`foundry/packages/frontend/src/components/mock-layout/right-sidebar.tsx`) — shows current owner with avatar, click to open dropdown of org members to change owner -- `onChangeOwner` and `members` props added to `RightSidebar` component -- Primary user login shown in green in left sidebar task items (`foundry/packages/frontend/src/components/mock-layout/sidebar.tsx`) -- `changeWorkspaceTaskOwner` method added to backend client and workspace client interfaces - -### 13. Client changes for task owner -- `changeWorkspaceTaskOwner()` added to `backend-client.ts` and all workspace client implementations (mock, remote) -- Mock workspace client implements the owner change -- Subscription manager test updated for new task summary shape - -## What to REVERT (communication pattern only) - -For each actor, revert from direct action calls back to queue sends with `expectQueueResponse` / fire-and-forget patterns. The reference for the queue patterns is `main` at `32f3c6c3`. - -### 1. Organization actor (`foundry/packages/backend/src/actors/organization/`) - -**`index.ts`:** -- Revert from actions-only to `run: workflow(runOrganizationWorkflow)` -- Keep the actions that are pure reads (getAppSnapshot, getOrganizationSummarySnapshot, etc.) -- Mutations should go through the workflow queue command loop - -**`workflow.ts`:** -- Restore `runOrganizationWorkflow` with the `ctx.loop("organization-command-loop", ...)` that dispatches queue names to mutation handlers -- Restore `ORGANIZATION_QUEUE_NAMES` and `COMMAND_HANDLERS` -- Restore `organizationWorkflowQueueName()` helper - -**`app-shell.ts`:** -- Revert direct action calls back to queue sends: `sendOrganizationCommand(org, "organization.command.X", body)` pattern -- Revert `githubData.syncRepos(...)` → `githubData.send(githubDataWorkflowQueueName("syncRepos"), ...)` -- But KEEP the `getOrganizationContext` override threading fix - -**`actions/tasks.ts`:** -- Keep `resolveTaskRepoId` (replacing `requireRepoExists`) -- Keep `requireWorkspaceTask` using `getOrCreate` -- Keep `getTask` using `getOrCreate` with `resolveTaskRepoId` -- Keep `getTaskIndexEntry` -- Keep `changeWorkspaceTaskOwner` (new action — delegates to task actor's `changeOwner`) -- Revert task actor calls from direct actions to queue sends where applicable - -**`actions/task-mutations.ts`:** -- Keep lazy task creation (virtual entries in org tables) -- Revert `taskHandle.initialize(...)` → `taskHandle.send(taskWorkflowQueueName("task.command.initialize"), ...)` -- Revert `task.pullRequestSync(...)` → `task.send(taskWorkflowQueueName("task.command.pullRequestSync"), ...)` -- Revert `auditLog.append(...)` → `auditLog.send("auditLog.command.append", ...)` - -**`actions/organization.ts`:** -- Revert direct calls to org workflow back to queue sends - -**`actions/github.ts`:** -- Revert direct calls back to queue sends - -### 2. Task actor (`foundry/packages/backend/src/actors/task/`) - -**`index.ts`:** -- Revert from actions-only to `run: workflow(runTaskWorkflow)` (or plain `run` with queue iteration) -- Keep read actions: `get`, `getTaskSummary`, `getTaskDetail`, `getSessionDetail` - -**`workflow/index.ts`:** -- Restore `taskCommandActions` as queue handlers in the workflow command loop -- Restore `TASK_QUEUE_NAMES` and dispatch map -- Add `changeOwner` to the queue dispatch map (new command, not in `main` — add as `task.command.changeOwner`) - -**`workspace.ts`:** -- Revert sandbox/org action calls back to queue sends where they were queue-based before -- Keep ALL task owner code: `readTaskOwner`, `upsertTaskOwner`, `maybeSwapTaskOwner`, `changeTaskOwnerManually`, `injectGitCredentials`, `resolveGithubIdentity` -- Keep the `authSessionId` param added to `ensureSandboxRepo` -- Keep the `maybeSwapTaskOwner` call in `sendWorkspaceMessage` -- Keep `primaryUserLogin`/`primaryUserAvatarUrl` in `buildTaskSummary` - -### 3. User actor (`foundry/packages/backend/src/actors/user/`) - -**`index.ts`:** -- Revert from actions-only to `run: workflow(runUserWorkflow)` (or plain run with queue iteration) - -**`workflow.ts`:** -- Restore queue command loop dispatching to mutation functions - -### 4. GitHub-data actor (`foundry/packages/backend/src/actors/github-data/`) - -**`index.ts`:** -- Revert from actions-only to having a run handler with queue iteration -- Keep the `getOrganizationContext` override threading fix -- Keep the `actionTimeout: 10 * 60_000` for long sync operations - -### 5. Audit-log actor -- Keep as actions-only (simplified). No need to revert — it's simpler with just `append`. - -### 6. Callers - -**`foundry/packages/backend/src/services/better-auth.ts`:** -- Revert direct user actor action calls back to queue sends - -**`foundry/packages/backend/src/actors/sandbox/index.ts`:** -- Revert `organization.getTask(...)` → queue send if it was queue-based before -- Keep the E2B timeout fix and listProcesses error handling - -## Step-by-step procedure - -1. Create a new branch from `task-owner-git-auth` (current HEAD) -2. For each actor, open a 3-way comparison: `main` (original queues), `queues-to-actions` (current), and your working copy -3. Restore queue/workflow run handlers and command loops from `main` -4. Restore queue name helpers and constants from `main` -5. Restore caller sites to use queue sends from `main` -6. Carefully preserve all items in the "KEEP" list above -7. Test: `cd foundry && docker compose -f compose.dev.yaml up -d`, sign in, verify GitHub sync completes, verify tasks show in sidebar, verify session creation works -8. Nuke RivetKit data between test runs: `docker volume rm foundry_foundry_rivetkit_storage` - -## Verification checklist - -- [ ] GitHub sync completes (160 repos for rivet-dev) -- [ ] Tasks show in sidebar (from PR sync, lazy/virtual entries) -- [ ] No task actors spawned during sync (check RivetKit inspector — should see 0 task actors until user clicks one) -- [ ] Clicking a task materializes the actor (lazy creation via getOrCreate) -- [ ] Session creation works on sandbox-agent-testing repo -- [ ] E2B sandbox provisions and connects -- [ ] Agent responds to messages -- [ ] No 500 errors in backend logs (except expected E2B sandbox expiry) -- [ ] Workflow history visible in RivetKit inspector for org, task, user actors -- [ ] CLAUDE.md constraints still documented and respected -- [ ] Task owner shows in right sidebar "Overview" tab -- [ ] Owner dropdown shows org members and allows switching -- [ ] Sending a message as a different user swaps the owner -- [ ] Primary user login shown in green on sidebar task items -- [ ] Git credentials injected into sandbox on owner swap (check `/home/user/.git-token` exists) diff --git a/.context/proposal-rivetkit-sandbox-resilience.md b/.context/proposal-rivetkit-sandbox-resilience.md deleted file mode 100644 index 1c94982..0000000 --- a/.context/proposal-rivetkit-sandbox-resilience.md +++ /dev/null @@ -1,94 +0,0 @@ -# Proposal: RivetKit Sandbox Actor Resilience - -## Context - -The rivetkit sandbox actor (`src/sandbox/actor.ts`) does not handle the case where the underlying cloud sandbox (e.g. E2B VM) is destroyed while the actor is still alive. This causes cascading 500 errors when the actor tries to call the dead sandbox. Additionally, a UNIQUE constraint bug in event persistence crashes the host process. - -The sandbox-agent repo (which defines the E2B provider) will be updated separately to use `autoPause` and expose `pause()`/typed errors. This proposal covers the rivetkit-side changes needed to handle those signals. - -## Changes - -### 1. Fix `persistObservedEnvelope` UNIQUE constraint crash - -**File:** `insertEvent` in the sandbox actor's SQLite persistence layer - -The `sandbox_agent_events` table has a UNIQUE constraint on `(session_id, event_index)`. When the same event is observed twice (reconnection, replay, duplicate WebSocket delivery), the insert throws and crashes the host process as an unhandled rejection. - -**Fix:** Change the INSERT to `INSERT OR IGNORE` / `ON CONFLICT DO NOTHING`. Duplicate events are expected and harmless — they should be silently deduplicated at the persistence layer. - -### 2. Handle destroyed sandbox in `ensureAgent()` - -**File:** `src/sandbox/actor.ts` — `ensureAgent()` function - -When the provider's `start()` is called with an existing `sandboxId` and the sandbox no longer exists, the provider throws a typed `SandboxDestroyedError` (defined in the sandbox-agent provider contract). - -`ensureAgent()` should catch this error and check the `onSandboxExpired` config option: - -```typescript -// New config option on sandboxActor() -onSandboxExpired?: "destroy" | "recreate"; // default: "destroy" -``` - -**`"destroy"` (default):** -- Set `state.sandboxDestroyed = true` -- Emit `sandboxExpired` event to all connected clients -- All subsequent action calls (runProcess, createSession, etc.) return a clear error: "Sandbox has expired. Create a new task to continue." -- The sandbox actor stays alive (preserves session history, audit log) but rejects new work - -**`"recreate"`:** -- Call provider `create()` to provision a fresh sandbox -- Store new `sandboxId` in state -- Emit `sandboxRecreated` event to connected clients with a notice that sessions are lost (new VM, no prior state) -- Resume normal operation with the new sandbox - -### 3. Expose `pause` action - -**File:** `src/sandbox/actor.ts` — actions - -Add a `pause` action that delegates to the provider's `pause()` method. This is user-initiated only (e.g. user clicks "Pause sandbox" in UI to save credits). The sandbox actor should never auto-pause. - -```typescript -async pause(c) { - await c.provider.pause(); - state.sandboxPaused = true; - c.broadcast("sandboxPaused", {}); -} -``` - -### 4. Expose `resume` action - -**File:** `src/sandbox/actor.ts` — actions - -Add a `resume` action for explicit recovery. Calls `provider.start({ sandboxId: state.sandboxId })` which auto-resumes if paused. - -```typescript -async resume(c) { - await ensureAgent(c); // handles reconnect internally - state.sandboxPaused = false; - c.broadcast("sandboxResumed", {}); -} -``` - -### 5. Keep-alive while sessions are active - -**File:** `src/sandbox/actor.ts` - -While the sandbox actor has connected WebSocket clients, periodically extend the underlying sandbox TTL to prevent it from being garbage collected mid-session. - -- On first client connect: start a keep-alive interval (e.g. every 2 minutes) -- Each tick: call `provider.extendTimeout(extensionMs)` (the provider maps this to `sandbox.setTimeout()` for E2B) -- On last client disconnect: clear the interval, let the sandbox idle toward its natural timeout - -This prevents the common case where a user is actively working but the sandbox expires because the E2B default timeout (5 min) is too short. The `timeoutMs` in create options is the initial TTL; keep-alive extends it dynamically. - -## Key invariant - -**Never silently fail.** Every destroyed/expired/error state must be surfaced to connected clients via events. The actor must always tell the UI what happened so the user can act on it. See CLAUDE.md "never silently catch errors" rule. - -## Dependencies - -These changes depend on the sandbox-agent provider contract exposing: -- `pause()` method -- `extendTimeout(ms)` method -- Typed `SandboxDestroyedError` thrown from `start()` when sandbox is gone -- `start()` auto-resuming paused sandboxes via `Sandbox.connect(sandboxId)` diff --git a/.context/proposal-task-owner-git-auth.md b/.context/proposal-task-owner-git-auth.md deleted file mode 100644 index b2a35c8..0000000 --- a/.context/proposal-task-owner-git-auth.md +++ /dev/null @@ -1,200 +0,0 @@ -# Proposal: Task Primary Owner & Git Authentication - -## Problem - -Sandbox git operations (commit, push, PR creation) require authentication. -Currently, the sandbox has no user-scoped credentials. The E2B sandbox -clones repos using the GitHub App installation token, but push operations -need user-scoped auth so commits are attributed correctly and branch -protection rules are enforced. - -## Design - -### Concept: Primary User per Task - -Each task has a **primary user** (the "owner"). This is the last user who -sent a message on the task. Their GitHub OAuth credentials are injected -into the sandbox for git operations. When the owner changes, the sandbox -git config and credentials swap to the new user. - -### Data Model - -**Task actor DB** -- new `task_owner` single-row table: -- `primaryUserId` (text) -- better-auth user ID -- `primaryGithubLogin` (text) -- GitHub username (for `git config user.name`) -- `primaryGithubEmail` (text) -- GitHub email (for `git config user.email`) -- `primaryGithubAvatarUrl` (text) -- avatar for UI display -- `updatedAt` (integer) - -**Org coordinator** -- add to `taskSummaries` table: -- `primaryUserLogin` (text, nullable) -- `primaryUserAvatarUrl` (text, nullable) - -### Owner Swap Flow - -Triggered when `sendWorkspaceMessage` is called with a different user than -the current primary: - -1. `sendWorkspaceMessage(authSessionId, ...)` resolves user from auth session -2. Look up user's GitHub identity from auth account table (`providerId = "github"`) -3. Compare `primaryUserId` with current owner. If different: - a. Update `task_owner` row in task actor DB - b. Get user's OAuth `accessToken` from auth account - c. Push into sandbox via `runProcess`: - - `git config user.name "{login}"` - - `git config user.email "{email}"` - - Write token to `/home/user/.git-token` (or equivalent) - d. Push updated task summary to org coordinator (includes `primaryUserLogin`) - e. Broadcast `taskUpdated` to connected clients -4. If same user, no-op (token is still valid) - -### Token Injection - -The user's GitHub OAuth token (stored in better-auth account table) has -`repo` scope (verified -- see `better-auth.ts` line 480: `scope: ["read:org", "repo"]`). - -This is a standard **OAuth App** flow (not GitHub App OAuth). OAuth App -tokens do not expire unless explicitly revoked. No refresh logic is needed. - -**Injection method:** - -On first sandbox repo setup (`ensureSandboxRepo`), configure: - -```bash -# Write token file -echo "{token}" > /home/user/.git-token -chmod 600 /home/user/.git-token - -# Configure git to use it -git config --global credential.helper 'store --file=/home/user/.git-token' - -# Format: https://{login}:{token}@github.com -echo "https://{login}:{token}@github.com" > /home/user/.git-token -``` - -On owner swap, overwrite `/home/user/.git-token` with new user's credentials. - -**Important: git should never prompt for credentials.** The credential -store file ensures all git operations are auto-authenticated. No -`GIT_ASKPASS` prompts, no interactive auth. - -**Race condition (expected behavior):** If User A sends a message and the -agent starts a long git operation, then User B sends a message and triggers -an owner swap, the in-flight git process still has User A's credentials -(already read from the credential store). The next git operation uses -User B's credentials. This is expected behavior -- document in comments. - -### Token Validity - -OAuth App tokens (our flow) do not expire. They persist until the user -revokes them or the OAuth App is deauthorized. No periodic refresh needed. - -If a token becomes invalid (user revokes), git operations will fail with -a 401. The error surfaces through the standard `ensureSandboxRepo` / -`runProcess` error path and is displayed in the UI. - -### User Removal - -When a user is removed from the organization: -1. Org actor queries active tasks with that user as primary owner -2. For each, clear the `task_owner` row -3. Task actor clears the sandbox git credentials (overwrite credential file) -4. Push updated task summaries to org coordinator -5. Subsequent git operations fail with "No active owner -- assign an owner to enable git operations" - -### UI Changes - -**Right sidebar -- new "Overview" tab:** -- Add as a new tab alongside "Changes" and "All Files" -- Shows current primary user: avatar, name, login -- Click on the user -> dropdown of all workspace users (from org member list) -- Select a user -> triggers explicit owner swap (same flow as message-triggered) -- Also shows task metadata: branch, repo, created date - -**Left sidebar -- task items:** -- Show primary user's GitHub login in green text next to task name -- Only shown when there is an active owner - -**Task detail header:** -- Show small avatar of primary user next to task title - -### Org Coordinator - -`commandApplyTaskSummaryUpdate` already receives the full task summary -from the task actor. Add `primaryUserLogin` and `primaryUserAvatarUrl` -to the summary payload. The org writes it to `taskSummaries`. The sidebar -reads it from the org snapshot. - -### Sandbox Architecture Note - -Structurally, the system supports multiple sandboxes per task, but in -practice there is exactly one active sandbox per task. Design the owner -injection assuming one sandbox. The token is injected into the active -sandbox only. If multi-sandbox support is needed in the future, extend -the injection to target specific sandbox IDs. - -## Security Considerations - -### OAuth Token Scope - -The user's GitHub OAuth token has `repo` scope, which grants **full control -of all private repositories** the user has access to. When injected into -the sandbox: - -- The agent can read/write ANY repo the user has access to, not just the - task's target repo -- The token persists in the sandbox filesystem until overwritten -- Any process running in the sandbox can read the credential file - -**Mitigations:** -- Credential file has `chmod 600` (owner-read-only) -- Sandbox is isolated per-task (E2B VM boundary) -- Token is overwritten on owner swap (old user's token removed) -- Token is cleared on user removal from org -- Sandbox has a finite lifetime (E2B timeout + autoPause) - -**Accepted risk:** This is the standard trade-off for OAuth-based git -integrations (same as GitHub Codespaces, Gitpod, etc.). The user consents -to `repo` scope at sign-in time. Document this in user-facing terms in -the product's security/privacy page. - -### Future: Fine-grained tokens - -GitHub supports fine-grained personal access tokens scoped to specific -repos. A future improvement could mint per-repo tokens instead of using -the user's full OAuth token. This requires the user to create and manage -fine-grained tokens, which adds friction. Evaluate based on user feedback. - -## Implementation Order - -1. Add `task_owner` table to task actor schema + migration -2. Add `primaryUserLogin` / `primaryUserAvatarUrl` to `taskSummaries` schema + migration -3. Implement owner swap in `sendWorkspaceMessage` flow -4. Implement credential injection in `ensureSandboxRepo` -5. Implement credential swap via `runProcess` on owner change -6. Implement user removal cleanup in org actor -7. Add "Overview" tab to right sidebar -8. Add owner display to left sidebar task items -9. Add owner picker dropdown in Overview tab -10. Update org coordinator to propagate owner in task summaries - -## Files to Modify - -### Backend -- `foundry/packages/backend/src/actors/task/db/schema.ts` -- add `task_owner` table -- `foundry/packages/backend/src/actors/task/db/migrations.ts` -- add migration -- `foundry/packages/backend/src/actors/organization/db/schema.ts` -- add owner columns to `taskSummaries` -- `foundry/packages/backend/src/actors/organization/db/migrations.ts` -- add migration -- `foundry/packages/backend/src/actors/task/workspace.ts` -- owner swap logic in `sendWorkspaceMessage`, credential injection in `ensureSandboxRepo` -- `foundry/packages/backend/src/actors/task/workflow/index.ts` -- wire owner swap action -- `foundry/packages/backend/src/actors/organization/actions/task-mutations.ts` -- propagate owner in summaries -- `foundry/packages/backend/src/actors/organization/actions/tasks.ts` -- `sendWorkspaceMessage` owner check -- `foundry/packages/backend/src/services/better-auth.ts` -- expose `getAccessTokenForSession` for owner lookup - -### Shared -- `foundry/packages/shared/src/types.ts` -- add `primaryUserLogin` to `TaskSummary` - -### Frontend -- `foundry/packages/frontend/src/components/mock-layout/right-sidebar.tsx` -- add Overview tab -- `foundry/packages/frontend/src/components/organization-dashboard.tsx` -- show owner in sidebar task items -- `foundry/packages/frontend/src/components/mock-layout.tsx` -- wire Overview tab state diff --git a/.gitignore b/.gitignore index 7b6c859..de4d863 100644 --- a/.gitignore +++ b/.gitignore @@ -59,3 +59,4 @@ sdks/cli/platforms/*/bin/ # Foundry desktop app build artifacts foundry/packages/desktop/frontend-dist/ foundry/packages/desktop/src-tauri/sidecars/ +.context/ diff --git a/CLAUDE.md b/CLAUDE.md index 4935aa5..cff74eb 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -22,19 +22,6 @@ - `server/packages/sandbox-agent/src/cli.rs` - Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs). -## E2E Agent Testing - -- When asked to test agents e2e and you do not have the API tokens/credentials required, always stop and ask the user where to find the tokens before proceeding. - -## ACP Adapter Audit - -- `scripts/audit-acp-deps/adapters.json` is the single source of truth for ACP adapter npm packages, pinned versions, and the `@agentclientprotocol/sdk` pin. -- The Rust fallback install path in `server/packages/agent-management/src/agents.rs` reads adapter entries from `adapters.json` at compile time via `include_str!`. -- Run `cd scripts/audit-acp-deps && npx tsx audit.ts` to compare our pinned versions against the ACP registry and npm latest. -- When bumping an adapter version, update `adapters.json` only — the Rust code picks it up automatically. -- When adding a new agent, add an entry to `adapters.json` (the `_` fallback arm in `install_agent_process_fallback` handles it). -- When updating the `@agentclientprotocol/sdk` pin, update both `adapters.json` (sdkDeps) and `sdks/acp-http-client/package.json`. - ## Change Tracking - If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push. @@ -43,41 +30,22 @@ - Regenerate `docs/openapi.json` when HTTP contracts change. - Keep `docs/inspector.mdx` and `docs/sdks/typescript.mdx` aligned with implementation. - Append blockers/decisions to `research/acp/friction.md` during ACP work. -- Each agent has its own doc page at `docs/agents/.mdx` listing models, modes, and thought levels. Update the relevant page when changing `fallback_config_options`. To regenerate capability data, run `cd scripts/agent-configs && npx tsx dump.ts`. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`). +- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`). - Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier. -## Adding Providers +## Docker Test Image -When adding a new sandbox provider, update all of the following: +- Docker-backed Rust and TypeScript tests build `docker/test-agent/Dockerfile` directly in-process and cache the image tag only in memory (`OnceLock` in Rust, module-level variable in TypeScript). +- Do not add cross-process image-build scripts unless there is a concrete need for them. -- `sdks/typescript/src/providers/.ts` — provider implementation -- `sdks/typescript/package.json` — add `./` export, peerDependencies, peerDependenciesMeta, devDependencies -- `sdks/typescript/tsup.config.ts` — add entry point and external -- `sdks/typescript/tests/providers.test.ts` — add test entry -- `examples//` — create example with `src/index.ts` and `tests/.test.ts` -- `docs/deploy/.mdx` — create deploy guide -- `docs/docs.json` — add to Deploy pages navigation -- `docs/quickstart.mdx` — add tab in "Start the sandbox" step, add credentials entry in "Passing LLM credentials" accordion +## Common Software Sync -## Adding Agents - -When adding a new agent, update all of the following: - -- `docs/agents/.mdx` — create agent page with usage snippet and capabilities table -- `docs/docs.json` — add to the Agents group under Agent -- `docs/quickstart.mdx` — add tab in the "Create a session and send a prompt" CodeGroup - -## Persist Packages (Deprecated) - -- The `@sandbox-agent/persist-*` npm packages (`persist-sqlite`, `persist-postgres`, `persist-indexeddb`, `persist-rivet`) are deprecated stubs. They still publish to npm but throw a deprecation error at import time. -- Driver implementations now live inline in examples and consuming packages: - - SQLite: `examples/persist-sqlite/src/persist.ts` - - Postgres: `examples/persist-postgres/src/persist.ts` - - IndexedDB: `frontend/packages/inspector/src/persist-indexeddb.ts` - - Rivet: inlined in `docs/multiplayer.mdx` - - In-memory: built into the main `sandbox-agent` SDK (`InMemorySessionPersistDriver`) -- Docs (`docs/session-persistence.mdx`) link to the example implementations on GitHub instead of referencing the packages. -- Do not re-add `@sandbox-agent/persist-*` as dependencies anywhere. New persist drivers should be copied into the consuming project directly. +- These three files must stay in sync: + - `docs/common-software.mdx` (user-facing documentation) + - `docker/test-common-software/Dockerfile` (packages installed in the test image) + - `server/packages/sandbox-agent/tests/common_software.rs` (test assertions) +- When adding or removing software from `docs/common-software.mdx`, also add/remove the corresponding `apt-get install` line in the Dockerfile and add/remove the test in `common_software.rs`. +- Run `cargo test -p sandbox-agent --test common_software` to verify. ## Install Version References @@ -93,27 +61,20 @@ When adding a new agent, update all of the following: - `docs/sdk-overview.mdx` - `docs/react-components.mdx` - `docs/session-persistence.mdx` - - `docs/architecture.mdx` - `docs/deploy/local.mdx` - `docs/deploy/cloudflare.mdx` - `docs/deploy/vercel.mdx` - `docs/deploy/daytona.mdx` - `docs/deploy/e2b.mdx` - `docs/deploy/docker.mdx` - - `docs/deploy/boxlite.mdx` - - `docs/deploy/modal.mdx` - - `docs/deploy/computesdk.mdx` - `frontend/packages/website/src/components/GetStarted.tsx` - `.claude/commands/post-release-testing.md` - `examples/cloudflare/Dockerfile` - - `examples/boxlite/Dockerfile` - - `examples/boxlite-python/Dockerfile` - `examples/daytona/src/index.ts` - `examples/shared/src/docker.ts` - `examples/docker/src/index.ts` - `examples/e2b/src/index.ts` - `examples/vercel/src/index.ts` - - `sdks/typescript/src/providers/shared.ts` - `scripts/release/main.ts` - `scripts/release/promote-artifacts.ts` - `scripts/release/sdk.ts` diff --git a/docker/inspector-dev/Dockerfile b/docker/inspector-dev/Dockerfile new file mode 100644 index 0000000..b55923f --- /dev/null +++ b/docker/inspector-dev/Dockerfile @@ -0,0 +1,7 @@ +FROM node:22-bookworm-slim + +RUN npm install -g pnpm@10.28.2 + +WORKDIR /app + +CMD ["bash", "-lc", "pnpm install --filter @sandbox-agent/inspector... && cd frontend/packages/inspector && exec pnpm vite --host 0.0.0.0 --port 5173"] diff --git a/docker/runtime/Dockerfile b/docker/runtime/Dockerfile index e0a3335..85473be 100644 --- a/docker/runtime/Dockerfile +++ b/docker/runtime/Dockerfile @@ -149,7 +149,8 @@ FROM debian:bookworm-slim RUN apt-get update && apt-get install -y \ ca-certificates \ curl \ - git && \ + git \ + ffmpeg && \ rm -rf /var/lib/apt/lists/* # Copy the binary from builder diff --git a/docker/test-agent/Dockerfile b/docker/test-agent/Dockerfile new file mode 100644 index 0000000..67888b3 --- /dev/null +++ b/docker/test-agent/Dockerfile @@ -0,0 +1,61 @@ +FROM rust:1.88.0-bookworm AS builder +WORKDIR /build + +COPY Cargo.toml Cargo.lock ./ +COPY server/ ./server/ +COPY gigacode/ ./gigacode/ +COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/ +COPY scripts/agent-configs/ ./scripts/agent-configs/ +COPY scripts/audit-acp-deps/ ./scripts/audit-acp-deps/ + +ENV SANDBOX_AGENT_SKIP_INSPECTOR=1 + +RUN --mount=type=cache,target=/usr/local/cargo/registry \ + --mount=type=cache,target=/usr/local/cargo/git \ + --mount=type=cache,target=/build/target \ + cargo build -p sandbox-agent --release && \ + cp target/release/sandbox-agent /sandbox-agent + +# Extract neko binary from the official image for WebRTC desktop streaming. +# Using neko v3 base image from GHCR which provides multi-arch support (amd64, arm64). +# Pinned by digest to prevent breaking changes from upstream. +# Reference client: https://github.com/demodesk/neko-client/blob/37f93eae6bd55b333c94bd009d7f2b079075a026/src/component/internal/webrtc.ts +FROM ghcr.io/m1k1o/neko/base@sha256:0c384afa56268aaa2d5570211d284763d0840dcdd1a7d9a24be3081d94d3dfce AS neko-base + +FROM node:22-bookworm-slim +RUN apt-get update -qq && \ + apt-get install -y -qq --no-install-recommends \ + ca-certificates \ + bash \ + libstdc++6 \ + xvfb \ + openbox \ + xdotool \ + imagemagick \ + ffmpeg \ + gstreamer1.0-tools \ + gstreamer1.0-plugins-base \ + gstreamer1.0-plugins-good \ + gstreamer1.0-plugins-bad \ + gstreamer1.0-plugins-ugly \ + gstreamer1.0-nice \ + gstreamer1.0-x \ + gstreamer1.0-pulseaudio \ + libxcvt0 \ + x11-xserver-utils \ + dbus-x11 \ + xauth \ + fonts-dejavu-core \ + xterm \ + > /dev/null 2>&1 && \ + rm -rf /var/lib/apt/lists/* + +COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent +COPY --from=neko-base /usr/bin/neko /usr/local/bin/neko + +EXPOSE 3000 +# Expose UDP port range for WebRTC media transport +EXPOSE 59050-59070/udp + +ENTRYPOINT ["/usr/local/bin/sandbox-agent"] +CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"] diff --git a/docker/test-common-software/Dockerfile b/docker/test-common-software/Dockerfile new file mode 100644 index 0000000..7a03abc --- /dev/null +++ b/docker/test-common-software/Dockerfile @@ -0,0 +1,37 @@ +# Extends the base test-agent image with common software pre-installed. +# Used by the common_software integration test to verify that all documented +# software in docs/common-software.mdx works correctly inside the sandbox. +# +# KEEP IN SYNC with docs/common-software.mdx + +ARG BASE_IMAGE=sandbox-agent-test:dev +FROM ${BASE_IMAGE} + +USER root + +RUN apt-get update -qq && \ + apt-get install -y -qq --no-install-recommends \ + # Browsers + chromium \ + firefox-esr \ + # Languages + python3 python3-pip python3-venv \ + default-jdk \ + ruby-full \ + # Databases + sqlite3 \ + redis-server \ + # Build tools + build-essential cmake pkg-config \ + # CLI tools + git jq tmux \ + # Media and graphics + imagemagick \ + poppler-utils \ + # Desktop apps + gimp \ + > /dev/null 2>&1 && \ + rm -rf /var/lib/apt/lists/* + +ENTRYPOINT ["/usr/local/bin/sandbox-agent"] +CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"] diff --git a/docs/cli.mdx b/docs/cli.mdx index 2ad3b08..362de49 100644 --- a/docs/cli.mdx +++ b/docs/cli.mdx @@ -37,6 +37,36 @@ Notes: - Set `SANDBOX_AGENT_LOG_STDOUT=1` to force stdout/stderr logging. - Use `SANDBOX_AGENT_LOG_DIR` to override log directory. +## install + +Install first-party runtime dependencies. + +### install desktop + +Install the Linux desktop runtime packages required by `/v1/desktop/*`. + +```bash +sandbox-agent install desktop [OPTIONS] +``` + +| Option | Description | +|--------|-------------| +| `--yes` | Skip the confirmation prompt | +| `--print-only` | Print the package-manager command without executing it | +| `--package-manager ` | Override package-manager detection | +| `--no-fonts` | Skip the default DejaVu font package | + +```bash +sandbox-agent install desktop --yes +sandbox-agent install desktop --print-only +``` + +Notes: + +- Supported on Linux only. +- The command detects `apt`, `dnf`, or `apk`. +- If the host is not already running as root, the command requires `sudo`. + ## install-agent Install or reinstall a single agent, or every supported agent with `--all`. diff --git a/docs/common-software.mdx b/docs/common-software.mdx new file mode 100644 index 0000000..7997a92 --- /dev/null +++ b/docs/common-software.mdx @@ -0,0 +1,560 @@ +--- +title: "Common Software" +description: "Install browsers, languages, databases, and other tools inside the sandbox." +sidebarTitle: "Common Software" +icon: "box-open" +--- + +The sandbox runs a Debian/Ubuntu base image. You can install software with `apt-get` via the [Process API](/processes) or by customizing your Docker image. This page covers commonly needed packages and how to install them. + +## Browsers + +### Chromium + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "chromium", "chromium-sandbox"], +}); + +// Launch headless +await sdk.runProcess({ + command: "chromium", + args: ["--headless", "--no-sandbox", "--disable-gpu", "https://example.com"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","chromium","chromium-sandbox"]}' +``` + + + +Use `--no-sandbox` when running Chromium inside a container. The container itself provides isolation. + + +### Firefox + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "firefox-esr"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","firefox-esr"]}' +``` + + +### Playwright browsers + +Playwright bundles its own browser binaries. Install the Playwright CLI and let it download browsers for you. + + +```ts TypeScript +await sdk.runProcess({ + command: "npx", + args: ["playwright", "install", "--with-deps", "chromium"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"npx","args":["playwright","install","--with-deps","chromium"]}' +``` + + +--- + +## Languages and runtimes + +### Node.js + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "nodejs", "npm"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","nodejs","npm"]}' +``` + + +For a specific version, use [nvm](https://github.com/nvm-sh/nvm): + +```ts TypeScript +await sdk.runProcess({ + command: "bash", + args: ["-c", "curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash && . ~/.nvm/nvm.sh && nvm install 22"], +}); +``` + +### Python + +Python 3 is typically pre-installed. To add pip and common packages: + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "python3", "python3-pip", "python3-venv"], +}); + +await sdk.runProcess({ + command: "pip3", + args: ["install", "numpy", "pandas", "matplotlib"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","python3","python3-pip","python3-venv"]}' + +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"pip3","args":["install","numpy","pandas","matplotlib"]}' +``` + + +### Go + + +```ts TypeScript +await sdk.runProcess({ + command: "bash", + args: ["-c", "curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"], +}); + +// Add to PATH for subsequent commands +await sdk.runProcess({ + command: "bash", + args: ["-c", "export PATH=$PATH:/usr/local/go/bin && go version"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"bash","args":["-c","curl -fsSL https://go.dev/dl/go1.23.6.linux-amd64.tar.gz | tar -C /usr/local -xz"]}' +``` + + +### Rust + + +```ts TypeScript +await sdk.runProcess({ + command: "bash", + args: ["-c", "curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"bash","args":["-c","curl --proto =https --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y"]}' +``` + + +### Java (OpenJDK) + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "default-jdk"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","default-jdk"]}' +``` + + +### Ruby + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "ruby-full"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","ruby-full"]}' +``` + + +--- + +## Databases + +### PostgreSQL + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "postgresql", "postgresql-client"], +}); + +// Start the service +const proc = await sdk.createProcess({ + command: "bash", + args: ["-c", "su - postgres -c 'pg_ctlcluster 15 main start'"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","postgresql","postgresql-client"]}' +``` + + +### SQLite + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "sqlite3"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","sqlite3"]}' +``` + + +### Redis + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "redis-server"], +}); + +const proc = await sdk.createProcess({ + command: "redis-server", + args: ["--daemonize", "no"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","redis-server"]}' + +curl -X POST "http://127.0.0.1:2468/v1/processes" \ + -H "Content-Type: application/json" \ + -d '{"command":"redis-server","args":["--daemonize","no"]}' +``` + + +### MySQL / MariaDB + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "mariadb-server", "mariadb-client"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","mariadb-server","mariadb-client"]}' +``` + + +--- + +## Build tools + +### Essential build toolchain + +Most compiled software needs the standard build toolchain: + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "build-essential", "cmake", "pkg-config"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","build-essential","cmake","pkg-config"]}' +``` + + +This installs `gcc`, `g++`, `make`, `cmake`, and related tools. + +--- + +## Desktop applications + +These require the [Computer Use](/computer-use) desktop to be started first. + +### LibreOffice + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "libreoffice"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","libreoffice"]}' +``` + + +### GIMP + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "gimp"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","gimp"]}' +``` + + +### VLC + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "vlc"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","vlc"]}' +``` + + +### VS Code (code-server) + + +```ts TypeScript +await sdk.runProcess({ + command: "bash", + args: ["-c", "curl -fsSL https://code-server.dev/install.sh | sh"], +}); + +const proc = await sdk.createProcess({ + command: "code-server", + args: ["--bind-addr", "0.0.0.0:8080", "--auth", "none"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"bash","args":["-c","curl -fsSL https://code-server.dev/install.sh | sh"]}' + +curl -X POST "http://127.0.0.1:2468/v1/processes" \ + -H "Content-Type: application/json" \ + -d '{"command":"code-server","args":["--bind-addr","0.0.0.0:8080","--auth","none"]}' +``` + + +--- + +## CLI tools + +### Git + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "git"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","git"]}' +``` + + +### Docker + + +```ts TypeScript +await sdk.runProcess({ + command: "bash", + args: ["-c", "curl -fsSL https://get.docker.com | sh"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"bash","args":["-c","curl -fsSL https://get.docker.com | sh"]}' +``` + + +### jq + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "jq"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","jq"]}' +``` + + +### tmux + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "tmux"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","tmux"]}' +``` + + +--- + +## Media and graphics + +### FFmpeg + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "ffmpeg"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","ffmpeg"]}' +``` + + +### ImageMagick + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "imagemagick"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","imagemagick"]}' +``` + + +### Poppler (PDF utilities) + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "poppler-utils"], +}); + +// Convert PDF to images +await sdk.runProcess({ + command: "pdftoppm", + args: ["-png", "document.pdf", "output"], +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","poppler-utils"]}' +``` + + +--- + +## Pre-installing in a Docker image + +For production use, install software in your Dockerfile instead of at runtime. This avoids repeated downloads and makes startup faster. + +```dockerfile +FROM ubuntu:22.04 + +RUN apt-get update && apt-get install -y \ + chromium \ + firefox-esr \ + nodejs npm \ + python3 python3-pip \ + git curl wget \ + build-essential \ + sqlite3 \ + ffmpeg \ + imagemagick \ + jq \ + && rm -rf /var/lib/apt/lists/* + +RUN pip3 install numpy pandas matplotlib +``` + +See [Docker deployment](/deploy/docker) for how to use custom images with Sandbox Agent. diff --git a/docs/computer-use.mdx b/docs/computer-use.mdx new file mode 100644 index 0000000..fc6b7d0 --- /dev/null +++ b/docs/computer-use.mdx @@ -0,0 +1,859 @@ +--- +title: "Computer Use" +description: "Control a virtual desktop inside the sandbox with mouse, keyboard, screenshots, recordings, and live streaming." +sidebarTitle: "Computer Use" +icon: "desktop" +--- + +Sandbox Agent provides a managed virtual desktop (Xvfb + openbox) that you can control programmatically. This is useful for browser automation, GUI testing, and AI computer-use workflows. + +## Start and stop + + +```ts TypeScript +import { SandboxAgent } from "sandbox-agent"; + +const sdk = await SandboxAgent.connect({ + baseUrl: "http://127.0.0.1:2468", +}); + +const status = await sdk.startDesktop({ + width: 1920, + height: 1080, + dpi: 96, +}); + +console.log(status.state); // "active" +console.log(status.display); // ":99" + +// When done +await sdk.stopDesktop(); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \ + -H "Content-Type: application/json" \ + -d '{"width":1920,"height":1080,"dpi":96}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/stop" +``` + + +All fields in the start request are optional. Defaults are 1440x900 at 96 DPI. + +### Start request options + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `width` | number | 1440 | Desktop width in pixels | +| `height` | number | 900 | Desktop height in pixels | +| `dpi` | number | 96 | Display DPI | +| `displayNum` | number | 99 | Starting X display number. The runtime probes from this number upward to find an available display. | +| `stateDir` | string | (auto) | Desktop state directory for home, logs, recordings | +| `streamVideoCodec` | string | `"vp8"` | WebRTC video codec (`vp8`, `vp9`, `h264`) | +| `streamAudioCodec` | string | `"opus"` | WebRTC audio codec (`opus`, `g722`) | +| `streamFrameRate` | number | 30 | Streaming frame rate (1-60) | +| `webrtcPortRange` | string | `"59050-59070"` | UDP port range for WebRTC media | +| `recordingFps` | number | 30 | Default recording FPS when not specified in `startDesktopRecording` (1-60) | + +The streaming and recording options configure defaults for the desktop session. They take effect when streaming or recording is started later. + + +```ts TypeScript +const status = await sdk.startDesktop({ + width: 1920, + height: 1080, + streamVideoCodec: "h264", + streamFrameRate: 60, + webrtcPortRange: "59100-59120", + recordingFps: 15, +}); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/start" \ + -H "Content-Type: application/json" \ + -d '{ + "width": 1920, + "height": 1080, + "streamVideoCodec": "h264", + "streamFrameRate": 60, + "webrtcPortRange": "59100-59120", + "recordingFps": 15 + }' +``` + + +## Status + + +```ts TypeScript +const status = await sdk.getDesktopStatus(); +console.log(status.state); // "inactive" | "active" | "failed" | ... +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/status" +``` + + +## Screenshots + +Capture the full desktop or a specific region. Optionally include the cursor position. + + +```ts TypeScript +// Full screenshot (PNG by default) +const png = await sdk.takeDesktopScreenshot(); + +// JPEG at 70% quality, half scale +const jpeg = await sdk.takeDesktopScreenshot({ + format: "jpeg", + quality: 70, + scale: 0.5, +}); + +// Include cursor overlay +const withCursor = await sdk.takeDesktopScreenshot({ + showCursor: true, +}); + +// Region screenshot +const region = await sdk.takeDesktopRegionScreenshot({ + x: 100, + y: 100, + width: 400, + height: 300, +}); +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/screenshot" --output screenshot.png + +curl "http://127.0.0.1:2468/v1/desktop/screenshot?format=jpeg&quality=70&scale=0.5" \ + --output screenshot.jpg + +# Include cursor overlay +curl "http://127.0.0.1:2468/v1/desktop/screenshot?show_cursor=true" \ + --output with_cursor.png + +curl "http://127.0.0.1:2468/v1/desktop/screenshot/region?x=100&y=100&width=400&height=300" \ + --output region.png +``` + + +### Screenshot options + +| Param | Type | Default | Description | +|-------|------|---------|-------------| +| `format` | string | `"png"` | Output format: `png`, `jpeg`, or `webp` | +| `quality` | number | 85 | Compression quality (1-100, JPEG/WebP only) | +| `scale` | number | 1.0 | Scale factor (0.1-1.0) | +| `showCursor` | boolean | `false` | Composite a crosshair at the cursor position | + +When `showCursor` is enabled, the cursor position is captured at the moment of the screenshot and a red crosshair is drawn at that location. This is useful for AI agents that need to see where the cursor is in the screenshot. + +## Mouse + + +```ts TypeScript +// Get current position +const pos = await sdk.getDesktopMousePosition(); +console.log(pos.x, pos.y); + +// Move +await sdk.moveDesktopMouse({ x: 500, y: 300 }); + +// Click (left by default) +await sdk.clickDesktop({ x: 500, y: 300 }); + +// Right click +await sdk.clickDesktop({ x: 500, y: 300, button: "right" }); + +// Double click +await sdk.clickDesktop({ x: 500, y: 300, clickCount: 2 }); + +// Drag +await sdk.dragDesktopMouse({ + startX: 100, startY: 100, + endX: 400, endY: 400, +}); + +// Scroll +await sdk.scrollDesktop({ x: 500, y: 300, deltaY: -3 }); +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/mouse/position" + +curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/click" \ + -H "Content-Type: application/json" \ + -d '{"x":500,"y":300}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/drag" \ + -H "Content-Type: application/json" \ + -d '{"startX":100,"startY":100,"endX":400,"endY":400}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/mouse/scroll" \ + -H "Content-Type: application/json" \ + -d '{"x":500,"y":300,"deltaY":-3}' +``` + + +## Keyboard + + +```ts TypeScript +// Type text +await sdk.typeDesktopText({ text: "Hello, world!" }); + +// Press a key with modifiers +await sdk.pressDesktopKey({ + key: "c", + modifiers: { ctrl: true }, +}); + +// Low-level key down/up +await sdk.keyDownDesktop({ key: "Shift_L" }); +await sdk.keyUpDesktop({ key: "Shift_L" }); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/type" \ + -H "Content-Type: application/json" \ + -d '{"text":"Hello, world!"}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/keyboard/press" \ + -H "Content-Type: application/json" \ + -d '{"key":"c","modifiers":{"ctrl":true}}' +``` + + +## Clipboard + +Read and write the X11 clipboard programmatically. + + +```ts TypeScript +// Read clipboard +const clipboard = await sdk.getDesktopClipboard(); +console.log(clipboard.text); + +// Read primary selection (mouse-selected text) +const primary = await sdk.getDesktopClipboard({ selection: "primary" }); + +// Write to clipboard +await sdk.setDesktopClipboard({ text: "Pasted via API" }); + +// Write to both clipboard and primary selection +await sdk.setDesktopClipboard({ + text: "Synced text", + selection: "both", +}); +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/clipboard" + +curl "http://127.0.0.1:2468/v1/desktop/clipboard?selection=primary" + +curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \ + -H "Content-Type: application/json" \ + -d '{"text":"Pasted via API"}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/clipboard" \ + -H "Content-Type: application/json" \ + -d '{"text":"Synced text","selection":"both"}' +``` + + +The `selection` parameter controls which X11 selection to read or write: + +| Value | Description | +|-------|-------------| +| `clipboard` (default) | The standard clipboard (Ctrl+C / Ctrl+V) | +| `primary` | The primary selection (text selected with the mouse) | +| `both` | Write to both clipboard and primary selection (write only) | + +## Display and windows + + +```ts TypeScript +const display = await sdk.getDesktopDisplayInfo(); +console.log(display.resolution); // { width: 1920, height: 1080, dpi: 96 } + +const { windows } = await sdk.listDesktopWindows(); +for (const win of windows) { + console.log(win.title, win.x, win.y, win.width, win.height); +} +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/display/info" + +curl "http://127.0.0.1:2468/v1/desktop/windows" +``` + + +The windows endpoint filters out noise automatically: window manager internals (Openbox), windows with empty titles, and tiny helper windows (under 120x80) are excluded. The currently active/focused window is always included regardless of filters. + +### Focused window + +Get the currently focused window without listing all windows. + + +```ts TypeScript +const focused = await sdk.getDesktopFocusedWindow(); +console.log(focused.title, focused.id); +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/windows/focused" +``` + + +Returns 404 if no window currently has focus. + +### Window management + +Focus, move, and resize windows by their X11 window ID. + + +```ts TypeScript +const { windows } = await sdk.listDesktopWindows(); +const win = windows[0]; + +// Bring window to foreground +await sdk.focusDesktopWindow(win.id); + +// Move window +await sdk.moveDesktopWindow(win.id, { x: 100, y: 50 }); + +// Resize window +await sdk.resizeDesktopWindow(win.id, { width: 1280, height: 720 }); +``` + +```bash cURL +# Focus a window +curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/focus" + +# Move a window +curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/move" \ + -H "Content-Type: application/json" \ + -d '{"x":100,"y":50}' + +# Resize a window +curl -X POST "http://127.0.0.1:2468/v1/desktop/windows/12345/resize" \ + -H "Content-Type: application/json" \ + -d '{"width":1280,"height":720}' +``` + + +All three endpoints return the updated window info so you can verify the operation took effect. The window manager may adjust the requested position or size. + +## App launching + +Launch applications or open files/URLs on the desktop without needing to shell out. + + +```ts TypeScript +// Launch an app by name +const result = await sdk.launchDesktopApp({ + app: "firefox", + args: ["--private"], +}); +console.log(result.processId); // "proc_7" + +// Launch and wait for the window to appear +const withWindow = await sdk.launchDesktopApp({ + app: "xterm", + wait: true, +}); +console.log(withWindow.windowId); // "12345" or null if timed out + +// Open a URL with the default handler +const opened = await sdk.openDesktopTarget({ + target: "https://example.com", +}); +console.log(opened.processId); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \ + -H "Content-Type: application/json" \ + -d '{"app":"firefox","args":["--private"]}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/launch" \ + -H "Content-Type: application/json" \ + -d '{"app":"xterm","wait":true}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/open" \ + -H "Content-Type: application/json" \ + -d '{"target":"https://example.com"}' +``` + + +The returned `processId` can be used with the [Process API](/processes) to read logs (`GET /v1/processes/{id}/logs`) or stop the application (`POST /v1/processes/{id}/stop`). + +When `wait` is `true`, the API polls for up to 5 seconds for a window to appear. If the window appears, its ID is returned in `windowId`. If it times out, `windowId` is `null` but the process is still running. + + +**Launch/Open vs the Process API:** Both `launch` and `open` are convenience wrappers around the [Process API](/processes). They create managed processes (with `owner: "desktop"`) that you can inspect, log, and stop through the same Process endpoints. The difference is that `launch` validates the binary exists in PATH first and can optionally wait for a window to appear, while `open` delegates to the system default handler (`xdg-open`). Use the Process API directly when you need full control over command, environment, working directory, or restart policies. + + +## Recording + +Record the desktop to MP4. + + +```ts TypeScript +const recording = await sdk.startDesktopRecording({ fps: 30 }); +console.log(recording.id); + +// ... do things ... + +const stopped = await sdk.stopDesktopRecording(); + +// List all recordings +const { recordings } = await sdk.listDesktopRecordings(); + +// Download +const mp4 = await sdk.downloadDesktopRecording(recording.id); + +// Clean up +await sdk.deleteDesktopRecording(recording.id); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/start" \ + -H "Content-Type: application/json" \ + -d '{"fps":30}' + +curl -X POST "http://127.0.0.1:2468/v1/desktop/recording/stop" + +curl "http://127.0.0.1:2468/v1/desktop/recordings" + +curl "http://127.0.0.1:2468/v1/desktop/recordings/rec_1/download" --output recording.mp4 + +curl -X DELETE "http://127.0.0.1:2468/v1/desktop/recordings/rec_1" +``` + + +## Desktop processes + +The desktop runtime manages several background processes (Xvfb, openbox, neko, ffmpeg). These are all registered with the general [Process API](/processes) under the `desktop` owner, so you can inspect logs, check status, and troubleshoot using the same tools you use for any other managed process. + + +```ts TypeScript +// List all processes, including desktop-owned ones +const { processes } = await sdk.listProcesses(); + +const desktopProcs = processes.filter((p) => p.owner === "desktop"); +for (const p of desktopProcs) { + console.log(p.id, p.command, p.status); +} + +// Read logs from a specific desktop process +const logs = await sdk.getProcessLogs(desktopProcs[0].id, { tail: 50 }); +for (const entry of logs.entries) { + console.log(entry.stream, atob(entry.data)); +} +``` + +```bash cURL +# List all processes (desktop processes have owner: "desktop") +curl "http://127.0.0.1:2468/v1/processes" + +# Get logs from a specific desktop process +curl "http://127.0.0.1:2468/v1/processes/proc_1/logs?tail=50" +``` + + +The desktop status endpoint also includes a summary of running processes: + + +```ts TypeScript +const status = await sdk.getDesktopStatus(); +for (const proc of status.processes) { + console.log(proc.name, proc.pid, proc.running); +} +``` + +```bash cURL +curl "http://127.0.0.1:2468/v1/desktop/status" +# Response includes: processes: [{ name: "Xvfb", pid: 123, running: true }, ...] +``` + + +| Process | Role | Restart policy | +|---------|------|---------------| +| Xvfb | Virtual X11 framebuffer | Auto-restart while desktop is active | +| openbox | Window manager | Auto-restart while desktop is active | +| neko | WebRTC streaming server (started by `startDesktopStream`) | No auto-restart | +| ffmpeg | Screen recorder (started by `startDesktopRecording`) | No auto-restart | + +## Live streaming + +Start a WebRTC stream for real-time desktop viewing in a browser. + + +```ts TypeScript +await sdk.startDesktopStream(); + +// Check stream status +const status = await sdk.getDesktopStreamStatus(); +console.log(status.active); // true +console.log(status.processId); // "proc_5" + +// Connect via the React DesktopViewer component or +// use the WebSocket signaling endpoint directly +// at ws://127.0.0.1:2468/v1/desktop/stream/signaling + +await sdk.stopDesktopStream(); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/start" + +# Check stream status +curl "http://127.0.0.1:2468/v1/desktop/stream/status" + +# Connect to ws://127.0.0.1:2468/v1/desktop/stream/signaling for WebRTC signaling + +curl -X POST "http://127.0.0.1:2468/v1/desktop/stream/stop" +``` + + +For a drop-in React component, see [React Components](/react-components). + +## API reference + +### Endpoints + +| Method | Path | Description | +|--------|------|-------------| +| `POST` | `/v1/desktop/start` | Start the desktop runtime | +| `POST` | `/v1/desktop/stop` | Stop the desktop runtime | +| `GET` | `/v1/desktop/status` | Get desktop runtime status | +| `GET` | `/v1/desktop/screenshot` | Capture full desktop screenshot | +| `GET` | `/v1/desktop/screenshot/region` | Capture a region screenshot | +| `GET` | `/v1/desktop/mouse/position` | Get current mouse position | +| `POST` | `/v1/desktop/mouse/move` | Move the mouse | +| `POST` | `/v1/desktop/mouse/click` | Click the mouse | +| `POST` | `/v1/desktop/mouse/down` | Press mouse button down | +| `POST` | `/v1/desktop/mouse/up` | Release mouse button | +| `POST` | `/v1/desktop/mouse/drag` | Drag from one point to another | +| `POST` | `/v1/desktop/mouse/scroll` | Scroll at a position | +| `POST` | `/v1/desktop/keyboard/type` | Type text | +| `POST` | `/v1/desktop/keyboard/press` | Press a key with optional modifiers | +| `POST` | `/v1/desktop/keyboard/down` | Press a key down (hold) | +| `POST` | `/v1/desktop/keyboard/up` | Release a key | +| `GET` | `/v1/desktop/display/info` | Get display info | +| `GET` | `/v1/desktop/windows` | List visible windows | +| `GET` | `/v1/desktop/windows/focused` | Get focused window info | +| `POST` | `/v1/desktop/windows/{id}/focus` | Focus a window | +| `POST` | `/v1/desktop/windows/{id}/move` | Move a window | +| `POST` | `/v1/desktop/windows/{id}/resize` | Resize a window | +| `GET` | `/v1/desktop/clipboard` | Read clipboard contents | +| `POST` | `/v1/desktop/clipboard` | Write to clipboard | +| `POST` | `/v1/desktop/launch` | Launch an application | +| `POST` | `/v1/desktop/open` | Open a file or URL | +| `POST` | `/v1/desktop/recording/start` | Start recording | +| `POST` | `/v1/desktop/recording/stop` | Stop recording | +| `GET` | `/v1/desktop/recordings` | List recordings | +| `GET` | `/v1/desktop/recordings/{id}` | Get recording metadata | +| `GET` | `/v1/desktop/recordings/{id}/download` | Download recording | +| `DELETE` | `/v1/desktop/recordings/{id}` | Delete recording | +| `POST` | `/v1/desktop/stream/start` | Start WebRTC streaming | +| `POST` | `/v1/desktop/stream/stop` | Stop WebRTC streaming | +| `GET` | `/v1/desktop/stream/status` | Get stream status | +| `GET` | `/v1/desktop/stream/signaling` | WebSocket for WebRTC signaling | + +### TypeScript SDK methods + +| Method | Returns | Description | +|--------|---------|-------------| +| `startDesktop(request?)` | `DesktopStatusResponse` | Start the desktop | +| `stopDesktop()` | `DesktopStatusResponse` | Stop the desktop | +| `getDesktopStatus()` | `DesktopStatusResponse` | Get desktop status | +| `takeDesktopScreenshot(query?)` | `Uint8Array` | Capture screenshot | +| `takeDesktopRegionScreenshot(query)` | `Uint8Array` | Capture region screenshot | +| `getDesktopMousePosition()` | `DesktopMousePositionResponse` | Get mouse position | +| `moveDesktopMouse(request)` | `DesktopMousePositionResponse` | Move mouse | +| `clickDesktop(request)` | `DesktopMousePositionResponse` | Click mouse | +| `mouseDownDesktop(request)` | `DesktopMousePositionResponse` | Mouse button down | +| `mouseUpDesktop(request)` | `DesktopMousePositionResponse` | Mouse button up | +| `dragDesktopMouse(request)` | `DesktopMousePositionResponse` | Drag mouse | +| `scrollDesktop(request)` | `DesktopMousePositionResponse` | Scroll | +| `typeDesktopText(request)` | `DesktopActionResponse` | Type text | +| `pressDesktopKey(request)` | `DesktopActionResponse` | Press key | +| `keyDownDesktop(request)` | `DesktopActionResponse` | Key down | +| `keyUpDesktop(request)` | `DesktopActionResponse` | Key up | +| `getDesktopDisplayInfo()` | `DesktopDisplayInfoResponse` | Get display info | +| `listDesktopWindows()` | `DesktopWindowListResponse` | List windows | +| `getDesktopFocusedWindow()` | `DesktopWindowInfo` | Get focused window | +| `focusDesktopWindow(id)` | `DesktopWindowInfo` | Focus a window | +| `moveDesktopWindow(id, request)` | `DesktopWindowInfo` | Move a window | +| `resizeDesktopWindow(id, request)` | `DesktopWindowInfo` | Resize a window | +| `getDesktopClipboard(query?)` | `DesktopClipboardResponse` | Read clipboard | +| `setDesktopClipboard(request)` | `DesktopActionResponse` | Write clipboard | +| `launchDesktopApp(request)` | `DesktopLaunchResponse` | Launch an app | +| `openDesktopTarget(request)` | `DesktopOpenResponse` | Open file/URL | +| `startDesktopRecording(request?)` | `DesktopRecordingInfo` | Start recording | +| `stopDesktopRecording()` | `DesktopRecordingInfo` | Stop recording | +| `listDesktopRecordings()` | `DesktopRecordingListResponse` | List recordings | +| `getDesktopRecording(id)` | `DesktopRecordingInfo` | Get recording | +| `downloadDesktopRecording(id)` | `Uint8Array` | Download recording | +| `deleteDesktopRecording(id)` | `void` | Delete recording | +| `startDesktopStream()` | `DesktopStreamStatusResponse` | Start streaming | +| `stopDesktopStream()` | `DesktopStreamStatusResponse` | Stop streaming | +| `getDesktopStreamStatus()` | `DesktopStreamStatusResponse` | Stream status | + +## Customizing the desktop environment + +The desktop runs inside the sandbox filesystem, so you can customize it using the [File System](/file-system) API before or after starting the desktop. The desktop HOME directory is located at `~/.local/state/sandbox-agent/desktop/home` (or `$XDG_STATE_HOME/sandbox-agent/desktop/home` if `XDG_STATE_HOME` is set). + +All configuration files below are written to paths relative to this HOME directory. + +### Window manager (openbox) + +The desktop uses [openbox](http://openbox.org/) as its window manager. You can customize its behavior, theme, and keyboard shortcuts by writing an `rc.xml` config file. + + +```ts TypeScript +const openboxConfig = ` + + + Clearlooks + NLIMC + DejaVu Sans10 + + 1 + + + + +`; + +await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" }); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" }, + openboxConfig, +); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox" + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/rc.xml" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @rc.xml +``` + + +### Autostart programs + +Openbox runs scripts in `~/.config/openbox/autostart` on startup. Use this to launch applications, set the background, or configure the environment. + + +```ts TypeScript +const autostart = `#!/bin/sh +# Set a solid background color +xsetroot -solid "#1e1e2e" & + +# Launch a terminal +xterm -geometry 120x40+50+50 & + +# Launch a browser +firefox --no-remote & +`; + +await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" }); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" }, + autostart, +); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox" + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @autostart.sh +``` + + + +The autostart script runs when openbox starts, which happens during `startDesktop()`. Write the autostart file before calling `startDesktop()` for it to take effect. + + +### Background + +There is no wallpaper set by default (the background is the X root window default). You can set it using `xsetroot` in the autostart script (as shown above), or use `feh` if you need an image: + + +```ts TypeScript +// Upload a wallpaper image +import fs from "node:fs"; + +const wallpaper = await fs.promises.readFile("./wallpaper.png"); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/wallpaper.png" }, + wallpaper, +); + +// Set the autostart to apply it +const autostart = `#!/bin/sh +feh --bg-fill ~/wallpaper.png & +`; + +await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox" }); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" }, + autostart, +); +``` + +```bash cURL +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/wallpaper.png" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @wallpaper.png + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/openbox/autostart" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @autostart.sh +``` + + + +`feh` is not installed by default. Install it via the [Process API](/processes) before starting the desktop: `await sdk.runProcess({ command: "apt-get", args: ["install", "-y", "feh"] })`. + + +### Fonts + +Only `fonts-dejavu-core` is installed by default. To add more fonts, install them with your system package manager or copy font files into the sandbox: + + +```ts TypeScript +// Install a font package +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "fonts-noto", "fonts-liberation"], +}); + +// Or copy a custom font file +import fs from "node:fs"; + +const font = await fs.promises.readFile("./CustomFont.ttf"); +await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts" }); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" }, + font, +); + +// Rebuild the font cache +await sdk.runProcess({ command: "fc-cache", args: ["-fv"] }); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","fonts-noto","fonts-liberation"]}' + +curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts" + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.local/share/fonts/CustomFont.ttf" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @CustomFont.ttf + +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"fc-cache","args":["-fv"]}' +``` + + +### Cursor theme + + +```ts TypeScript +await sdk.runProcess({ + command: "apt-get", + args: ["install", "-y", "dmz-cursor-theme"], +}); + +const xresources = `Xcursor.theme: DMZ-White\nXcursor.size: 24\n`; +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.Xresources" }, + xresources, +); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/processes/run" \ + -H "Content-Type: application/json" \ + -d '{"command":"apt-get","args":["install","-y","dmz-cursor-theme"]}' + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.Xresources" \ + -H "Content-Type: application/octet-stream" \ + --data-binary 'Xcursor.theme: DMZ-White\nXcursor.size: 24' +``` + + + +Run `xrdb -merge ~/.Xresources` (via the autostart or process API) after writing the file for changes to take effect. + + +### Shell and terminal + +No terminal emulator or shell is launched by default. Add one to the openbox autostart: + +```sh +# In ~/.config/openbox/autostart +xterm -geometry 120x40+50+50 & +``` + +To use a different shell, set the `SHELL` environment variable in your Dockerfile or install your preferred shell and configure the terminal to use it. + +### GTK theme + +Applications using GTK will pick up settings from `~/.config/gtk-3.0/settings.ini`: + + +```ts TypeScript +const gtkSettings = `[Settings] +gtk-theme-name=Adwaita +gtk-icon-theme-name=Adwaita +gtk-font-name=DejaVu Sans 10 +gtk-cursor-theme-name=DMZ-White +gtk-cursor-theme-size=24 +`; + +await sdk.mkdirFs({ path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0" }); +await sdk.writeFsFile( + { path: "~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" }, + gtkSettings, +); +``` + +```bash cURL +curl -X POST "http://127.0.0.1:2468/v1/fs/mkdir?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0" + +curl -X PUT "http://127.0.0.1:2468/v1/fs/file?path=~/.local/state/sandbox-agent/desktop/home/.config/gtk-3.0/settings.ini" \ + -H "Content-Type: application/octet-stream" \ + --data-binary @settings.ini +``` + + +### Summary of configuration paths + +All paths are relative to the desktop HOME directory (`~/.local/state/sandbox-agent/desktop/home`). + +| What | Path | Notes | +|------|------|-------| +| Openbox config | `.config/openbox/rc.xml` | Window manager theme, keybindings, behavior | +| Autostart | `.config/openbox/autostart` | Shell script run on desktop start | +| Custom fonts | `.local/share/fonts/` | TTF/OTF files, run `fc-cache -fv` after | +| Cursor theme | `.Xresources` | Requires `xrdb -merge` to apply | +| GTK 3 settings | `.config/gtk-3.0/settings.ini` | Theme, icons, fonts for GTK apps | +| Wallpaper | Any path, referenced from autostart | Requires `feh` or similar tool | diff --git a/docs/deploy/docker.mdx b/docs/deploy/docker.mdx index 674c2d5..4eb1bfc 100644 --- a/docs/deploy/docker.mdx +++ b/docs/deploy/docker.mdx @@ -15,43 +15,64 @@ Run the published full image with all supported agents pre-installed: docker run --rm -p 3000:3000 \ -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \ -e OPENAI_API_KEY="$OPENAI_API_KEY" \ - rivetdev/sandbox-agent:0.4.1-rc.1-full \ + rivetdev/sandbox-agent:0.3.1-full \ server --no-token --host 0.0.0.0 --port 3000 ``` -The `0.4.1-rc.1-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image. +The `0.3.1-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image. -## TypeScript with the Docker provider +If you also want the desktop API inside the container, install desktop dependencies before starting the server: ```bash -npm install sandbox-agent@0.3.x dockerode get-port +docker run --rm -p 3000:3000 \ + -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \ + -e OPENAI_API_KEY="$OPENAI_API_KEY" \ + node:22-bookworm-slim sh -c "\ + apt-get update && \ + DEBIAN_FRONTEND=noninteractive apt-get install -y curl ca-certificates bash libstdc++6 && \ + rm -rf /var/lib/apt/lists/* && \ + curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh && \ + sandbox-agent install desktop --yes && \ + sandbox-agent server --no-token --host 0.0.0.0 --port 3000" ``` -```typescript -import { SandboxAgent } from "sandbox-agent"; -import { docker } from "sandbox-agent/docker"; +In a Dockerfile: -const sdk = await SandboxAgent.start({ - sandbox: docker({ - env: [ - `ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`, - `OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`, - ].filter(Boolean), - }), +```dockerfile +RUN sandbox-agent install desktop --yes +``` + +## TypeScript with dockerode + +```typescript +import Docker from "dockerode"; +import { SandboxAgent } from "sandbox-agent"; + +const docker = new Docker(); +const PORT = 3000; + +const container = await docker.createContainer({ + Image: "rivetdev/sandbox-agent:0.3.1-full", + Cmd: ["server", "--no-token", "--host", "0.0.0.0", "--port", `${PORT}`], + Env: [ + `ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`, + `OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`, + `CODEX_API_KEY=${process.env.CODEX_API_KEY}`, + ].filter(Boolean), + ExposedPorts: { [`${PORT}/tcp`]: {} }, + HostConfig: { + AutoRemove: true, + PortBindings: { [`${PORT}/tcp`]: [{ HostPort: `${PORT}` }] }, + }, }); -try { - const session = await sdk.createSession({ agent: "codex" }); - await session.prompt([{ type: "text", text: "Summarize this repository." }]); -} finally { - await sdk.destroySandbox(); -} -``` +await container.start(); -The `docker` provider uses the `rivetdev/sandbox-agent:0.4.1-rc.1-full` image by default. Override with `image`: +const baseUrl = `http://127.0.0.1:${PORT}`; +const sdk = await SandboxAgent.connect({ baseUrl }); -```typescript -docker({ image: "my-custom-image:latest" }) +const session = await sdk.createSession({ agent: "codex" }); +await session.prompt([{ type: "text", text: "Summarize this repository." }]); ``` ## Building a custom image with everything preinstalled diff --git a/docs/docs.json b/docs/docs.json index 16620fe..0c2b19a 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -87,7 +87,7 @@ }, { "group": "System", - "pages": ["file-system", "processes"] + "pages": ["file-system", "processes", "computer-use", "common-software"] }, { "group": "Orchestration", diff --git a/docs/inspector.mdx b/docs/inspector.mdx index cc5f3d0..1412c21 100644 --- a/docs/inspector.mdx +++ b/docs/inspector.mdx @@ -35,6 +35,7 @@ console.log(url); - Prompt testing - Request/response debugging - Interactive permission prompts (approve, always-allow, or reject tool-use requests) +- Desktop panel for status, remediation, start/stop, and screenshot refresh - Process management (create, stop, kill, delete, view logs) - Interactive PTY terminal for tty processes - One-shot command execution @@ -50,3 +51,16 @@ console.log(url); The Inspector includes an embedded Ghostty-based terminal for interactive tty processes. The UI uses the SDK's high-level `connectProcessTerminal(...)` wrapper via the shared `@sandbox-agent/react` `ProcessTerminal` component. + +## Desktop panel + +The `Desktop` panel shows the current desktop runtime state, missing dependencies, +the suggested install command, last error details, process/log paths, and the +latest captured screenshot. + +Use it to: + +- Check whether desktop dependencies are installed +- Start or stop the managed desktop runtime +- Refresh desktop status +- Capture a fresh screenshot on demand diff --git a/docs/openapi.json b/docs/openapi.json index 7f42f7c..b749aa4 100644 --- a/docs/openapi.json +++ b/docs/openapi.json @@ -628,6 +628,1814 @@ } } }, + "/v1/desktop/clipboard": { + "get": { + "tags": ["v1"], + "summary": "Read the desktop clipboard.", + "description": "Returns the current text content of the X11 clipboard.", + "operationId": "get_v1_desktop_clipboard", + "parameters": [ + { + "name": "selection", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true + } + } + ], + "responses": { + "200": { + "description": "Clipboard contents", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopClipboardResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "500": { + "description": "Clipboard read failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + }, + "post": { + "tags": ["v1"], + "summary": "Write to the desktop clipboard.", + "description": "Sets the text content of the X11 clipboard.", + "operationId": "post_v1_desktop_clipboard", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopClipboardWriteRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Clipboard updated", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopActionResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "500": { + "description": "Clipboard write failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/display/info": { + "get": { + "tags": ["v1"], + "summary": "Get desktop display information.", + "description": "Performs a health-gated display query against the managed desktop and\nreturns the current display identifier and resolution.", + "operationId": "get_v1_desktop_display_info", + "responses": { + "200": { + "description": "Desktop display information", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopDisplayInfoResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "503": { + "description": "Desktop runtime health or display query failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/keyboard/down": { + "post": { + "tags": ["v1"], + "summary": "Press and hold a desktop keyboard key.", + "description": "Performs a health-gated `xdotool keydown` operation against the managed\ndesktop.", + "operationId": "post_v1_desktop_keyboard_down", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopKeyboardDownRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop keyboard action result", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopActionResponse" + } + } + } + }, + "400": { + "description": "Invalid keyboard down request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/keyboard/press": { + "post": { + "tags": ["v1"], + "summary": "Press a desktop keyboard shortcut.", + "description": "Performs a health-gated `xdotool key` operation against the managed\ndesktop.", + "operationId": "post_v1_desktop_keyboard_press", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopKeyboardPressRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop keyboard action result", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopActionResponse" + } + } + } + }, + "400": { + "description": "Invalid keyboard press request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/keyboard/type": { + "post": { + "tags": ["v1"], + "summary": "Type desktop keyboard text.", + "description": "Performs a health-gated `xdotool type` operation against the managed\ndesktop.", + "operationId": "post_v1_desktop_keyboard_type", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopKeyboardTypeRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop keyboard action result", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopActionResponse" + } + } + } + }, + "400": { + "description": "Invalid keyboard type request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/keyboard/up": { + "post": { + "tags": ["v1"], + "summary": "Release a desktop keyboard key.", + "description": "Performs a health-gated `xdotool keyup` operation against the managed\ndesktop.", + "operationId": "post_v1_desktop_keyboard_up", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopKeyboardUpRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop keyboard action result", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopActionResponse" + } + } + } + }, + "400": { + "description": "Invalid keyboard up request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/launch": { + "post": { + "tags": ["v1"], + "summary": "Launch a desktop application.", + "description": "Launches an application by name on the managed desktop, optionally waiting\nfor its window to appear.", + "operationId": "post_v1_desktop_launch", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopLaunchRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Application launched", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopLaunchResponse" + } + } + } + }, + "404": { + "description": "Application not found", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/click": { + "post": { + "tags": ["v1"], + "summary": "Click on the desktop.", + "description": "Performs a health-gated pointer move and click against the managed desktop\nand returns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_click", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseClickRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after click", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse click request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/down": { + "post": { + "tags": ["v1"], + "summary": "Press and hold a desktop mouse button.", + "description": "Performs a health-gated optional pointer move followed by `xdotool mousedown`\nand returns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_down", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseDownRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after button press", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse down request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/drag": { + "post": { + "tags": ["v1"], + "summary": "Drag the desktop mouse.", + "description": "Performs a health-gated drag gesture against the managed desktop and\nreturns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_drag", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseDragRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after drag", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse drag request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/move": { + "post": { + "tags": ["v1"], + "summary": "Move the desktop mouse.", + "description": "Performs a health-gated absolute pointer move on the managed desktop and\nreturns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_move", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseMoveRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after move", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse move request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/position": { + "get": { + "tags": ["v1"], + "summary": "Get the current desktop mouse position.", + "description": "Performs a health-gated mouse position query against the managed desktop.", + "operationId": "get_v1_desktop_mouse_position", + "responses": { + "200": { + "description": "Desktop mouse position", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input check failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/scroll": { + "post": { + "tags": ["v1"], + "summary": "Scroll the desktop mouse wheel.", + "description": "Performs a health-gated scroll gesture at the requested coordinates and\nreturns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_scroll", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseScrollRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after scroll", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse scroll request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/mouse/up": { + "post": { + "tags": ["v1"], + "summary": "Release a desktop mouse button.", + "description": "Performs a health-gated optional pointer move followed by `xdotool mouseup`\nand returns the resulting mouse position.", + "operationId": "post_v1_desktop_mouse_up", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMouseUpRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop mouse position after button release", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopMousePositionResponse" + } + } + } + }, + "400": { + "description": "Invalid mouse up request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or input failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/open": { + "post": { + "tags": ["v1"], + "summary": "Open a file or URL with the default handler.", + "description": "Opens a file path or URL using xdg-open on the managed desktop.", + "operationId": "post_v1_desktop_open", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopOpenRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Target opened", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopOpenResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/recording/start": { + "post": { + "tags": ["v1"], + "summary": "Start desktop recording.", + "description": "Starts an ffmpeg x11grab recording against the managed desktop and returns\nthe created recording metadata.", + "operationId": "post_v1_desktop_recording_start", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopRecordingStartRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop recording started", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopRecordingInfo" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready or a recording is already active", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop recording failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/recording/stop": { + "post": { + "tags": ["v1"], + "summary": "Stop desktop recording.", + "description": "Stops the active desktop recording and returns the finalized recording\nmetadata.", + "operationId": "post_v1_desktop_recording_stop", + "responses": { + "200": { + "description": "Desktop recording stopped", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopRecordingInfo" + } + } + } + }, + "409": { + "description": "No active desktop recording", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop recording stop failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/recordings": { + "get": { + "tags": ["v1"], + "summary": "List desktop recordings.", + "description": "Returns the current desktop recording catalog.", + "operationId": "get_v1_desktop_recordings", + "responses": { + "200": { + "description": "Desktop recordings", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopRecordingListResponse" + } + } + } + }, + "502": { + "description": "Desktop recordings query failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/recordings/{id}": { + "get": { + "tags": ["v1"], + "summary": "Get desktop recording metadata.", + "description": "Returns metadata for a single desktop recording.", + "operationId": "get_v1_desktop_recording", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "Desktop recording ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "responses": { + "200": { + "description": "Desktop recording metadata", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopRecordingInfo" + } + } + } + }, + "404": { + "description": "Unknown desktop recording", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + }, + "delete": { + "tags": ["v1"], + "summary": "Delete a desktop recording.", + "description": "Removes a completed desktop recording and its file from disk.", + "operationId": "delete_v1_desktop_recording", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "Desktop recording ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "responses": { + "204": { + "description": "Desktop recording deleted" + }, + "404": { + "description": "Unknown desktop recording", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop recording is still active", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/recordings/{id}/download": { + "get": { + "tags": ["v1"], + "summary": "Download a desktop recording.", + "description": "Serves the recorded MP4 bytes for a completed desktop recording.", + "operationId": "get_v1_desktop_recording_download", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "Desktop recording ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "responses": { + "200": { + "description": "Desktop recording as MP4 bytes" + }, + "404": { + "description": "Unknown desktop recording", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/screenshot": { + "get": { + "tags": ["v1"], + "summary": "Capture a full desktop screenshot.", + "description": "Performs a health-gated full-frame screenshot of the managed desktop and\nreturns the requested image bytes.", + "operationId": "get_v1_desktop_screenshot", + "parameters": [ + { + "name": "format", + "in": "query", + "required": false, + "schema": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopScreenshotFormat" + } + ], + "nullable": true + } + }, + { + "name": "quality", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + } + }, + { + "name": "scale", + "in": "query", + "required": false, + "schema": { + "type": "number", + "format": "float", + "nullable": true + } + }, + { + "name": "showCursor", + "in": "query", + "required": false, + "schema": { + "type": "boolean", + "nullable": true + } + } + ], + "responses": { + "200": { + "description": "Desktop screenshot as image bytes" + }, + "400": { + "description": "Invalid screenshot query", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or screenshot capture failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/screenshot/region": { + "get": { + "tags": ["v1"], + "summary": "Capture a desktop screenshot region.", + "description": "Performs a health-gated screenshot crop against the managed desktop and\nreturns the requested region image bytes.", + "operationId": "get_v1_desktop_screenshot_region", + "parameters": [ + { + "name": "x", + "in": "query", + "required": true, + "schema": { + "type": "integer", + "format": "int32" + } + }, + { + "name": "y", + "in": "query", + "required": true, + "schema": { + "type": "integer", + "format": "int32" + } + }, + { + "name": "width", + "in": "query", + "required": true, + "schema": { + "type": "integer", + "format": "int32", + "minimum": 0 + } + }, + { + "name": "height", + "in": "query", + "required": true, + "schema": { + "type": "integer", + "format": "int32", + "minimum": 0 + } + }, + { + "name": "format", + "in": "query", + "required": false, + "schema": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopScreenshotFormat" + } + ], + "nullable": true + } + }, + { + "name": "quality", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + } + }, + { + "name": "scale", + "in": "query", + "required": false, + "schema": { + "type": "number", + "format": "float", + "nullable": true + } + }, + { + "name": "showCursor", + "in": "query", + "required": false, + "schema": { + "type": "boolean", + "nullable": true + } + } + ], + "responses": { + "200": { + "description": "Desktop screenshot region as image bytes" + }, + "400": { + "description": "Invalid screenshot region", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop runtime health or screenshot capture failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/start": { + "post": { + "tags": ["v1"], + "summary": "Start the private desktop runtime.", + "description": "Lazily launches the managed Xvfb/openbox stack, validates display health,\nand returns the resulting desktop status snapshot.", + "operationId": "post_v1_desktop_start", + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStartRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Desktop runtime status after start", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStatusResponse" + } + } + } + }, + "400": { + "description": "Invalid desktop start request", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is already transitioning", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "501": { + "description": "Desktop API unsupported on this platform", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "503": { + "description": "Desktop runtime could not be started", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/status": { + "get": { + "tags": ["v1"], + "summary": "Get desktop runtime status.", + "description": "Returns the current desktop runtime state, dependency status, active\ndisplay metadata, and supervised process information.", + "operationId": "get_v1_desktop_status", + "responses": { + "200": { + "description": "Desktop runtime status", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStatusResponse" + } + } + } + }, + "401": { + "description": "Authentication required", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/stop": { + "post": { + "tags": ["v1"], + "summary": "Stop the private desktop runtime.", + "description": "Terminates the managed openbox/Xvfb/dbus processes owned by the desktop\nruntime and returns the resulting status snapshot.", + "operationId": "post_v1_desktop_stop", + "responses": { + "200": { + "description": "Desktop runtime status after stop", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStatusResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is already transitioning", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/stream/signaling": { + "get": { + "tags": ["v1"], + "summary": "Open a desktop WebRTC signaling session.", + "description": "Upgrades the connection to a WebSocket used for WebRTC signaling between\nthe browser client and the desktop streaming process. Also accepts mouse\nand keyboard input frames as a fallback transport.", + "operationId": "get_v1_desktop_stream_ws", + "parameters": [ + { + "name": "access_token", + "in": "query", + "description": "Bearer token alternative for WS auth", + "required": false, + "schema": { + "type": "string", + "nullable": true + } + } + ], + "responses": { + "101": { + "description": "WebSocket upgraded" + }, + "409": { + "description": "Desktop runtime or streaming session is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "502": { + "description": "Desktop stream failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/stream/start": { + "post": { + "tags": ["v1"], + "summary": "Start desktop streaming.", + "description": "Enables desktop websocket streaming for the managed desktop.", + "operationId": "post_v1_desktop_stream_start", + "responses": { + "200": { + "description": "Desktop streaming started", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStreamStatusResponse" + } + } + } + } + } + } + }, + "/v1/desktop/stream/status": { + "get": { + "tags": ["v1"], + "summary": "Get desktop stream status.", + "description": "Returns the current state of the desktop WebRTC streaming session.", + "operationId": "get_v1_desktop_stream_status", + "responses": { + "200": { + "description": "Desktop stream status", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStreamStatusResponse" + } + } + } + } + } + } + }, + "/v1/desktop/stream/stop": { + "post": { + "tags": ["v1"], + "summary": "Stop desktop streaming.", + "description": "Disables desktop websocket streaming for the managed desktop.", + "operationId": "post_v1_desktop_stream_stop", + "responses": { + "200": { + "description": "Desktop streaming stopped", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopStreamStatusResponse" + } + } + } + } + } + } + }, + "/v1/desktop/windows": { + "get": { + "tags": ["v1"], + "summary": "List visible desktop windows.", + "description": "Performs a health-gated visible-window enumeration against the managed\ndesktop and returns the current window metadata.", + "operationId": "get_v1_desktop_windows", + "responses": { + "200": { + "description": "Visible desktop windows", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowListResponse" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "503": { + "description": "Desktop runtime health or window query failed", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/windows/focused": { + "get": { + "tags": ["v1"], + "summary": "Get the currently focused desktop window.", + "description": "Returns information about the window that currently has input focus.", + "operationId": "get_v1_desktop_windows_focused", + "responses": { + "200": { + "description": "Focused window info", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowInfo" + } + } + } + }, + "404": { + "description": "No window is focused", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/windows/{id}/focus": { + "post": { + "tags": ["v1"], + "summary": "Focus a desktop window.", + "description": "Brings the specified window to the foreground and gives it input focus.", + "operationId": "post_v1_desktop_window_focus", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "X11 window ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "responses": { + "200": { + "description": "Window info after focus", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowInfo" + } + } + } + }, + "404": { + "description": "Window not found", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/windows/{id}/move": { + "post": { + "tags": ["v1"], + "summary": "Move a desktop window.", + "description": "Moves the specified window to the given position.", + "operationId": "post_v1_desktop_window_move", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "X11 window ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowMoveRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Window info after move", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowInfo" + } + } + } + }, + "404": { + "description": "Window not found", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, + "/v1/desktop/windows/{id}/resize": { + "post": { + "tags": ["v1"], + "summary": "Resize a desktop window.", + "description": "Resizes the specified window to the given dimensions.", + "operationId": "post_v1_desktop_window_resize", + "parameters": [ + { + "name": "id", + "in": "path", + "description": "X11 window ID", + "required": true, + "schema": { + "type": "string" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowResizeRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "description": "Window info after resize", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DesktopWindowInfo" + } + } + } + }, + "404": { + "description": "Window not found", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + }, + "409": { + "description": "Desktop runtime is not ready", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ProblemDetails" + } + } + } + } + } + } + }, "/v1/fs/entries": { "get": { "tags": ["v1"], @@ -911,6 +2719,21 @@ "summary": "List all managed processes.", "description": "Returns a list of all processes (running and exited) currently tracked\nby the runtime, sorted by process ID.", "operationId": "get_v1_processes", + "parameters": [ + { + "name": "owner", + "in": "query", + "required": false, + "schema": { + "allOf": [ + { + "$ref": "#/components/schemas/ProcessOwner" + } + ], + "nullable": true + } + } + ], "responses": { "200": { "description": "List processes", @@ -1934,6 +3757,769 @@ } } }, + "DesktopActionResponse": { + "type": "object", + "required": ["ok"], + "properties": { + "ok": { + "type": "boolean" + } + } + }, + "DesktopClipboardQuery": { + "type": "object", + "properties": { + "selection": { + "type": "string", + "nullable": true + } + } + }, + "DesktopClipboardResponse": { + "type": "object", + "required": ["text", "selection"], + "properties": { + "selection": { + "type": "string" + }, + "text": { + "type": "string" + } + } + }, + "DesktopClipboardWriteRequest": { + "type": "object", + "required": ["text"], + "properties": { + "selection": { + "type": "string", + "nullable": true + }, + "text": { + "type": "string" + } + } + }, + "DesktopDisplayInfoResponse": { + "type": "object", + "required": ["display", "resolution"], + "properties": { + "display": { + "type": "string" + }, + "resolution": { + "$ref": "#/components/schemas/DesktopResolution" + } + } + }, + "DesktopErrorInfo": { + "type": "object", + "required": ["code", "message"], + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + } + } + }, + "DesktopKeyModifiers": { + "type": "object", + "properties": { + "alt": { + "type": "boolean", + "nullable": true + }, + "cmd": { + "type": "boolean", + "nullable": true + }, + "ctrl": { + "type": "boolean", + "nullable": true + }, + "shift": { + "type": "boolean", + "nullable": true + } + } + }, + "DesktopKeyboardDownRequest": { + "type": "object", + "required": ["key"], + "properties": { + "key": { + "type": "string" + } + } + }, + "DesktopKeyboardPressRequest": { + "type": "object", + "required": ["key"], + "properties": { + "key": { + "type": "string" + }, + "modifiers": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopKeyModifiers" + } + ], + "nullable": true + } + } + }, + "DesktopKeyboardTypeRequest": { + "type": "object", + "required": ["text"], + "properties": { + "delayMs": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "text": { + "type": "string" + } + } + }, + "DesktopKeyboardUpRequest": { + "type": "object", + "required": ["key"], + "properties": { + "key": { + "type": "string" + } + } + }, + "DesktopLaunchRequest": { + "type": "object", + "required": ["app"], + "properties": { + "app": { + "type": "string" + }, + "args": { + "type": "array", + "items": { + "type": "string" + }, + "nullable": true + }, + "wait": { + "type": "boolean", + "nullable": true + } + } + }, + "DesktopLaunchResponse": { + "type": "object", + "required": ["processId"], + "properties": { + "pid": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "processId": { + "type": "string" + }, + "windowId": { + "type": "string", + "nullable": true + } + } + }, + "DesktopMouseButton": { + "type": "string", + "enum": ["left", "middle", "right"] + }, + "DesktopMouseClickRequest": { + "type": "object", + "required": ["x", "y"], + "properties": { + "button": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopMouseButton" + } + ], + "nullable": true + }, + "clickCount": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopMouseDownRequest": { + "type": "object", + "properties": { + "button": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopMouseButton" + } + ], + "nullable": true + }, + "x": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "y": { + "type": "integer", + "format": "int32", + "nullable": true + } + } + }, + "DesktopMouseDragRequest": { + "type": "object", + "required": ["startX", "startY", "endX", "endY"], + "properties": { + "button": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopMouseButton" + } + ], + "nullable": true + }, + "endX": { + "type": "integer", + "format": "int32" + }, + "endY": { + "type": "integer", + "format": "int32" + }, + "startX": { + "type": "integer", + "format": "int32" + }, + "startY": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopMouseMoveRequest": { + "type": "object", + "required": ["x", "y"], + "properties": { + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopMousePositionResponse": { + "type": "object", + "required": ["x", "y"], + "properties": { + "screen": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "window": { + "type": "string", + "nullable": true + }, + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopMouseScrollRequest": { + "type": "object", + "required": ["x", "y"], + "properties": { + "deltaX": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "deltaY": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopMouseUpRequest": { + "type": "object", + "properties": { + "button": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopMouseButton" + } + ], + "nullable": true + }, + "x": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "y": { + "type": "integer", + "format": "int32", + "nullable": true + } + } + }, + "DesktopOpenRequest": { + "type": "object", + "required": ["target"], + "properties": { + "target": { + "type": "string" + } + } + }, + "DesktopOpenResponse": { + "type": "object", + "required": ["processId"], + "properties": { + "pid": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "processId": { + "type": "string" + } + } + }, + "DesktopProcessInfo": { + "type": "object", + "required": ["name", "running"], + "properties": { + "logPath": { + "type": "string", + "nullable": true + }, + "name": { + "type": "string" + }, + "pid": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "running": { + "type": "boolean" + } + } + }, + "DesktopRecordingInfo": { + "type": "object", + "required": ["id", "status", "fileName", "bytes", "startedAt"], + "properties": { + "bytes": { + "type": "integer", + "format": "int64", + "minimum": 0 + }, + "endedAt": { + "type": "string", + "nullable": true + }, + "fileName": { + "type": "string" + }, + "id": { + "type": "string" + }, + "processId": { + "type": "string", + "nullable": true + }, + "startedAt": { + "type": "string" + }, + "status": { + "$ref": "#/components/schemas/DesktopRecordingStatus" + } + } + }, + "DesktopRecordingListResponse": { + "type": "object", + "required": ["recordings"], + "properties": { + "recordings": { + "type": "array", + "items": { + "$ref": "#/components/schemas/DesktopRecordingInfo" + } + } + } + }, + "DesktopRecordingStartRequest": { + "type": "object", + "properties": { + "fps": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + } + } + }, + "DesktopRecordingStatus": { + "type": "string", + "enum": ["recording", "completed", "failed"] + }, + "DesktopRegionScreenshotQuery": { + "type": "object", + "required": ["x", "y", "width", "height"], + "properties": { + "format": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopScreenshotFormat" + } + ], + "nullable": true + }, + "height": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "quality": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "scale": { + "type": "number", + "format": "float", + "nullable": true + }, + "showCursor": { + "type": "boolean", + "nullable": true + }, + "width": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopResolution": { + "type": "object", + "required": ["width", "height"], + "properties": { + "dpi": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "height": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "width": { + "type": "integer", + "format": "int32", + "minimum": 0 + } + } + }, + "DesktopScreenshotFormat": { + "type": "string", + "enum": ["png", "jpeg", "webp"] + }, + "DesktopScreenshotQuery": { + "type": "object", + "properties": { + "format": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopScreenshotFormat" + } + ], + "nullable": true + }, + "quality": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "scale": { + "type": "number", + "format": "float", + "nullable": true + }, + "showCursor": { + "type": "boolean", + "nullable": true + } + } + }, + "DesktopStartRequest": { + "type": "object", + "properties": { + "displayNum": { + "type": "integer", + "format": "int32", + "nullable": true + }, + "dpi": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "height": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "recordingFps": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "stateDir": { + "type": "string", + "nullable": true + }, + "streamAudioCodec": { + "type": "string", + "nullable": true + }, + "streamFrameRate": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + }, + "streamVideoCodec": { + "type": "string", + "nullable": true + }, + "webrtcPortRange": { + "type": "string", + "nullable": true + }, + "width": { + "type": "integer", + "format": "int32", + "nullable": true, + "minimum": 0 + } + } + }, + "DesktopState": { + "type": "string", + "enum": ["inactive", "install_required", "starting", "active", "stopping", "failed"] + }, + "DesktopStatusResponse": { + "type": "object", + "required": ["state"], + "properties": { + "display": { + "type": "string", + "nullable": true + }, + "installCommand": { + "type": "string", + "nullable": true + }, + "lastError": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopErrorInfo" + } + ], + "nullable": true + }, + "missingDependencies": { + "type": "array", + "items": { + "type": "string" + } + }, + "processes": { + "type": "array", + "items": { + "$ref": "#/components/schemas/DesktopProcessInfo" + } + }, + "resolution": { + "allOf": [ + { + "$ref": "#/components/schemas/DesktopResolution" + } + ], + "nullable": true + }, + "runtimeLogPath": { + "type": "string", + "nullable": true + }, + "startedAt": { + "type": "string", + "nullable": true + }, + "state": { + "$ref": "#/components/schemas/DesktopState" + }, + "windows": { + "type": "array", + "items": { + "$ref": "#/components/schemas/DesktopWindowInfo" + }, + "description": "Current visible windows (included when the desktop is active)." + } + } + }, + "DesktopStreamStatusResponse": { + "type": "object", + "required": ["active"], + "properties": { + "active": { + "type": "boolean" + }, + "processId": { + "type": "string", + "nullable": true + }, + "windowId": { + "type": "string", + "nullable": true + } + } + }, + "DesktopWindowInfo": { + "type": "object", + "required": ["id", "title", "x", "y", "width", "height", "isActive"], + "properties": { + "height": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "id": { + "type": "string" + }, + "isActive": { + "type": "boolean" + }, + "title": { + "type": "string" + }, + "width": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopWindowListResponse": { + "type": "object", + "required": ["windows"], + "properties": { + "windows": { + "type": "array", + "items": { + "$ref": "#/components/schemas/DesktopWindowInfo" + } + } + } + }, + "DesktopWindowMoveRequest": { + "type": "object", + "required": ["x", "y"], + "properties": { + "x": { + "type": "integer", + "format": "int32" + }, + "y": { + "type": "integer", + "format": "int32" + } + } + }, + "DesktopWindowResizeRequest": { + "type": "object", + "required": ["width", "height"], + "properties": { + "height": { + "type": "integer", + "format": "int32", + "minimum": 0 + }, + "width": { + "type": "integer", + "format": "int32", + "minimum": 0 + } + } + }, "ErrorType": { "type": "string", "enum": [ @@ -2326,7 +4912,7 @@ }, "ProcessInfo": { "type": "object", - "required": ["id", "command", "args", "tty", "interactive", "status", "createdAtMs"], + "required": ["id", "command", "args", "tty", "interactive", "owner", "status", "createdAtMs"], "properties": { "args": { "type": "array", @@ -2361,6 +4947,9 @@ "interactive": { "type": "boolean" }, + "owner": { + "$ref": "#/components/schemas/ProcessOwner" + }, "pid": { "type": "integer", "format": "int32", @@ -2398,6 +4987,19 @@ } } }, + "ProcessListQuery": { + "type": "object", + "properties": { + "owner": { + "allOf": [ + { + "$ref": "#/components/schemas/ProcessOwner" + } + ], + "nullable": true + } + } + }, "ProcessListResponse": { "type": "object", "required": ["processes"], @@ -2484,6 +5086,10 @@ "type": "string", "enum": ["stdout", "stderr", "combined", "pty"] }, + "ProcessOwner": { + "type": "string", + "enum": ["user", "desktop", "system"] + }, "ProcessRunRequest": { "type": "object", "required": ["command"], diff --git a/docs/quickstart.mdx b/docs/quickstart.mdx index 33f7120..5e0af9b 100644 --- a/docs/quickstart.mdx +++ b/docs/quickstart.mdx @@ -1,370 +1,289 @@ --- title: "Quickstart" -description: "Get a coding agent running in a sandbox in under a minute." +description: "Start the server and send your first message." icon: "rocket" --- - + - + + ```bash + npx skills add rivet-dev/skills -s sandbox-agent + ``` + + + ```bash + bunx skills add rivet-dev/skills -s sandbox-agent + ``` + + + + + + Each coding agent requires API keys to connect to their respective LLM providers. + + + + ```bash + export ANTHROPIC_API_KEY="sk-ant-..." + export OPENAI_API_KEY="sk-..." + ``` + + + + ```typescript + import { Sandbox } from "@e2b/code-interpreter"; + + const envs: Record = {}; + if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY; + if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY; + + const sandbox = await Sandbox.create({ envs }); + ``` + + + + ```typescript + import { Daytona } from "@daytonaio/sdk"; + + const envVars: Record = {}; + if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY; + if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY; + + const daytona = new Daytona(); + const sandbox = await daytona.create({ + snapshot: "sandbox-agent-ready", + envVars, + }); + ``` + + + + ```bash + docker run -p 2468:2468 \ + -e ANTHROPIC_API_KEY="sk-ant-..." \ + -e OPENAI_API_KEY="sk-..." \ + rivetdev/sandbox-agent:0.3.1-full \ + server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + + + Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from local Claude Code or Codex config files. + + + Use the `mock` agent for SDK and integration testing without provider credentials. + + + For per-tenant token tracking, budget enforcement, or usage-based billing, see [LLM Credentials](/llm-credentials) for gateway options like OpenRouter, LiteLLM, and Portkey. + + + + + + + + Install and run the binary directly. + + ```bash + curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh + sandbox-agent server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + Run without installing globally. + + ```bash + npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + Run without installing globally. + + ```bash + bunx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + Install globally, then run. + + ```bash + npm install -g @sandbox-agent/cli@0.3.x + sandbox-agent server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + Install globally, then run. + + ```bash + bun add -g @sandbox-agent/cli@0.3.x + # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()). + bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64 + sandbox-agent server --no-token --host 0.0.0.0 --port 2468 + ``` + + + + For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess. + ```bash npm install sandbox-agent@0.3.x ``` + + ```typescript + import { SandboxAgent } from "sandbox-agent"; + + const sdk = await SandboxAgent.start(); + ``` - + + + For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess. + ```bash bun add sandbox-agent@0.3.x # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()). bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64 ``` - - - - - - `SandboxAgent.start()` provisions a sandbox, starts a lightweight [Sandbox Agent server](/architecture) inside it, and connects your SDK client. - - - - ```bash - npm install sandbox-agent@0.3.x - ``` ```typescript import { SandboxAgent } from "sandbox-agent"; - import { local } from "sandbox-agent/local"; - // Runs on your machine. Inherits process.env automatically. - const client = await SandboxAgent.start({ - sandbox: local(), - }); + const sdk = await SandboxAgent.start(); ``` - - See [Local deploy guide](/deploy/local) - + + If you're running from source instead of the installed CLI. + ```bash - npm install sandbox-agent@0.3.x @e2b/code-interpreter + cargo run -p sandbox-agent -- server --no-token --host 0.0.0.0 --port 2468 ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { e2b } from "sandbox-agent/e2b"; - - // Provisions a cloud sandbox on E2B, installs the server, and connects. - const client = await SandboxAgent.start({ - sandbox: e2b(), - }); - ``` - - See [E2B deploy guide](/deploy/e2b) - - - - ```bash - npm install sandbox-agent@0.3.x @daytonaio/sdk - ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { daytona } from "sandbox-agent/daytona"; - - // Provisions a Daytona workspace with the server pre-installed. - const client = await SandboxAgent.start({ - sandbox: daytona(), - }); - ``` - - See [Daytona deploy guide](/deploy/daytona) - - - - ```bash - npm install sandbox-agent@0.3.x @vercel/sandbox - ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { vercel } from "sandbox-agent/vercel"; - - // Provisions a Vercel sandbox with the server installed on boot. - const client = await SandboxAgent.start({ - sandbox: vercel(), - }); - ``` - - See [Vercel deploy guide](/deploy/vercel) - - - - ```bash - npm install sandbox-agent@0.3.x modal - ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { modal } from "sandbox-agent/modal"; - - // Builds a container image with agents pre-installed (cached after first run), - // starts a Modal sandbox from that image, and connects. - const client = await SandboxAgent.start({ - sandbox: modal(), - }); - ``` - - See [Modal deploy guide](/deploy/modal) - - - - ```bash - npm install sandbox-agent@0.3.x @cloudflare/sandbox - ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { cloudflare } from "sandbox-agent/cloudflare"; - import { SandboxClient } from "@cloudflare/sandbox"; - - // Uses the Cloudflare Sandbox SDK to provision and connect. - // The Cloudflare SDK handles server lifecycle internally. - const cfSandboxClient = new SandboxClient(); - const client = await SandboxAgent.start({ - sandbox: cloudflare({ sdk: cfSandboxClient }), - }); - ``` - - See [Cloudflare deploy guide](/deploy/cloudflare) - - - - ```bash - npm install sandbox-agent@0.3.x dockerode get-port - ``` - - ```typescript - import { SandboxAgent } from "sandbox-agent"; - import { docker } from "sandbox-agent/docker"; - - // Runs a Docker container locally. Good for testing. - const client = await SandboxAgent.start({ - sandbox: docker(), - }); - ``` - - See [Docker deploy guide](/deploy/docker) -
- - **More info:** + Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely. - - Agents need API keys for their LLM provider. Each provider passes credentials differently: + + Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure networking at the infrastructure layer. - ```typescript - // Local — inherits process.env automatically + If you expose the server publicly, use `--token "$SANDBOX_TOKEN"` to require authentication: - // E2B - e2b({ create: { envs: { ANTHROPIC_API_KEY: "..." } } }) - - // Daytona - daytona({ create: { envVars: { ANTHROPIC_API_KEY: "..." } } }) - - // Vercel - vercel({ create: { env: { ANTHROPIC_API_KEY: "..." } } }) - - // Modal - modal({ create: { secrets: { ANTHROPIC_API_KEY: "..." } } }) - - // Docker - docker({ env: ["ANTHROPIC_API_KEY=..."] }) + ```bash + sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468 ``` - For multi-tenant billing, per-user keys, and gateway options, see [LLM Credentials](/llm-credentials). - + Then pass the token when connecting: - - Implement the `SandboxProvider` interface to use any sandbox platform: - - ```typescript - import { SandboxAgent, type SandboxProvider } from "sandbox-agent"; - - const myProvider: SandboxProvider = { - name: "my-provider", - async create() { - // Provision a sandbox, install & start the server, return an ID - return "sandbox-123"; - }, - async destroy(sandboxId) { - // Tear down the sandbox - }, - async getUrl(sandboxId) { - // Return the Sandbox Agent server URL - return `https://${sandboxId}.my-platform.dev:3000`; - }, - }; - - const client = await SandboxAgent.start({ - sandbox: myProvider, - }); - ``` - - - - If you already have a Sandbox Agent server running, connect directly: - - ```typescript - const client = await SandboxAgent.connect({ - baseUrl: "http://127.0.0.1:2468", - }); - ``` - - - + + ```typescript + import { SandboxAgent } from "sandbox-agent"; + + const sdk = await SandboxAgent.connect({ + baseUrl: "http://your-server:2468", + token: process.env.SANDBOX_TOKEN, + }); + ``` + + ```bash - curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh - sandbox-agent server --no-token --host 0.0.0.0 --port 2468 + curl "http://your-server:2468/v1/health" \ + -H "Authorization: Bearer $SANDBOX_TOKEN" ``` - + + ```bash - npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468 - ``` - - - ```bash - docker run -p 2468:2468 \ - -e ANTHROPIC_API_KEY="sk-ant-..." \ - -e OPENAI_API_KEY="sk-..." \ - rivetdev/sandbox-agent:0.4.1-rc.1-full \ - server --no-token --host 0.0.0.0 --port 2468 + sandbox-agent --token "$SANDBOX_TOKEN" api agents list \ + --endpoint http://your-server:2468 ``` + + If you're calling the server from a browser, see the [CORS configuration guide](/cors). + - - + + To preinstall agents: - ```typescript Claude - const session = await client.createSession({ - agent: "claude", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - ```typescript Codex - const session = await client.createSession({ - agent: "codex", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - ```typescript OpenCode - const session = await client.createSession({ - agent: "opencode", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - ```typescript Cursor - const session = await client.createSession({ - agent: "cursor", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - ```typescript Amp - const session = await client.createSession({ - agent: "amp", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - ```typescript Pi - const session = await client.createSession({ - agent: "pi", - }); - - session.onEvent((event) => { - console.log(event.sender, event.payload); - }); - - const result = await session.prompt([ - { type: "text", text: "Summarize the repository and suggest next steps." }, - ]); - - console.log(result.stopReason); - ``` - - - - See [Agent Sessions](/agent-sessions) for the full sessions API. - - - - ```typescript - await client.destroySandbox(); // provider-defined cleanup and disconnect + ```bash + sandbox-agent install-agent --all ``` - Use `client.dispose()` instead to disconnect without changing sandbox state. On E2B, `client.pauseSandbox()` pauses the sandbox and `client.killSandbox()` deletes it permanently. + If agents are not installed up front, they are lazily installed when creating a session. - - Open the Inspector at `/ui/` on your server (e.g. `http://localhost:2468/ui/`) to view sessions and events in a GUI. + + If you want to use `/v1/desktop/*`, install the desktop runtime packages first: + + ```bash + sandbox-agent install desktop --yes + ``` + + Then use `GET /v1/desktop/status` or `sdk.getDesktopStatus()` to verify the runtime is ready before calling desktop screenshot or input APIs. + + + + ```typescript + import { SandboxAgent } from "sandbox-agent"; + + const sdk = await SandboxAgent.connect({ + baseUrl: "http://127.0.0.1:2468", + }); + + const session = await sdk.createSession({ + agent: "claude", + sessionInit: { + cwd: "/", + mcpServers: [], + }, + }); + + console.log(session.id); + ``` + + + + ```typescript + const result = await session.prompt([ + { type: "text", text: "Summarize the repository and suggest next steps." }, + ]); + + console.log(result.stopReason); + ``` + + + + ```typescript + const off = session.onEvent((event) => { + console.log(event.sender, event.payload); + }); + + const page = await sdk.getEvents({ + sessionId: session.id, + limit: 50, + }); + + console.log(page.items.length); + off(); + ``` + + + + Open the Inspector UI at `/ui/` on your server (for example, `http://localhost:2468/ui/`) to inspect sessions and events in a GUI. Sandbox Agent Inspector @@ -372,44 +291,16 @@ icon: "rocket" -## Full example - -```typescript -import { SandboxAgent } from "sandbox-agent"; -import { e2b } from "sandbox-agent/e2b"; - -const client = await SandboxAgent.start({ - sandbox: e2b({ - create: { - envs: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY }, - }, - }), -}); - -try { - const session = await client.createSession({ agent: "claude" }); - - session.onEvent((event) => { - console.log(`[${event.sender}]`, JSON.stringify(event.payload)); - }); - - const result = await session.prompt([ - { type: "text", text: "Write a function that checks if a number is prime." }, - ]); - - console.log("Done:", result.stopReason); -} finally { - await client.destroySandbox(); -} -``` - ## Next steps - - - Full TypeScript SDK API surface. + + + Configure in-memory, Rivet Actor state, IndexedDB, SQLite, and Postgres persistence. - Deploy to E2B, Daytona, Docker, Vercel, or Cloudflare. + Deploy your agent to E2B, Daytona, Docker, Vercel, or Cloudflare. + + + Use the latest TypeScript SDK API. diff --git a/docs/sdk-overview.mdx b/docs/sdk-overview.mdx index 8e7c8f6..8e87e2e 100644 --- a/docs/sdk-overview.mdx +++ b/docs/sdk-overview.mdx @@ -196,6 +196,44 @@ const writeResult = await sdk.writeFsFile({ path: "./hello.txt" }, "hello"); console.log(health.status, agents.agents.length, entries.length, writeResult.path); ``` +## Desktop API + +The SDK also wraps the desktop host/runtime HTTP API. + +Install desktop dependencies first on Linux hosts: + +```bash +sandbox-agent install desktop --yes +``` + +Then query status, surface remediation if needed, and start the runtime: + +```ts +const status = await sdk.getDesktopStatus(); + +if (status.state === "install_required") { + console.log(status.installCommand); +} + +const started = await sdk.startDesktop({ + width: 1440, + height: 900, + dpi: 96, +}); + +const screenshot = await sdk.takeDesktopScreenshot(); +const displayInfo = await sdk.getDesktopDisplayInfo(); + +await sdk.moveDesktopMouse({ x: 400, y: 300 }); +await sdk.clickDesktop({ x: 400, y: 300, button: "left", clickCount: 1 }); +await sdk.typeDesktopText({ text: "hello world", delayMs: 10 }); +await sdk.pressDesktopKey({ key: "ctrl+l" }); + +await sdk.stopDesktop(); +``` + +Screenshot helpers return `Uint8Array` PNG bytes. The SDK does not attempt to install OS packages remotely; callers should surface `missingDependencies` and `installCommand` from `getDesktopStatus()`. + ## Error handling ```ts diff --git a/frontend/packages/inspector/index.html b/frontend/packages/inspector/index.html index 5893717..6a5d064 100644 --- a/frontend/packages/inspector/index.html +++ b/frontend/packages/inspector/index.html @@ -2889,6 +2889,181 @@ gap: 20px; } + .desktop-panel { + display: flex; + flex-direction: column; + gap: 16px; + } + + .desktop-state-grid { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 12px; + margin-bottom: 12px; + } + + .desktop-start-controls { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 10px; + } + + .desktop-screenshot-controls { + display: flex; + align-items: flex-end; + gap: 10px; + flex-wrap: wrap; + margin-bottom: 12px; + } + + .desktop-checkbox-label { + display: flex; + align-items: center; + gap: 6px; + font-size: 12px; + cursor: pointer; + white-space: nowrap; + padding-bottom: 4px; + } + + .desktop-advanced-grid { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 10px; + margin-top: 8px; + } + + .desktop-input-group { + display: flex; + flex-direction: column; + gap: 4px; + } + + .desktop-chip-list { + display: flex; + flex-wrap: wrap; + gap: 8px; + } + + .desktop-command { + margin-top: 6px; + padding: 8px 10px; + border-radius: var(--radius); + border: 1px solid var(--border); + background: var(--surface); + overflow-x: auto; + } + + .desktop-diagnostic-block + .desktop-diagnostic-block { + margin-top: 14px; + } + + .desktop-process-list { + display: flex; + flex-direction: column; + gap: 10px; + margin-top: 8px; + } + + .desktop-process-item { + padding: 10px; + border-radius: var(--radius); + border: 1px solid var(--border); + background: var(--surface); + display: flex; + flex-direction: column; + gap: 4px; + } + + .desktop-clipboard-text { + margin: 4px 0 0; + padding: 8px 10px; + border-radius: var(--radius); + border: 1px solid var(--border); + background: var(--surface); + font-size: 12px; + white-space: pre-wrap; + word-break: break-all; + max-height: 120px; + overflow-y: auto; + } + + .desktop-window-item { + padding: 10px; + border-radius: var(--radius); + border: 1px solid var(--border); + background: var(--surface); + display: flex; + flex-direction: column; + gap: 6px; + } + + .desktop-window-focused { + border-color: var(--success); + box-shadow: inset 0 0 0 1px var(--success); + } + + .desktop-window-editor { + display: flex; + align-items: center; + gap: 6px; + margin-top: 6px; + padding-top: 6px; + border-top: 1px solid var(--border); + } + + .desktop-launch-row { + display: flex; + align-items: center; + gap: 8px; + margin-top: 6px; + flex-wrap: wrap; + } + + .desktop-mouse-pos { + display: flex; + align-items: center; + gap: 8px; + margin-top: 8px; + } + + .desktop-stream-hint { + display: flex; + align-items: center; + justify-content: space-between; + gap: 12px; + margin-bottom: 8px; + font-size: 11px; + color: var(--muted); + } + + .desktop-screenshot-empty { + padding: 18px; + border: 1px dashed var(--border); + border-radius: var(--radius); + color: var(--muted); + background: var(--surface); + text-align: center; + } + + .desktop-screenshot-frame { + border-radius: calc(var(--radius) + 2px); + overflow: hidden; + border: 1px solid var(--border); + background: + linear-gradient(135deg, rgba(15, 23, 42, 0.9), rgba(30, 41, 59, 0.92)), + radial-gradient(circle at top right, rgba(56, 189, 248, 0.12), transparent 40%); + padding: 10px; + } + + .desktop-screenshot-image { + display: block; + width: 100%; + height: auto; + border-radius: var(--radius); + background: rgba(0, 0, 0, 0.24); + } + .processes-section { display: flex; flex-direction: column; @@ -3551,6 +3726,12 @@ grid-template-columns: 1fr; } + .desktop-state-grid, + .desktop-start-controls, + .desktop-advanced-grid { + grid-template-columns: 1fr; + } + .session-sidebar { display: none; } diff --git a/frontend/packages/inspector/package.json b/frontend/packages/inspector/package.json index d17c3a0..6d86357 100644 --- a/frontend/packages/inspector/package.json +++ b/frontend/packages/inspector/package.json @@ -18,6 +18,7 @@ "@types/react-dom": "^19.1.6", "@vitejs/plugin-react": "^4.3.1", "fake-indexeddb": "^6.2.4", + "jsdom": "^26.1.0", "typescript": "^5.7.3", "vite": "^5.4.7", "vitest": "^3.0.0" diff --git a/frontend/packages/inspector/src/components/debug/DebugPanel.tsx b/frontend/packages/inspector/src/components/debug/DebugPanel.tsx index 879ab96..9855d38 100644 --- a/frontend/packages/inspector/src/components/debug/DebugPanel.tsx +++ b/frontend/packages/inspector/src/components/debug/DebugPanel.tsx @@ -1,4 +1,4 @@ -import { ChevronLeft, ChevronRight, Cloud, Play, PlayCircle, Server, Terminal, Wrench } from "lucide-react"; +import { ChevronLeft, ChevronRight, Cloud, Monitor, Play, PlayCircle, Server, Terminal, Wrench } from "lucide-react"; import type { AgentInfo, SandboxAgent, SessionEvent } from "sandbox-agent"; type AgentModeInfo = { id: string; name: string; description: string }; @@ -9,9 +9,10 @@ import ProcessesTab from "./ProcessesTab"; import ProcessRunTab from "./ProcessRunTab"; import SkillsTab from "./SkillsTab"; import RequestLogTab from "./RequestLogTab"; +import DesktopTab from "./DesktopTab"; import type { RequestLog } from "../../types/requestLog"; -export type DebugTab = "log" | "events" | "agents" | "mcp" | "skills" | "processes" | "run-process"; +export type DebugTab = "log" | "events" | "agents" | "desktop" | "mcp" | "skills" | "processes" | "run-process"; const DebugPanel = ({ debugTab, @@ -75,6 +76,10 @@ const DebugPanel = ({ Agents + + {isActive && !liveViewActive && ( + + )} +
+ {isActive && !liveViewActive && ( +
+
+ + +
+ {screenshotFormat !== "png" && ( +
+ + setScreenshotQuality(event.target.value)} + inputMode="numeric" + style={{ maxWidth: 60 }} + /> +
+ )} +
+ + setScreenshotScale(event.target.value)} + inputMode="decimal" + style={{ maxWidth: 60 }} + /> +
+ +
+ )} + {error &&
{error}
} + {screenshotError &&
{screenshotError}
} + {/* ========== Runtime Section ========== */} +
+
+ + + Desktop Runtime + + + {status?.state ?? "unknown"} + +
+
+
+
Display
+
{status?.display ?? "Not assigned"}
+
+
+
Resolution
+
{resolutionLabel}
+
+
+
Started
+
{formatStartedAt(status?.startedAt)}
+
+
+
+
+ + setWidth(event.target.value)} inputMode="numeric" /> +
+
+ + setHeight(event.target.value)} inputMode="numeric" /> +
+
+ + setDpi(event.target.value)} inputMode="numeric" /> +
+
+ + {showAdvancedStart && ( +
+
+ + +
+
+ + +
+
+ + setStreamFrameRate(event.target.value)} + inputMode="numeric" + disabled={isActive} + /> +
+
+ + setWebrtcPortRange(event.target.value)} disabled={isActive} /> +
+
+ + setDefaultRecordingFps(event.target.value)} + inputMode="numeric" + disabled={isActive} + /> +
+
+ )} +
+ {isActive ? ( + + ) : ( + + )} +
+
+ {/* ========== Missing Dependencies ========== */} + {status?.missingDependencies && status.missingDependencies.length > 0 && ( +
+
+ Missing Dependencies +
+
+ {status.missingDependencies.map((dependency) => ( + + {dependency} + + ))} +
+ {status.installCommand && ( + <> +
+ Install command +
+
{status.installCommand}
+ + )} +
+ )} + {/* ========== Live View Section ========== */} +
+
+ + + {isActive && ( + + )} +
+ {liveViewError && ( +
+ {liveViewError} +
+ )} + {!isActive &&
Start the desktop runtime to enable live view.
} + {isActive && liveViewActive && ( + <> +
+ Right click to open window + {status?.resolution && ( + + {status.resolution.width}x{status.resolution.height} + + )} +
+ + + )} + {isActive && !liveViewActive && ( + <> + {screenshotUrl ? ( +
+ Desktop screenshot +
+ ) : ( +
Click "Start Stream" for live desktop view, or use the Screenshot button above.
+ )} + + )} + {isActive && ( +
+ + {mousePos && ( + + ({mousePos.x}, {mousePos.y}) + + )} +
+ )} +
+ {isActive && ( +
+
+ + + Clipboard + +
+ + +
+
+ {clipboardError && ( +
+ {clipboardError} +
+ )} +
+
Current contents
+
+              {clipboardText ? clipboardText : (empty)}
+            
+
+
+
Write to clipboard
+