feat: desktop computer-use APIs with neko-based streaming

Add desktop runtime management (Xvfb, openbox, dbus), screen capture,
mouse/keyboard input, and video streaming via neko binary extracted
from the m1k1o/neko container. Includes Docker test rig, TypeScript SDK
desktop support, and inspector Desktop tab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nathan Flurry 2026-03-16 17:56:39 -07:00
parent 3895e34bdb
commit 33821d8660
66 changed files with 13190 additions and 1135 deletions

View file

@ -1,370 +1,289 @@
---
title: "Quickstart"
description: "Get a coding agent running in a sandbox in under a minute."
description: "Start the server and send your first message."
icon: "rocket"
---
<Steps>
<Step title="Install">
<Step title="Install skill (optional)">
<Tabs>
<Tab title="npm">
<Tab title="npx">
```bash
npx skills add rivet-dev/skills -s sandbox-agent
```
</Tab>
<Tab title="bunx">
```bash
bunx skills add rivet-dev/skills -s sandbox-agent
```
</Tab>
</Tabs>
</Step>
<Step title="Set environment variables">
Each coding agent requires API keys to connect to their respective LLM providers.
<Tabs>
<Tab title="Local shell">
```bash
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
```
</Tab>
<Tab title="E2B">
```typescript
import { Sandbox } from "@e2b/code-interpreter";
const envs: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const sandbox = await Sandbox.create({ envs });
```
</Tab>
<Tab title="Daytona">
```typescript
import { Daytona } from "@daytonaio/sdk";
const envVars: Record<string, string> = {};
if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const daytona = new Daytona();
const sandbox = await daytona.create({
snapshot: "sandbox-agent-ready",
envVars,
});
```
</Tab>
<Tab title="Docker">
```bash
docker run -p 2468:2468 \
-e ANTHROPIC_API_KEY="sk-ant-..." \
-e OPENAI_API_KEY="sk-..." \
rivetdev/sandbox-agent:0.3.1-full \
server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
</Tabs>
<AccordionGroup>
<Accordion title="Extracting API keys from current machine">
Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from local Claude Code or Codex config files.
</Accordion>
<Accordion title="Testing without API keys">
Use the `mock` agent for SDK and integration testing without provider credentials.
</Accordion>
<Accordion title="Multi-tenant and per-user billing">
For per-tenant token tracking, budget enforcement, or usage-based billing, see [LLM Credentials](/llm-credentials) for gateway options like OpenRouter, LiteLLM, and Portkey.
</Accordion>
</AccordionGroup>
</Step>
<Step title="Run the server">
<Tabs>
<Tab title="curl">
Install and run the binary directly.
```bash
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="npx">
Run without installing globally.
```bash
npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="bunx">
Run without installing globally.
```bash
bunx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="npm i -g">
Install globally, then run.
```bash
npm install -g @sandbox-agent/cli@0.3.x
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="bun add -g">
Install globally, then run.
```bash
bun add -g @sandbox-agent/cli@0.3.x
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="Node.js (local)">
For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
```bash
npm install sandbox-agent@0.3.x
```
```typescript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.start();
```
</Tab>
<Tab title="bun">
<Tab title="Bun (local)">
For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
```bash
bun add sandbox-agent@0.3.x
# Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
```
</Tab>
</Tabs>
</Step>
<Step title="Start the sandbox">
`SandboxAgent.start()` provisions a sandbox, starts a lightweight [Sandbox Agent server](/architecture) inside it, and connects your SDK client.
<Tabs>
<Tab title="Local">
```bash
npm install sandbox-agent@0.3.x
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { local } from "sandbox-agent/local";
// Runs on your machine. Inherits process.env automatically.
const client = await SandboxAgent.start({
sandbox: local(),
});
const sdk = await SandboxAgent.start();
```
See [Local deploy guide](/deploy/local)
</Tab>
<Tab title="E2B">
<Tab title="Build from source">
If you're running from source instead of the installed CLI.
```bash
npm install sandbox-agent@0.3.x @e2b/code-interpreter
cargo run -p sandbox-agent -- server --no-token --host 0.0.0.0 --port 2468
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { e2b } from "sandbox-agent/e2b";
// Provisions a cloud sandbox on E2B, installs the server, and connects.
const client = await SandboxAgent.start({
sandbox: e2b(),
});
```
See [E2B deploy guide](/deploy/e2b)
</Tab>
<Tab title="Daytona">
```bash
npm install sandbox-agent@0.3.x @daytonaio/sdk
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { daytona } from "sandbox-agent/daytona";
// Provisions a Daytona workspace with the server pre-installed.
const client = await SandboxAgent.start({
sandbox: daytona(),
});
```
See [Daytona deploy guide](/deploy/daytona)
</Tab>
<Tab title="Vercel">
```bash
npm install sandbox-agent@0.3.x @vercel/sandbox
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { vercel } from "sandbox-agent/vercel";
// Provisions a Vercel sandbox with the server installed on boot.
const client = await SandboxAgent.start({
sandbox: vercel(),
});
```
See [Vercel deploy guide](/deploy/vercel)
</Tab>
<Tab title="Modal">
```bash
npm install sandbox-agent@0.3.x modal
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { modal } from "sandbox-agent/modal";
// Builds a container image with agents pre-installed (cached after first run),
// starts a Modal sandbox from that image, and connects.
const client = await SandboxAgent.start({
sandbox: modal(),
});
```
See [Modal deploy guide](/deploy/modal)
</Tab>
<Tab title="Cloudflare">
```bash
npm install sandbox-agent@0.3.x @cloudflare/sandbox
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { cloudflare } from "sandbox-agent/cloudflare";
import { SandboxClient } from "@cloudflare/sandbox";
// Uses the Cloudflare Sandbox SDK to provision and connect.
// The Cloudflare SDK handles server lifecycle internally.
const cfSandboxClient = new SandboxClient();
const client = await SandboxAgent.start({
sandbox: cloudflare({ sdk: cfSandboxClient }),
});
```
See [Cloudflare deploy guide](/deploy/cloudflare)
</Tab>
<Tab title="Docker">
```bash
npm install sandbox-agent@0.3.x dockerode get-port
```
```typescript
import { SandboxAgent } from "sandbox-agent";
import { docker } from "sandbox-agent/docker";
// Runs a Docker container locally. Good for testing.
const client = await SandboxAgent.start({
sandbox: docker(),
});
```
See [Docker deploy guide](/deploy/docker)
</Tab>
</Tabs>
<div style={{ height: "1rem" }} />
**More info:**
Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely.
<AccordionGroup>
<Accordion title="Passing LLM credentials">
Agents need API keys for their LLM provider. Each provider passes credentials differently:
<Accordion title="Configuring token">
Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure networking at the infrastructure layer.
```typescript
// Local — inherits process.env automatically
If you expose the server publicly, use `--token "$SANDBOX_TOKEN"` to require authentication:
// E2B
e2b({ create: { envs: { ANTHROPIC_API_KEY: "..." } } })
// Daytona
daytona({ create: { envVars: { ANTHROPIC_API_KEY: "..." } } })
// Vercel
vercel({ create: { env: { ANTHROPIC_API_KEY: "..." } } })
// Modal
modal({ create: { secrets: { ANTHROPIC_API_KEY: "..." } } })
// Docker
docker({ env: ["ANTHROPIC_API_KEY=..."] })
```bash
sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
```
For multi-tenant billing, per-user keys, and gateway options, see [LLM Credentials](/llm-credentials).
</Accordion>
Then pass the token when connecting:
<Accordion title="Implementing a custom provider">
Implement the `SandboxProvider` interface to use any sandbox platform:
```typescript
import { SandboxAgent, type SandboxProvider } from "sandbox-agent";
const myProvider: SandboxProvider = {
name: "my-provider",
async create() {
// Provision a sandbox, install & start the server, return an ID
return "sandbox-123";
},
async destroy(sandboxId) {
// Tear down the sandbox
},
async getUrl(sandboxId) {
// Return the Sandbox Agent server URL
return `https://${sandboxId}.my-platform.dev:3000`;
},
};
const client = await SandboxAgent.start({
sandbox: myProvider,
});
```
</Accordion>
<Accordion title="Connecting to an existing server">
If you already have a Sandbox Agent server running, connect directly:
```typescript
const client = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
```
</Accordion>
<Accordion title="Starting the server manually">
<Tabs>
<Tab title="TypeScript">
```typescript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.connect({
baseUrl: "http://your-server:2468",
token: process.env.SANDBOX_TOKEN,
});
```
</Tab>
<Tab title="curl">
```bash
curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh
sandbox-agent server --no-token --host 0.0.0.0 --port 2468
curl "http://your-server:2468/v1/health" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
</Tab>
<Tab title="npx">
<Tab title="CLI">
```bash
npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
```
</Tab>
<Tab title="Docker">
```bash
docker run -p 2468:2468 \
-e ANTHROPIC_API_KEY="sk-ant-..." \
-e OPENAI_API_KEY="sk-..." \
rivetdev/sandbox-agent:0.4.1-rc.1-full \
server --no-token --host 0.0.0.0 --port 2468
sandbox-agent --token "$SANDBOX_TOKEN" api agents list \
--endpoint http://your-server:2468
```
</Tab>
</Tabs>
</Accordion>
<Accordion title="CORS">
If you're calling the server from a browser, see the [CORS configuration guide](/cors).
</Accordion>
</AccordionGroup>
</Step>
<Step title="Create a session and send a prompt">
<CodeGroup>
<Step title="Install agents (optional)">
To preinstall agents:
```typescript Claude
const session = await client.createSession({
agent: "claude",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
```typescript Codex
const session = await client.createSession({
agent: "codex",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
```typescript OpenCode
const session = await client.createSession({
agent: "opencode",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
```typescript Cursor
const session = await client.createSession({
agent: "cursor",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
```typescript Amp
const session = await client.createSession({
agent: "amp",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
```typescript Pi
const session = await client.createSession({
agent: "pi",
});
session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
</CodeGroup>
See [Agent Sessions](/agent-sessions) for the full sessions API.
</Step>
<Step title="Clean up">
```typescript
await client.destroySandbox(); // provider-defined cleanup and disconnect
```bash
sandbox-agent install-agent --all
```
Use `client.dispose()` instead to disconnect without changing sandbox state. On E2B, `client.pauseSandbox()` pauses the sandbox and `client.killSandbox()` deletes it permanently.
If agents are not installed up front, they are lazily installed when creating a session.
</Step>
<Step title="Inspect with the UI">
Open the Inspector at `/ui/` on your server (e.g. `http://localhost:2468/ui/`) to view sessions and events in a GUI.
<Step title="Install desktop dependencies (optional, Linux only)">
If you want to use `/v1/desktop/*`, install the desktop runtime packages first:
```bash
sandbox-agent install desktop --yes
```
Then use `GET /v1/desktop/status` or `sdk.getDesktopStatus()` to verify the runtime is ready before calling desktop screenshot or input APIs.
</Step>
<Step title="Create a session">
```typescript
import { SandboxAgent } from "sandbox-agent";
const sdk = await SandboxAgent.connect({
baseUrl: "http://127.0.0.1:2468",
});
const session = await sdk.createSession({
agent: "claude",
sessionInit: {
cwd: "/",
mcpServers: [],
},
});
console.log(session.id);
```
</Step>
<Step title="Send a message">
```typescript
const result = await session.prompt([
{ type: "text", text: "Summarize the repository and suggest next steps." },
]);
console.log(result.stopReason);
```
</Step>
<Step title="Read events">
```typescript
const off = session.onEvent((event) => {
console.log(event.sender, event.payload);
});
const page = await sdk.getEvents({
sessionId: session.id,
limit: 50,
});
console.log(page.items.length);
off();
```
</Step>
<Step title="Test with Inspector">
Open the Inspector UI at `/ui/` on your server (for example, `http://localhost:2468/ui/`) to inspect sessions and events in a GUI.
<Frame>
<img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
@ -372,44 +291,16 @@ icon: "rocket"
</Step>
</Steps>
## Full example
```typescript
import { SandboxAgent } from "sandbox-agent";
import { e2b } from "sandbox-agent/e2b";
const client = await SandboxAgent.start({
sandbox: e2b({
create: {
envs: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY },
},
}),
});
try {
const session = await client.createSession({ agent: "claude" });
session.onEvent((event) => {
console.log(`[${event.sender}]`, JSON.stringify(event.payload));
});
const result = await session.prompt([
{ type: "text", text: "Write a function that checks if a number is prime." },
]);
console.log("Done:", result.stopReason);
} finally {
await client.destroySandbox();
}
```
## Next steps
<CardGroup cols={2}>
<Card title="SDK Overview" icon="compass" href="/sdk-overview">
Full TypeScript SDK API surface.
<CardGroup cols={3}>
<Card title="Session Persistence" icon="database" href="/session-persistence">
Configure in-memory, Rivet Actor state, IndexedDB, SQLite, and Postgres persistence.
</Card>
<Card title="Deploy to a Sandbox" icon="box" href="/deploy/local">
Deploy to E2B, Daytona, Docker, Vercel, or Cloudflare.
Deploy your agent to E2B, Daytona, Docker, Vercel, or Cloudflare.
</Card>
<Card title="SDK Overview" icon="compass" href="/sdk-overview">
Use the latest TypeScript SDK API.
</Card>
</CardGroup>