mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 07:04:48 +00:00
feat: add Docker Sandbox deployment support
Add example and documentation for deploying sandbox-agent inside Docker Sandbox microVMs for enhanced isolation on macOS/Windows. - Add examples/docker-sandbox/ with TypeScript example - Add docs/deploy/docker-sandbox.mdx with setup guide using custom templates - Update docs navigation to include Docker Sandbox
This commit is contained in:
parent
cc5a9e0d73
commit
1b2e65ec7f
14 changed files with 952 additions and 2 deletions
222
docs/deploy/docker-sandbox.mdx
Normal file
222
docs/deploy/docker-sandbox.mdx
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
---
|
||||
title: "Docker Sandbox"
|
||||
description: "Run agents inside Docker Sandbox microVMs for enhanced isolation."
|
||||
---
|
||||
|
||||
<Warning>
|
||||
As of February 2026, Docker Sandbox microVMs are only available on macOS and Windows with Docker Desktop 4.58+. Linux users should use [regular Docker containers](/deploy/docker) or other sandbox providers. See [Docker Sandboxes Architecture](https://docs.docker.com/ai/sandboxes/architecture/) for details.
|
||||
</Warning>
|
||||
|
||||
## Overview
|
||||
|
||||
[Docker Sandboxes](https://docs.docker.com/ai/sandboxes/) provide hypervisor-level isolation using lightweight microVMs. Each sandbox gets its own kernel, private Docker daemon, and isolated network for much stronger isolation than standard containers.
|
||||
|
||||
| | Docker Container | Docker Sandbox |
|
||||
|--|------------------|----------------|
|
||||
| **Security** | Shared kernel (namespaces) | Separate kernel (microVM) |
|
||||
| **Untrusted code** | Not safe | Safe |
|
||||
| **Port exposure** | Supported (`-p 3000:3000`) | Not supported |
|
||||
| **Network access** | Direct HTTP | Via `docker sandbox exec` only |
|
||||
| **Volumes** | Direct mount | Bidirectional file sync |
|
||||
| **Platform** | Linux, macOS, Windows | macOS, Windows only |
|
||||
|
||||
### Why sandbox-agent inside Docker Sandbox?
|
||||
|
||||
Running sandbox-agent inside a Docker Sandbox provides:
|
||||
|
||||
- **Multiple sessions**: Manage multiple concurrent agent sessions within a single sandbox
|
||||
- **Any coding agent**: Run any supported agent (Claude, Codex, OpenCode, Amp) through a unified API
|
||||
- **Session persistence**: Sessions persist across interactions without recreating the sandbox
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker Desktop 4.58 or later
|
||||
- macOS or Windows (experimental on Windows)
|
||||
- `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` for the coding agents
|
||||
|
||||
## Setup
|
||||
|
||||
<Steps>
|
||||
<Step title="Create Dockerfile">
|
||||
Create a custom template with sandbox-agent as the main process:
|
||||
|
||||
```dockerfile
|
||||
# Base template includes Ubuntu, Git, Docker CLI, Node.js, Python, Go
|
||||
FROM docker/sandbox-templates:claude-code
|
||||
|
||||
USER root
|
||||
|
||||
# Install sandbox-agent
|
||||
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
|
||||
|
||||
# Pre-install agents
|
||||
RUN sandbox-agent install-agent claude
|
||||
RUN sandbox-agent install-agent codex
|
||||
|
||||
# Create wrapper: Docker Sandbox runs `claude` but we redirect to sandbox-agent
|
||||
RUN mv /home/agent/.local/bin/claude /home/agent/.local/bin/claude-real \
|
||||
&& printf '#!/bin/bash\nexec sandbox-agent server --no-token --host 0.0.0.0\n' \
|
||||
> /home/agent/.local/bin/claude \
|
||||
&& chmod +x /home/agent/.local/bin/claude
|
||||
|
||||
USER agent
|
||||
```
|
||||
|
||||
<Note>
|
||||
Docker Sandbox requires an agent argument (`claude`, `codex`, etc.) and runs that agent as the main process. The wrapper script intercepts the `claude` command and runs sandbox-agent server instead.
|
||||
</Note>
|
||||
</Step>
|
||||
|
||||
<Step title="Build the template">
|
||||
```bash
|
||||
docker build -t sandbox-agent-template:latest .
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Run sandbox">
|
||||
```bash
|
||||
docker sandbox run \
|
||||
--load-local-template \
|
||||
-t sandbox-agent-template:latest \
|
||||
--name my-sandbox \
|
||||
claude ~/my-project
|
||||
```
|
||||
|
||||
The sandbox-agent server starts automatically as the main process.
|
||||
</Step>
|
||||
|
||||
<Step title="Interact with the server">
|
||||
In another terminal, use `docker sandbox exec` to interact:
|
||||
|
||||
```bash
|
||||
# Create a session
|
||||
docker sandbox exec my-sandbox \
|
||||
sandbox-agent api sessions create my-session --agent claude
|
||||
|
||||
# Send a message (pass API key via -e)
|
||||
docker sandbox exec -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" my-sandbox \
|
||||
sandbox-agent api sessions send-message my-session \
|
||||
--message "Summarize this repository"
|
||||
|
||||
# Stream events
|
||||
docker sandbox exec my-sandbox \
|
||||
sandbox-agent api sessions events my-session
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Clean up">
|
||||
```bash
|
||||
docker sandbox stop my-sandbox
|
||||
docker sandbox rm my-sandbox
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## TypeScript Example
|
||||
|
||||
```typescript
|
||||
import { execSync, spawn, spawnSync } from "node:child_process";
|
||||
|
||||
const SANDBOX_NAME = "my-agent-sandbox";
|
||||
const TEMPLATE = "sandbox-agent-template:latest";
|
||||
|
||||
function exec(cmd: string): string {
|
||||
return execSync(cmd, { encoding: "utf-8" }).trim();
|
||||
}
|
||||
|
||||
function sandboxExec(cmd: string, env?: Record<string, string>): string {
|
||||
const envFlags = env
|
||||
? Object.entries(env).flatMap(([k, v]) => ["-e", `${k}=${v}`])
|
||||
: [];
|
||||
const args = ["sandbox", "exec", ...envFlags, SANDBOX_NAME, "sh", "-c", cmd];
|
||||
const result = spawnSync("docker", args, { encoding: "utf-8" });
|
||||
return result.stdout?.trim() ?? "";
|
||||
}
|
||||
|
||||
// Run sandbox with custom template (sandbox-agent starts as main process)
|
||||
spawn("docker", [
|
||||
"sandbox", "run",
|
||||
"--load-local-template", "-t", TEMPLATE,
|
||||
"--name", SANDBOX_NAME,
|
||||
"claude", process.cwd()
|
||||
], { detached: true, stdio: "ignore" }).unref();
|
||||
|
||||
// Wait for server to start
|
||||
await new Promise((r) => setTimeout(r, 5000));
|
||||
|
||||
// Create a session
|
||||
sandboxExec("sandbox-agent api sessions create my-session --agent claude");
|
||||
|
||||
// Send a message (pass API key via exec -e flag)
|
||||
sandboxExec(
|
||||
"sandbox-agent api sessions send-message my-session --message 'Hello'",
|
||||
{ ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! }
|
||||
);
|
||||
|
||||
// Get events
|
||||
const output = sandboxExec("sandbox-agent api sessions events my-session");
|
||||
console.log(output);
|
||||
|
||||
// Cleanup
|
||||
exec(`docker sandbox stop ${SANDBOX_NAME}`);
|
||||
exec(`docker sandbox rm ${SANDBOX_NAME}`);
|
||||
```
|
||||
|
||||
## Sandbox Commands Reference
|
||||
|
||||
```bash
|
||||
# List all sandboxes
|
||||
docker sandbox ls
|
||||
|
||||
# Create a sandbox with custom template
|
||||
docker sandbox create --load-local-template -t <image> --name <name> AGENT WORKSPACE
|
||||
|
||||
# Run agent interactively (creates sandbox if needed)
|
||||
docker sandbox run claude ~/my-project
|
||||
|
||||
# Execute command in sandbox
|
||||
docker sandbox exec <name> <command>
|
||||
|
||||
# Execute with environment variables
|
||||
docker sandbox exec -e VAR=value <name> <command>
|
||||
|
||||
# Run command in background
|
||||
docker sandbox exec -d <name> <command>
|
||||
|
||||
# Interactive shell
|
||||
docker sandbox exec -it <name> bash
|
||||
|
||||
# Save sandbox as template
|
||||
docker sandbox save <name> my-template:v1
|
||||
|
||||
# Stop sandbox (without removing)
|
||||
docker sandbox stop <name>
|
||||
|
||||
# Remove sandbox
|
||||
docker sandbox rm <name>
|
||||
```
|
||||
|
||||
### Create Options
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--name` | Custom name for the sandbox |
|
||||
| `-t, --template <image>` | Use custom Docker image as base |
|
||||
| `--load-local-template` | Use image from local Docker daemon |
|
||||
| `-q, --quiet` | Suppress verbose output |
|
||||
|
||||
### Exec Options
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `-e VAR=value` | Set environment variable for this command |
|
||||
| `-d, --detach` | Run command in background |
|
||||
| `-i, --interactive` | Keep STDIN open |
|
||||
| `-t, --tty` | Allocate a pseudo-TTY |
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [Docker Sandboxes Documentation](https://docs.docker.com/ai/sandboxes/)
|
||||
- [Docker Sandboxes Templates](https://docs.docker.com/ai/sandboxes/templates/)
|
||||
- [Docker Sandboxes Architecture](https://docs.docker.com/ai/sandboxes/architecture/)
|
||||
- [Regular Docker Deployment](/deploy/docker) (for HTTP server access)
|
||||
|
|
@ -24,4 +24,7 @@ icon: "server"
|
|||
<Card title="Docker" icon="docker" href="/deploy/docker">
|
||||
Build and run in a container (development only).
|
||||
</Card>
|
||||
<Card title="Docker Sandbox" icon="shield" href="/deploy/docker-sandbox">
|
||||
Run in Docker Sandbox microVMs for enhanced isolation (macOS/Windows).
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
|
|
|||
|
|
@ -52,7 +52,8 @@
|
|||
"deploy/daytona",
|
||||
"deploy/vercel",
|
||||
"deploy/cloudflare",
|
||||
"deploy/docker"
|
||||
"deploy/docker",
|
||||
"deploy/docker-sandbox"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
|
|
|||
10
examples/docker-sandbox/Dockerfile
Normal file
10
examples/docker-sandbox/Dockerfile
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
FROM --platform=linux/amd64 node:22-slim
|
||||
|
||||
RUN apt-get update && apt-get install -y curl git && rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install sandbox-agent
|
||||
RUN curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/install.sh | sh
|
||||
|
||||
# Pre-install claude agent and add to PATH
|
||||
RUN sandbox-agent install-agent claude
|
||||
ENV PATH="/root/.local/share/sandbox-agent/bin:$PATH"
|
||||
37
examples/docker-sandbox/README.md
Normal file
37
examples/docker-sandbox/README.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
# Docker Sandbox Example
|
||||
|
||||
Runs sandbox-agent inside a Docker Sandbox microVM for enhanced isolation.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Docker Desktop 4.58+ (macOS or Windows)
|
||||
- `ANTHROPIC_API_KEY` environment variable
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
pnpm start
|
||||
```
|
||||
|
||||
First run builds the image and creates the VM (slow). Subsequent runs reuse the VM (fast).
|
||||
|
||||
To clean up resources:
|
||||
```bash
|
||||
pnpm cleanup
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
1. Checks if VM exists, creates one if not (via sandboxd daemon API)
|
||||
2. Builds and loads the template image into the VM (one-time)
|
||||
3. Starts a container with sandbox-agent server (with proxy config for network access)
|
||||
4. Creates a Claude session and sends a test message
|
||||
5. Streams and displays Claude's response
|
||||
|
||||
## Notes
|
||||
|
||||
- Docker Sandbox VMs have network isolation - outbound HTTPS goes through a proxy at `host.docker.internal:3128`
|
||||
- The container is configured with `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY` environment variables
|
||||
- `NODE_TLS_REJECT_UNAUTHORIZED=0` is set to bypass proxy SSL verification (for testing)
|
||||
- `ANTHROPIC_API_KEY` is baked into the container at creation time - run `pnpm cleanup` and restart if you change the key
|
||||
- Resources are kept hot between runs for faster iteration - use `pnpm cleanup` to remove
|
||||
17
examples/docker-sandbox/package.json
Normal file
17
examples/docker-sandbox/package.json
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
{
|
||||
"name": "@sandbox-agent/example-docker-sandbox",
|
||||
"private": true,
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"start": "tsx src/docker-sandbox.ts",
|
||||
"cleanup": "tsx src/cleanup.ts",
|
||||
"typecheck": "tsc --noEmit"
|
||||
},
|
||||
"dependencies": {},
|
||||
"devDependencies": {
|
||||
"@types/node": "latest",
|
||||
"tsx": "latest",
|
||||
"typescript": "latest",
|
||||
"vitest": "^3.0.0"
|
||||
}
|
||||
}
|
||||
5
examples/docker-sandbox/src/cleanup.ts
Normal file
5
examples/docker-sandbox/src/cleanup.ts
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
import { cleanup } from "./utils.js";
|
||||
|
||||
console.log("Cleaning up Docker Sandbox resources...");
|
||||
cleanup();
|
||||
console.log("Done.");
|
||||
102
examples/docker-sandbox/src/docker-sandbox.ts
Normal file
102
examples/docker-sandbox/src/docker-sandbox.ts
Normal file
|
|
@ -0,0 +1,102 @@
|
|||
import * as os from "node:os";
|
||||
import { exec, vmExec, sleep, runPrompt, SANDBOXD_SOCK, VM_NAME } from "./utils.js";
|
||||
|
||||
// Global error handlers
|
||||
process.on("uncaughtException", (err) => {
|
||||
console.error("Error:", err.message);
|
||||
if (err.message.includes("docker.sock")) {
|
||||
console.error("Try: pnpm cleanup && pnpm start");
|
||||
}
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
// Check prerequisites
|
||||
try {
|
||||
exec("docker sandbox --help", { silent: true });
|
||||
} catch {
|
||||
console.error(
|
||||
"Docker Sandbox not available. Requires Docker Desktop 4.58+ on macOS/Windows.",
|
||||
);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
if (!process.env.ANTHROPIC_API_KEY) {
|
||||
console.error("ANTHROPIC_API_KEY environment variable is required");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Check if VM already exists
|
||||
const vms = JSON.parse(
|
||||
exec(`curl -s --unix-socket "${SANDBOXD_SOCK}" http://localhost/vm`, { silent: true }),
|
||||
);
|
||||
const existingVm = vms.find((v: { vm_name: string }) => v.vm_name === VM_NAME);
|
||||
|
||||
let vmSock: string;
|
||||
if (existingVm) {
|
||||
console.log(`Using existing VM: ${existingVm.vm_id}`);
|
||||
vmSock = `${os.homedir()}/.docker/sandboxes/vm/${VM_NAME}/docker.sock`;
|
||||
} else {
|
||||
// Create VM
|
||||
console.log("Creating VM (one-time setup)...");
|
||||
const payload = JSON.stringify({
|
||||
agent_name: "sandbox-agent",
|
||||
workspace_dir: process.cwd(),
|
||||
});
|
||||
const vm = JSON.parse(
|
||||
exec(
|
||||
`curl -s -X POST --unix-socket "${SANDBOXD_SOCK}" http://localhost/vm -H "Content-Type: application/json" -d '${payload}'`,
|
||||
{ silent: true },
|
||||
),
|
||||
);
|
||||
if (!vm.vm_id) throw new Error(`Failed to create VM: ${JSON.stringify(vm)}`);
|
||||
vmSock =
|
||||
vm.vm_config?.socketPath ??
|
||||
`${os.homedir()}/.docker/sandboxes/vm/${VM_NAME}/docker.sock`;
|
||||
console.log(`VM created: ${vm.vm_id}`);
|
||||
|
||||
// Build and load image (only needed once per VM)
|
||||
console.log("Building image (one-time setup)...");
|
||||
exec(`docker build -t sandbox-agent-template:latest .`);
|
||||
console.log("Loading image into VM (one-time setup)...");
|
||||
exec(`docker save sandbox-agent-template:latest | docker --host "unix://${vmSock}" load`);
|
||||
}
|
||||
|
||||
// Check if container already exists
|
||||
const containerExists = exec(
|
||||
`docker --host "unix://${vmSock}" ps -a --filter name=^sandbox$ --format "{{.Status}}"`,
|
||||
{ silent: true },
|
||||
);
|
||||
|
||||
if (containerExists.includes("Up")) {
|
||||
console.log("Container already running");
|
||||
} else if (containerExists) {
|
||||
console.log("Starting existing container...");
|
||||
exec(`docker --host "unix://${vmSock}" start sandbox`, { silent: true });
|
||||
} else {
|
||||
console.log("Creating container...");
|
||||
// Note: Docker Sandbox requires proxy for outbound HTTPS
|
||||
exec(
|
||||
`docker --host "unix://${vmSock}" run -d --name sandbox ` +
|
||||
`-e HTTP_PROXY=http://host.docker.internal:3128 ` +
|
||||
`-e HTTPS_PROXY=http://host.docker.internal:3128 ` +
|
||||
`-e NO_PROXY=localhost,127.0.0.1 ` +
|
||||
`-e NODE_TLS_REJECT_UNAUTHORIZED=0 ` +
|
||||
`-e ANTHROPIC_API_KEY="${process.env.ANTHROPIC_API_KEY}" ` +
|
||||
`-v "${process.cwd()}:${process.cwd()}" -w "${process.cwd()}" ` +
|
||||
`sandbox-agent-template:latest sandbox-agent server --no-token --host 0.0.0.0`,
|
||||
{ silent: true },
|
||||
);
|
||||
}
|
||||
|
||||
// Wait for server
|
||||
console.log("Waiting for healthy...");
|
||||
const start = Date.now();
|
||||
while (Date.now() - start < 30000) {
|
||||
try {
|
||||
if (vmExec(vmSock, "sandbox-agent api sessions list").includes("sessions")) break;
|
||||
} catch {}
|
||||
await sleep(500);
|
||||
}
|
||||
|
||||
// Interactive prompt loop
|
||||
await runPrompt(vmSock);
|
||||
57
examples/docker-sandbox/src/utils.ts
Normal file
57
examples/docker-sandbox/src/utils.ts
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
import { execSync, spawnSync } from "node:child_process";
|
||||
import * as os from "node:os";
|
||||
|
||||
export const SANDBOXD_SOCK = `${os.homedir()}/.docker/sandboxes/sandboxd.sock`;
|
||||
export const VM_NAME = "sandbox-agent-vm";
|
||||
|
||||
export const exec = (cmd: string, opts?: { silent?: boolean }) =>
|
||||
execSync(cmd, { encoding: "utf-8", stdio: opts?.silent ? "pipe" : "inherit" })?.trim() ?? "";
|
||||
|
||||
export const vmExec = (vmSock: string, cmd: string, env?: Record<string, string>) => {
|
||||
const envFlags = env ? Object.entries(env).flatMap(([k, v]) => ["-e", `${k}=${v}`]) : [];
|
||||
const r = spawnSync("docker", ["--host", `unix://${vmSock}`, "exec", ...envFlags, "sandbox", "sh", "-c", cmd], { encoding: "utf-8", stdio: "pipe" });
|
||||
if (r.error) throw r.error;
|
||||
return r.stdout?.trim() ?? "";
|
||||
};
|
||||
|
||||
export const cleanup = () => {
|
||||
try { exec(`curl -s -X DELETE --unix-socket "${SANDBOXD_SOCK}" http://localhost/vm/${VM_NAME}`); } catch {}
|
||||
};
|
||||
|
||||
export const sleep = (ms: number) => new Promise(r => setTimeout(r, ms));
|
||||
|
||||
export const runPrompt = async (vmSock: string): Promise<void> => {
|
||||
const { createInterface } = await import("node:readline/promises");
|
||||
const { spawn } = await import("node:child_process");
|
||||
|
||||
const rl = createInterface({ input: process.stdin, output: process.stdout });
|
||||
|
||||
const sessionId = `session-${Date.now()}`;
|
||||
vmExec(vmSock, `sandbox-agent api sessions create ${sessionId} --agent claude`);
|
||||
console.log(`Session: ${sessionId}\nPress Ctrl+C to quit.\n`);
|
||||
|
||||
const sendMessage = (input: string) => new Promise<void>((resolve) => {
|
||||
const proc = spawn("docker", [
|
||||
"--host", `unix://${vmSock}`, "exec", "sandbox", "sh", "-c",
|
||||
`sandbox-agent api sessions send-message-stream ${sessionId} --message "${input.replace(/"/g, '\\"')}"`,
|
||||
]);
|
||||
|
||||
proc.stdout.on("data", (chunk: Buffer) => {
|
||||
for (const line of chunk.toString().split("\n")) {
|
||||
if (!line.startsWith("data: ")) continue;
|
||||
const evt = JSON.parse(line.slice(6));
|
||||
if (evt.type === "item.delta" && evt.data?.delta) {
|
||||
const isUserEcho = evt.data.item_id?.includes("user") || evt.data.native_item_id?.includes("user");
|
||||
if (!isUserEcho) process.stdout.write(evt.data.delta);
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
proc.on("close", () => { console.log(); resolve(); });
|
||||
});
|
||||
|
||||
for await (const input of rl) {
|
||||
if (input.trim()) await sendMessage(input);
|
||||
process.stdout.write("> ");
|
||||
}
|
||||
};
|
||||
64
examples/docker-sandbox/tests/docker-sandbox.test.ts
Normal file
64
examples/docker-sandbox/tests/docker-sandbox.test.ts
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
import { describe, it, expect } from "vitest";
|
||||
import { execSync } from "node:child_process";
|
||||
|
||||
const shouldRun = process.env.RUN_DOCKER_SANDBOX_EXAMPLES === "1";
|
||||
const timeoutMs = Number.parseInt(process.env.SANDBOX_TEST_TIMEOUT_MS || "", 10) || 300_000;
|
||||
|
||||
const testFn = shouldRun ? it : it.skip;
|
||||
|
||||
function execCapture(cmd: string): string {
|
||||
return execSync(cmd, { encoding: "utf-8", stdio: "pipe" }).toString().trim();
|
||||
}
|
||||
|
||||
function isDockerSandboxAvailable(): boolean {
|
||||
try {
|
||||
execCapture("docker sandbox --help");
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
describe("docker-sandbox example", () => {
|
||||
testFn(
|
||||
"docker sandbox CLI is available",
|
||||
async () => {
|
||||
expect(isDockerSandboxAvailable()).toBe(true);
|
||||
},
|
||||
timeoutMs
|
||||
);
|
||||
|
||||
testFn(
|
||||
"can create and remove a sandbox",
|
||||
async () => {
|
||||
if (!isDockerSandboxAvailable()) {
|
||||
console.log("Skipping: Docker Sandbox not available");
|
||||
return;
|
||||
}
|
||||
|
||||
const sandboxName = `test-sandbox-${Date.now()}`;
|
||||
const workspaceDir = process.cwd();
|
||||
|
||||
try {
|
||||
// Create sandbox
|
||||
execCapture(`docker sandbox create --name ${sandboxName} ${workspaceDir}`);
|
||||
|
||||
// Verify it exists
|
||||
const list = execCapture(`docker sandbox ls --format "{{.Name}}"`);
|
||||
expect(list.split("\n")).toContain(sandboxName);
|
||||
|
||||
// Execute a command inside
|
||||
const result = execCapture(`docker sandbox exec ${sandboxName} echo "hello"`);
|
||||
expect(result).toBe("hello");
|
||||
} finally {
|
||||
// Cleanup
|
||||
try {
|
||||
execCapture(`docker sandbox rm -f ${sandboxName}`);
|
||||
} catch {
|
||||
// Ignore cleanup errors
|
||||
}
|
||||
}
|
||||
},
|
||||
timeoutMs
|
||||
);
|
||||
});
|
||||
16
examples/docker-sandbox/tsconfig.json
Normal file
16
examples/docker-sandbox/tsconfig.json
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"lib": ["ES2022", "DOM"],
|
||||
"module": "ESNext",
|
||||
"moduleResolution": "Bundler",
|
||||
"allowImportingTsExtensions": true,
|
||||
"noEmit": true,
|
||||
"esModuleInterop": true,
|
||||
"strict": true,
|
||||
"skipLibCheck": true,
|
||||
"resolveJsonModule": true
|
||||
},
|
||||
"include": ["src/**/*"],
|
||||
"exclude": ["node_modules", "**/*.test.ts"]
|
||||
}
|
||||
13
package.json
13
package.json
|
|
@ -12,5 +12,16 @@
|
|||
"devDependencies": {
|
||||
"turbo": "^2.4.0",
|
||||
"vitest": "^3.0.0"
|
||||
}
|
||||
},
|
||||
"workspaces": [
|
||||
"frontend/packages/*",
|
||||
"sdks/*",
|
||||
"sdks/cli",
|
||||
"sdks/cli/platforms/*",
|
||||
"resources/agent-schemas",
|
||||
"resources/vercel-ai-sdk-schemas",
|
||||
"scripts/release",
|
||||
"scripts/sandbox-testing",
|
||||
"examples/*"
|
||||
]
|
||||
}
|
||||
|
|
|
|||
22
pnpm-lock.yaml
generated
22
pnpm-lock.yaml
generated
|
|
@ -102,6 +102,28 @@ importers:
|
|||
specifier: ^3.0.0
|
||||
version: 3.2.4(@types/debug@4.1.12)(@types/node@25.2.0)(jiti@1.21.7)(tsx@4.21.0)(yaml@2.8.2)
|
||||
|
||||
examples/docker-sandbox:
|
||||
dependencies:
|
||||
'@sandbox-agent/example-shared':
|
||||
specifier: workspace:*
|
||||
version: link:../shared
|
||||
sandbox-agent:
|
||||
specifier: workspace:*
|
||||
version: link:../../sdks/typescript
|
||||
devDependencies:
|
||||
'@types/node':
|
||||
specifier: latest
|
||||
version: 25.2.0
|
||||
tsx:
|
||||
specifier: latest
|
||||
version: 4.21.0
|
||||
typescript:
|
||||
specifier: latest
|
||||
version: 5.9.3
|
||||
vitest:
|
||||
specifier: ^3.0.0
|
||||
version: 3.2.4(@types/debug@4.1.12)(@types/node@25.2.0)(jiti@1.21.7)(tsx@4.21.0)(yaml@2.8.2)
|
||||
|
||||
examples/e2b:
|
||||
dependencies:
|
||||
'@e2b/code-interpreter':
|
||||
|
|
|
|||
383
research/docker-sandbox.md
Normal file
383
research/docker-sandbox.md
Normal file
|
|
@ -0,0 +1,383 @@
|
|||
# Docker Sandbox Research
|
||||
|
||||
Research on Docker Desktop's Sandbox feature and its internal APIs.
|
||||
|
||||
## Overview
|
||||
|
||||
Docker Sandboxes (Docker Desktop 4.58+) provide hypervisor-level isolation using lightweight microVMs. Each sandbox gets its own kernel, private Docker daemon, and isolated network.
|
||||
|
||||
- **Platforms:** macOS (virtualization.framework), Windows (Hyper-V, experimental). Linux uses legacy container-based sandboxes.
|
||||
- **Isolation:** MicroVM with separate kernel, not shared like containers
|
||||
- **Networking:** Each VM has private network namespace, no cross-sandbox or host localhost access
|
||||
- **File access:** Bidirectional file sync for specified workspace directories
|
||||
|
||||
## Official CLI
|
||||
|
||||
### Core Commands
|
||||
|
||||
```bash
|
||||
# List sandboxes
|
||||
docker sandbox ls
|
||||
|
||||
# Create and run
|
||||
docker sandbox run claude ~/my-project
|
||||
|
||||
# Create without running
|
||||
docker sandbox create --name my-sandbox claude ~/my-project
|
||||
|
||||
# Execute command in sandbox
|
||||
docker sandbox exec my-sandbox <command>
|
||||
|
||||
# Execute with environment variables
|
||||
docker sandbox exec -e API_KEY="xxx" my-sandbox <command>
|
||||
|
||||
# Stop sandbox (preserves state)
|
||||
docker sandbox stop my-sandbox
|
||||
|
||||
# Remove sandbox
|
||||
docker sandbox rm my-sandbox
|
||||
|
||||
# Inspect sandbox details
|
||||
docker sandbox inspect my-sandbox
|
||||
|
||||
# Reset all sandboxes
|
||||
docker sandbox reset
|
||||
|
||||
# Show version
|
||||
docker sandbox version
|
||||
```
|
||||
|
||||
### Run Options
|
||||
|
||||
```bash
|
||||
docker sandbox run AGENT WORKSPACE [-- AGENT_ARGS...]
|
||||
|
||||
# Options:
|
||||
# --name Custom sandbox name (default: <agent>-<workdir>)
|
||||
# -t, --template Custom container image for sandbox
|
||||
# --load-local-template Load locally built template image
|
||||
# -- Pass additional arguments to agent
|
||||
```
|
||||
|
||||
### Template Management
|
||||
|
||||
```bash
|
||||
# Save current sandbox state as template
|
||||
docker sandbox save my-sandbox my-template:v1
|
||||
|
||||
# Run with custom template
|
||||
docker sandbox run --template my-template:v1 claude ~/project
|
||||
|
||||
# Run with local template (not pushed to registry)
|
||||
docker sandbox run --load-local-template -t my-local-template:v1 claude ~/project
|
||||
```
|
||||
|
||||
**Agent validation:** The CLI validates `agent_name` against a whitelist (claude, codex, gemini, etc.). This can be bypassed via the raw sandboxd API.
|
||||
|
||||
## Internal sandboxd API (Undocumented)
|
||||
|
||||
The `docker sandbox` CLI communicates with a daemon via Unix socket. This API is **undocumented and subject to change**.
|
||||
|
||||
### Socket Location
|
||||
|
||||
```
|
||||
~/.docker/sandboxes/sandboxd.sock
|
||||
```
|
||||
|
||||
### Endpoints
|
||||
|
||||
#### List VMs
|
||||
|
||||
```bash
|
||||
curl -s --unix-socket ~/.docker/sandboxes/sandboxd.sock http://localhost/vm
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"vm_id": "uuid",
|
||||
"vm_name": "agent-vm",
|
||||
"agent_name": "claude",
|
||||
"workspace_dir": "/path/to/workspace"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### Create VM
|
||||
|
||||
```bash
|
||||
curl -s -X POST --unix-socket ~/.docker/sandboxes/sandboxd.sock \
|
||||
http://localhost/vm \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"agent_name": "sandbox-agent", "workspace_dir": "/path/to/workspace"}'
|
||||
```
|
||||
|
||||
**Required fields:**
|
||||
- `agent_name` - Name of the agent (no whitelist validation at API level)
|
||||
- `workspace_dir` - Host directory to sync into VM
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"vm_id": "uuid",
|
||||
"vm_name": "sandbox-agent-vm",
|
||||
"vm_config": {
|
||||
"socketPath": "/Users/x/.docker/sandboxes/vm/sandbox-agent-vm/docker.sock",
|
||||
"fileSharingDirectories": ["/path/to/workspace"],
|
||||
"stateDir": "/Users/x/.docker/sandboxes/vm/sandbox-agent-vm",
|
||||
"assetDir": "/Users/x/.container-platform"
|
||||
},
|
||||
"ca_cert_path": "/Users/x/.docker/sandboxes/vm/sandbox-agent-vm/proxy_cacerts/proxy-ca.crt",
|
||||
"ca_cert_data": "base64..."
|
||||
}
|
||||
```
|
||||
|
||||
#### Delete VM
|
||||
|
||||
```bash
|
||||
curl -s -X DELETE --unix-socket ~/.docker/sandboxes/sandboxd.sock \
|
||||
http://localhost/vm/{vm_name}
|
||||
```
|
||||
|
||||
### VM Docker Socket
|
||||
|
||||
Each VM exposes its own Docker daemon at `vm_config.socketPath`. Use this to interact with containers inside the VM:
|
||||
|
||||
```bash
|
||||
SOCK="/Users/x/.docker/sandboxes/vm/sandbox-agent-vm/docker.sock"
|
||||
|
||||
# List containers in VM
|
||||
docker --host "unix://$SOCK" ps
|
||||
|
||||
# Load image into VM
|
||||
docker save my-image:latest | docker --host "unix://$SOCK" load
|
||||
|
||||
# Run container in VM
|
||||
docker --host "unix://$SOCK" run -d --name my-container my-image:latest
|
||||
|
||||
# Execute command in container
|
||||
docker --host "unix://$SOCK" exec my-container <command>
|
||||
```
|
||||
|
||||
### Why Use the Raw API?
|
||||
|
||||
The `docker sandbox` CLI validates agent names against a whitelist. The raw sandboxd API bypasses this validation, allowing custom agent names like `sandbox-agent`.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
~/.docker/sandboxes/
|
||||
├── sandboxd.sock # Main daemon socket
|
||||
├── vm/
|
||||
│ └── <sandbox-name>/
|
||||
│ ├── docker.sock # Per-VM Docker daemon socket
|
||||
│ ├── proxy-config.json # Network proxy configuration
|
||||
│ └── proxy_cacerts/
|
||||
│ └── proxy-ca.crt # MITM proxy CA certificate
|
||||
└── ...
|
||||
|
||||
~/.sandboxd/
|
||||
└── proxy-config.json # Default proxy config for new sandboxes
|
||||
```
|
||||
|
||||
## File Sharing
|
||||
|
||||
The `workspace_dir` parameter sets up bidirectional file sync between host and VM:
|
||||
|
||||
1. Specify `workspace_dir` when creating VM
|
||||
2. sandboxd syncs that directory into the VM at the same absolute path
|
||||
3. Mount it into containers with `-v /path:/path`
|
||||
|
||||
Files modified in the VM are synced back to the host.
|
||||
|
||||
**Important:** This is file synchronization, not volume mounting. Files are copied between host and VM.
|
||||
|
||||
## Network Policies
|
||||
|
||||
### Proxy Architecture
|
||||
|
||||
Each sandbox includes an HTTP/HTTPS filtering proxy:
|
||||
- Runs on host, accessible at `host.docker.internal:3128` from inside sandbox
|
||||
- Enforces allow/deny policies on outbound HTTP/HTTPS traffic
|
||||
- Raw TCP/UDP connections are blocked
|
||||
|
||||
### Policy Configuration
|
||||
|
||||
```bash
|
||||
# View current policy
|
||||
docker sandbox network proxy my-sandbox
|
||||
|
||||
# Set allow policy (default) - allows all except blocked
|
||||
docker sandbox network proxy my-sandbox --policy allow
|
||||
|
||||
# Set deny policy - blocks all except allowed
|
||||
docker sandbox network proxy my-sandbox --policy deny
|
||||
|
||||
# Allow specific hosts
|
||||
docker sandbox network proxy my-sandbox --allow-host api.example.com
|
||||
docker sandbox network proxy my-sandbox --allow-host "*.github.com"
|
||||
|
||||
# Block specific hosts
|
||||
docker sandbox network proxy my-sandbox --block-host malicious.com
|
||||
|
||||
# Block CIDR ranges (these are blocked by default)
|
||||
docker sandbox network proxy my-sandbox \
|
||||
--block-cidr 10.0.0.0/8 \
|
||||
--block-cidr 172.16.0.0/12 \
|
||||
--block-cidr 192.168.0.0/16 \
|
||||
--block-cidr 127.0.0.0/8
|
||||
|
||||
# Bypass HTTPS inspection for specific hosts
|
||||
docker sandbox network proxy my-sandbox --bypass-host pinned-cert.example.com
|
||||
|
||||
# View blocked/allowed requests
|
||||
docker sandbox network log my-sandbox
|
||||
```
|
||||
|
||||
### Domain Matching Rules
|
||||
|
||||
- Exact match: `example.com` (does NOT match subdomains)
|
||||
- Port-specific: `example.com:443`
|
||||
- Wildcard: `*.example.com` (subdomains only)
|
||||
- Catch-all: `*` or `*:443`
|
||||
|
||||
### Default Blocked CIDRs
|
||||
|
||||
- `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16` (RFC 1918)
|
||||
- `127.0.0.0/8`, `169.254.0.0/16` (localhost, link-local)
|
||||
- IPv6: `::1/128`, `fc00::/7`, `fe80::/10`
|
||||
|
||||
### HTTPS Interception
|
||||
|
||||
By default, the proxy performs MITM on HTTPS:
|
||||
- Terminates TLS and re-encrypts with its own CA
|
||||
- Allows policy enforcement and credential injection
|
||||
- Sandbox container trusts proxy CA automatically
|
||||
|
||||
Use `--bypass-host` or `--bypass-cidr` for apps with certificate pinning.
|
||||
|
||||
### Configuration Files
|
||||
|
||||
Per-sandbox: `~/.docker/sandboxes/vm/<sandbox-name>/proxy-config.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"policy": "allow",
|
||||
"network": {
|
||||
"allowedDomains": ["api.example.com"],
|
||||
"blockedDomains": ["blocked.com"],
|
||||
"blockCIDR": ["10.0.0.0/8"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Default for new sandboxes: `~/.sandboxd/proxy-config.json`
|
||||
|
||||
## Sandbox Templates
|
||||
|
||||
### Base Images
|
||||
|
||||
Official templates: `docker/sandbox-templates:<agent>`
|
||||
- `docker/sandbox-templates:claude-code`
|
||||
- Includes: Ubuntu base, Git, Docker CLI, Node.js, Python, Go
|
||||
|
||||
### Creating Custom Templates
|
||||
|
||||
```dockerfile
|
||||
FROM docker/sandbox-templates:claude-code
|
||||
|
||||
USER root
|
||||
# Install additional packages
|
||||
RUN apt-get update && apt-get install -y postgresql-client redis-tools
|
||||
|
||||
# Install language-specific tools
|
||||
RUN pip install pandas numpy
|
||||
|
||||
USER agent
|
||||
```
|
||||
|
||||
Build and use:
|
||||
```bash
|
||||
docker build -t my-template:v1 .
|
||||
docker sandbox run --load-local-template -t my-template:v1 claude ~/project
|
||||
```
|
||||
|
||||
### Template Best Practices
|
||||
|
||||
- Always switch to `root` for system installs, back to `agent` at end
|
||||
- Pin specific tool versions for reproducibility
|
||||
- Don't use standard images like `python:3.11` as base (missing agent binaries)
|
||||
- Use `docker sandbox save` to capture working sandbox state
|
||||
|
||||
## Inspect Output
|
||||
|
||||
```bash
|
||||
docker sandbox inspect my-sandbox
|
||||
```
|
||||
|
||||
Returns JSON with:
|
||||
```json
|
||||
[{
|
||||
"id": "abc123...",
|
||||
"name": "my-sandbox",
|
||||
"created_at": "2025-01-15T10:30:00Z",
|
||||
"status": "running",
|
||||
"template": "docker/sandbox-templates:claude-code",
|
||||
"labels": {
|
||||
"com.docker.sandbox.agent": "claude",
|
||||
"com.docker.sandbox.workingDirectory": "/Users/x/project",
|
||||
"com.docker.sandboxes.flavor": "microvm"
|
||||
}
|
||||
}]
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
- **No port exposure:** Sandbox VMs don't support `-p` port mapping to host
|
||||
- **No host localhost access:** Cannot reach services on host machine
|
||||
- **No cross-sandbox networking:** VMs are completely isolated from each other
|
||||
- **macOS/Windows only:** Linux requires legacy container-based sandboxes
|
||||
- **HTTP/HTTPS only:** Raw TCP/UDP connections to external services are blocked
|
||||
- **Agent whitelist:** CLI validates agent names; use raw API to bypass
|
||||
|
||||
## Known Issues & Feature Requests
|
||||
|
||||
From [docker/cli GitHub issues](https://github.com/docker/cli/issues):
|
||||
|
||||
- **#6766** - Support for opencode in `docker sandbox create`
|
||||
- **#6734** - Add GitHub Copilot CLI support to `docker sandbox run`
|
||||
- **#6731** - Support platform selection (`--platform linux/amd64`)
|
||||
|
||||
The agent whitelist is hardcoded in the CLI. Workarounds:
|
||||
1. Use the raw sandboxd API (bypasses validation)
|
||||
2. Use `--template` with a custom image (still requires valid agent name)
|
||||
|
||||
## Source Code
|
||||
|
||||
The `docker sandbox` plugin is **closed source** and distributed with Docker Desktop. The open-source [docker/cli](https://github.com/docker/cli) repo does not contain sandbox implementation.
|
||||
|
||||
Key observations from docker/cli:
|
||||
- Sandbox is a plugin, not part of core CLI
|
||||
- Uses `SandboxID` and `SandboxKey` in network settings (container isolation concept)
|
||||
- No sandbox subcommand code in public repo
|
||||
|
||||
## References
|
||||
|
||||
### Official Documentation
|
||||
- [Docker Sandboxes Overview](https://docs.docker.com/ai/sandboxes/)
|
||||
- [Docker Sandboxes Architecture](https://docs.docker.com/ai/sandboxes/architecture/)
|
||||
- [Docker Sandboxes Templates](https://docs.docker.com/ai/sandboxes/templates/)
|
||||
- [Network Policies](https://docs.docker.com/ai/sandboxes/network-policies/)
|
||||
- [CLI Reference](https://docs.docker.com/reference/cli/docker/sandbox/)
|
||||
|
||||
### Blog Posts & Tutorials
|
||||
- [Docker Blog: A New Approach for Coding Agent Safety](https://www.docker.com/blog/docker-sandboxes-a-new-approach-for-coding-agent-safety/)
|
||||
- [Docker Blog: Run Claude Code Safely](https://www.docker.com/blog/docker-sandboxes-run-claude-code-and-other-coding-agents-unsupervised-but-safely/)
|
||||
- [Everything You Need to Know About Docker AI Sandboxes](https://blog.codeminer42.com/everything-you-need-to-know-about-docker-ai-sandboxes/)
|
||||
- [Docker Sandboxes Tutorial and Cheatsheet](https://www.ajeetraina.com/docker-sandboxes-tutorial-and-cheatsheet/)
|
||||
|
||||
### Related Projects
|
||||
- [microsandbox](https://github.com/microsandbox/microsandbox) - Self-hosted microVM sandboxes
|
||||
- [arrakis](https://github.com/abshkbh/arrakis) - MicroVM sandbox with backtracking support
|
||||
- [docker-sandbox-run-copilot](https://github.com/henrybravo/docker-sandbox-run-copilot) - Community Copilot template
|
||||
Loading…
Add table
Add a link
Reference in a new issue