sandbox-agent/CLAUDE.md
Nathan Flurry c91791f88d
feat: add configuration for model, mode, and thought level (#205)
* feat: add configuration for model, mode, and thought level

* docs: document Claude effort-level filesystem config

* fix: prevent panic on empty modes/thoughtLevels in parse_agent_config

Use `.first()` with safe fallback instead of direct `[0]` index access,
which would panic if the Vec is empty and no default is set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden session lifecycle and align cli.mdx example with claude.json

- destroySession: wrap session/cancel RPC in try/catch so local cleanup
  always succeeds even when the agent is unreachable
- createSession/resumeOrCreateSession: clean up the remote session if
  post-creation config calls (setMode/setModel/setThoughtLevel) fail,
  preventing leaked orphan sessions
- cli.mdx: fix example output to match current claude.json (model name,
  model order, and populated modes)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden session lifecycle and align config persistence logic

- resumeOrCreateSession: Remove destroy-on-error for the resume path. Config
  errors now propagate without destroying a pre-existing session. The destroy
  pattern remains in createSession (where the session is newly created and has
  no prior state to preserve).

- setSessionMode fallback: When session/set_mode returns -32601 and the
  fallback uses session/set_config_option, now keep modes.currentModeId
  in sync with the updated currentValue. Prevents stale cached state in
  getModes() when the fallback path is used.

- persistSessionStateFromMethod: Re-read the record from persistence instead
  of using a stale pre-await snapshot. Prevents race conditions where
  concurrent session/update events (processed by persistSessionStateFromEvent)
  are silently overwritten by optimistic updates.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: correct doc examples with valid Codex modes and update stable API list

- Replace invalid Codex mode values ("plan", "build") with valid ones
  ("auto", "full-access") in agent-sessions.mdx and sdk-overview.mdx
- Update CLAUDE.md stable method enumerations to include new session
  config methods (setSessionMode, setSessionModel, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add OpenAPI annotations for process endpoints and fix config persistence race

Add summary/description to all process management endpoint specs and the
not_found error type. Fix hydrateSessionConfigOptions to re-read from
persistence after the network call, and sync mode-category configOptions
on session/update current_mode_update events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 00:24:32 -08:00

130 lines
7.4 KiB
Markdown

# Instructions
## ACP v1 Baseline
- v1 is ACP-native.
- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`).
- `/opencode/*` is disabled during ACP core phases and returns `503`.
- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`:
- `POST /v1/rpc`
- `GET /v1/rpc` (SSE)
- `DELETE /v1/rpc`
- Control-plane endpoints:
- `GET /v1/health`
- `GET /v1/agents`
- `POST /v1/agents/{agent}/install`
- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods):
- `GET /v1/fs/file`
- `PUT /v1/fs/file`
- `POST /v1/fs/upload-batch`
- Sandbox Agent ACP extension method naming:
- Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`).
- Session detach method is `_sandboxagent/session/detach`.
## API Scope
- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent.
- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP.
- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities).
- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP.
- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP:
- These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior.
- They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream.
- This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`.
- ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations.
## Naming and Ownership
- This repository/product is **Sandbox Agent**.
- **Gigacode** is a separate user-facing UI/client, not the server product name.
- Gigacode integrates with Sandbox Agent via the OpenCode-compatible surface (`/opencode/*`) when that compatibility layer is enabled.
- Canonical extension namespace/domain string is `sandboxagent.dev` (no hyphen).
- Canonical custom ACP extension method prefix is `_sandboxagent/...` (no hyphen).
## Architecture (Brief)
- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs`
- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs`
- Agent/native + ACP agent process install and lazy install: `server/packages/agent-management/`
- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/`
## TypeScript SDK Architecture
- TypeScript clients are split into:
- `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers.
- `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers.
- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`.
- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `sendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, and `getSessionModes`.
- `Session` helpers are `prompt(...)`, `send(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, and `getModes()`.
- Cleanup is `sdk.dispose()`.
### Docs Source Of Truth
- For TypeScript docs/examples, source of truth is implementation in:
- `sdks/typescript/src/client.ts`
- `sdks/typescript/src/index.ts`
- `sdks/acp-http-client/src/index.ts`
- Do not document TypeScript APIs unless they are exported and implemented in those files.
- For HTTP/CLI docs/examples, source of truth is:
- `server/packages/sandbox-agent/src/router.rs`
- `server/packages/sandbox-agent/src/cli.rs`
- Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs).
## Source Documents
- `~/misc/acp-docs/schema/schema.json`
- `~/misc/acp-docs/schema/meta.json`
- `research/acp/spec.md`
- `research/acp/v1-schema-to-acp-mapping.md`
- `research/acp/friction.md`
- `research/acp/todo.md`
## Change Tracking
- Keep CLI subcommands and HTTP endpoints in sync.
- Update `docs/cli.mdx` when CLI behavior changes.
- Regenerate `docs/openapi.json` when HTTP contracts change.
- Keep `docs/inspector.mdx` and `docs/sdks/typescript.mdx` aligned with implementation.
- Append blockers/decisions to `research/acp/friction.md` during ACP work.
- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
- Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.
- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior.
- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests.
## Docker Examples (Dev Testing)
- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples.
- This triggers `examples/shared/Dockerfile.dev` which builds the server binary from local source and packages it into the Docker image.
- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`
## Install Version References
- Channel policy:
- Sandbox Agent install/version references use a pinned minor channel `0.N.x` (for curl URLs and `sandbox-agent` / `@sandbox-agent/cli` npm/bun installs).
- Gigacode install/version references use `latest` (for `@sandbox-agent/gigacode` install/run commands and `gigacode-install.*` release promotion).
- Release promotion policy: `latest` releases must still update `latest`; when a release is `latest`, Sandbox Agent must also be promoted to the matching minor channel `0.N.x`.
- Keep every install-version reference below in sync whenever versions/channels change:
- `README.md`
- `docs/acp-http-client.mdx`
- `docs/cli.mdx`
- `docs/quickstart.mdx`
- `docs/sdk-overview.mdx`
- `docs/session-persistence.mdx`
- `docs/deploy/local.mdx`
- `docs/deploy/cloudflare.mdx`
- `docs/deploy/vercel.mdx`
- `docs/deploy/daytona.mdx`
- `docs/deploy/e2b.mdx`
- `docs/deploy/docker.mdx`
- `frontend/packages/website/src/components/GetStarted.tsx`
- `.claude/commands/post-release-testing.md`
- `examples/cloudflare/Dockerfile`
- `examples/daytona/src/index.ts`
- `examples/daytona/src/daytona-with-snapshot.ts`
- `examples/docker/src/index.ts`
- `examples/e2b/src/index.ts`
- `examples/vercel/src/index.ts`
- `scripts/release/main.ts`
- `scripts/release/promote-artifacts.ts`
- `scripts/release/sdk.ts`
- `scripts/sandbox-testing/test-sandbox.ts`