From 3263d4f5e11951aa900857a2292210c61ab7b5ca Mon Sep 17 00:00:00 2001 From: Nathan Flurry Date: Sat, 14 Mar 2026 16:06:50 -0700 Subject: [PATCH] wip --- CLAUDE.md | 98 +---- foundry/CLAUDE.md | 6 + foundry/compose.dev.yaml | 4 + .../src/actors/project-branch-sync/index.ts | 2 +- .../backend/src/actors/project/actions.ts | 92 +++-- .../packages/backend/src/actors/task/index.ts | 50 ++- .../backend/src/actors/task/workbench.ts | 18 +- .../backend/src/actors/task/workflow/index.ts | 37 +- .../backend/src/actors/task/workflow/init.ts | 107 +---- .../backend/src/actors/task/workflow/queue.ts | 1 + .../backend/src/actors/workspace/actions.ts | 52 +-- foundry/packages/backend/src/driver.ts | 4 + .../backend/src/integrations/git/index.ts | 16 +- .../backend/test/helpers/test-driver.ts | 1 + .../backend/test/workbench-unread.test.ts | 29 +- .../research/specs/remove-local-git-clone.md | 381 ++++++++++++++++++ sdks/CLAUDE.md | 37 ++ server/CLAUDE.md | 71 +++- 18 files changed, 677 insertions(+), 329 deletions(-) create mode 100644 foundry/research/specs/remove-local-git-clone.md create mode 100644 sdks/CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md index 26dfa28..f8771fb 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,40 +1,5 @@ # Instructions -## ACP v1 Baseline - -- v1 is ACP-native. -- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`). -- `/opencode/*` is disabled during ACP core phases and returns `503`. -- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`: - - `POST /v1/rpc` - - `GET /v1/rpc` (SSE) - - `DELETE /v1/rpc` -- Control-plane endpoints: - - `GET /v1/health` - - `GET /v1/agents` - - `POST /v1/agents/{agent}/install` -- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods): - - `GET /v1/fs/file` - - `PUT /v1/fs/file` - - `POST /v1/fs/upload-batch` -- Sandbox Agent ACP extension method naming: - - Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`). - - Session detach method is `_sandboxagent/session/detach`. - -## API Scope - -- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent. -- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP. -- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities). -- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP. - - Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent. - - ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior. -- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP: - - These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior. - - They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream. - - This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`. - - ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations. - ## Naming and Ownership - This repository/product is **Sandbox Agent**. @@ -49,66 +14,13 @@ - Never expose underlying protocol method names (e.g. `session/request_permission`, `session/create`, `_sandboxagent/session/detach`) in non-ACP docs. Describe the behavior in user-facing terms instead. - Do not describe the underlying protocol implementation in docs. Only document the SDK surface (methods, types, options). ACP protocol details belong exclusively in ACP-specific pages. -## Architecture (Brief) +### Docs Source Of Truth (HTTP/CLI) -- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs` -- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs` -- Agent/native + ACP agent process install and lazy install: `server/packages/agent-management/` -- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/` - -## TypeScript SDK Architecture - -- TypeScript clients are split into: - - `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers. - - `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers. -- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`. -- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`. -- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`. -- Cleanup is `sdk.dispose()`. - -### React Component Methodology - -- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector. -- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations. -- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks. -- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`. -- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`. - -### TypeScript SDK Naming Conventions - -- Use `respond(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent. -- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions. - -### Docs Source Of Truth - -- For TypeScript docs/examples, source of truth is implementation in: - - `sdks/typescript/src/client.ts` - - `sdks/typescript/src/index.ts` - - `sdks/acp-http-client/src/index.ts` -- Do not document TypeScript APIs unless they are exported and implemented in those files. - For HTTP/CLI docs/examples, source of truth is: - `server/packages/sandbox-agent/src/router.rs` - `server/packages/sandbox-agent/src/cli.rs` - Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs). -## ACP Protocol Compliance - -- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`. -- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`). -- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`). -- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it. -- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire. - -## Source Documents - -- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt` -- `~/misc/acp-docs/schema/schema.json` -- `~/misc/acp-docs/schema/meta.json` -- `research/acp/spec.md` -- `research/acp/v1-schema-to-acp-mapping.md` -- `research/acp/friction.md` -- `research/acp/todo.md` - ## Change Tracking - If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push. @@ -119,14 +31,6 @@ - Append blockers/decisions to `research/acp/friction.md` during ACP work. - `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`). - Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier. -- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior. -- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests. - -## Docker Examples (Dev Testing) - -- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples. -- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image. -- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start` ## Install Version References diff --git a/foundry/CLAUDE.md b/foundry/CLAUDE.md index aae89c3..7d2bb37 100644 --- a/foundry/CLAUDE.md +++ b/foundry/CLAUDE.md @@ -224,6 +224,12 @@ Examples: Never use `wait: true` for operations that depend on external readiness, sandbox I/O, agent responses, git network operations, polling loops, or long-running queue drains. Never hold an action open while waiting for an external system to become ready — that is a polling/retry loop in disguise. +### Timeout policy + +All `wait: true` sends must have an explicit `timeout`. Maximum timeout for any `wait: true` send is **10 seconds** (`10_000`). If an operation cannot reliably complete within 10 seconds, it must be restructured: write the initial record to the DB, return it to the caller, and continue the work asynchronously with `wait: false`. The client observes completion via push events. + +`wait: false` sends do not need a timeout (the enqueue is instant; the work runs in the workflow loop with its own step-level timeouts). + ### Task creation: resolve metadata before creating the actor When creating a task, all deterministic metadata (title, branch name) must be resolved synchronously in the parent actor (project) *before* the task actor is created. The task actor must never be created with null `branchName` or `title`. diff --git a/foundry/compose.dev.yaml b/foundry/compose.dev.yaml index 8dd9f97..d425654 100644 --- a/foundry/compose.dev.yaml +++ b/foundry/compose.dev.yaml @@ -14,6 +14,10 @@ services: HF_BACKEND_HOST: "0.0.0.0" HF_BACKEND_PORT: "7741" RIVETKIT_STORAGE_PATH: "/root/.local/share/foundry/rivetkit" + RIVET_LOG_ERROR_STACK: "${RIVET_LOG_ERROR_STACK:-1}" + RIVET_LOG_LEVEL: "${RIVET_LOG_LEVEL:-debug}" + RIVET_LOG_TIMESTAMP: "${RIVET_LOG_TIMESTAMP:-1}" + FOUNDRY_LOG_LEVEL: "${FOUNDRY_LOG_LEVEL:-debug}" # Pass through credentials needed for agent execution + PR creation in dev/e2e. # Do not hardcode secrets; set these in your environment when starting compose. ANTHROPIC_API_KEY: "${ANTHROPIC_API_KEY:-}" diff --git a/foundry/packages/backend/src/actors/project-branch-sync/index.ts b/foundry/packages/backend/src/actors/project-branch-sync/index.ts index 3b20941..1003822 100644 --- a/foundry/packages/backend/src/actors/project-branch-sync/index.ts +++ b/foundry/packages/backend/src/actors/project-branch-sync/index.ts @@ -156,7 +156,7 @@ export const projectBranchSync = actor({ async force(c): Promise { const self = selfProjectBranchSync(c); - await self.send(CONTROL.force, {}, { wait: true, timeout: 5 * 60_000 }); + await self.send(CONTROL.force, {}, { wait: true, timeout: 10_000 }); }, }, run: workflow(async (ctx) => { diff --git a/foundry/packages/backend/src/actors/project/actions.ts b/foundry/packages/backend/src/actors/project/actions.ts index 8f9090d..36355c6 100644 --- a/foundry/packages/backend/src/actors/project/actions.ts +++ b/foundry/packages/backend/src/actors/project/actions.ts @@ -11,7 +11,7 @@ import { resolveWorkspaceGithubAuth } from "../../services/github-auth.js"; import { expectQueueResponse } from "../../services/queue.js"; import { withRepoGitLock } from "../../services/repo-git-lock.js"; import { branches, taskIndex, repoActionJobs, repoMeta } from "./db/schema.js"; -import { deriveFallbackTitle } from "../../services/create-flow.js"; +import { deriveFallbackTitle, resolveCreateFlowDecision } from "../../services/create-flow.js"; import { normalizeBaseBranchName } from "../../integrations/git-spice/index.js"; import { sortBranchesForOverview } from "./stack-model.js"; @@ -416,37 +416,81 @@ async function hydrateTaskIndexMutation(c: any, _cmd?: HydrateTaskIndexCommand): } async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise { + const workspaceId = c.state.workspaceId; + const repoId = c.state.repoId; + const repoRemote = c.state.remoteUrl; const onBranch = cmd.onBranch?.trim() || null; - const initialBranchName = onBranch; - const initialTitle = onBranch ? deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined) : null; const taskId = randomUUID(); + let initialBranchName: string | null = null; + let initialTitle: string | null = null; if (onBranch) { + initialBranchName = onBranch; + initialTitle = deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined); + await registerTaskBranchMutation(c, { taskId, branchName: onBranch, requireExistingRemote: true, }); + } else { + const localPath = await ensureProjectReady(c); + const { driver } = getActorRuntimeContext(); + + // Read locally cached remote-tracking refs — no network fetch. + // The branch sync actor keeps these reasonably fresh. If a rare naming + // collision occurs with a very recently created remote branch, it will + // be caught lazily on push/checkout. + const remoteBranches = (await driver.git.listLocalRemoteRefs(localPath)).map((branch: any) => branch.branchName); + + await ensureTaskIndexHydrated(c); + const reservedBranchRows = await c.db.select({ branchName: taskIndex.branchName }).from(taskIndex).where(isNotNull(taskIndex.branchName)).all(); + const reservedBranches = reservedBranchRows + .map((row: { branchName: string | null }) => row.branchName) + .filter((branchName): branchName is string => typeof branchName === "string" && branchName.length > 0); + + const resolved = resolveCreateFlowDecision({ + task: cmd.task, + explicitTitle: cmd.explicitTitle ?? undefined, + explicitBranchName: cmd.explicitBranchName ?? undefined, + localBranches: remoteBranches, + taskBranches: reservedBranches, + }); + + initialBranchName = resolved.branchName; + initialTitle = resolved.title; + + const now = Date.now(); + await c.db + .insert(taskIndex) + .values({ + taskId, + branchName: resolved.branchName, + createdAt: now, + updatedAt: now, + }) + .onConflictDoNothing() + .run(); } let task: Awaited>; try { - task = await getOrCreateTask(c, c.state.workspaceId, c.state.repoId, taskId, { - workspaceId: c.state.workspaceId, - repoId: c.state.repoId, + task = await getOrCreateTask(c, workspaceId, repoId, taskId, { + workspaceId, + repoId, taskId, - repoRemote: c.state.remoteUrl, + repoRemote, branchName: initialBranchName, title: initialTitle, task: cmd.task, providerId: cmd.providerId, agentType: cmd.agentType, - explicitTitle: onBranch ? null : cmd.explicitTitle, - explicitBranchName: onBranch ? null : cmd.explicitBranchName, + explicitTitle: null, + explicitBranchName: null, initialPrompt: cmd.initialPrompt, }); } catch (error) { - if (onBranch) { + if (initialBranchName) { await c.db .delete(taskIndex) .where(eq(taskIndex.taskId, taskId)) @@ -456,28 +500,14 @@ async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise( await self.send(projectWorkflowQueueName("project.command.ensure"), cmd, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }), ); }, @@ -929,7 +959,7 @@ export const projectActions = { return expectQueueResponse( await self.send(projectWorkflowQueueName("project.command.createTask"), cmd, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }), ); }, @@ -947,7 +977,7 @@ export const projectActions = { return expectQueueResponse<{ branchName: string; headSha: string }>( await self.send(projectWorkflowQueueName("project.command.registerTaskBranch"), cmd, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }), ); }, @@ -956,7 +986,7 @@ export const projectActions = { const self = selfProject(c); await self.send(projectWorkflowQueueName("project.command.hydrateTaskIndex"), cmd ?? {}, { wait: true, - timeout: 60_000, + timeout: 10_000, }); }, @@ -1225,7 +1255,7 @@ export const projectActions = { const self = selfProject(c); await self.send(projectWorkflowQueueName("project.command.applyBranchSyncResult"), body, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }); }, }; diff --git a/foundry/packages/backend/src/actors/task/index.ts b/foundry/packages/backend/src/actors/task/index.ts index cac007a..968171c 100644 --- a/foundry/packages/backend/src/actors/task/index.ts +++ b/foundry/packages/backend/src/actors/task/index.ts @@ -101,14 +101,15 @@ interface TaskWorkbenchSendMessageCommand { attachments: Array; } -interface TaskWorkbenchSendMessageActionInput extends TaskWorkbenchSendMessageInput { - waitForCompletion?: boolean; -} - interface TaskWorkbenchCreateSessionCommand { model?: string; } +interface TaskWorkbenchCreateSessionAndSendCommand { + model?: string; + text: string; +} + interface TaskWorkbenchSessionCommand { sessionId: string; } @@ -143,7 +144,7 @@ export const task = actor({ const self = selfTask(c); const result = await self.send(taskWorkflowQueueName("task.command.initialize"), cmd ?? {}, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }); return expectQueueResponse(result); }, @@ -160,7 +161,7 @@ export const task = actor({ const self = selfTask(c); const result = await self.send(taskWorkflowQueueName("task.command.attach"), cmd ?? {}, { wait: true, - timeout: 20_000, + timeout: 10_000, }); return expectQueueResponse<{ target: string; sessionId: string | null }>(result); }, @@ -172,7 +173,7 @@ export const task = actor({ {}, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); return expectQueueResponse<{ switchTarget: string }>(result); @@ -236,7 +237,7 @@ export const task = actor({ {}, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); }, @@ -263,12 +264,25 @@ export const task = actor({ { ...(input?.model ? { model: input.model } : {}) } satisfies TaskWorkbenchCreateSessionCommand, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }, ); return expectQueueResponse<{ tabId: string }>(result); }, + /** + * Fire-and-forget: creates a workbench session and sends the initial message. + * Used by createWorkbenchTask so the caller doesn't block on session creation. + */ + async createWorkbenchSessionAndSend(c, input: { model?: string; text: string }): Promise { + const self = selfTask(c); + await self.send( + taskWorkflowQueueName("task.command.workbench.create_session_and_send"), + { model: input.model, text: input.text } satisfies TaskWorkbenchCreateSessionAndSendCommand, + { wait: false }, + ); + }, + async renameWorkbenchSession(c, input: TaskWorkbenchRenameSessionInput): Promise { const self = selfTask(c); await self.send( @@ -276,7 +290,7 @@ export const task = actor({ { sessionId: input.tabId, title: input.title } satisfies TaskWorkbenchSessionTitleCommand, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); }, @@ -288,7 +302,7 @@ export const task = actor({ { sessionId: input.tabId, unread: input.unread } satisfies TaskWorkbenchSessionUnreadCommand, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); }, @@ -304,7 +318,7 @@ export const task = actor({ } satisfies TaskWorkbenchUpdateDraftCommand, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); }, @@ -316,14 +330,14 @@ export const task = actor({ { sessionId: input.tabId, model: input.model } satisfies TaskWorkbenchChangeModelCommand, { wait: true, - timeout: 20_000, + timeout: 10_000, }, ); }, - async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageActionInput): Promise { + async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageInput): Promise { const self = selfTask(c); - const result = await self.send( + await self.send( taskWorkflowQueueName("task.command.workbench.send_message"), { sessionId: input.tabId, @@ -331,13 +345,9 @@ export const task = actor({ attachments: input.attachments, } satisfies TaskWorkbenchSendMessageCommand, { - wait: input.waitForCompletion === true, - ...(input.waitForCompletion === true ? { timeout: 10 * 60_000 } : {}), + wait: false, }, ); - if (input.waitForCompletion === true) { - expectQueueResponse(result); - } }, async stopWorkbenchSession(c, input: TaskTabCommand): Promise { diff --git a/foundry/packages/backend/src/actors/task/workbench.ts b/foundry/packages/backend/src/actors/task/workbench.ts index 1da7f2f..9277152 100644 --- a/foundry/packages/backend/src/actors/task/workbench.ts +++ b/foundry/packages/backend/src/actors/task/workbench.ts @@ -307,22 +307,14 @@ async function requireReadySessionMeta(c: any, tabId: string): Promise { return meta; } -async function ensureReadySessionMeta(c: any, tabId: string): Promise { - const meta = await readSessionMeta(c, tabId); +export function requireSendableSessionMeta(meta: any, tabId: string): any { if (!meta) { throw new Error(`Unknown workbench tab: ${tabId}`); } - - if (meta.status === "ready" && meta.sandboxSessionId) { - return meta; + if (meta.status !== "ready" || !meta.sandboxSessionId) { + throw new Error(`Session is not ready (status: ${meta.status}). Wait for session provisioning to complete.`); } - - if (meta.status === "error") { - throw new Error(meta.errorMessage ?? "This workbench tab failed to prepare"); - } - - await ensureWorkbenchSession(c, tabId); - return await requireReadySessionMeta(c, tabId); + return meta; } function shellFragment(parts: string[]): string { @@ -1204,7 +1196,7 @@ export async function changeWorkbenchModel(c: any, sessionId: string, model: str } export async function sendWorkbenchMessage(c: any, sessionId: string, text: string, attachments: Array): Promise { - const meta = await ensureReadySessionMeta(c, sessionId); + const meta = requireSendableSessionMeta(await readSessionMeta(c, sessionId), sessionId); const record = await ensureWorkbenchSeeded(c); const runtime = await getTaskSandboxRuntime(c, record); await ensureSandboxRepo(c, runtime.sandbox, record); diff --git a/foundry/packages/backend/src/actors/task/workflow/index.ts b/foundry/packages/backend/src/actors/task/workflow/index.ts index f9049a7..c14ab78 100644 --- a/foundry/packages/backend/src/actors/task/workflow/index.ts +++ b/foundry/packages/backend/src/actors/task/workflow/index.ts @@ -1,14 +1,7 @@ import { Loop } from "rivetkit/workflow"; import { logActorWarning, resolveErrorMessage } from "../../logging.js"; import { getCurrentRecord } from "./common.js"; -import { - initAssertNameActivity, - initBootstrapDbActivity, - initCompleteActivity, - initEnqueueProvisionActivity, - initEnsureNameActivity, - initFailedActivity, -} from "./init.js"; +import { initBootstrapDbActivity, initCompleteActivity, initEnqueueProvisionActivity, initFailedActivity } from "./init.js"; import { handleArchiveActivity, handleAttachActivity, @@ -67,12 +60,8 @@ const commandHandlers: Record = { await loopCtx.removed("init-failed", "step"); await loopCtx.removed("init-failed-v2", "step"); try { - await loopCtx.step({ - name: "init-ensure-name", - timeout: 5 * 60_000, - run: async () => initEnsureNameActivity(loopCtx), - }); - await loopCtx.step("init-assert-name", async () => initAssertNameActivity(loopCtx)); + await loopCtx.removed("init-ensure-name", "step"); + await loopCtx.removed("init-assert-name", "step"); await loopCtx.removed("init-create-sandbox", "step"); await loopCtx.removed("init-ensure-agent", "step"); await loopCtx.removed("init-start-sandbox-instance", "step"); @@ -156,6 +145,26 @@ const commandHandlers: Record = { } }, + "task.command.workbench.create_session_and_send": async (loopCtx, msg) => { + try { + const created = await loopCtx.step({ + name: "workbench-create-session-for-send", + timeout: 5 * 60_000, + run: async () => createWorkbenchSession(loopCtx, msg.body?.model), + }); + await loopCtx.step({ + name: "workbench-send-initial-message", + timeout: 5 * 60_000, + run: async () => sendWorkbenchMessage(loopCtx, created.tabId, msg.body.text, []), + }); + } catch (error) { + logActorWarning("task.workflow", "create_session_and_send failed", { + error: resolveErrorMessage(error), + }); + } + await msg.complete({ ok: true }); + }, + "task.command.workbench.ensure_session": async (loopCtx, msg) => { await loopCtx.step({ name: "workbench-ensure-session", diff --git a/foundry/packages/backend/src/actors/task/workflow/init.ts b/foundry/packages/backend/src/actors/task/workflow/init.ts index ec0b699..9cfe3d3 100644 --- a/foundry/packages/backend/src/actors/task/workflow/init.ts +++ b/foundry/packages/backend/src/actors/task/workflow/init.ts @@ -1,10 +1,8 @@ // @ts-nocheck import { eq } from "drizzle-orm"; -import { resolveCreateFlowDecision } from "../../../services/create-flow.js"; -import { resolveWorkspaceGithubAuth } from "../../../services/github-auth.js"; import { getActorRuntimeContext } from "../../context.js"; -import { getOrCreateHistory, getOrCreateProject, selfTask } from "../../handles.js"; -import { logActorWarning, resolveErrorMessage } from "../../logging.js"; +import { getOrCreateHistory, selfTask } from "../../handles.js"; +import { resolveErrorMessage } from "../../logging.js"; import { defaultSandboxProviderId } from "../../../sandbox-config.js"; import { task as taskTable, taskRuntime } from "../db/schema.js"; import { TASK_ROW_ID, appendHistory, collectErrorMessages, resolveErrorDetail, setTaskState } from "./common.js"; @@ -21,7 +19,6 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise< const { config } = getActorRuntimeContext(); const providerId = body?.providerId ?? loopCtx.state.providerId ?? defaultSandboxProviderId(config); const now = Date.now(); - const initialStatusMessage = loopCtx.state.branchName && loopCtx.state.title ? "provisioning" : "naming"; await ensureTaskRuntimeCacheColumns(loopCtx.db); @@ -60,7 +57,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise< activeSessionId: null, activeSwitchTarget: null, activeCwd: null, - statusMessage: initialStatusMessage, + statusMessage: "provisioning", gitStateJson: null, gitStateUpdatedAt: null, provisionStage: "queued", @@ -74,7 +71,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise< activeSessionId: null, activeSwitchTarget: null, activeCwd: null, - statusMessage: initialStatusMessage, + statusMessage: "provisioning", provisionStage: "queued", provisionStageUpdatedAt: now, updatedAt: now, @@ -111,102 +108,6 @@ export async function initEnqueueProvisionActivity(loopCtx: any, body: any): Pro } } -export async function initEnsureNameActivity(loopCtx: any): Promise { - await setTaskState(loopCtx, "init_ensure_name", "determining title and branch"); - const existing = await loopCtx.db - .select({ - branchName: taskTable.branchName, - title: taskTable.title, - }) - .from(taskTable) - .where(eq(taskTable.id, TASK_ROW_ID)) - .get(); - - if (existing?.branchName && existing?.title) { - loopCtx.state.branchName = existing.branchName; - loopCtx.state.title = existing.title; - return; - } - - const { driver } = getActorRuntimeContext(); - const auth = await resolveWorkspaceGithubAuth(loopCtx, loopCtx.state.workspaceId); - let repoLocalPath = loopCtx.state.repoLocalPath; - if (!repoLocalPath) { - const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote); - const result = await project.ensure({ remoteUrl: loopCtx.state.repoRemote }); - repoLocalPath = result.localPath; - loopCtx.state.repoLocalPath = repoLocalPath; - } - - try { - await driver.git.fetch(repoLocalPath, { githubToken: auth?.githubToken ?? null }); - } catch (error) { - logActorWarning("task.init", "fetch before naming failed", { - workspaceId: loopCtx.state.workspaceId, - repoId: loopCtx.state.repoId, - taskId: loopCtx.state.taskId, - error: resolveErrorMessage(error), - }); - } - - const remoteBranches = (await driver.git.listRemoteBranches(repoLocalPath, { githubToken: auth?.githubToken ?? null })).map( - (branch: any) => branch.branchName, - ); - const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote); - const reservedBranches = await project.listReservedBranches({}); - const resolved = resolveCreateFlowDecision({ - task: loopCtx.state.task, - explicitTitle: loopCtx.state.explicitTitle ?? undefined, - explicitBranchName: loopCtx.state.explicitBranchName ?? undefined, - localBranches: remoteBranches, - taskBranches: reservedBranches, - }); - - const now = Date.now(); - await loopCtx.db - .update(taskTable) - .set({ - branchName: resolved.branchName, - title: resolved.title, - updatedAt: now, - }) - .where(eq(taskTable.id, TASK_ROW_ID)) - .run(); - - loopCtx.state.branchName = resolved.branchName; - loopCtx.state.title = resolved.title; - loopCtx.state.explicitTitle = null; - loopCtx.state.explicitBranchName = null; - - await loopCtx.db - .update(taskRuntime) - .set({ - statusMessage: "provisioning", - provisionStage: "repo_prepared", - provisionStageUpdatedAt: now, - updatedAt: now, - }) - .where(eq(taskRuntime.id, TASK_ROW_ID)) - .run(); - - await project.registerTaskBranch({ - taskId: loopCtx.state.taskId, - branchName: resolved.branchName, - }); - - await appendHistory(loopCtx, "task.named", { - title: resolved.title, - branchName: resolved.branchName, - }); -} - -export async function initAssertNameActivity(loopCtx: any): Promise { - await setTaskState(loopCtx, "init_assert_name", "validating naming"); - if (!loopCtx.state.branchName) { - throw new Error("task branchName is not initialized"); - } -} - export async function initCompleteActivity(loopCtx: any, body: any): Promise { const now = Date.now(); const { config } = getActorRuntimeContext(); diff --git a/foundry/packages/backend/src/actors/task/workflow/queue.ts b/foundry/packages/backend/src/actors/task/workflow/queue.ts index 6210468..3e613e2 100644 --- a/foundry/packages/backend/src/actors/task/workflow/queue.ts +++ b/foundry/packages/backend/src/actors/task/workflow/queue.ts @@ -13,6 +13,7 @@ export const TASK_QUEUE_NAMES = [ "task.command.workbench.rename_task", "task.command.workbench.rename_branch", "task.command.workbench.create_session", + "task.command.workbench.create_session_and_send", "task.command.workbench.ensure_session", "task.command.workbench.rename_session", "task.command.workbench.set_session_unread", diff --git a/foundry/packages/backend/src/actors/workspace/actions.ts b/foundry/packages/backend/src/actors/workspace/actions.ts index 8782a77..f4ee4db 100644 --- a/foundry/packages/backend/src/actors/workspace/actions.ts +++ b/foundry/packages/backend/src/actors/workspace/actions.ts @@ -1,5 +1,4 @@ // @ts-nocheck -import { setTimeout as delay } from "node:timers/promises"; import { desc, eq } from "drizzle-orm"; import { Loop } from "rivetkit/workflow"; import type { @@ -272,24 +271,6 @@ async function requireWorkbenchTask(c: any, taskId: string) { return getTask(c, c.state.workspaceId, repoId, taskId); } -async function waitForWorkbenchTaskReady(task: any, timeoutMs = 5 * 60_000): Promise { - const startedAt = Date.now(); - - for (;;) { - const record = await task.get(); - if (record?.branchName && record?.title) { - return record; - } - if (record?.status === "error") { - throw new Error("task initialization failed before the workbench session was ready"); - } - if (Date.now() - startedAt > timeoutMs) { - throw new Error("timed out waiting for task initialization"); - } - await delay(1_000); - } -} - /** * Reads the workspace sidebar snapshot from the workspace actor's local SQLite * plus the org-scoped GitHub actor for open PRs. Task actors still push @@ -562,7 +543,7 @@ export const workspaceActions = { return expectQueueResponse( await self.send(workspaceWorkflowQueueName("workspace.command.addRepo"), input, { wait: true, - timeout: 60_000, + timeout: 10_000, }), ); }, @@ -595,7 +576,7 @@ export const workspaceActions = { return expectQueueResponse( await self.send(workspaceWorkflowQueueName("workspace.command.createTask"), input, { wait: true, - timeout: 5 * 60_000, + timeout: 10_000, }), ); }, @@ -813,6 +794,7 @@ export const workspaceActions = { }, async createWorkbenchTask(c: any, input: TaskWorkbenchCreateTaskInput): Promise<{ taskId: string; tabId?: string }> { + // Step 1: Create the task record (wait: true — local state mutations only). const created = await workspaceActions.createTask(c, { workspaceId: c.state.workspaceId, repoId: input.repoId, @@ -821,26 +803,18 @@ export const workspaceActions = { ...(input.onBranch ? { onBranch: input.onBranch } : input.branch ? { explicitBranchName: input.branch } : {}), ...(input.model ? { agentType: agentTypeForModel(input.model) } : {}), }); + + // Step 2: Enqueue session creation + initial message (wait: false). + // The task workflow creates the session record and sends the message in + // the background. The client observes progress via push events on the + // task interest topic. const task = await requireWorkbenchTask(c, created.taskId); - await waitForWorkbenchTaskReady(task); - const session = await task.createWorkbenchSession({ - taskId: created.taskId, - ...(input.model ? { model: input.model } : {}), - }); - await task.sendWorkbenchMessage({ - taskId: created.taskId, - tabId: session.tabId, + await task.createWorkbenchSessionAndSend({ + model: input.model, text: input.task, - attachments: [], - waitForCompletion: true, }); - await task.getSessionDetail({ - sessionId: session.tabId, - }); - return { - taskId: created.taskId, - tabId: session.tabId, - }; + + return { taskId: created.taskId }; }, async markWorkbenchUnread(c: any, input: TaskWorkbenchSelectInput): Promise { @@ -988,7 +962,7 @@ export const workspaceActions = { const self = selfWorkspace(c); await self.send(workspaceWorkflowQueueName("workspace.command.refreshProviderProfiles"), command ?? {}, { wait: true, - timeout: 60_000, + timeout: 10_000, }); }, diff --git a/foundry/packages/backend/src/driver.ts b/foundry/packages/backend/src/driver.ts index e96fea8..7152592 100644 --- a/foundry/packages/backend/src/driver.ts +++ b/foundry/packages/backend/src/driver.ts @@ -5,6 +5,7 @@ import { ensureCloned, fetch, listRemoteBranches, + listLocalRemoteRefs, remoteDefaultBaseRef, revParse, ensureRemoteBranch, @@ -28,6 +29,8 @@ export interface GitDriver { ensureCloned(remoteUrl: string, targetPath: string, options?: { githubToken?: string | null }): Promise; fetch(repoPath: string, options?: { githubToken?: string | null }): Promise; listRemoteBranches(repoPath: string, options?: { githubToken?: string | null }): Promise; + /** Read remote-tracking refs from the local clone without fetching. */ + listLocalRemoteRefs(repoPath: string): Promise; remoteDefaultBaseRef(repoPath: string): Promise; revParse(repoPath: string, ref: string): Promise; ensureRemoteBranch(repoPath: string, branchName: string, options?: { githubToken?: string | null }): Promise; @@ -81,6 +84,7 @@ export function createDefaultDriver(): BackendDriver { ensureCloned, fetch, listRemoteBranches, + listLocalRemoteRefs, remoteDefaultBaseRef, revParse, ensureRemoteBranch, diff --git a/foundry/packages/backend/src/integrations/git/index.ts b/foundry/packages/backend/src/integrations/git/index.ts index 880e0f5..728239e 100644 --- a/foundry/packages/backend/src/integrations/git/index.ts +++ b/foundry/packages/backend/src/integrations/git/index.ts @@ -208,11 +208,25 @@ export async function remoteDefaultBaseRef(repoPath: string): Promise { return "origin/main"; } +/** + * Fetch from origin, then read remote-tracking refs. + * Use when you need guaranteed-fresh branch data and can tolerate network I/O. + */ export async function listRemoteBranches(repoPath: string, options?: GitAuthOptions): Promise { await fetch(repoPath, options); + return listLocalRemoteRefs(repoPath); +} + +/** + * Read remote-tracking refs (`refs/remotes/origin/*`) from the local clone + * without fetching. The data is only as fresh as the last fetch — use this + * when the branch sync actor keeps refs current and you want to avoid + * blocking on network I/O. + */ +export async function listLocalRemoteRefs(repoPath: string): Promise { const { stdout } = await execFileAsync("git", ["-C", repoPath, "for-each-ref", "--format=%(refname:short) %(objectname)", "refs/remotes/origin"], { maxBuffer: 1024 * 1024, - env: gitEnv(options), + env: gitEnv(), }); return stdout diff --git a/foundry/packages/backend/test/helpers/test-driver.ts b/foundry/packages/backend/test/helpers/test-driver.ts index 505bcc4..c370d87 100644 --- a/foundry/packages/backend/test/helpers/test-driver.ts +++ b/foundry/packages/backend/test/helpers/test-driver.ts @@ -15,6 +15,7 @@ export function createTestGitDriver(overrides?: Partial): GitDriver { ensureCloned: async () => {}, fetch: async () => {}, listRemoteBranches: async () => [], + listLocalRemoteRefs: async () => [], remoteDefaultBaseRef: async () => "origin/main", revParse: async () => "abc1234567890", ensureRemoteBranch: async () => {}, diff --git a/foundry/packages/backend/test/workbench-unread.test.ts b/foundry/packages/backend/test/workbench-unread.test.ts index aafc178..4972c64 100644 --- a/foundry/packages/backend/test/workbench-unread.test.ts +++ b/foundry/packages/backend/test/workbench-unread.test.ts @@ -1,5 +1,5 @@ import { describe, expect, it } from "vitest"; -import { shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js"; +import { requireSendableSessionMeta, shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js"; describe("workbench unread status transitions", () => { it("marks unread when a running session first becomes idle", () => { @@ -57,3 +57,30 @@ describe("workbench model changes", () => { ).toBe(false); }); }); + +describe("workbench send readiness", () => { + it("rejects unknown tabs", () => { + expect(() => requireSendableSessionMeta(null, "tab-1")).toThrow("Unknown workbench tab: tab-1"); + }); + + it("rejects pending sessions", () => { + expect(() => + requireSendableSessionMeta( + { + status: "pending_session_create", + sandboxSessionId: null, + }, + "tab-2", + ), + ).toThrow("Session is not ready (status: pending_session_create). Wait for session provisioning to complete."); + }); + + it("accepts ready sessions with a sandbox session id", () => { + const meta = { + status: "ready", + sandboxSessionId: "session-1", + }; + + expect(requireSendableSessionMeta(meta, "tab-3")).toBe(meta); + }); +}); diff --git a/foundry/research/specs/remove-local-git-clone.md b/foundry/research/specs/remove-local-git-clone.md new file mode 100644 index 0000000..261ffc2 --- /dev/null +++ b/foundry/research/specs/remove-local-git-clone.md @@ -0,0 +1,381 @@ +# Remove Local Git Clone from Backend + +## Goal + +The Foundry backend stores zero git state. No clones, no refs, no working trees, no git-spice. All git operations execute inside sandboxes. Repo metadata (branches, default branch, PRs) comes from GitHub API/webhooks which we already have. + +## Terminology renames + +Rename Foundry domain terms across the entire `foundry/` directory. All changes are breaking — no backwards compatibility needed. Execute as separate atomic commits in this order. `pnpm -w typecheck && pnpm -w build && pnpm -w test` must pass between each. + +| New name | Old name (current code) | +|---|---| +| **Organization** | Workspace | +| **Repository** | Project | +| **Session** (not "tab") | Tab / Session (mixed) | +| **Subscription** | Interest | +| **SandboxProviderId** | ProviderId | + +### Rename 1: `interest` → `subscription` + +The realtime pub/sub system in `client/src/interest/`. Rename the directory, all types (`InterestManager` → `SubscriptionManager`, `MockInterestManager` → `MockSubscriptionManager`, `RemoteInterestManager` → `RemoteSubscriptionManager`, `DebugInterestTopic` → `DebugSubscriptionTopic`), the `useInterest` hook → `useSubscription`, and all imports in client + frontend. Rename `frontend/src/lib/interest.ts` → `subscription.ts`. Rename test file `client/test/interest-manager.test.ts` → `subscription-manager.test.ts`. + +### Rename 2: `tab` → `session` + +The UI "tab" concept is really a session. Rename `TabStrip` → `SessionStrip`, `tabId` → `sessionId`, `closeTab` → `closeSession`, `addTab` → `addSession`, `WorkbenchAgentTab` → `WorkbenchAgentSession`, `TaskWorkbenchTabInput` → `TaskWorkbenchSessionInput`, `TaskWorkbenchAddTabResponse` → `TaskWorkbenchAddSessionResponse`, and all related props/DOM attrs (`activeTabId` → `activeSessionId`, `onSwitchTab` → `onSwitchSession`, `onCloseTab` → `onCloseSession`, `data-tab` → `data-session`, `editingSessionTabId` → `editingSessionId`). Rename file `tab-strip.tsx` → `session-strip.tsx`. **Leave "diff tabs" alone** (`isDiffTab`, `diffTabId`) — those are file viewer panes, a different concept. + +### Rename 3: `ProviderId` → `SandboxProviderId` + +The `ProviderId` type (`"e2b" | "local"`) is specifically a sandbox provider. Rename the type (`ProviderId` → `SandboxProviderId`), schema (`ProviderIdSchema` → `SandboxProviderIdSchema`), and all `providerId` fields that refer to sandbox hosting (`CreateTaskInput`, `TaskRecord`, `SwitchResult`, `WorkbenchSandboxSummary`, task DB schema `task.provider_id` → `sandbox_provider_id`, `task_sandboxes.provider_id` → `sandbox_provider_id`, topic params). Rename config key `providers` → `sandboxProviders`. DB column renames need Drizzle migrations. + +**Do NOT rename**: `model.provider` (AI model provider), `auth_account_index.provider_id` (auth provider), `providerAgent()` (model→agent mapping), `WorkbenchModelGroup.provider`. + +Also **delete the `providerProfiles` table entirely** — it's written but never read (dead code). Remove the table definition from the organization actor DB schema, all writes in organization actions, and the `refreshProviderProfiles` queue command/handler/interface. + +### Rename 4: `project` → `repository` + +The "project" actor/entity is a git repository. Rename: +- Actor directory `actors/project/` → `actors/repository/` +- Actor directory `actors/project-branch-sync/` → `actors/repository-branch-sync/` +- Actor registry keys `project` → `repository`, `projectBranchSync` → `repositoryBranchSync` +- Actor name string `"Project"` → `"Repository"` +- All functions: `projectKey` → `repositoryKey`, `getOrCreateProject` → `getOrCreateRepository`, `getProject` → `getRepository`, `selfProject` → `selfRepository`, `projectBranchSyncKey` → `repositoryBranchSyncKey`, `projectPrSyncKey` → `repositoryPrSyncKey`, `projectWorkflowQueueName` → `repositoryWorkflowQueueName` +- Types: `ProjectInput` → `RepositoryInput`, `WorkbenchProjectSection` → `WorkbenchRepositorySection`, `PROJECT_QUEUE_NAMES` → `REPOSITORY_QUEUE_NAMES` +- Queue names: `"project.command.*"` → `"repository.command.*"` +- Actor key strings: change `"project"` to `"repository"` in key arrays (e.g. `["ws", id, "project", repoId]` → `["org", id, "repository", repoId]`) +- Frontend: `projects` → `repositories`, `collapsedProjects` → `collapsedRepositories`, `hoveredProjectId` → `hoveredRepositoryId`, `PROJECT_COLORS` → `REPOSITORY_COLORS`, `data-project-*` → `data-repository-*`, `groupWorkbenchProjects` → `groupWorkbenchRepositories` +- Client keys: `projectKey()` → `repositoryKey()`, `projectBranchSyncKey()` → `repositoryBranchSyncKey()`, `projectPrSyncKey()` → `repositoryPrSyncKey()` + +### Rename 5: `workspace` → `organization` + +The "workspace" is really an organization. Rename: +- Actor directory `actors/workspace/` → `actors/organization/` +- Actor registry key `workspace` → `organization` +- Actor name string `"Workspace"` → `"Organization"` +- All types: `WorkspaceIdSchema` → `OrganizationIdSchema`, `WorkspaceId` → `OrganizationId`, `WorkspaceEvent` → `OrganizationEvent`, `WorkspaceSummarySnapshot` → `OrganizationSummarySnapshot`, `WorkspaceUseInputSchema` → `OrganizationUseInputSchema`, `WorkspaceHandle` → `OrganizationHandle`, `WorkspaceTopicParams` → `OrganizationTopicParams` +- All `workspaceId` fields/params → `organizationId` (~20+ schemas in contracts.ts, plus topic params, task snapshot, etc.) +- `FoundryOrganization.workspaceId` → `FoundryOrganization.organizationId` (or just `id`) +- All functions: `workspaceKey` → `organizationKey`, `getOrCreateWorkspace` → `getOrCreateOrganization`, `selfWorkspace` → `selfOrganization`, `resolveWorkspaceId` → `resolveOrganizationId`, `defaultWorkspace` → `defaultOrganization`, `workspaceWorkflowQueueName` → `organizationWorkflowQueueName`, `WORKSPACE_QUEUE_NAMES` → `ORGANIZATION_QUEUE_NAMES` +- Actor key strings: change `"ws"` to `"org"` in key arrays (e.g. `["ws", id]` → `["org", id]`) +- Queue names: `"workspace.command.*"` → `"organization.command.*"` +- Topic keys: `"workspace:${id}"` → `"organization:${id}"`, event `"workspaceUpdated"` → `"organizationUpdated"` +- Methods: `connectWorkspace` → `connectOrganization`, `getWorkspaceSummary` → `getOrganizationSummary`, `useWorkspace` → `useOrganization` +- Files: `shared/src/workspace.ts` → `organization.ts`, `backend/src/config/workspace.ts` → `organization.ts` +- Config keys: `config.workspace.default` → `config.organization.default` +- URL paths: `/workspaces/$workspaceId` → `/organizations/$organizationId` +- UI strings: `"Loading workspace..."` → `"Loading organization..."` +- Tests: rename `workspace-*.test.ts` files, update `workspaceSnapshot()` → `organizationSnapshot()`, `workspaceId: "ws-1"` → `organizationId: "org-1"` + +### After all renames: update CLAUDE.md files + +Update `foundry/CLAUDE.md` and `foundry/packages/backend/CLAUDE.md` to use new terminology throughout (organization instead of workspace, repository instead of project, etc.). The rest of this spec already uses the new names. + +## What gets deleted + +### Entire directories/files + +| Path (relative to `packages/backend/src/`) | Reason | +|---|---| +| `integrations/git/index.ts` | All local git operations | +| `integrations/git-spice/index.ts` | Stack management via git-spice | +| `actors/repository-branch-sync/` (currently `project-branch-sync/`) | Polling actor that fetches + reads local clone every 5s | +| `actors/project-pr-sync/` | Empty directory, already dead | +| `actors/repository/stack-model.ts` (currently `project/stack-model.ts`) | Stack parent/sort model (git-spice dependent) | +| `test/git-spice.test.ts` | Tests for deleted git-spice integration | +| `test/git-validate-remote.test.ts` | Tests for deleted git validation | +| `test/stack-model.test.ts` | Tests for deleted stack model | + +### Driver interfaces removed from `driver.ts` + +- `GitDriver` — entire interface deleted +- `StackDriver` — entire interface deleted +- `BackendDriver.git` — removed +- `BackendDriver.stack` — removed +- All imports from `integrations/git/` and `integrations/git-spice/` + +`BackendDriver` keeps only `github` and `tmux`. + +### Test driver cleanup (`test/helpers/test-driver.ts`) + +- Delete `createTestGitDriver()` +- Delete `createTestStackDriver()` +- Remove `git` and `stack` from `createTestDriver()` + +### Docker volume removed (`compose.dev.yaml`, `compose.preview.yaml`) + +- Remove `foundry_git_repos` volume and its mount at `/root/.local/share/foundry/repos` +- Remove the CLAUDE.md note about the repos volume + +### Actor registry cleanup (`actors/index.ts`, `actors/keys.ts`, `actors/handles.ts`) + +- Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) registration +- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) +- Remove branch sync handle helpers + +### Client key cleanup (`packages/client/src/keys.ts`, `packages/client/test/keys.test.ts`) + +- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) if exported + +### Dead code removal: `providerProfiles` table + +The `providerProfiles` table in the organization actor (currently workspace actor) DB is written but never read. Delete: + +- Table definition in `actors/organization/db/schema.ts` (currently `workspace/db/schema.ts`) +- All writes in `actors/organization/actions.ts` (currently `workspace/actions.ts`) +- The `refreshProviderProfiles` queue command and handler +- The `RefreshProviderProfilesCommand` interface +- Add a DB migration to drop the `provider_profiles` table + +### Ensure pattern cleanup (`actors/repository/actions.ts`, currently `project/actions.ts`) + +Delete all `ensure*` functions that block action handlers on external I/O or cross-actor fan-out: + +- **`ensureLocalClone()`** — Delete (git clone removal). +- **`ensureProjectReady()`** / **`ensureRepositoryReady()`** — Delete (wrapper around `ensureLocalClone` + sync actors). +- **`ensureProjectReadyForRead()`** / **`ensureRepositoryReadyForRead()`** — Delete (dispatches ensure with 10s wait on read path). +- **`ensureProjectSyncActors()`** / **`ensureRepositorySyncActors()`** — Delete (spawns branch sync actor which is being removed). +- **`forceProjectSync()`** / **`forceRepositorySync()`** — Delete (triggers branch sync actor). +- **`ensureTaskIndexHydrated()`** — Delete. This is the migration path from `HistoryActor` → `task_index` table. Since we assume fresh repositories, no migration needed. The task index is populated on write (`createTask` inserts the row). +- **`ensureTaskIndexHydratedForRead()`** — Delete (wrapper that dispatches `hydrateTaskIndex`). +- **`taskIndexHydrated` state flag** — Delete from repository actor state. + +The `ensureAskpassScript()` is fine — it's a fast local operation. + +### Dead schema tables and helpers (`actors/repository/db/schema.ts`, `actors/repository/actions.ts`) + +With the branch sync actor and git-spice stack operations deleted, these tables have no writer and should be removed: + +- **`branches` table** — populated by `RepositoryBranchSyncActor` from the local clone. Delete the table, its schema definition, and all reads from it (including `enrichTaskRecord` which reads `diffStat`, `hasUnpushed`, `conflictsWithMain`, `parentBranch` from this table). +- **`repoActionJobs` table** — populated by `runRepoStackAction()` for git-spice stack operations. Delete the table, its schema definition, and all helpers: `ensureRepoActionJobsTable()`, `writeRepoActionJob()`, `listRepoActionJobRows()`. + +## What gets modified + +### `actors/repository/actions.ts` (currently `project/actions.ts`) + +This is the biggest change. Current git operations in this file: + +1. **`createTaskMutation()`** — Currently calls `listLocalRemoteRefs` to check branch name conflicts against remote branches. Replace: branch conflict checking uses only the repository actor's `task_index` table (which branches are already taken by tasks). We don't need to check against remote branches — if the branch already exists on the remote, `git push` in the sandbox will handle it. +2. **`registerTaskBranch()`** — Currently does `fetch` + `remoteDefaultBaseRef` + `revParse` + git-spice stack tracking. Replace: default base branch comes from GitHub repo metadata (already stored from webhook/API at repo add time). SHA resolution is not needed at task creation — the sandbox handles it. Delete all git-spice stack tracking. +3. **`getRepoOverview()`** — Currently calls `listLocalRemoteRefs` + `remoteDefaultBaseRef` + `stack.available` + `stack.listStack`. Replace: branch data comes from GitHub API data we already store from webhooks (push/create/delete events feed branch state). Stack data is deleted. The overview returns branches from stored GitHub webhook data. +4. **`runRepoStackAction()`** — Delete entirely (all git-spice stack operations). +5. **All `normalizeBaseBranchName` imports from git-spice** — Inline or move to a simple utility if still needed. +6. **All `ensureTaskIndexHydrated*` / `ensureRepositoryReady*` call sites** — Remove. Read actions query the `task_index` table directly; if it's empty, it's empty. Write actions populate it on create. + +### `actors/repository/index.ts` (currently `project/index.ts`) + +- Remove local clone path from state/initialization +- Remove branch sync actor spawning +- Remove any `ensureLocalClone` calls in lifecycle + +### `actors/task/workbench.ts` + +- **`ensureSandboxRepo()` line 405**: Currently calls `driver.git.remoteDefaultBaseRef()` on the local clone. Replace: read default branch from repository actor state (which gets it from GitHub API/webhook data at repo add time). + +### `actors/organization/actions.ts` (currently `workspace/actions.ts`) + +- **`addRemote()` line 320**: Currently calls `driver.git.validateRemote()` which runs `git ls-remote`. Replace: validate via GitHub API — `GET /repos/{owner}/{repo}` returns 404 for invalid repos. We already parse the remote URL into owner/repo for GitHub operations. + +### `actors/keys.ts` / `actors/handles.ts` + +- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) export +- Remove branch sync handle creation + +## What stays the same + +- `driver.github.*` — already uses GitHub API, no changes +- `driver.tmux.*` — unrelated, no changes +- `integrations/github/index.ts` — already GitHub API based, keeps working +- All sandbox execution (`executeInSandbox()`) — already correct pattern +- Webhook handlers for push/create/delete events — already feed GitHub data into backend + +## CLAUDE.md updates + +### `foundry/packages/backend/CLAUDE.md` + +Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) from the actor hierarchy tree: + +```text +OrganizationActor +├─ HistoryActor(organization-scoped global feed) +├─ GithubDataActor +├─ RepositoryActor(repo) +│ └─ TaskActor(task) +│ ├─ TaskSessionActor(session) x N +│ │ └─ SessionStatusSyncActor(session) x 0..1 +│ └─ Task-local workbench state +└─ SandboxInstanceActor(sandboxProviderId, sandboxId) x N +``` + +Add to Ownership Rules: + +> - The backend stores no local git state. No clones, no refs, no working trees, no git-spice. Repo metadata (branches, default branch) comes from GitHub API and webhook events. All git operations that require a working tree execute inside sandboxes via `executeInSandbox()`. + +### `foundry/CLAUDE.md` + +Add a new section: + +```markdown +## Git State Policy + +- The backend stores **zero git state**. No local clones, no refs, no working trees, no git-spice. +- Repo metadata (branches, default branch, PRs) comes from GitHub API and webhook events already flowing into the system. +- All git operations that require a working tree (diff, push, conflict check, rev-parse) execute inside the task's sandbox via `executeInSandbox()`. +- Do not add local git clone paths, `git fetch`, `git for-each-ref`, or any direct git CLI calls to the backend. If you need git data, either read it from stored GitHub webhook/API data or run it in a sandbox. +- The `BackendDriver` has no `GitDriver` or `StackDriver`. Only `GithubDriver` and `TmuxDriver` remain. +- git-spice is not used anywhere in the system. +``` + +Remove from CLAUDE.md: + +> - Docker dev: `compose.dev.yaml` mounts a named volume at `/root/.local/share/foundry/repos` to persist backend-managed git clones across restarts. Code must still work if this volume is not present (create directories as needed). + +## Concerns + +1. **Concurrent agent work**: Another agent is currently modifying `workspace/actions.ts`, `project/actions.ts`, `task/workbench.ts`, `task/workflow/init.ts`, `task/workflow/queue.ts`, `driver.ts`, and `project-branch-sync/index.ts`. Those changes are adding `listLocalRemoteRefs` to the driver and removing polling loops/timeouts. The git clone removal work will **delete** the code the other agent is modifying. Coordinate: let the other agent's changes land first, then this spec deletes the git integration entirely. + +2. **Rename ordering**: The rename spec (workspace→organization, project→repository, etc.) should ideally land **before** this spec is executed, so the file paths and identifiers match. If not, the implementing agent should map old names → new names using the table above. + +3. **`project-pr-sync/` directory**: This is already an empty directory. Delete it as part of cleanup. + +4. **`ensureRepoActionJobsTable()`**: The current spec mentions this should stay but the `repoActionJobs` table is being deleted. Updating: both the table and the ensure function should be deleted. + +## Validation + +After implementation, run: + +```bash +pnpm -w typecheck +pnpm -w build +pnpm -w test +``` + +Then restart the dev stack and run the main user flow end-to-end: + +```bash +just foundry-dev-down && just foundry-dev +``` + +Verify: +1. Add a repo to an organization +2. Create a task (should return immediately with taskId) +3. Task appears in sidebar with pending status +4. Task provisions and transitions to ready +5. Session is created and initial message is sent +6. Agent responds in the session transcript + +This must work against a real GitHub repo (`rivet-dev/sandbox-agent-testing`) with the dev environment credentials. + +### Codebase grep validation + +After implementation, verify no local git operations or git-spice references remain in the backend: + +```bash +# No local git CLI calls (excludes integrations/github which is GitHub API, not local git) +rg -l 'execFileAsync\("git"' foundry/packages/backend/src/ && echo "FAIL: local git CLI calls found" || echo "PASS" + +# No git-spice references +rg -l 'git.spice|gitSpice|git_spice' foundry/packages/backend/src/ && echo "FAIL: git-spice references found" || echo "PASS" + +# No GitDriver or StackDriver references +rg -l 'GitDriver|StackDriver' foundry/packages/backend/src/ && echo "FAIL: deleted driver interfaces still referenced" || echo "PASS" + +# No local clone path references +rg -l 'localPath|ensureCloned|ensureLocalClone|foundryRepoClonePath' foundry/packages/backend/src/ && echo "FAIL: local clone references found" || echo "PASS" + +# No branch sync actor references +rg -l 'BranchSync|branchSync|branch.sync' foundry/packages/backend/src/ && echo "FAIL: branch sync references found" || echo "PASS" + +# No deleted ensure patterns +rg -l 'ensureProjectReady|ensureTaskIndexHydrated|taskIndexHydrated' foundry/packages/backend/src/ && echo "FAIL: deleted ensure patterns found" || echo "PASS" + +# integrations/git/ and integrations/git-spice/ directories should not exist +ls foundry/packages/backend/src/integrations/git/index.ts 2>/dev/null && echo "FAIL: git integration not deleted" || echo "PASS" +ls foundry/packages/backend/src/integrations/git-spice/index.ts 2>/dev/null && echo "FAIL: git-spice integration not deleted" || echo "PASS" +``` + +All checks must pass before the change is considered complete. + +### Rename verification + +After the rename spec has landed, verify no old names remain anywhere in `foundry/`: + +```bash +# --- workspace → organization --- +# No "WorkspaceActor", "WorkspaceEvent", "WorkspaceId", "WorkspaceSummary", etc. (exclude pnpm-workspace.yaml, node_modules, .turbo) +rg -l 'WorkspaceActor|WorkspaceEvent|WorkspaceId|WorkspaceSummary|WorkspaceHandle|WorkspaceUseInput|WorkspaceTopicParams' foundry/packages/ && echo "FAIL: workspace type references remain" || echo "PASS" + +# No workspaceId in domain code (exclude pnpm-workspace, node_modules, .turbo, this spec file) +rg -l 'workspaceId' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspaceId references remain" || echo "PASS" + +# No workspace actor directory +ls foundry/packages/backend/src/actors/workspace/ 2>/dev/null && echo "FAIL: workspace actor directory not renamed" || echo "PASS" + +# No workspaceKey function +rg 'workspaceKey|selfWorkspace|getOrCreateWorkspace|resolveWorkspaceId|defaultWorkspace' foundry/packages/ --glob '!node_modules' && echo "FAIL: workspace function references remain" || echo "PASS" + +# No "ws" actor key string (the old key prefix) +rg '"\\"ws\\""|\["ws"' foundry/packages/ --glob '!node_modules' && echo "FAIL: old 'ws' actor key strings remain" || echo "PASS" + +# No workspace queue names +rg 'workspace\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspace queue names remain" || echo "PASS" + +# No /workspaces/ URL paths +rg '/workspaces/' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: /workspaces/ URL paths remain" || echo "PASS" + +# No config.workspace +rg 'config\.workspace' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: config.workspace references remain" || echo "PASS" + +# --- project → repository --- +# No ProjectActor, ProjectInput, ProjectSection, etc. +rg -l 'ProjectActor|ProjectInput|ProjectSection|PROJECT_QUEUE|PROJECT_COLORS' foundry/packages/ --glob '!node_modules' && echo "FAIL: project type references remain" || echo "PASS" + +# No project actor directory +ls foundry/packages/backend/src/actors/project/ 2>/dev/null && echo "FAIL: project actor directory not renamed" || echo "PASS" + +# No projectKey, selfProject, getOrCreateProject, etc. +rg 'projectKey|selfProject|getOrCreateProject|getProject\b|projectBranchSync|projectPrSync|projectWorkflow' foundry/packages/ --glob '!node_modules' && echo "FAIL: project function references remain" || echo "PASS" + +# No "project" actor key string +rg '"\\"project\\""|\[".*"project"' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: old project actor key strings remain" || echo "PASS" + +# No project.command.* queue names +rg 'project\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: project queue names remain" || echo "PASS" + +# --- tab → session --- +# No WorkbenchAgentTab, TaskWorkbenchTabInput, TabStrip, tabId (in workbench context) +rg -l 'WorkbenchAgentTab|TaskWorkbenchTabInput|TaskWorkbenchAddTabResponse|TabStrip' foundry/packages/ --glob '!node_modules' && echo "FAIL: tab type references remain" || echo "PASS" + +# No tabId (should be sessionId now) +rg '\btabId\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: tabId references remain" || echo "PASS" + +# No tab-strip.tsx file +ls foundry/packages/frontend/src/components/mock-layout/tab-strip.tsx 2>/dev/null && echo "FAIL: tab-strip.tsx not renamed" || echo "PASS" + +# No closeTab/addTab (should be closeSession/addSession) +rg '\bcloseTab\b|\baddTab\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: closeTab/addTab references remain" || echo "PASS" + +# --- interest → subscription --- +# No InterestManager, useInterest, etc. +rg -l 'InterestManager|useInterest|DebugInterestTopic' foundry/packages/ --glob '!node_modules' && echo "FAIL: interest type references remain" || echo "PASS" + +# No interest/ directory +ls foundry/packages/client/src/interest/ 2>/dev/null && echo "FAIL: interest directory not renamed" || echo "PASS" + +# --- ProviderId → SandboxProviderId --- +# No bare ProviderId/ProviderIdSchema (but allow sandboxProviderId, model.provider, auth provider_id) +rg '\bProviderIdSchema\b|\bProviderId\b' foundry/packages/shared/src/contracts.ts && echo "FAIL: bare ProviderId in contracts.ts" || echo "PASS" + +# No bare providerId for sandbox context (check task schema) +rg '\bproviderId\b' foundry/packages/backend/src/actors/task/db/schema.ts && echo "FAIL: bare providerId in task schema" || echo "PASS" + +# No providerProfiles table (dead code, should be deleted) +rg 'providerProfiles|provider_profiles|refreshProviderProfiles' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: providerProfiles references remain" || echo "PASS" + +# --- Verify new names exist --- +rg -l 'OrganizationActor|OrganizationEvent|OrganizationId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new organization names not found" +rg -l 'RepositoryActor|RepositoryInput|RepositorySection' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new repository names not found" +rg -l 'SubscriptionManager|useSubscription' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new subscription names not found" +rg -l 'SandboxProviderIdSchema|SandboxProviderId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new sandbox provider names not found" +``` + +All checks must pass. False positives from markdown files, comments referencing old names in migration context, or `node_modules` should be excluded via the globs above. diff --git a/sdks/CLAUDE.md b/sdks/CLAUDE.md new file mode 100644 index 0000000..a71eac4 --- /dev/null +++ b/sdks/CLAUDE.md @@ -0,0 +1,37 @@ +# SDK Instructions + +## TypeScript SDK Architecture + +- TypeScript clients are split into: + - `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers. + - `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers. +- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`. +- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`. +- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`. +- Cleanup is `sdk.dispose()`. + +### React Component Methodology + +- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector. +- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations. +- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks. +- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`. +- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`. + +### TypeScript SDK Naming Conventions + +- Use `respond(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent. +- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions. + +### Docs Source Of Truth + +- For TypeScript docs/examples, source of truth is implementation in: + - `sdks/typescript/src/client.ts` + - `sdks/typescript/src/index.ts` + - `sdks/acp-http-client/src/index.ts` +- Do not document TypeScript APIs unless they are exported and implemented in those files. + +## Tests + +- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior. +- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests. diff --git a/server/CLAUDE.md b/server/CLAUDE.md index b56223c..88f4f0a 100644 --- a/server/CLAUDE.md +++ b/server/CLAUDE.md @@ -1,18 +1,47 @@ # Server Instructions -## Architecture +## ACP v1 Baseline -- Public API routes are defined in `server/packages/sandbox-agent/src/router.rs`. -- ACP proxy runtime is in `server/packages/sandbox-agent/src/acp_proxy_runtime.rs`. -- All API endpoints are under `/v1`. -- Keep binary filesystem transfer endpoints as dedicated HTTP APIs: +- v1 is ACP-native. +- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`). +- `/opencode/*` is disabled during ACP core phases and returns `503`. +- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`: + - `POST /v1/rpc` + - `GET /v1/rpc` (SSE) + - `DELETE /v1/rpc` +- Control-plane endpoints: + - `GET /v1/health` + - `GET /v1/agents` + - `POST /v1/agents/{agent}/install` +- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods): - `GET /v1/fs/file` - `PUT /v1/fs/file` - `POST /v1/fs/upload-batch` - - Rationale: host-owned cross-agent-consistent behavior and large binary transfer needs that ACP JSON-RPC is not suited to stream efficiently. - - Maintain ACP variants in parallel only when they share the same underlying filesystem implementation; SDK defaults should still prefer HTTP for large/binary transfers. -- `/opencode/*` stays disabled (`503`) until Phase 7. -- Agent install logic (native + ACP agent process + lazy install) is handled by `server/packages/agent-management/`. +- Sandbox Agent ACP extension method naming: + - Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`). + - Session detach method is `_sandboxagent/session/detach`. + +## API Scope + +- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent. +- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP. +- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities). +- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP. + - Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent. + - ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior. +- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP: + - These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior. + - They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream. + - This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`. + - ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations. + +## Architecture + +- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs` +- ACP proxy runtime: `server/packages/sandbox-agent/src/acp_proxy_runtime.rs` +- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs` +- Agent install logic (native + ACP agent process + lazy install): `server/packages/agent-management/` +- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/` ## API Contract Rules @@ -21,6 +50,24 @@ - Regenerate `docs/openapi.json` after endpoint contract changes. - Keep CLI and HTTP endpoint behavior aligned (`docs/cli.mdx`). +## ACP Protocol Compliance + +- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`. +- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`). +- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`). +- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it. +- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire. + +## Source Documents + +- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt` +- `~/misc/acp-docs/schema/schema.json` +- `~/misc/acp-docs/schema/meta.json` +- `research/acp/spec.md` +- `research/acp/v1-schema-to-acp-mapping.md` +- `research/acp/friction.md` +- `research/acp/todo.md` + ## Tests Primary v1 integration coverage: @@ -38,3 +85,9 @@ cargo test -p sandbox-agent --test v1_agent_process_matrix - Keep `research/acp/spec.md` as the source spec. - Update `research/acp/todo.md` when scope/status changes. - Log blockers/decisions in `research/acp/friction.md`. + +## Docker Examples (Dev Testing) + +- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples. +- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image. +- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`