This commit is contained in:
Nathan Flurry 2026-03-14 16:06:50 -07:00
parent 400f9a214e
commit 3263d4f5e1
18 changed files with 677 additions and 329 deletions

View file

@ -1,40 +1,5 @@
# Instructions
## ACP v1 Baseline
- v1 is ACP-native.
- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`).
- `/opencode/*` is disabled during ACP core phases and returns `503`.
- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`:
- `POST /v1/rpc`
- `GET /v1/rpc` (SSE)
- `DELETE /v1/rpc`
- Control-plane endpoints:
- `GET /v1/health`
- `GET /v1/agents`
- `POST /v1/agents/{agent}/install`
- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods):
- `GET /v1/fs/file`
- `PUT /v1/fs/file`
- `POST /v1/fs/upload-batch`
- Sandbox Agent ACP extension method naming:
- Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`).
- Session detach method is `_sandboxagent/session/detach`.
## API Scope
- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent.
- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP.
- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities).
- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP.
- Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent.
- ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior.
- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP:
- These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior.
- They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream.
- This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`.
- ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations.
## Naming and Ownership
- This repository/product is **Sandbox Agent**.
@ -49,66 +14,13 @@
- Never expose underlying protocol method names (e.g. `session/request_permission`, `session/create`, `_sandboxagent/session/detach`) in non-ACP docs. Describe the behavior in user-facing terms instead.
- Do not describe the underlying protocol implementation in docs. Only document the SDK surface (methods, types, options). ACP protocol details belong exclusively in ACP-specific pages.
## Architecture (Brief)
### Docs Source Of Truth (HTTP/CLI)
- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs`
- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs`
- Agent/native + ACP agent process install and lazy install: `server/packages/agent-management/`
- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/`
## TypeScript SDK Architecture
- TypeScript clients are split into:
- `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers.
- `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers.
- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`.
- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`.
- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`.
- Cleanup is `sdk.dispose()`.
### React Component Methodology
- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector.
- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations.
- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks.
- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`.
- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`.
### TypeScript SDK Naming Conventions
- Use `respond<Thing>(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent.
- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions.
### Docs Source Of Truth
- For TypeScript docs/examples, source of truth is implementation in:
- `sdks/typescript/src/client.ts`
- `sdks/typescript/src/index.ts`
- `sdks/acp-http-client/src/index.ts`
- Do not document TypeScript APIs unless they are exported and implemented in those files.
- For HTTP/CLI docs/examples, source of truth is:
- `server/packages/sandbox-agent/src/router.rs`
- `server/packages/sandbox-agent/src/cli.rs`
- Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs).
## ACP Protocol Compliance
- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`.
- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`).
- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`).
- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it.
- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire.
## Source Documents
- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt`
- `~/misc/acp-docs/schema/schema.json`
- `~/misc/acp-docs/schema/meta.json`
- `research/acp/spec.md`
- `research/acp/v1-schema-to-acp-mapping.md`
- `research/acp/friction.md`
- `research/acp/todo.md`
## Change Tracking
- If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push.
@ -119,14 +31,6 @@
- Append blockers/decisions to `research/acp/friction.md` during ACP work.
- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
- Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.
- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior.
- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests.
## Docker Examples (Dev Testing)
- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples.
- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image.
- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`
## Install Version References

View file

@ -224,6 +224,12 @@ Examples:
Never use `wait: true` for operations that depend on external readiness, sandbox I/O, agent responses, git network operations, polling loops, or long-running queue drains. Never hold an action open while waiting for an external system to become ready — that is a polling/retry loop in disguise.
### Timeout policy
All `wait: true` sends must have an explicit `timeout`. Maximum timeout for any `wait: true` send is **10 seconds** (`10_000`). If an operation cannot reliably complete within 10 seconds, it must be restructured: write the initial record to the DB, return it to the caller, and continue the work asynchronously with `wait: false`. The client observes completion via push events.
`wait: false` sends do not need a timeout (the enqueue is instant; the work runs in the workflow loop with its own step-level timeouts).
### Task creation: resolve metadata before creating the actor
When creating a task, all deterministic metadata (title, branch name) must be resolved synchronously in the parent actor (project) *before* the task actor is created. The task actor must never be created with null `branchName` or `title`.

View file

@ -14,6 +14,10 @@ services:
HF_BACKEND_HOST: "0.0.0.0"
HF_BACKEND_PORT: "7741"
RIVETKIT_STORAGE_PATH: "/root/.local/share/foundry/rivetkit"
RIVET_LOG_ERROR_STACK: "${RIVET_LOG_ERROR_STACK:-1}"
RIVET_LOG_LEVEL: "${RIVET_LOG_LEVEL:-debug}"
RIVET_LOG_TIMESTAMP: "${RIVET_LOG_TIMESTAMP:-1}"
FOUNDRY_LOG_LEVEL: "${FOUNDRY_LOG_LEVEL:-debug}"
# Pass through credentials needed for agent execution + PR creation in dev/e2e.
# Do not hardcode secrets; set these in your environment when starting compose.
ANTHROPIC_API_KEY: "${ANTHROPIC_API_KEY:-}"

View file

@ -156,7 +156,7 @@ export const projectBranchSync = actor({
async force(c): Promise<void> {
const self = selfProjectBranchSync(c);
await self.send(CONTROL.force, {}, { wait: true, timeout: 5 * 60_000 });
await self.send(CONTROL.force, {}, { wait: true, timeout: 10_000 });
},
},
run: workflow(async (ctx) => {

View file

@ -11,7 +11,7 @@ import { resolveWorkspaceGithubAuth } from "../../services/github-auth.js";
import { expectQueueResponse } from "../../services/queue.js";
import { withRepoGitLock } from "../../services/repo-git-lock.js";
import { branches, taskIndex, repoActionJobs, repoMeta } from "./db/schema.js";
import { deriveFallbackTitle } from "../../services/create-flow.js";
import { deriveFallbackTitle, resolveCreateFlowDecision } from "../../services/create-flow.js";
import { normalizeBaseBranchName } from "../../integrations/git-spice/index.js";
import { sortBranchesForOverview } from "./stack-model.js";
@ -416,37 +416,81 @@ async function hydrateTaskIndexMutation(c: any, _cmd?: HydrateTaskIndexCommand):
}
async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise<TaskRecord> {
const workspaceId = c.state.workspaceId;
const repoId = c.state.repoId;
const repoRemote = c.state.remoteUrl;
const onBranch = cmd.onBranch?.trim() || null;
const initialBranchName = onBranch;
const initialTitle = onBranch ? deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined) : null;
const taskId = randomUUID();
let initialBranchName: string | null = null;
let initialTitle: string | null = null;
if (onBranch) {
initialBranchName = onBranch;
initialTitle = deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined);
await registerTaskBranchMutation(c, {
taskId,
branchName: onBranch,
requireExistingRemote: true,
});
} else {
const localPath = await ensureProjectReady(c);
const { driver } = getActorRuntimeContext();
// Read locally cached remote-tracking refs — no network fetch.
// The branch sync actor keeps these reasonably fresh. If a rare naming
// collision occurs with a very recently created remote branch, it will
// be caught lazily on push/checkout.
const remoteBranches = (await driver.git.listLocalRemoteRefs(localPath)).map((branch: any) => branch.branchName);
await ensureTaskIndexHydrated(c);
const reservedBranchRows = await c.db.select({ branchName: taskIndex.branchName }).from(taskIndex).where(isNotNull(taskIndex.branchName)).all();
const reservedBranches = reservedBranchRows
.map((row: { branchName: string | null }) => row.branchName)
.filter((branchName): branchName is string => typeof branchName === "string" && branchName.length > 0);
const resolved = resolveCreateFlowDecision({
task: cmd.task,
explicitTitle: cmd.explicitTitle ?? undefined,
explicitBranchName: cmd.explicitBranchName ?? undefined,
localBranches: remoteBranches,
taskBranches: reservedBranches,
});
initialBranchName = resolved.branchName;
initialTitle = resolved.title;
const now = Date.now();
await c.db
.insert(taskIndex)
.values({
taskId,
branchName: resolved.branchName,
createdAt: now,
updatedAt: now,
})
.onConflictDoNothing()
.run();
}
let task: Awaited<ReturnType<typeof getOrCreateTask>>;
try {
task = await getOrCreateTask(c, c.state.workspaceId, c.state.repoId, taskId, {
workspaceId: c.state.workspaceId,
repoId: c.state.repoId,
task = await getOrCreateTask(c, workspaceId, repoId, taskId, {
workspaceId,
repoId,
taskId,
repoRemote: c.state.remoteUrl,
repoRemote,
branchName: initialBranchName,
title: initialTitle,
task: cmd.task,
providerId: cmd.providerId,
agentType: cmd.agentType,
explicitTitle: onBranch ? null : cmd.explicitTitle,
explicitBranchName: onBranch ? null : cmd.explicitBranchName,
explicitTitle: null,
explicitBranchName: null,
initialPrompt: cmd.initialPrompt,
});
} catch (error) {
if (onBranch) {
if (initialBranchName) {
await c.db
.delete(taskIndex)
.where(eq(taskIndex.taskId, taskId))
@ -456,28 +500,14 @@ async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise<TaskR
throw error;
}
if (!onBranch) {
const now = Date.now();
await c.db
.insert(taskIndex)
.values({
taskId,
branchName: initialBranchName,
createdAt: now,
updatedAt: now,
})
.onConflictDoNothing()
.run();
}
const created = await task.initialize({ providerId: cmd.providerId });
const history = await getOrCreateHistory(c, c.state.workspaceId, c.state.repoId);
const history = await getOrCreateHistory(c, workspaceId, repoId);
await history.append({
kind: "task.created",
taskId,
payload: {
repoId: c.state.repoId,
repoId,
providerId: cmd.providerId,
},
});
@ -919,7 +949,7 @@ export const projectActions = {
return expectQueueResponse<EnsureProjectResult>(
await self.send(projectWorkflowQueueName("project.command.ensure"), cmd, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
}),
);
},
@ -929,7 +959,7 @@ export const projectActions = {
return expectQueueResponse<TaskRecord>(
await self.send(projectWorkflowQueueName("project.command.createTask"), cmd, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
}),
);
},
@ -947,7 +977,7 @@ export const projectActions = {
return expectQueueResponse<{ branchName: string; headSha: string }>(
await self.send(projectWorkflowQueueName("project.command.registerTaskBranch"), cmd, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
}),
);
},
@ -956,7 +986,7 @@ export const projectActions = {
const self = selfProject(c);
await self.send(projectWorkflowQueueName("project.command.hydrateTaskIndex"), cmd ?? {}, {
wait: true,
timeout: 60_000,
timeout: 10_000,
});
},
@ -1225,7 +1255,7 @@ export const projectActions = {
const self = selfProject(c);
await self.send(projectWorkflowQueueName("project.command.applyBranchSyncResult"), body, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
});
},
};

View file

@ -101,14 +101,15 @@ interface TaskWorkbenchSendMessageCommand {
attachments: Array<any>;
}
interface TaskWorkbenchSendMessageActionInput extends TaskWorkbenchSendMessageInput {
waitForCompletion?: boolean;
}
interface TaskWorkbenchCreateSessionCommand {
model?: string;
}
interface TaskWorkbenchCreateSessionAndSendCommand {
model?: string;
text: string;
}
interface TaskWorkbenchSessionCommand {
sessionId: string;
}
@ -143,7 +144,7 @@ export const task = actor({
const self = selfTask(c);
const result = await self.send(taskWorkflowQueueName("task.command.initialize"), cmd ?? {}, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
});
return expectQueueResponse<TaskRecord>(result);
},
@ -160,7 +161,7 @@ export const task = actor({
const self = selfTask(c);
const result = await self.send(taskWorkflowQueueName("task.command.attach"), cmd ?? {}, {
wait: true,
timeout: 20_000,
timeout: 10_000,
});
return expectQueueResponse<{ target: string; sessionId: string | null }>(result);
},
@ -172,7 +173,7 @@ export const task = actor({
{},
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
return expectQueueResponse<{ switchTarget: string }>(result);
@ -236,7 +237,7 @@ export const task = actor({
{},
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
},
@ -263,12 +264,25 @@ export const task = actor({
{ ...(input?.model ? { model: input.model } : {}) } satisfies TaskWorkbenchCreateSessionCommand,
{
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
},
);
return expectQueueResponse<{ tabId: string }>(result);
},
/**
* Fire-and-forget: creates a workbench session and sends the initial message.
* Used by createWorkbenchTask so the caller doesn't block on session creation.
*/
async createWorkbenchSessionAndSend(c, input: { model?: string; text: string }): Promise<void> {
const self = selfTask(c);
await self.send(
taskWorkflowQueueName("task.command.workbench.create_session_and_send"),
{ model: input.model, text: input.text } satisfies TaskWorkbenchCreateSessionAndSendCommand,
{ wait: false },
);
},
async renameWorkbenchSession(c, input: TaskWorkbenchRenameSessionInput): Promise<void> {
const self = selfTask(c);
await self.send(
@ -276,7 +290,7 @@ export const task = actor({
{ sessionId: input.tabId, title: input.title } satisfies TaskWorkbenchSessionTitleCommand,
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
},
@ -288,7 +302,7 @@ export const task = actor({
{ sessionId: input.tabId, unread: input.unread } satisfies TaskWorkbenchSessionUnreadCommand,
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
},
@ -304,7 +318,7 @@ export const task = actor({
} satisfies TaskWorkbenchUpdateDraftCommand,
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
},
@ -316,14 +330,14 @@ export const task = actor({
{ sessionId: input.tabId, model: input.model } satisfies TaskWorkbenchChangeModelCommand,
{
wait: true,
timeout: 20_000,
timeout: 10_000,
},
);
},
async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageActionInput): Promise<void> {
async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageInput): Promise<void> {
const self = selfTask(c);
const result = await self.send(
await self.send(
taskWorkflowQueueName("task.command.workbench.send_message"),
{
sessionId: input.tabId,
@ -331,13 +345,9 @@ export const task = actor({
attachments: input.attachments,
} satisfies TaskWorkbenchSendMessageCommand,
{
wait: input.waitForCompletion === true,
...(input.waitForCompletion === true ? { timeout: 10 * 60_000 } : {}),
wait: false,
},
);
if (input.waitForCompletion === true) {
expectQueueResponse(result);
}
},
async stopWorkbenchSession(c, input: TaskTabCommand): Promise<void> {

View file

@ -307,22 +307,14 @@ async function requireReadySessionMeta(c: any, tabId: string): Promise<any> {
return meta;
}
async function ensureReadySessionMeta(c: any, tabId: string): Promise<any> {
const meta = await readSessionMeta(c, tabId);
export function requireSendableSessionMeta(meta: any, tabId: string): any {
if (!meta) {
throw new Error(`Unknown workbench tab: ${tabId}`);
}
if (meta.status === "ready" && meta.sandboxSessionId) {
return meta;
if (meta.status !== "ready" || !meta.sandboxSessionId) {
throw new Error(`Session is not ready (status: ${meta.status}). Wait for session provisioning to complete.`);
}
if (meta.status === "error") {
throw new Error(meta.errorMessage ?? "This workbench tab failed to prepare");
}
await ensureWorkbenchSession(c, tabId);
return await requireReadySessionMeta(c, tabId);
return meta;
}
function shellFragment(parts: string[]): string {
@ -1204,7 +1196,7 @@ export async function changeWorkbenchModel(c: any, sessionId: string, model: str
}
export async function sendWorkbenchMessage(c: any, sessionId: string, text: string, attachments: Array<any>): Promise<void> {
const meta = await ensureReadySessionMeta(c, sessionId);
const meta = requireSendableSessionMeta(await readSessionMeta(c, sessionId), sessionId);
const record = await ensureWorkbenchSeeded(c);
const runtime = await getTaskSandboxRuntime(c, record);
await ensureSandboxRepo(c, runtime.sandbox, record);

View file

@ -1,14 +1,7 @@
import { Loop } from "rivetkit/workflow";
import { logActorWarning, resolveErrorMessage } from "../../logging.js";
import { getCurrentRecord } from "./common.js";
import {
initAssertNameActivity,
initBootstrapDbActivity,
initCompleteActivity,
initEnqueueProvisionActivity,
initEnsureNameActivity,
initFailedActivity,
} from "./init.js";
import { initBootstrapDbActivity, initCompleteActivity, initEnqueueProvisionActivity, initFailedActivity } from "./init.js";
import {
handleArchiveActivity,
handleAttachActivity,
@ -67,12 +60,8 @@ const commandHandlers: Record<TaskQueueName, WorkflowHandler> = {
await loopCtx.removed("init-failed", "step");
await loopCtx.removed("init-failed-v2", "step");
try {
await loopCtx.step({
name: "init-ensure-name",
timeout: 5 * 60_000,
run: async () => initEnsureNameActivity(loopCtx),
});
await loopCtx.step("init-assert-name", async () => initAssertNameActivity(loopCtx));
await loopCtx.removed("init-ensure-name", "step");
await loopCtx.removed("init-assert-name", "step");
await loopCtx.removed("init-create-sandbox", "step");
await loopCtx.removed("init-ensure-agent", "step");
await loopCtx.removed("init-start-sandbox-instance", "step");
@ -156,6 +145,26 @@ const commandHandlers: Record<TaskQueueName, WorkflowHandler> = {
}
},
"task.command.workbench.create_session_and_send": async (loopCtx, msg) => {
try {
const created = await loopCtx.step({
name: "workbench-create-session-for-send",
timeout: 5 * 60_000,
run: async () => createWorkbenchSession(loopCtx, msg.body?.model),
});
await loopCtx.step({
name: "workbench-send-initial-message",
timeout: 5 * 60_000,
run: async () => sendWorkbenchMessage(loopCtx, created.tabId, msg.body.text, []),
});
} catch (error) {
logActorWarning("task.workflow", "create_session_and_send failed", {
error: resolveErrorMessage(error),
});
}
await msg.complete({ ok: true });
},
"task.command.workbench.ensure_session": async (loopCtx, msg) => {
await loopCtx.step({
name: "workbench-ensure-session",

View file

@ -1,10 +1,8 @@
// @ts-nocheck
import { eq } from "drizzle-orm";
import { resolveCreateFlowDecision } from "../../../services/create-flow.js";
import { resolveWorkspaceGithubAuth } from "../../../services/github-auth.js";
import { getActorRuntimeContext } from "../../context.js";
import { getOrCreateHistory, getOrCreateProject, selfTask } from "../../handles.js";
import { logActorWarning, resolveErrorMessage } from "../../logging.js";
import { getOrCreateHistory, selfTask } from "../../handles.js";
import { resolveErrorMessage } from "../../logging.js";
import { defaultSandboxProviderId } from "../../../sandbox-config.js";
import { task as taskTable, taskRuntime } from "../db/schema.js";
import { TASK_ROW_ID, appendHistory, collectErrorMessages, resolveErrorDetail, setTaskState } from "./common.js";
@ -21,7 +19,6 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
const { config } = getActorRuntimeContext();
const providerId = body?.providerId ?? loopCtx.state.providerId ?? defaultSandboxProviderId(config);
const now = Date.now();
const initialStatusMessage = loopCtx.state.branchName && loopCtx.state.title ? "provisioning" : "naming";
await ensureTaskRuntimeCacheColumns(loopCtx.db);
@ -60,7 +57,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
activeSessionId: null,
activeSwitchTarget: null,
activeCwd: null,
statusMessage: initialStatusMessage,
statusMessage: "provisioning",
gitStateJson: null,
gitStateUpdatedAt: null,
provisionStage: "queued",
@ -74,7 +71,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
activeSessionId: null,
activeSwitchTarget: null,
activeCwd: null,
statusMessage: initialStatusMessage,
statusMessage: "provisioning",
provisionStage: "queued",
provisionStageUpdatedAt: now,
updatedAt: now,
@ -111,102 +108,6 @@ export async function initEnqueueProvisionActivity(loopCtx: any, body: any): Pro
}
}
export async function initEnsureNameActivity(loopCtx: any): Promise<void> {
await setTaskState(loopCtx, "init_ensure_name", "determining title and branch");
const existing = await loopCtx.db
.select({
branchName: taskTable.branchName,
title: taskTable.title,
})
.from(taskTable)
.where(eq(taskTable.id, TASK_ROW_ID))
.get();
if (existing?.branchName && existing?.title) {
loopCtx.state.branchName = existing.branchName;
loopCtx.state.title = existing.title;
return;
}
const { driver } = getActorRuntimeContext();
const auth = await resolveWorkspaceGithubAuth(loopCtx, loopCtx.state.workspaceId);
let repoLocalPath = loopCtx.state.repoLocalPath;
if (!repoLocalPath) {
const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote);
const result = await project.ensure({ remoteUrl: loopCtx.state.repoRemote });
repoLocalPath = result.localPath;
loopCtx.state.repoLocalPath = repoLocalPath;
}
try {
await driver.git.fetch(repoLocalPath, { githubToken: auth?.githubToken ?? null });
} catch (error) {
logActorWarning("task.init", "fetch before naming failed", {
workspaceId: loopCtx.state.workspaceId,
repoId: loopCtx.state.repoId,
taskId: loopCtx.state.taskId,
error: resolveErrorMessage(error),
});
}
const remoteBranches = (await driver.git.listRemoteBranches(repoLocalPath, { githubToken: auth?.githubToken ?? null })).map(
(branch: any) => branch.branchName,
);
const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote);
const reservedBranches = await project.listReservedBranches({});
const resolved = resolveCreateFlowDecision({
task: loopCtx.state.task,
explicitTitle: loopCtx.state.explicitTitle ?? undefined,
explicitBranchName: loopCtx.state.explicitBranchName ?? undefined,
localBranches: remoteBranches,
taskBranches: reservedBranches,
});
const now = Date.now();
await loopCtx.db
.update(taskTable)
.set({
branchName: resolved.branchName,
title: resolved.title,
updatedAt: now,
})
.where(eq(taskTable.id, TASK_ROW_ID))
.run();
loopCtx.state.branchName = resolved.branchName;
loopCtx.state.title = resolved.title;
loopCtx.state.explicitTitle = null;
loopCtx.state.explicitBranchName = null;
await loopCtx.db
.update(taskRuntime)
.set({
statusMessage: "provisioning",
provisionStage: "repo_prepared",
provisionStageUpdatedAt: now,
updatedAt: now,
})
.where(eq(taskRuntime.id, TASK_ROW_ID))
.run();
await project.registerTaskBranch({
taskId: loopCtx.state.taskId,
branchName: resolved.branchName,
});
await appendHistory(loopCtx, "task.named", {
title: resolved.title,
branchName: resolved.branchName,
});
}
export async function initAssertNameActivity(loopCtx: any): Promise<void> {
await setTaskState(loopCtx, "init_assert_name", "validating naming");
if (!loopCtx.state.branchName) {
throw new Error("task branchName is not initialized");
}
}
export async function initCompleteActivity(loopCtx: any, body: any): Promise<void> {
const now = Date.now();
const { config } = getActorRuntimeContext();

View file

@ -13,6 +13,7 @@ export const TASK_QUEUE_NAMES = [
"task.command.workbench.rename_task",
"task.command.workbench.rename_branch",
"task.command.workbench.create_session",
"task.command.workbench.create_session_and_send",
"task.command.workbench.ensure_session",
"task.command.workbench.rename_session",
"task.command.workbench.set_session_unread",

View file

@ -1,5 +1,4 @@
// @ts-nocheck
import { setTimeout as delay } from "node:timers/promises";
import { desc, eq } from "drizzle-orm";
import { Loop } from "rivetkit/workflow";
import type {
@ -272,24 +271,6 @@ async function requireWorkbenchTask(c: any, taskId: string) {
return getTask(c, c.state.workspaceId, repoId, taskId);
}
async function waitForWorkbenchTaskReady(task: any, timeoutMs = 5 * 60_000): Promise<any> {
const startedAt = Date.now();
for (;;) {
const record = await task.get();
if (record?.branchName && record?.title) {
return record;
}
if (record?.status === "error") {
throw new Error("task initialization failed before the workbench session was ready");
}
if (Date.now() - startedAt > timeoutMs) {
throw new Error("timed out waiting for task initialization");
}
await delay(1_000);
}
}
/**
* Reads the workspace sidebar snapshot from the workspace actor's local SQLite
* plus the org-scoped GitHub actor for open PRs. Task actors still push
@ -562,7 +543,7 @@ export const workspaceActions = {
return expectQueueResponse<RepoRecord>(
await self.send(workspaceWorkflowQueueName("workspace.command.addRepo"), input, {
wait: true,
timeout: 60_000,
timeout: 10_000,
}),
);
},
@ -595,7 +576,7 @@ export const workspaceActions = {
return expectQueueResponse<TaskRecord>(
await self.send(workspaceWorkflowQueueName("workspace.command.createTask"), input, {
wait: true,
timeout: 5 * 60_000,
timeout: 10_000,
}),
);
},
@ -813,6 +794,7 @@ export const workspaceActions = {
},
async createWorkbenchTask(c: any, input: TaskWorkbenchCreateTaskInput): Promise<{ taskId: string; tabId?: string }> {
// Step 1: Create the task record (wait: true — local state mutations only).
const created = await workspaceActions.createTask(c, {
workspaceId: c.state.workspaceId,
repoId: input.repoId,
@ -821,26 +803,18 @@ export const workspaceActions = {
...(input.onBranch ? { onBranch: input.onBranch } : input.branch ? { explicitBranchName: input.branch } : {}),
...(input.model ? { agentType: agentTypeForModel(input.model) } : {}),
});
// Step 2: Enqueue session creation + initial message (wait: false).
// The task workflow creates the session record and sends the message in
// the background. The client observes progress via push events on the
// task interest topic.
const task = await requireWorkbenchTask(c, created.taskId);
await waitForWorkbenchTaskReady(task);
const session = await task.createWorkbenchSession({
taskId: created.taskId,
...(input.model ? { model: input.model } : {}),
});
await task.sendWorkbenchMessage({
taskId: created.taskId,
tabId: session.tabId,
await task.createWorkbenchSessionAndSend({
model: input.model,
text: input.task,
attachments: [],
waitForCompletion: true,
});
await task.getSessionDetail({
sessionId: session.tabId,
});
return {
taskId: created.taskId,
tabId: session.tabId,
};
return { taskId: created.taskId };
},
async markWorkbenchUnread(c: any, input: TaskWorkbenchSelectInput): Promise<void> {
@ -988,7 +962,7 @@ export const workspaceActions = {
const self = selfWorkspace(c);
await self.send(workspaceWorkflowQueueName("workspace.command.refreshProviderProfiles"), command ?? {}, {
wait: true,
timeout: 60_000,
timeout: 10_000,
});
},

View file

@ -5,6 +5,7 @@ import {
ensureCloned,
fetch,
listRemoteBranches,
listLocalRemoteRefs,
remoteDefaultBaseRef,
revParse,
ensureRemoteBranch,
@ -28,6 +29,8 @@ export interface GitDriver {
ensureCloned(remoteUrl: string, targetPath: string, options?: { githubToken?: string | null }): Promise<void>;
fetch(repoPath: string, options?: { githubToken?: string | null }): Promise<void>;
listRemoteBranches(repoPath: string, options?: { githubToken?: string | null }): Promise<BranchSnapshot[]>;
/** Read remote-tracking refs from the local clone without fetching. */
listLocalRemoteRefs(repoPath: string): Promise<BranchSnapshot[]>;
remoteDefaultBaseRef(repoPath: string): Promise<string>;
revParse(repoPath: string, ref: string): Promise<string>;
ensureRemoteBranch(repoPath: string, branchName: string, options?: { githubToken?: string | null }): Promise<void>;
@ -81,6 +84,7 @@ export function createDefaultDriver(): BackendDriver {
ensureCloned,
fetch,
listRemoteBranches,
listLocalRemoteRefs,
remoteDefaultBaseRef,
revParse,
ensureRemoteBranch,

View file

@ -208,11 +208,25 @@ export async function remoteDefaultBaseRef(repoPath: string): Promise<string> {
return "origin/main";
}
/**
* Fetch from origin, then read remote-tracking refs.
* Use when you need guaranteed-fresh branch data and can tolerate network I/O.
*/
export async function listRemoteBranches(repoPath: string, options?: GitAuthOptions): Promise<BranchSnapshot[]> {
await fetch(repoPath, options);
return listLocalRemoteRefs(repoPath);
}
/**
* Read remote-tracking refs (`refs/remotes/origin/*`) from the local clone
* without fetching. The data is only as fresh as the last fetch use this
* when the branch sync actor keeps refs current and you want to avoid
* blocking on network I/O.
*/
export async function listLocalRemoteRefs(repoPath: string): Promise<BranchSnapshot[]> {
const { stdout } = await execFileAsync("git", ["-C", repoPath, "for-each-ref", "--format=%(refname:short) %(objectname)", "refs/remotes/origin"], {
maxBuffer: 1024 * 1024,
env: gitEnv(options),
env: gitEnv(),
});
return stdout

View file

@ -15,6 +15,7 @@ export function createTestGitDriver(overrides?: Partial<GitDriver>): GitDriver {
ensureCloned: async () => {},
fetch: async () => {},
listRemoteBranches: async () => [],
listLocalRemoteRefs: async () => [],
remoteDefaultBaseRef: async () => "origin/main",
revParse: async () => "abc1234567890",
ensureRemoteBranch: async () => {},

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from "vitest";
import { shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js";
import { requireSendableSessionMeta, shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js";
describe("workbench unread status transitions", () => {
it("marks unread when a running session first becomes idle", () => {
@ -57,3 +57,30 @@ describe("workbench model changes", () => {
).toBe(false);
});
});
describe("workbench send readiness", () => {
it("rejects unknown tabs", () => {
expect(() => requireSendableSessionMeta(null, "tab-1")).toThrow("Unknown workbench tab: tab-1");
});
it("rejects pending sessions", () => {
expect(() =>
requireSendableSessionMeta(
{
status: "pending_session_create",
sandboxSessionId: null,
},
"tab-2",
),
).toThrow("Session is not ready (status: pending_session_create). Wait for session provisioning to complete.");
});
it("accepts ready sessions with a sandbox session id", () => {
const meta = {
status: "ready",
sandboxSessionId: "session-1",
};
expect(requireSendableSessionMeta(meta, "tab-3")).toBe(meta);
});
});

View file

@ -0,0 +1,381 @@
# Remove Local Git Clone from Backend
## Goal
The Foundry backend stores zero git state. No clones, no refs, no working trees, no git-spice. All git operations execute inside sandboxes. Repo metadata (branches, default branch, PRs) comes from GitHub API/webhooks which we already have.
## Terminology renames
Rename Foundry domain terms across the entire `foundry/` directory. All changes are breaking — no backwards compatibility needed. Execute as separate atomic commits in this order. `pnpm -w typecheck && pnpm -w build && pnpm -w test` must pass between each.
| New name | Old name (current code) |
|---|---|
| **Organization** | Workspace |
| **Repository** | Project |
| **Session** (not "tab") | Tab / Session (mixed) |
| **Subscription** | Interest |
| **SandboxProviderId** | ProviderId |
### Rename 1: `interest``subscription`
The realtime pub/sub system in `client/src/interest/`. Rename the directory, all types (`InterestManager``SubscriptionManager`, `MockInterestManager``MockSubscriptionManager`, `RemoteInterestManager``RemoteSubscriptionManager`, `DebugInterestTopic``DebugSubscriptionTopic`), the `useInterest` hook → `useSubscription`, and all imports in client + frontend. Rename `frontend/src/lib/interest.ts``subscription.ts`. Rename test file `client/test/interest-manager.test.ts``subscription-manager.test.ts`.
### Rename 2: `tab``session`
The UI "tab" concept is really a session. Rename `TabStrip``SessionStrip`, `tabId``sessionId`, `closeTab``closeSession`, `addTab``addSession`, `WorkbenchAgentTab``WorkbenchAgentSession`, `TaskWorkbenchTabInput``TaskWorkbenchSessionInput`, `TaskWorkbenchAddTabResponse``TaskWorkbenchAddSessionResponse`, and all related props/DOM attrs (`activeTabId``activeSessionId`, `onSwitchTab``onSwitchSession`, `onCloseTab``onCloseSession`, `data-tab``data-session`, `editingSessionTabId``editingSessionId`). Rename file `tab-strip.tsx``session-strip.tsx`. **Leave "diff tabs" alone** (`isDiffTab`, `diffTabId`) — those are file viewer panes, a different concept.
### Rename 3: `ProviderId``SandboxProviderId`
The `ProviderId` type (`"e2b" | "local"`) is specifically a sandbox provider. Rename the type (`ProviderId``SandboxProviderId`), schema (`ProviderIdSchema``SandboxProviderIdSchema`), and all `providerId` fields that refer to sandbox hosting (`CreateTaskInput`, `TaskRecord`, `SwitchResult`, `WorkbenchSandboxSummary`, task DB schema `task.provider_id``sandbox_provider_id`, `task_sandboxes.provider_id``sandbox_provider_id`, topic params). Rename config key `providers``sandboxProviders`. DB column renames need Drizzle migrations.
**Do NOT rename**: `model.provider` (AI model provider), `auth_account_index.provider_id` (auth provider), `providerAgent()` (model→agent mapping), `WorkbenchModelGroup.provider`.
Also **delete the `providerProfiles` table entirely** — it's written but never read (dead code). Remove the table definition from the organization actor DB schema, all writes in organization actions, and the `refreshProviderProfiles` queue command/handler/interface.
### Rename 4: `project``repository`
The "project" actor/entity is a git repository. Rename:
- Actor directory `actors/project/``actors/repository/`
- Actor directory `actors/project-branch-sync/``actors/repository-branch-sync/`
- Actor registry keys `project``repository`, `projectBranchSync``repositoryBranchSync`
- Actor name string `"Project"``"Repository"`
- All functions: `projectKey``repositoryKey`, `getOrCreateProject``getOrCreateRepository`, `getProject``getRepository`, `selfProject``selfRepository`, `projectBranchSyncKey``repositoryBranchSyncKey`, `projectPrSyncKey``repositoryPrSyncKey`, `projectWorkflowQueueName``repositoryWorkflowQueueName`
- Types: `ProjectInput``RepositoryInput`, `WorkbenchProjectSection``WorkbenchRepositorySection`, `PROJECT_QUEUE_NAMES``REPOSITORY_QUEUE_NAMES`
- Queue names: `"project.command.*"``"repository.command.*"`
- Actor key strings: change `"project"` to `"repository"` in key arrays (e.g. `["ws", id, "project", repoId]``["org", id, "repository", repoId]`)
- Frontend: `projects``repositories`, `collapsedProjects``collapsedRepositories`, `hoveredProjectId``hoveredRepositoryId`, `PROJECT_COLORS``REPOSITORY_COLORS`, `data-project-*``data-repository-*`, `groupWorkbenchProjects``groupWorkbenchRepositories`
- Client keys: `projectKey()``repositoryKey()`, `projectBranchSyncKey()``repositoryBranchSyncKey()`, `projectPrSyncKey()``repositoryPrSyncKey()`
### Rename 5: `workspace``organization`
The "workspace" is really an organization. Rename:
- Actor directory `actors/workspace/``actors/organization/`
- Actor registry key `workspace``organization`
- Actor name string `"Workspace"``"Organization"`
- All types: `WorkspaceIdSchema``OrganizationIdSchema`, `WorkspaceId``OrganizationId`, `WorkspaceEvent``OrganizationEvent`, `WorkspaceSummarySnapshot``OrganizationSummarySnapshot`, `WorkspaceUseInputSchema``OrganizationUseInputSchema`, `WorkspaceHandle``OrganizationHandle`, `WorkspaceTopicParams``OrganizationTopicParams`
- All `workspaceId` fields/params → `organizationId` (~20+ schemas in contracts.ts, plus topic params, task snapshot, etc.)
- `FoundryOrganization.workspaceId``FoundryOrganization.organizationId` (or just `id`)
- All functions: `workspaceKey``organizationKey`, `getOrCreateWorkspace``getOrCreateOrganization`, `selfWorkspace``selfOrganization`, `resolveWorkspaceId``resolveOrganizationId`, `defaultWorkspace``defaultOrganization`, `workspaceWorkflowQueueName``organizationWorkflowQueueName`, `WORKSPACE_QUEUE_NAMES``ORGANIZATION_QUEUE_NAMES`
- Actor key strings: change `"ws"` to `"org"` in key arrays (e.g. `["ws", id]``["org", id]`)
- Queue names: `"workspace.command.*"``"organization.command.*"`
- Topic keys: `"workspace:${id}"``"organization:${id}"`, event `"workspaceUpdated"``"organizationUpdated"`
- Methods: `connectWorkspace``connectOrganization`, `getWorkspaceSummary``getOrganizationSummary`, `useWorkspace``useOrganization`
- Files: `shared/src/workspace.ts``organization.ts`, `backend/src/config/workspace.ts``organization.ts`
- Config keys: `config.workspace.default``config.organization.default`
- URL paths: `/workspaces/$workspaceId``/organizations/$organizationId`
- UI strings: `"Loading workspace..."``"Loading organization..."`
- Tests: rename `workspace-*.test.ts` files, update `workspaceSnapshot()``organizationSnapshot()`, `workspaceId: "ws-1"``organizationId: "org-1"`
### After all renames: update CLAUDE.md files
Update `foundry/CLAUDE.md` and `foundry/packages/backend/CLAUDE.md` to use new terminology throughout (organization instead of workspace, repository instead of project, etc.). The rest of this spec already uses the new names.
## What gets deleted
### Entire directories/files
| Path (relative to `packages/backend/src/`) | Reason |
|---|---|
| `integrations/git/index.ts` | All local git operations |
| `integrations/git-spice/index.ts` | Stack management via git-spice |
| `actors/repository-branch-sync/` (currently `project-branch-sync/`) | Polling actor that fetches + reads local clone every 5s |
| `actors/project-pr-sync/` | Empty directory, already dead |
| `actors/repository/stack-model.ts` (currently `project/stack-model.ts`) | Stack parent/sort model (git-spice dependent) |
| `test/git-spice.test.ts` | Tests for deleted git-spice integration |
| `test/git-validate-remote.test.ts` | Tests for deleted git validation |
| `test/stack-model.test.ts` | Tests for deleted stack model |
### Driver interfaces removed from `driver.ts`
- `GitDriver` — entire interface deleted
- `StackDriver` — entire interface deleted
- `BackendDriver.git` — removed
- `BackendDriver.stack` — removed
- All imports from `integrations/git/` and `integrations/git-spice/`
`BackendDriver` keeps only `github` and `tmux`.
### Test driver cleanup (`test/helpers/test-driver.ts`)
- Delete `createTestGitDriver()`
- Delete `createTestStackDriver()`
- Remove `git` and `stack` from `createTestDriver()`
### Docker volume removed (`compose.dev.yaml`, `compose.preview.yaml`)
- Remove `foundry_git_repos` volume and its mount at `/root/.local/share/foundry/repos`
- Remove the CLAUDE.md note about the repos volume
### Actor registry cleanup (`actors/index.ts`, `actors/keys.ts`, `actors/handles.ts`)
- Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) registration
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`)
- Remove branch sync handle helpers
### Client key cleanup (`packages/client/src/keys.ts`, `packages/client/test/keys.test.ts`)
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) if exported
### Dead code removal: `providerProfiles` table
The `providerProfiles` table in the organization actor (currently workspace actor) DB is written but never read. Delete:
- Table definition in `actors/organization/db/schema.ts` (currently `workspace/db/schema.ts`)
- All writes in `actors/organization/actions.ts` (currently `workspace/actions.ts`)
- The `refreshProviderProfiles` queue command and handler
- The `RefreshProviderProfilesCommand` interface
- Add a DB migration to drop the `provider_profiles` table
### Ensure pattern cleanup (`actors/repository/actions.ts`, currently `project/actions.ts`)
Delete all `ensure*` functions that block action handlers on external I/O or cross-actor fan-out:
- **`ensureLocalClone()`** — Delete (git clone removal).
- **`ensureProjectReady()`** / **`ensureRepositoryReady()`** — Delete (wrapper around `ensureLocalClone` + sync actors).
- **`ensureProjectReadyForRead()`** / **`ensureRepositoryReadyForRead()`** — Delete (dispatches ensure with 10s wait on read path).
- **`ensureProjectSyncActors()`** / **`ensureRepositorySyncActors()`** — Delete (spawns branch sync actor which is being removed).
- **`forceProjectSync()`** / **`forceRepositorySync()`** — Delete (triggers branch sync actor).
- **`ensureTaskIndexHydrated()`** — Delete. This is the migration path from `HistoryActor``task_index` table. Since we assume fresh repositories, no migration needed. The task index is populated on write (`createTask` inserts the row).
- **`ensureTaskIndexHydratedForRead()`** — Delete (wrapper that dispatches `hydrateTaskIndex`).
- **`taskIndexHydrated` state flag** — Delete from repository actor state.
The `ensureAskpassScript()` is fine — it's a fast local operation.
### Dead schema tables and helpers (`actors/repository/db/schema.ts`, `actors/repository/actions.ts`)
With the branch sync actor and git-spice stack operations deleted, these tables have no writer and should be removed:
- **`branches` table** — populated by `RepositoryBranchSyncActor` from the local clone. Delete the table, its schema definition, and all reads from it (including `enrichTaskRecord` which reads `diffStat`, `hasUnpushed`, `conflictsWithMain`, `parentBranch` from this table).
- **`repoActionJobs` table** — populated by `runRepoStackAction()` for git-spice stack operations. Delete the table, its schema definition, and all helpers: `ensureRepoActionJobsTable()`, `writeRepoActionJob()`, `listRepoActionJobRows()`.
## What gets modified
### `actors/repository/actions.ts` (currently `project/actions.ts`)
This is the biggest change. Current git operations in this file:
1. **`createTaskMutation()`** — Currently calls `listLocalRemoteRefs` to check branch name conflicts against remote branches. Replace: branch conflict checking uses only the repository actor's `task_index` table (which branches are already taken by tasks). We don't need to check against remote branches — if the branch already exists on the remote, `git push` in the sandbox will handle it.
2. **`registerTaskBranch()`** — Currently does `fetch` + `remoteDefaultBaseRef` + `revParse` + git-spice stack tracking. Replace: default base branch comes from GitHub repo metadata (already stored from webhook/API at repo add time). SHA resolution is not needed at task creation — the sandbox handles it. Delete all git-spice stack tracking.
3. **`getRepoOverview()`** — Currently calls `listLocalRemoteRefs` + `remoteDefaultBaseRef` + `stack.available` + `stack.listStack`. Replace: branch data comes from GitHub API data we already store from webhooks (push/create/delete events feed branch state). Stack data is deleted. The overview returns branches from stored GitHub webhook data.
4. **`runRepoStackAction()`** — Delete entirely (all git-spice stack operations).
5. **All `normalizeBaseBranchName` imports from git-spice** — Inline or move to a simple utility if still needed.
6. **All `ensureTaskIndexHydrated*` / `ensureRepositoryReady*` call sites** — Remove. Read actions query the `task_index` table directly; if it's empty, it's empty. Write actions populate it on create.
### `actors/repository/index.ts` (currently `project/index.ts`)
- Remove local clone path from state/initialization
- Remove branch sync actor spawning
- Remove any `ensureLocalClone` calls in lifecycle
### `actors/task/workbench.ts`
- **`ensureSandboxRepo()` line 405**: Currently calls `driver.git.remoteDefaultBaseRef()` on the local clone. Replace: read default branch from repository actor state (which gets it from GitHub API/webhook data at repo add time).
### `actors/organization/actions.ts` (currently `workspace/actions.ts`)
- **`addRemote()` line 320**: Currently calls `driver.git.validateRemote()` which runs `git ls-remote`. Replace: validate via GitHub API — `GET /repos/{owner}/{repo}` returns 404 for invalid repos. We already parse the remote URL into owner/repo for GitHub operations.
### `actors/keys.ts` / `actors/handles.ts`
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) export
- Remove branch sync handle creation
## What stays the same
- `driver.github.*` — already uses GitHub API, no changes
- `driver.tmux.*` — unrelated, no changes
- `integrations/github/index.ts` — already GitHub API based, keeps working
- All sandbox execution (`executeInSandbox()`) — already correct pattern
- Webhook handlers for push/create/delete events — already feed GitHub data into backend
## CLAUDE.md updates
### `foundry/packages/backend/CLAUDE.md`
Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) from the actor hierarchy tree:
```text
OrganizationActor
├─ HistoryActor(organization-scoped global feed)
├─ GithubDataActor
├─ RepositoryActor(repo)
│ └─ TaskActor(task)
│ ├─ TaskSessionActor(session) x N
│ │ └─ SessionStatusSyncActor(session) x 0..1
│ └─ Task-local workbench state
└─ SandboxInstanceActor(sandboxProviderId, sandboxId) x N
```
Add to Ownership Rules:
> - The backend stores no local git state. No clones, no refs, no working trees, no git-spice. Repo metadata (branches, default branch) comes from GitHub API and webhook events. All git operations that require a working tree execute inside sandboxes via `executeInSandbox()`.
### `foundry/CLAUDE.md`
Add a new section:
```markdown
## Git State Policy
- The backend stores **zero git state**. No local clones, no refs, no working trees, no git-spice.
- Repo metadata (branches, default branch, PRs) comes from GitHub API and webhook events already flowing into the system.
- All git operations that require a working tree (diff, push, conflict check, rev-parse) execute inside the task's sandbox via `executeInSandbox()`.
- Do not add local git clone paths, `git fetch`, `git for-each-ref`, or any direct git CLI calls to the backend. If you need git data, either read it from stored GitHub webhook/API data or run it in a sandbox.
- The `BackendDriver` has no `GitDriver` or `StackDriver`. Only `GithubDriver` and `TmuxDriver` remain.
- git-spice is not used anywhere in the system.
```
Remove from CLAUDE.md:
> - Docker dev: `compose.dev.yaml` mounts a named volume at `/root/.local/share/foundry/repos` to persist backend-managed git clones across restarts. Code must still work if this volume is not present (create directories as needed).
## Concerns
1. **Concurrent agent work**: Another agent is currently modifying `workspace/actions.ts`, `project/actions.ts`, `task/workbench.ts`, `task/workflow/init.ts`, `task/workflow/queue.ts`, `driver.ts`, and `project-branch-sync/index.ts`. Those changes are adding `listLocalRemoteRefs` to the driver and removing polling loops/timeouts. The git clone removal work will **delete** the code the other agent is modifying. Coordinate: let the other agent's changes land first, then this spec deletes the git integration entirely.
2. **Rename ordering**: The rename spec (workspace→organization, project→repository, etc.) should ideally land **before** this spec is executed, so the file paths and identifiers match. If not, the implementing agent should map old names → new names using the table above.
3. **`project-pr-sync/` directory**: This is already an empty directory. Delete it as part of cleanup.
4. **`ensureRepoActionJobsTable()`**: The current spec mentions this should stay but the `repoActionJobs` table is being deleted. Updating: both the table and the ensure function should be deleted.
## Validation
After implementation, run:
```bash
pnpm -w typecheck
pnpm -w build
pnpm -w test
```
Then restart the dev stack and run the main user flow end-to-end:
```bash
just foundry-dev-down && just foundry-dev
```
Verify:
1. Add a repo to an organization
2. Create a task (should return immediately with taskId)
3. Task appears in sidebar with pending status
4. Task provisions and transitions to ready
5. Session is created and initial message is sent
6. Agent responds in the session transcript
This must work against a real GitHub repo (`rivet-dev/sandbox-agent-testing`) with the dev environment credentials.
### Codebase grep validation
After implementation, verify no local git operations or git-spice references remain in the backend:
```bash
# No local git CLI calls (excludes integrations/github which is GitHub API, not local git)
rg -l 'execFileAsync\("git"' foundry/packages/backend/src/ && echo "FAIL: local git CLI calls found" || echo "PASS"
# No git-spice references
rg -l 'git.spice|gitSpice|git_spice' foundry/packages/backend/src/ && echo "FAIL: git-spice references found" || echo "PASS"
# No GitDriver or StackDriver references
rg -l 'GitDriver|StackDriver' foundry/packages/backend/src/ && echo "FAIL: deleted driver interfaces still referenced" || echo "PASS"
# No local clone path references
rg -l 'localPath|ensureCloned|ensureLocalClone|foundryRepoClonePath' foundry/packages/backend/src/ && echo "FAIL: local clone references found" || echo "PASS"
# No branch sync actor references
rg -l 'BranchSync|branchSync|branch.sync' foundry/packages/backend/src/ && echo "FAIL: branch sync references found" || echo "PASS"
# No deleted ensure patterns
rg -l 'ensureProjectReady|ensureTaskIndexHydrated|taskIndexHydrated' foundry/packages/backend/src/ && echo "FAIL: deleted ensure patterns found" || echo "PASS"
# integrations/git/ and integrations/git-spice/ directories should not exist
ls foundry/packages/backend/src/integrations/git/index.ts 2>/dev/null && echo "FAIL: git integration not deleted" || echo "PASS"
ls foundry/packages/backend/src/integrations/git-spice/index.ts 2>/dev/null && echo "FAIL: git-spice integration not deleted" || echo "PASS"
```
All checks must pass before the change is considered complete.
### Rename verification
After the rename spec has landed, verify no old names remain anywhere in `foundry/`:
```bash
# --- workspace → organization ---
# No "WorkspaceActor", "WorkspaceEvent", "WorkspaceId", "WorkspaceSummary", etc. (exclude pnpm-workspace.yaml, node_modules, .turbo)
rg -l 'WorkspaceActor|WorkspaceEvent|WorkspaceId|WorkspaceSummary|WorkspaceHandle|WorkspaceUseInput|WorkspaceTopicParams' foundry/packages/ && echo "FAIL: workspace type references remain" || echo "PASS"
# No workspaceId in domain code (exclude pnpm-workspace, node_modules, .turbo, this spec file)
rg -l 'workspaceId' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspaceId references remain" || echo "PASS"
# No workspace actor directory
ls foundry/packages/backend/src/actors/workspace/ 2>/dev/null && echo "FAIL: workspace actor directory not renamed" || echo "PASS"
# No workspaceKey function
rg 'workspaceKey|selfWorkspace|getOrCreateWorkspace|resolveWorkspaceId|defaultWorkspace' foundry/packages/ --glob '!node_modules' && echo "FAIL: workspace function references remain" || echo "PASS"
# No "ws" actor key string (the old key prefix)
rg '"\\"ws\\""|\["ws"' foundry/packages/ --glob '!node_modules' && echo "FAIL: old 'ws' actor key strings remain" || echo "PASS"
# No workspace queue names
rg 'workspace\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspace queue names remain" || echo "PASS"
# No /workspaces/ URL paths
rg '/workspaces/' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: /workspaces/ URL paths remain" || echo "PASS"
# No config.workspace
rg 'config\.workspace' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: config.workspace references remain" || echo "PASS"
# --- project → repository ---
# No ProjectActor, ProjectInput, ProjectSection, etc.
rg -l 'ProjectActor|ProjectInput|ProjectSection|PROJECT_QUEUE|PROJECT_COLORS' foundry/packages/ --glob '!node_modules' && echo "FAIL: project type references remain" || echo "PASS"
# No project actor directory
ls foundry/packages/backend/src/actors/project/ 2>/dev/null && echo "FAIL: project actor directory not renamed" || echo "PASS"
# No projectKey, selfProject, getOrCreateProject, etc.
rg 'projectKey|selfProject|getOrCreateProject|getProject\b|projectBranchSync|projectPrSync|projectWorkflow' foundry/packages/ --glob '!node_modules' && echo "FAIL: project function references remain" || echo "PASS"
# No "project" actor key string
rg '"\\"project\\""|\[".*"project"' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: old project actor key strings remain" || echo "PASS"
# No project.command.* queue names
rg 'project\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: project queue names remain" || echo "PASS"
# --- tab → session ---
# No WorkbenchAgentTab, TaskWorkbenchTabInput, TabStrip, tabId (in workbench context)
rg -l 'WorkbenchAgentTab|TaskWorkbenchTabInput|TaskWorkbenchAddTabResponse|TabStrip' foundry/packages/ --glob '!node_modules' && echo "FAIL: tab type references remain" || echo "PASS"
# No tabId (should be sessionId now)
rg '\btabId\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: tabId references remain" || echo "PASS"
# No tab-strip.tsx file
ls foundry/packages/frontend/src/components/mock-layout/tab-strip.tsx 2>/dev/null && echo "FAIL: tab-strip.tsx not renamed" || echo "PASS"
# No closeTab/addTab (should be closeSession/addSession)
rg '\bcloseTab\b|\baddTab\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: closeTab/addTab references remain" || echo "PASS"
# --- interest → subscription ---
# No InterestManager, useInterest, etc.
rg -l 'InterestManager|useInterest|DebugInterestTopic' foundry/packages/ --glob '!node_modules' && echo "FAIL: interest type references remain" || echo "PASS"
# No interest/ directory
ls foundry/packages/client/src/interest/ 2>/dev/null && echo "FAIL: interest directory not renamed" || echo "PASS"
# --- ProviderId → SandboxProviderId ---
# No bare ProviderId/ProviderIdSchema (but allow sandboxProviderId, model.provider, auth provider_id)
rg '\bProviderIdSchema\b|\bProviderId\b' foundry/packages/shared/src/contracts.ts && echo "FAIL: bare ProviderId in contracts.ts" || echo "PASS"
# No bare providerId for sandbox context (check task schema)
rg '\bproviderId\b' foundry/packages/backend/src/actors/task/db/schema.ts && echo "FAIL: bare providerId in task schema" || echo "PASS"
# No providerProfiles table (dead code, should be deleted)
rg 'providerProfiles|provider_profiles|refreshProviderProfiles' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: providerProfiles references remain" || echo "PASS"
# --- Verify new names exist ---
rg -l 'OrganizationActor|OrganizationEvent|OrganizationId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new organization names not found"
rg -l 'RepositoryActor|RepositoryInput|RepositorySection' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new repository names not found"
rg -l 'SubscriptionManager|useSubscription' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new subscription names not found"
rg -l 'SandboxProviderIdSchema|SandboxProviderId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new sandbox provider names not found"
```
All checks must pass. False positives from markdown files, comments referencing old names in migration context, or `node_modules` should be excluded via the globs above.

37
sdks/CLAUDE.md Normal file
View file

@ -0,0 +1,37 @@
# SDK Instructions
## TypeScript SDK Architecture
- TypeScript clients are split into:
- `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers.
- `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers.
- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`.
- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`.
- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`.
- Cleanup is `sdk.dispose()`.
### React Component Methodology
- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector.
- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations.
- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks.
- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`.
- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`.
### TypeScript SDK Naming Conventions
- Use `respond<Thing>(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent.
- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions.
### Docs Source Of Truth
- For TypeScript docs/examples, source of truth is implementation in:
- `sdks/typescript/src/client.ts`
- `sdks/typescript/src/index.ts`
- `sdks/acp-http-client/src/index.ts`
- Do not document TypeScript APIs unless they are exported and implemented in those files.
## Tests
- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior.
- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests.

View file

@ -1,18 +1,47 @@
# Server Instructions
## Architecture
## ACP v1 Baseline
- Public API routes are defined in `server/packages/sandbox-agent/src/router.rs`.
- ACP proxy runtime is in `server/packages/sandbox-agent/src/acp_proxy_runtime.rs`.
- All API endpoints are under `/v1`.
- Keep binary filesystem transfer endpoints as dedicated HTTP APIs:
- v1 is ACP-native.
- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`).
- `/opencode/*` is disabled during ACP core phases and returns `503`.
- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`:
- `POST /v1/rpc`
- `GET /v1/rpc` (SSE)
- `DELETE /v1/rpc`
- Control-plane endpoints:
- `GET /v1/health`
- `GET /v1/agents`
- `POST /v1/agents/{agent}/install`
- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods):
- `GET /v1/fs/file`
- `PUT /v1/fs/file`
- `POST /v1/fs/upload-batch`
- Rationale: host-owned cross-agent-consistent behavior and large binary transfer needs that ACP JSON-RPC is not suited to stream efficiently.
- Maintain ACP variants in parallel only when they share the same underlying filesystem implementation; SDK defaults should still prefer HTTP for large/binary transfers.
- `/opencode/*` stays disabled (`503`) until Phase 7.
- Agent install logic (native + ACP agent process + lazy install) is handled by `server/packages/agent-management/`.
- Sandbox Agent ACP extension method naming:
- Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`).
- Session detach method is `_sandboxagent/session/detach`.
## API Scope
- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent.
- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP.
- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities).
- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP.
- Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent.
- ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior.
- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP:
- These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior.
- They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream.
- This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`.
- ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations.
## Architecture
- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs`
- ACP proxy runtime: `server/packages/sandbox-agent/src/acp_proxy_runtime.rs`
- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs`
- Agent install logic (native + ACP agent process + lazy install): `server/packages/agent-management/`
- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/`
## API Contract Rules
@ -21,6 +50,24 @@
- Regenerate `docs/openapi.json` after endpoint contract changes.
- Keep CLI and HTTP endpoint behavior aligned (`docs/cli.mdx`).
## ACP Protocol Compliance
- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`.
- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`).
- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`).
- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it.
- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire.
## Source Documents
- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt`
- `~/misc/acp-docs/schema/schema.json`
- `~/misc/acp-docs/schema/meta.json`
- `research/acp/spec.md`
- `research/acp/v1-schema-to-acp-mapping.md`
- `research/acp/friction.md`
- `research/acp/todo.md`
## Tests
Primary v1 integration coverage:
@ -38,3 +85,9 @@ cargo test -p sandbox-agent --test v1_agent_process_matrix
- Keep `research/acp/spec.md` as the source spec.
- Update `research/acp/todo.md` when scope/status changes.
- Log blockers/decisions in `research/acp/friction.md`.
## Docker Examples (Dev Testing)
- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples.
- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image.
- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`