mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 19:05:18 +00:00
wip
This commit is contained in:
parent
400f9a214e
commit
3263d4f5e1
18 changed files with 677 additions and 329 deletions
98
CLAUDE.md
98
CLAUDE.md
|
|
@ -1,40 +1,5 @@
|
|||
# Instructions
|
||||
|
||||
## ACP v1 Baseline
|
||||
|
||||
- v1 is ACP-native.
|
||||
- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`).
|
||||
- `/opencode/*` is disabled during ACP core phases and returns `503`.
|
||||
- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`:
|
||||
- `POST /v1/rpc`
|
||||
- `GET /v1/rpc` (SSE)
|
||||
- `DELETE /v1/rpc`
|
||||
- Control-plane endpoints:
|
||||
- `GET /v1/health`
|
||||
- `GET /v1/agents`
|
||||
- `POST /v1/agents/{agent}/install`
|
||||
- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods):
|
||||
- `GET /v1/fs/file`
|
||||
- `PUT /v1/fs/file`
|
||||
- `POST /v1/fs/upload-batch`
|
||||
- Sandbox Agent ACP extension method naming:
|
||||
- Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`).
|
||||
- Session detach method is `_sandboxagent/session/detach`.
|
||||
|
||||
## API Scope
|
||||
|
||||
- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent.
|
||||
- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP.
|
||||
- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities).
|
||||
- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP.
|
||||
- Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent.
|
||||
- ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior.
|
||||
- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP:
|
||||
- These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior.
|
||||
- They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream.
|
||||
- This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`.
|
||||
- ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations.
|
||||
|
||||
## Naming and Ownership
|
||||
|
||||
- This repository/product is **Sandbox Agent**.
|
||||
|
|
@ -49,66 +14,13 @@
|
|||
- Never expose underlying protocol method names (e.g. `session/request_permission`, `session/create`, `_sandboxagent/session/detach`) in non-ACP docs. Describe the behavior in user-facing terms instead.
|
||||
- Do not describe the underlying protocol implementation in docs. Only document the SDK surface (methods, types, options). ACP protocol details belong exclusively in ACP-specific pages.
|
||||
|
||||
## Architecture (Brief)
|
||||
### Docs Source Of Truth (HTTP/CLI)
|
||||
|
||||
- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs`
|
||||
- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs`
|
||||
- Agent/native + ACP agent process install and lazy install: `server/packages/agent-management/`
|
||||
- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/`
|
||||
|
||||
## TypeScript SDK Architecture
|
||||
|
||||
- TypeScript clients are split into:
|
||||
- `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers.
|
||||
- `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers.
|
||||
- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`.
|
||||
- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`.
|
||||
- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`.
|
||||
- Cleanup is `sdk.dispose()`.
|
||||
|
||||
### React Component Methodology
|
||||
|
||||
- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector.
|
||||
- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations.
|
||||
- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks.
|
||||
- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`.
|
||||
- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`.
|
||||
|
||||
### TypeScript SDK Naming Conventions
|
||||
|
||||
- Use `respond<Thing>(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent.
|
||||
- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions.
|
||||
|
||||
### Docs Source Of Truth
|
||||
|
||||
- For TypeScript docs/examples, source of truth is implementation in:
|
||||
- `sdks/typescript/src/client.ts`
|
||||
- `sdks/typescript/src/index.ts`
|
||||
- `sdks/acp-http-client/src/index.ts`
|
||||
- Do not document TypeScript APIs unless they are exported and implemented in those files.
|
||||
- For HTTP/CLI docs/examples, source of truth is:
|
||||
- `server/packages/sandbox-agent/src/router.rs`
|
||||
- `server/packages/sandbox-agent/src/cli.rs`
|
||||
- Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs).
|
||||
|
||||
## ACP Protocol Compliance
|
||||
|
||||
- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`.
|
||||
- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`).
|
||||
- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`).
|
||||
- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it.
|
||||
- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire.
|
||||
|
||||
## Source Documents
|
||||
|
||||
- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt`
|
||||
- `~/misc/acp-docs/schema/schema.json`
|
||||
- `~/misc/acp-docs/schema/meta.json`
|
||||
- `research/acp/spec.md`
|
||||
- `research/acp/v1-schema-to-acp-mapping.md`
|
||||
- `research/acp/friction.md`
|
||||
- `research/acp/todo.md`
|
||||
|
||||
## Change Tracking
|
||||
|
||||
- If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push.
|
||||
|
|
@ -119,14 +31,6 @@
|
|||
- Append blockers/decisions to `research/acp/friction.md` during ACP work.
|
||||
- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
|
||||
- Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.
|
||||
- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior.
|
||||
- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests.
|
||||
|
||||
## Docker Examples (Dev Testing)
|
||||
|
||||
- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples.
|
||||
- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image.
|
||||
- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`
|
||||
|
||||
## Install Version References
|
||||
|
||||
|
|
|
|||
|
|
@ -224,6 +224,12 @@ Examples:
|
|||
|
||||
Never use `wait: true` for operations that depend on external readiness, sandbox I/O, agent responses, git network operations, polling loops, or long-running queue drains. Never hold an action open while waiting for an external system to become ready — that is a polling/retry loop in disguise.
|
||||
|
||||
### Timeout policy
|
||||
|
||||
All `wait: true` sends must have an explicit `timeout`. Maximum timeout for any `wait: true` send is **10 seconds** (`10_000`). If an operation cannot reliably complete within 10 seconds, it must be restructured: write the initial record to the DB, return it to the caller, and continue the work asynchronously with `wait: false`. The client observes completion via push events.
|
||||
|
||||
`wait: false` sends do not need a timeout (the enqueue is instant; the work runs in the workflow loop with its own step-level timeouts).
|
||||
|
||||
### Task creation: resolve metadata before creating the actor
|
||||
|
||||
When creating a task, all deterministic metadata (title, branch name) must be resolved synchronously in the parent actor (project) *before* the task actor is created. The task actor must never be created with null `branchName` or `title`.
|
||||
|
|
|
|||
|
|
@ -14,6 +14,10 @@ services:
|
|||
HF_BACKEND_HOST: "0.0.0.0"
|
||||
HF_BACKEND_PORT: "7741"
|
||||
RIVETKIT_STORAGE_PATH: "/root/.local/share/foundry/rivetkit"
|
||||
RIVET_LOG_ERROR_STACK: "${RIVET_LOG_ERROR_STACK:-1}"
|
||||
RIVET_LOG_LEVEL: "${RIVET_LOG_LEVEL:-debug}"
|
||||
RIVET_LOG_TIMESTAMP: "${RIVET_LOG_TIMESTAMP:-1}"
|
||||
FOUNDRY_LOG_LEVEL: "${FOUNDRY_LOG_LEVEL:-debug}"
|
||||
# Pass through credentials needed for agent execution + PR creation in dev/e2e.
|
||||
# Do not hardcode secrets; set these in your environment when starting compose.
|
||||
ANTHROPIC_API_KEY: "${ANTHROPIC_API_KEY:-}"
|
||||
|
|
|
|||
|
|
@ -156,7 +156,7 @@ export const projectBranchSync = actor({
|
|||
|
||||
async force(c): Promise<void> {
|
||||
const self = selfProjectBranchSync(c);
|
||||
await self.send(CONTROL.force, {}, { wait: true, timeout: 5 * 60_000 });
|
||||
await self.send(CONTROL.force, {}, { wait: true, timeout: 10_000 });
|
||||
},
|
||||
},
|
||||
run: workflow(async (ctx) => {
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ import { resolveWorkspaceGithubAuth } from "../../services/github-auth.js";
|
|||
import { expectQueueResponse } from "../../services/queue.js";
|
||||
import { withRepoGitLock } from "../../services/repo-git-lock.js";
|
||||
import { branches, taskIndex, repoActionJobs, repoMeta } from "./db/schema.js";
|
||||
import { deriveFallbackTitle } from "../../services/create-flow.js";
|
||||
import { deriveFallbackTitle, resolveCreateFlowDecision } from "../../services/create-flow.js";
|
||||
import { normalizeBaseBranchName } from "../../integrations/git-spice/index.js";
|
||||
import { sortBranchesForOverview } from "./stack-model.js";
|
||||
|
||||
|
|
@ -416,37 +416,81 @@ async function hydrateTaskIndexMutation(c: any, _cmd?: HydrateTaskIndexCommand):
|
|||
}
|
||||
|
||||
async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise<TaskRecord> {
|
||||
const workspaceId = c.state.workspaceId;
|
||||
const repoId = c.state.repoId;
|
||||
const repoRemote = c.state.remoteUrl;
|
||||
const onBranch = cmd.onBranch?.trim() || null;
|
||||
const initialBranchName = onBranch;
|
||||
const initialTitle = onBranch ? deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined) : null;
|
||||
const taskId = randomUUID();
|
||||
let initialBranchName: string | null = null;
|
||||
let initialTitle: string | null = null;
|
||||
|
||||
if (onBranch) {
|
||||
initialBranchName = onBranch;
|
||||
initialTitle = deriveFallbackTitle(cmd.task, cmd.explicitTitle ?? undefined);
|
||||
|
||||
await registerTaskBranchMutation(c, {
|
||||
taskId,
|
||||
branchName: onBranch,
|
||||
requireExistingRemote: true,
|
||||
});
|
||||
} else {
|
||||
const localPath = await ensureProjectReady(c);
|
||||
const { driver } = getActorRuntimeContext();
|
||||
|
||||
// Read locally cached remote-tracking refs — no network fetch.
|
||||
// The branch sync actor keeps these reasonably fresh. If a rare naming
|
||||
// collision occurs with a very recently created remote branch, it will
|
||||
// be caught lazily on push/checkout.
|
||||
const remoteBranches = (await driver.git.listLocalRemoteRefs(localPath)).map((branch: any) => branch.branchName);
|
||||
|
||||
await ensureTaskIndexHydrated(c);
|
||||
const reservedBranchRows = await c.db.select({ branchName: taskIndex.branchName }).from(taskIndex).where(isNotNull(taskIndex.branchName)).all();
|
||||
const reservedBranches = reservedBranchRows
|
||||
.map((row: { branchName: string | null }) => row.branchName)
|
||||
.filter((branchName): branchName is string => typeof branchName === "string" && branchName.length > 0);
|
||||
|
||||
const resolved = resolveCreateFlowDecision({
|
||||
task: cmd.task,
|
||||
explicitTitle: cmd.explicitTitle ?? undefined,
|
||||
explicitBranchName: cmd.explicitBranchName ?? undefined,
|
||||
localBranches: remoteBranches,
|
||||
taskBranches: reservedBranches,
|
||||
});
|
||||
|
||||
initialBranchName = resolved.branchName;
|
||||
initialTitle = resolved.title;
|
||||
|
||||
const now = Date.now();
|
||||
await c.db
|
||||
.insert(taskIndex)
|
||||
.values({
|
||||
taskId,
|
||||
branchName: resolved.branchName,
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
})
|
||||
.onConflictDoNothing()
|
||||
.run();
|
||||
}
|
||||
|
||||
let task: Awaited<ReturnType<typeof getOrCreateTask>>;
|
||||
try {
|
||||
task = await getOrCreateTask(c, c.state.workspaceId, c.state.repoId, taskId, {
|
||||
workspaceId: c.state.workspaceId,
|
||||
repoId: c.state.repoId,
|
||||
task = await getOrCreateTask(c, workspaceId, repoId, taskId, {
|
||||
workspaceId,
|
||||
repoId,
|
||||
taskId,
|
||||
repoRemote: c.state.remoteUrl,
|
||||
repoRemote,
|
||||
branchName: initialBranchName,
|
||||
title: initialTitle,
|
||||
task: cmd.task,
|
||||
providerId: cmd.providerId,
|
||||
agentType: cmd.agentType,
|
||||
explicitTitle: onBranch ? null : cmd.explicitTitle,
|
||||
explicitBranchName: onBranch ? null : cmd.explicitBranchName,
|
||||
explicitTitle: null,
|
||||
explicitBranchName: null,
|
||||
initialPrompt: cmd.initialPrompt,
|
||||
});
|
||||
} catch (error) {
|
||||
if (onBranch) {
|
||||
if (initialBranchName) {
|
||||
await c.db
|
||||
.delete(taskIndex)
|
||||
.where(eq(taskIndex.taskId, taskId))
|
||||
|
|
@ -456,28 +500,14 @@ async function createTaskMutation(c: any, cmd: CreateTaskCommand): Promise<TaskR
|
|||
throw error;
|
||||
}
|
||||
|
||||
if (!onBranch) {
|
||||
const now = Date.now();
|
||||
await c.db
|
||||
.insert(taskIndex)
|
||||
.values({
|
||||
taskId,
|
||||
branchName: initialBranchName,
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
})
|
||||
.onConflictDoNothing()
|
||||
.run();
|
||||
}
|
||||
|
||||
const created = await task.initialize({ providerId: cmd.providerId });
|
||||
|
||||
const history = await getOrCreateHistory(c, c.state.workspaceId, c.state.repoId);
|
||||
const history = await getOrCreateHistory(c, workspaceId, repoId);
|
||||
await history.append({
|
||||
kind: "task.created",
|
||||
taskId,
|
||||
payload: {
|
||||
repoId: c.state.repoId,
|
||||
repoId,
|
||||
providerId: cmd.providerId,
|
||||
},
|
||||
});
|
||||
|
|
@ -919,7 +949,7 @@ export const projectActions = {
|
|||
return expectQueueResponse<EnsureProjectResult>(
|
||||
await self.send(projectWorkflowQueueName("project.command.ensure"), cmd, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
}),
|
||||
);
|
||||
},
|
||||
|
|
@ -929,7 +959,7 @@ export const projectActions = {
|
|||
return expectQueueResponse<TaskRecord>(
|
||||
await self.send(projectWorkflowQueueName("project.command.createTask"), cmd, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
}),
|
||||
);
|
||||
},
|
||||
|
|
@ -947,7 +977,7 @@ export const projectActions = {
|
|||
return expectQueueResponse<{ branchName: string; headSha: string }>(
|
||||
await self.send(projectWorkflowQueueName("project.command.registerTaskBranch"), cmd, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
}),
|
||||
);
|
||||
},
|
||||
|
|
@ -956,7 +986,7 @@ export const projectActions = {
|
|||
const self = selfProject(c);
|
||||
await self.send(projectWorkflowQueueName("project.command.hydrateTaskIndex"), cmd ?? {}, {
|
||||
wait: true,
|
||||
timeout: 60_000,
|
||||
timeout: 10_000,
|
||||
});
|
||||
},
|
||||
|
||||
|
|
@ -1225,7 +1255,7 @@ export const projectActions = {
|
|||
const self = selfProject(c);
|
||||
await self.send(projectWorkflowQueueName("project.command.applyBranchSyncResult"), body, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
});
|
||||
},
|
||||
};
|
||||
|
|
|
|||
|
|
@ -101,14 +101,15 @@ interface TaskWorkbenchSendMessageCommand {
|
|||
attachments: Array<any>;
|
||||
}
|
||||
|
||||
interface TaskWorkbenchSendMessageActionInput extends TaskWorkbenchSendMessageInput {
|
||||
waitForCompletion?: boolean;
|
||||
}
|
||||
|
||||
interface TaskWorkbenchCreateSessionCommand {
|
||||
model?: string;
|
||||
}
|
||||
|
||||
interface TaskWorkbenchCreateSessionAndSendCommand {
|
||||
model?: string;
|
||||
text: string;
|
||||
}
|
||||
|
||||
interface TaskWorkbenchSessionCommand {
|
||||
sessionId: string;
|
||||
}
|
||||
|
|
@ -143,7 +144,7 @@ export const task = actor({
|
|||
const self = selfTask(c);
|
||||
const result = await self.send(taskWorkflowQueueName("task.command.initialize"), cmd ?? {}, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
});
|
||||
return expectQueueResponse<TaskRecord>(result);
|
||||
},
|
||||
|
|
@ -160,7 +161,7 @@ export const task = actor({
|
|||
const self = selfTask(c);
|
||||
const result = await self.send(taskWorkflowQueueName("task.command.attach"), cmd ?? {}, {
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
});
|
||||
return expectQueueResponse<{ target: string; sessionId: string | null }>(result);
|
||||
},
|
||||
|
|
@ -172,7 +173,7 @@ export const task = actor({
|
|||
{},
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
return expectQueueResponse<{ switchTarget: string }>(result);
|
||||
|
|
@ -236,7 +237,7 @@ export const task = actor({
|
|||
{},
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
},
|
||||
|
|
@ -263,12 +264,25 @@ export const task = actor({
|
|||
{ ...(input?.model ? { model: input.model } : {}) } satisfies TaskWorkbenchCreateSessionCommand,
|
||||
{
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
return expectQueueResponse<{ tabId: string }>(result);
|
||||
},
|
||||
|
||||
/**
|
||||
* Fire-and-forget: creates a workbench session and sends the initial message.
|
||||
* Used by createWorkbenchTask so the caller doesn't block on session creation.
|
||||
*/
|
||||
async createWorkbenchSessionAndSend(c, input: { model?: string; text: string }): Promise<void> {
|
||||
const self = selfTask(c);
|
||||
await self.send(
|
||||
taskWorkflowQueueName("task.command.workbench.create_session_and_send"),
|
||||
{ model: input.model, text: input.text } satisfies TaskWorkbenchCreateSessionAndSendCommand,
|
||||
{ wait: false },
|
||||
);
|
||||
},
|
||||
|
||||
async renameWorkbenchSession(c, input: TaskWorkbenchRenameSessionInput): Promise<void> {
|
||||
const self = selfTask(c);
|
||||
await self.send(
|
||||
|
|
@ -276,7 +290,7 @@ export const task = actor({
|
|||
{ sessionId: input.tabId, title: input.title } satisfies TaskWorkbenchSessionTitleCommand,
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
},
|
||||
|
|
@ -288,7 +302,7 @@ export const task = actor({
|
|||
{ sessionId: input.tabId, unread: input.unread } satisfies TaskWorkbenchSessionUnreadCommand,
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
},
|
||||
|
|
@ -304,7 +318,7 @@ export const task = actor({
|
|||
} satisfies TaskWorkbenchUpdateDraftCommand,
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
},
|
||||
|
|
@ -316,14 +330,14 @@ export const task = actor({
|
|||
{ sessionId: input.tabId, model: input.model } satisfies TaskWorkbenchChangeModelCommand,
|
||||
{
|
||||
wait: true,
|
||||
timeout: 20_000,
|
||||
timeout: 10_000,
|
||||
},
|
||||
);
|
||||
},
|
||||
|
||||
async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageActionInput): Promise<void> {
|
||||
async sendWorkbenchMessage(c, input: TaskWorkbenchSendMessageInput): Promise<void> {
|
||||
const self = selfTask(c);
|
||||
const result = await self.send(
|
||||
await self.send(
|
||||
taskWorkflowQueueName("task.command.workbench.send_message"),
|
||||
{
|
||||
sessionId: input.tabId,
|
||||
|
|
@ -331,13 +345,9 @@ export const task = actor({
|
|||
attachments: input.attachments,
|
||||
} satisfies TaskWorkbenchSendMessageCommand,
|
||||
{
|
||||
wait: input.waitForCompletion === true,
|
||||
...(input.waitForCompletion === true ? { timeout: 10 * 60_000 } : {}),
|
||||
wait: false,
|
||||
},
|
||||
);
|
||||
if (input.waitForCompletion === true) {
|
||||
expectQueueResponse(result);
|
||||
}
|
||||
},
|
||||
|
||||
async stopWorkbenchSession(c, input: TaskTabCommand): Promise<void> {
|
||||
|
|
|
|||
|
|
@ -307,22 +307,14 @@ async function requireReadySessionMeta(c: any, tabId: string): Promise<any> {
|
|||
return meta;
|
||||
}
|
||||
|
||||
async function ensureReadySessionMeta(c: any, tabId: string): Promise<any> {
|
||||
const meta = await readSessionMeta(c, tabId);
|
||||
export function requireSendableSessionMeta(meta: any, tabId: string): any {
|
||||
if (!meta) {
|
||||
throw new Error(`Unknown workbench tab: ${tabId}`);
|
||||
}
|
||||
|
||||
if (meta.status === "ready" && meta.sandboxSessionId) {
|
||||
return meta;
|
||||
if (meta.status !== "ready" || !meta.sandboxSessionId) {
|
||||
throw new Error(`Session is not ready (status: ${meta.status}). Wait for session provisioning to complete.`);
|
||||
}
|
||||
|
||||
if (meta.status === "error") {
|
||||
throw new Error(meta.errorMessage ?? "This workbench tab failed to prepare");
|
||||
}
|
||||
|
||||
await ensureWorkbenchSession(c, tabId);
|
||||
return await requireReadySessionMeta(c, tabId);
|
||||
return meta;
|
||||
}
|
||||
|
||||
function shellFragment(parts: string[]): string {
|
||||
|
|
@ -1204,7 +1196,7 @@ export async function changeWorkbenchModel(c: any, sessionId: string, model: str
|
|||
}
|
||||
|
||||
export async function sendWorkbenchMessage(c: any, sessionId: string, text: string, attachments: Array<any>): Promise<void> {
|
||||
const meta = await ensureReadySessionMeta(c, sessionId);
|
||||
const meta = requireSendableSessionMeta(await readSessionMeta(c, sessionId), sessionId);
|
||||
const record = await ensureWorkbenchSeeded(c);
|
||||
const runtime = await getTaskSandboxRuntime(c, record);
|
||||
await ensureSandboxRepo(c, runtime.sandbox, record);
|
||||
|
|
|
|||
|
|
@ -1,14 +1,7 @@
|
|||
import { Loop } from "rivetkit/workflow";
|
||||
import { logActorWarning, resolveErrorMessage } from "../../logging.js";
|
||||
import { getCurrentRecord } from "./common.js";
|
||||
import {
|
||||
initAssertNameActivity,
|
||||
initBootstrapDbActivity,
|
||||
initCompleteActivity,
|
||||
initEnqueueProvisionActivity,
|
||||
initEnsureNameActivity,
|
||||
initFailedActivity,
|
||||
} from "./init.js";
|
||||
import { initBootstrapDbActivity, initCompleteActivity, initEnqueueProvisionActivity, initFailedActivity } from "./init.js";
|
||||
import {
|
||||
handleArchiveActivity,
|
||||
handleAttachActivity,
|
||||
|
|
@ -67,12 +60,8 @@ const commandHandlers: Record<TaskQueueName, WorkflowHandler> = {
|
|||
await loopCtx.removed("init-failed", "step");
|
||||
await loopCtx.removed("init-failed-v2", "step");
|
||||
try {
|
||||
await loopCtx.step({
|
||||
name: "init-ensure-name",
|
||||
timeout: 5 * 60_000,
|
||||
run: async () => initEnsureNameActivity(loopCtx),
|
||||
});
|
||||
await loopCtx.step("init-assert-name", async () => initAssertNameActivity(loopCtx));
|
||||
await loopCtx.removed("init-ensure-name", "step");
|
||||
await loopCtx.removed("init-assert-name", "step");
|
||||
await loopCtx.removed("init-create-sandbox", "step");
|
||||
await loopCtx.removed("init-ensure-agent", "step");
|
||||
await loopCtx.removed("init-start-sandbox-instance", "step");
|
||||
|
|
@ -156,6 +145,26 @@ const commandHandlers: Record<TaskQueueName, WorkflowHandler> = {
|
|||
}
|
||||
},
|
||||
|
||||
"task.command.workbench.create_session_and_send": async (loopCtx, msg) => {
|
||||
try {
|
||||
const created = await loopCtx.step({
|
||||
name: "workbench-create-session-for-send",
|
||||
timeout: 5 * 60_000,
|
||||
run: async () => createWorkbenchSession(loopCtx, msg.body?.model),
|
||||
});
|
||||
await loopCtx.step({
|
||||
name: "workbench-send-initial-message",
|
||||
timeout: 5 * 60_000,
|
||||
run: async () => sendWorkbenchMessage(loopCtx, created.tabId, msg.body.text, []),
|
||||
});
|
||||
} catch (error) {
|
||||
logActorWarning("task.workflow", "create_session_and_send failed", {
|
||||
error: resolveErrorMessage(error),
|
||||
});
|
||||
}
|
||||
await msg.complete({ ok: true });
|
||||
},
|
||||
|
||||
"task.command.workbench.ensure_session": async (loopCtx, msg) => {
|
||||
await loopCtx.step({
|
||||
name: "workbench-ensure-session",
|
||||
|
|
|
|||
|
|
@ -1,10 +1,8 @@
|
|||
// @ts-nocheck
|
||||
import { eq } from "drizzle-orm";
|
||||
import { resolveCreateFlowDecision } from "../../../services/create-flow.js";
|
||||
import { resolveWorkspaceGithubAuth } from "../../../services/github-auth.js";
|
||||
import { getActorRuntimeContext } from "../../context.js";
|
||||
import { getOrCreateHistory, getOrCreateProject, selfTask } from "../../handles.js";
|
||||
import { logActorWarning, resolveErrorMessage } from "../../logging.js";
|
||||
import { getOrCreateHistory, selfTask } from "../../handles.js";
|
||||
import { resolveErrorMessage } from "../../logging.js";
|
||||
import { defaultSandboxProviderId } from "../../../sandbox-config.js";
|
||||
import { task as taskTable, taskRuntime } from "../db/schema.js";
|
||||
import { TASK_ROW_ID, appendHistory, collectErrorMessages, resolveErrorDetail, setTaskState } from "./common.js";
|
||||
|
|
@ -21,7 +19,6 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
|
|||
const { config } = getActorRuntimeContext();
|
||||
const providerId = body?.providerId ?? loopCtx.state.providerId ?? defaultSandboxProviderId(config);
|
||||
const now = Date.now();
|
||||
const initialStatusMessage = loopCtx.state.branchName && loopCtx.state.title ? "provisioning" : "naming";
|
||||
|
||||
await ensureTaskRuntimeCacheColumns(loopCtx.db);
|
||||
|
||||
|
|
@ -60,7 +57,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
|
|||
activeSessionId: null,
|
||||
activeSwitchTarget: null,
|
||||
activeCwd: null,
|
||||
statusMessage: initialStatusMessage,
|
||||
statusMessage: "provisioning",
|
||||
gitStateJson: null,
|
||||
gitStateUpdatedAt: null,
|
||||
provisionStage: "queued",
|
||||
|
|
@ -74,7 +71,7 @@ export async function initBootstrapDbActivity(loopCtx: any, body: any): Promise<
|
|||
activeSessionId: null,
|
||||
activeSwitchTarget: null,
|
||||
activeCwd: null,
|
||||
statusMessage: initialStatusMessage,
|
||||
statusMessage: "provisioning",
|
||||
provisionStage: "queued",
|
||||
provisionStageUpdatedAt: now,
|
||||
updatedAt: now,
|
||||
|
|
@ -111,102 +108,6 @@ export async function initEnqueueProvisionActivity(loopCtx: any, body: any): Pro
|
|||
}
|
||||
}
|
||||
|
||||
export async function initEnsureNameActivity(loopCtx: any): Promise<void> {
|
||||
await setTaskState(loopCtx, "init_ensure_name", "determining title and branch");
|
||||
const existing = await loopCtx.db
|
||||
.select({
|
||||
branchName: taskTable.branchName,
|
||||
title: taskTable.title,
|
||||
})
|
||||
.from(taskTable)
|
||||
.where(eq(taskTable.id, TASK_ROW_ID))
|
||||
.get();
|
||||
|
||||
if (existing?.branchName && existing?.title) {
|
||||
loopCtx.state.branchName = existing.branchName;
|
||||
loopCtx.state.title = existing.title;
|
||||
return;
|
||||
}
|
||||
|
||||
const { driver } = getActorRuntimeContext();
|
||||
const auth = await resolveWorkspaceGithubAuth(loopCtx, loopCtx.state.workspaceId);
|
||||
let repoLocalPath = loopCtx.state.repoLocalPath;
|
||||
if (!repoLocalPath) {
|
||||
const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote);
|
||||
const result = await project.ensure({ remoteUrl: loopCtx.state.repoRemote });
|
||||
repoLocalPath = result.localPath;
|
||||
loopCtx.state.repoLocalPath = repoLocalPath;
|
||||
}
|
||||
|
||||
try {
|
||||
await driver.git.fetch(repoLocalPath, { githubToken: auth?.githubToken ?? null });
|
||||
} catch (error) {
|
||||
logActorWarning("task.init", "fetch before naming failed", {
|
||||
workspaceId: loopCtx.state.workspaceId,
|
||||
repoId: loopCtx.state.repoId,
|
||||
taskId: loopCtx.state.taskId,
|
||||
error: resolveErrorMessage(error),
|
||||
});
|
||||
}
|
||||
|
||||
const remoteBranches = (await driver.git.listRemoteBranches(repoLocalPath, { githubToken: auth?.githubToken ?? null })).map(
|
||||
(branch: any) => branch.branchName,
|
||||
);
|
||||
const project = await getOrCreateProject(loopCtx, loopCtx.state.workspaceId, loopCtx.state.repoId, loopCtx.state.repoRemote);
|
||||
const reservedBranches = await project.listReservedBranches({});
|
||||
const resolved = resolveCreateFlowDecision({
|
||||
task: loopCtx.state.task,
|
||||
explicitTitle: loopCtx.state.explicitTitle ?? undefined,
|
||||
explicitBranchName: loopCtx.state.explicitBranchName ?? undefined,
|
||||
localBranches: remoteBranches,
|
||||
taskBranches: reservedBranches,
|
||||
});
|
||||
|
||||
const now = Date.now();
|
||||
await loopCtx.db
|
||||
.update(taskTable)
|
||||
.set({
|
||||
branchName: resolved.branchName,
|
||||
title: resolved.title,
|
||||
updatedAt: now,
|
||||
})
|
||||
.where(eq(taskTable.id, TASK_ROW_ID))
|
||||
.run();
|
||||
|
||||
loopCtx.state.branchName = resolved.branchName;
|
||||
loopCtx.state.title = resolved.title;
|
||||
loopCtx.state.explicitTitle = null;
|
||||
loopCtx.state.explicitBranchName = null;
|
||||
|
||||
await loopCtx.db
|
||||
.update(taskRuntime)
|
||||
.set({
|
||||
statusMessage: "provisioning",
|
||||
provisionStage: "repo_prepared",
|
||||
provisionStageUpdatedAt: now,
|
||||
updatedAt: now,
|
||||
})
|
||||
.where(eq(taskRuntime.id, TASK_ROW_ID))
|
||||
.run();
|
||||
|
||||
await project.registerTaskBranch({
|
||||
taskId: loopCtx.state.taskId,
|
||||
branchName: resolved.branchName,
|
||||
});
|
||||
|
||||
await appendHistory(loopCtx, "task.named", {
|
||||
title: resolved.title,
|
||||
branchName: resolved.branchName,
|
||||
});
|
||||
}
|
||||
|
||||
export async function initAssertNameActivity(loopCtx: any): Promise<void> {
|
||||
await setTaskState(loopCtx, "init_assert_name", "validating naming");
|
||||
if (!loopCtx.state.branchName) {
|
||||
throw new Error("task branchName is not initialized");
|
||||
}
|
||||
}
|
||||
|
||||
export async function initCompleteActivity(loopCtx: any, body: any): Promise<void> {
|
||||
const now = Date.now();
|
||||
const { config } = getActorRuntimeContext();
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ export const TASK_QUEUE_NAMES = [
|
|||
"task.command.workbench.rename_task",
|
||||
"task.command.workbench.rename_branch",
|
||||
"task.command.workbench.create_session",
|
||||
"task.command.workbench.create_session_and_send",
|
||||
"task.command.workbench.ensure_session",
|
||||
"task.command.workbench.rename_session",
|
||||
"task.command.workbench.set_session_unread",
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
// @ts-nocheck
|
||||
import { setTimeout as delay } from "node:timers/promises";
|
||||
import { desc, eq } from "drizzle-orm";
|
||||
import { Loop } from "rivetkit/workflow";
|
||||
import type {
|
||||
|
|
@ -272,24 +271,6 @@ async function requireWorkbenchTask(c: any, taskId: string) {
|
|||
return getTask(c, c.state.workspaceId, repoId, taskId);
|
||||
}
|
||||
|
||||
async function waitForWorkbenchTaskReady(task: any, timeoutMs = 5 * 60_000): Promise<any> {
|
||||
const startedAt = Date.now();
|
||||
|
||||
for (;;) {
|
||||
const record = await task.get();
|
||||
if (record?.branchName && record?.title) {
|
||||
return record;
|
||||
}
|
||||
if (record?.status === "error") {
|
||||
throw new Error("task initialization failed before the workbench session was ready");
|
||||
}
|
||||
if (Date.now() - startedAt > timeoutMs) {
|
||||
throw new Error("timed out waiting for task initialization");
|
||||
}
|
||||
await delay(1_000);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads the workspace sidebar snapshot from the workspace actor's local SQLite
|
||||
* plus the org-scoped GitHub actor for open PRs. Task actors still push
|
||||
|
|
@ -562,7 +543,7 @@ export const workspaceActions = {
|
|||
return expectQueueResponse<RepoRecord>(
|
||||
await self.send(workspaceWorkflowQueueName("workspace.command.addRepo"), input, {
|
||||
wait: true,
|
||||
timeout: 60_000,
|
||||
timeout: 10_000,
|
||||
}),
|
||||
);
|
||||
},
|
||||
|
|
@ -595,7 +576,7 @@ export const workspaceActions = {
|
|||
return expectQueueResponse<TaskRecord>(
|
||||
await self.send(workspaceWorkflowQueueName("workspace.command.createTask"), input, {
|
||||
wait: true,
|
||||
timeout: 5 * 60_000,
|
||||
timeout: 10_000,
|
||||
}),
|
||||
);
|
||||
},
|
||||
|
|
@ -813,6 +794,7 @@ export const workspaceActions = {
|
|||
},
|
||||
|
||||
async createWorkbenchTask(c: any, input: TaskWorkbenchCreateTaskInput): Promise<{ taskId: string; tabId?: string }> {
|
||||
// Step 1: Create the task record (wait: true — local state mutations only).
|
||||
const created = await workspaceActions.createTask(c, {
|
||||
workspaceId: c.state.workspaceId,
|
||||
repoId: input.repoId,
|
||||
|
|
@ -821,26 +803,18 @@ export const workspaceActions = {
|
|||
...(input.onBranch ? { onBranch: input.onBranch } : input.branch ? { explicitBranchName: input.branch } : {}),
|
||||
...(input.model ? { agentType: agentTypeForModel(input.model) } : {}),
|
||||
});
|
||||
|
||||
// Step 2: Enqueue session creation + initial message (wait: false).
|
||||
// The task workflow creates the session record and sends the message in
|
||||
// the background. The client observes progress via push events on the
|
||||
// task interest topic.
|
||||
const task = await requireWorkbenchTask(c, created.taskId);
|
||||
await waitForWorkbenchTaskReady(task);
|
||||
const session = await task.createWorkbenchSession({
|
||||
taskId: created.taskId,
|
||||
...(input.model ? { model: input.model } : {}),
|
||||
});
|
||||
await task.sendWorkbenchMessage({
|
||||
taskId: created.taskId,
|
||||
tabId: session.tabId,
|
||||
await task.createWorkbenchSessionAndSend({
|
||||
model: input.model,
|
||||
text: input.task,
|
||||
attachments: [],
|
||||
waitForCompletion: true,
|
||||
});
|
||||
await task.getSessionDetail({
|
||||
sessionId: session.tabId,
|
||||
});
|
||||
return {
|
||||
taskId: created.taskId,
|
||||
tabId: session.tabId,
|
||||
};
|
||||
|
||||
return { taskId: created.taskId };
|
||||
},
|
||||
|
||||
async markWorkbenchUnread(c: any, input: TaskWorkbenchSelectInput): Promise<void> {
|
||||
|
|
@ -988,7 +962,7 @@ export const workspaceActions = {
|
|||
const self = selfWorkspace(c);
|
||||
await self.send(workspaceWorkflowQueueName("workspace.command.refreshProviderProfiles"), command ?? {}, {
|
||||
wait: true,
|
||||
timeout: 60_000,
|
||||
timeout: 10_000,
|
||||
});
|
||||
},
|
||||
|
||||
|
|
|
|||
|
|
@ -5,6 +5,7 @@ import {
|
|||
ensureCloned,
|
||||
fetch,
|
||||
listRemoteBranches,
|
||||
listLocalRemoteRefs,
|
||||
remoteDefaultBaseRef,
|
||||
revParse,
|
||||
ensureRemoteBranch,
|
||||
|
|
@ -28,6 +29,8 @@ export interface GitDriver {
|
|||
ensureCloned(remoteUrl: string, targetPath: string, options?: { githubToken?: string | null }): Promise<void>;
|
||||
fetch(repoPath: string, options?: { githubToken?: string | null }): Promise<void>;
|
||||
listRemoteBranches(repoPath: string, options?: { githubToken?: string | null }): Promise<BranchSnapshot[]>;
|
||||
/** Read remote-tracking refs from the local clone without fetching. */
|
||||
listLocalRemoteRefs(repoPath: string): Promise<BranchSnapshot[]>;
|
||||
remoteDefaultBaseRef(repoPath: string): Promise<string>;
|
||||
revParse(repoPath: string, ref: string): Promise<string>;
|
||||
ensureRemoteBranch(repoPath: string, branchName: string, options?: { githubToken?: string | null }): Promise<void>;
|
||||
|
|
@ -81,6 +84,7 @@ export function createDefaultDriver(): BackendDriver {
|
|||
ensureCloned,
|
||||
fetch,
|
||||
listRemoteBranches,
|
||||
listLocalRemoteRefs,
|
||||
remoteDefaultBaseRef,
|
||||
revParse,
|
||||
ensureRemoteBranch,
|
||||
|
|
|
|||
|
|
@ -208,11 +208,25 @@ export async function remoteDefaultBaseRef(repoPath: string): Promise<string> {
|
|||
return "origin/main";
|
||||
}
|
||||
|
||||
/**
|
||||
* Fetch from origin, then read remote-tracking refs.
|
||||
* Use when you need guaranteed-fresh branch data and can tolerate network I/O.
|
||||
*/
|
||||
export async function listRemoteBranches(repoPath: string, options?: GitAuthOptions): Promise<BranchSnapshot[]> {
|
||||
await fetch(repoPath, options);
|
||||
return listLocalRemoteRefs(repoPath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Read remote-tracking refs (`refs/remotes/origin/*`) from the local clone
|
||||
* without fetching. The data is only as fresh as the last fetch — use this
|
||||
* when the branch sync actor keeps refs current and you want to avoid
|
||||
* blocking on network I/O.
|
||||
*/
|
||||
export async function listLocalRemoteRefs(repoPath: string): Promise<BranchSnapshot[]> {
|
||||
const { stdout } = await execFileAsync("git", ["-C", repoPath, "for-each-ref", "--format=%(refname:short) %(objectname)", "refs/remotes/origin"], {
|
||||
maxBuffer: 1024 * 1024,
|
||||
env: gitEnv(options),
|
||||
env: gitEnv(),
|
||||
});
|
||||
|
||||
return stdout
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ export function createTestGitDriver(overrides?: Partial<GitDriver>): GitDriver {
|
|||
ensureCloned: async () => {},
|
||||
fetch: async () => {},
|
||||
listRemoteBranches: async () => [],
|
||||
listLocalRemoteRefs: async () => [],
|
||||
remoteDefaultBaseRef: async () => "origin/main",
|
||||
revParse: async () => "abc1234567890",
|
||||
ensureRemoteBranch: async () => {},
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
import { describe, expect, it } from "vitest";
|
||||
import { shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js";
|
||||
import { requireSendableSessionMeta, shouldMarkSessionUnreadForStatus, shouldRecreateSessionForModelChange } from "../src/actors/task/workbench.js";
|
||||
|
||||
describe("workbench unread status transitions", () => {
|
||||
it("marks unread when a running session first becomes idle", () => {
|
||||
|
|
@ -57,3 +57,30 @@ describe("workbench model changes", () => {
|
|||
).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe("workbench send readiness", () => {
|
||||
it("rejects unknown tabs", () => {
|
||||
expect(() => requireSendableSessionMeta(null, "tab-1")).toThrow("Unknown workbench tab: tab-1");
|
||||
});
|
||||
|
||||
it("rejects pending sessions", () => {
|
||||
expect(() =>
|
||||
requireSendableSessionMeta(
|
||||
{
|
||||
status: "pending_session_create",
|
||||
sandboxSessionId: null,
|
||||
},
|
||||
"tab-2",
|
||||
),
|
||||
).toThrow("Session is not ready (status: pending_session_create). Wait for session provisioning to complete.");
|
||||
});
|
||||
|
||||
it("accepts ready sessions with a sandbox session id", () => {
|
||||
const meta = {
|
||||
status: "ready",
|
||||
sandboxSessionId: "session-1",
|
||||
};
|
||||
|
||||
expect(requireSendableSessionMeta(meta, "tab-3")).toBe(meta);
|
||||
});
|
||||
});
|
||||
|
|
|
|||
381
foundry/research/specs/remove-local-git-clone.md
Normal file
381
foundry/research/specs/remove-local-git-clone.md
Normal file
|
|
@ -0,0 +1,381 @@
|
|||
# Remove Local Git Clone from Backend
|
||||
|
||||
## Goal
|
||||
|
||||
The Foundry backend stores zero git state. No clones, no refs, no working trees, no git-spice. All git operations execute inside sandboxes. Repo metadata (branches, default branch, PRs) comes from GitHub API/webhooks which we already have.
|
||||
|
||||
## Terminology renames
|
||||
|
||||
Rename Foundry domain terms across the entire `foundry/` directory. All changes are breaking — no backwards compatibility needed. Execute as separate atomic commits in this order. `pnpm -w typecheck && pnpm -w build && pnpm -w test` must pass between each.
|
||||
|
||||
| New name | Old name (current code) |
|
||||
|---|---|
|
||||
| **Organization** | Workspace |
|
||||
| **Repository** | Project |
|
||||
| **Session** (not "tab") | Tab / Session (mixed) |
|
||||
| **Subscription** | Interest |
|
||||
| **SandboxProviderId** | ProviderId |
|
||||
|
||||
### Rename 1: `interest` → `subscription`
|
||||
|
||||
The realtime pub/sub system in `client/src/interest/`. Rename the directory, all types (`InterestManager` → `SubscriptionManager`, `MockInterestManager` → `MockSubscriptionManager`, `RemoteInterestManager` → `RemoteSubscriptionManager`, `DebugInterestTopic` → `DebugSubscriptionTopic`), the `useInterest` hook → `useSubscription`, and all imports in client + frontend. Rename `frontend/src/lib/interest.ts` → `subscription.ts`. Rename test file `client/test/interest-manager.test.ts` → `subscription-manager.test.ts`.
|
||||
|
||||
### Rename 2: `tab` → `session`
|
||||
|
||||
The UI "tab" concept is really a session. Rename `TabStrip` → `SessionStrip`, `tabId` → `sessionId`, `closeTab` → `closeSession`, `addTab` → `addSession`, `WorkbenchAgentTab` → `WorkbenchAgentSession`, `TaskWorkbenchTabInput` → `TaskWorkbenchSessionInput`, `TaskWorkbenchAddTabResponse` → `TaskWorkbenchAddSessionResponse`, and all related props/DOM attrs (`activeTabId` → `activeSessionId`, `onSwitchTab` → `onSwitchSession`, `onCloseTab` → `onCloseSession`, `data-tab` → `data-session`, `editingSessionTabId` → `editingSessionId`). Rename file `tab-strip.tsx` → `session-strip.tsx`. **Leave "diff tabs" alone** (`isDiffTab`, `diffTabId`) — those are file viewer panes, a different concept.
|
||||
|
||||
### Rename 3: `ProviderId` → `SandboxProviderId`
|
||||
|
||||
The `ProviderId` type (`"e2b" | "local"`) is specifically a sandbox provider. Rename the type (`ProviderId` → `SandboxProviderId`), schema (`ProviderIdSchema` → `SandboxProviderIdSchema`), and all `providerId` fields that refer to sandbox hosting (`CreateTaskInput`, `TaskRecord`, `SwitchResult`, `WorkbenchSandboxSummary`, task DB schema `task.provider_id` → `sandbox_provider_id`, `task_sandboxes.provider_id` → `sandbox_provider_id`, topic params). Rename config key `providers` → `sandboxProviders`. DB column renames need Drizzle migrations.
|
||||
|
||||
**Do NOT rename**: `model.provider` (AI model provider), `auth_account_index.provider_id` (auth provider), `providerAgent()` (model→agent mapping), `WorkbenchModelGroup.provider`.
|
||||
|
||||
Also **delete the `providerProfiles` table entirely** — it's written but never read (dead code). Remove the table definition from the organization actor DB schema, all writes in organization actions, and the `refreshProviderProfiles` queue command/handler/interface.
|
||||
|
||||
### Rename 4: `project` → `repository`
|
||||
|
||||
The "project" actor/entity is a git repository. Rename:
|
||||
- Actor directory `actors/project/` → `actors/repository/`
|
||||
- Actor directory `actors/project-branch-sync/` → `actors/repository-branch-sync/`
|
||||
- Actor registry keys `project` → `repository`, `projectBranchSync` → `repositoryBranchSync`
|
||||
- Actor name string `"Project"` → `"Repository"`
|
||||
- All functions: `projectKey` → `repositoryKey`, `getOrCreateProject` → `getOrCreateRepository`, `getProject` → `getRepository`, `selfProject` → `selfRepository`, `projectBranchSyncKey` → `repositoryBranchSyncKey`, `projectPrSyncKey` → `repositoryPrSyncKey`, `projectWorkflowQueueName` → `repositoryWorkflowQueueName`
|
||||
- Types: `ProjectInput` → `RepositoryInput`, `WorkbenchProjectSection` → `WorkbenchRepositorySection`, `PROJECT_QUEUE_NAMES` → `REPOSITORY_QUEUE_NAMES`
|
||||
- Queue names: `"project.command.*"` → `"repository.command.*"`
|
||||
- Actor key strings: change `"project"` to `"repository"` in key arrays (e.g. `["ws", id, "project", repoId]` → `["org", id, "repository", repoId]`)
|
||||
- Frontend: `projects` → `repositories`, `collapsedProjects` → `collapsedRepositories`, `hoveredProjectId` → `hoveredRepositoryId`, `PROJECT_COLORS` → `REPOSITORY_COLORS`, `data-project-*` → `data-repository-*`, `groupWorkbenchProjects` → `groupWorkbenchRepositories`
|
||||
- Client keys: `projectKey()` → `repositoryKey()`, `projectBranchSyncKey()` → `repositoryBranchSyncKey()`, `projectPrSyncKey()` → `repositoryPrSyncKey()`
|
||||
|
||||
### Rename 5: `workspace` → `organization`
|
||||
|
||||
The "workspace" is really an organization. Rename:
|
||||
- Actor directory `actors/workspace/` → `actors/organization/`
|
||||
- Actor registry key `workspace` → `organization`
|
||||
- Actor name string `"Workspace"` → `"Organization"`
|
||||
- All types: `WorkspaceIdSchema` → `OrganizationIdSchema`, `WorkspaceId` → `OrganizationId`, `WorkspaceEvent` → `OrganizationEvent`, `WorkspaceSummarySnapshot` → `OrganizationSummarySnapshot`, `WorkspaceUseInputSchema` → `OrganizationUseInputSchema`, `WorkspaceHandle` → `OrganizationHandle`, `WorkspaceTopicParams` → `OrganizationTopicParams`
|
||||
- All `workspaceId` fields/params → `organizationId` (~20+ schemas in contracts.ts, plus topic params, task snapshot, etc.)
|
||||
- `FoundryOrganization.workspaceId` → `FoundryOrganization.organizationId` (or just `id`)
|
||||
- All functions: `workspaceKey` → `organizationKey`, `getOrCreateWorkspace` → `getOrCreateOrganization`, `selfWorkspace` → `selfOrganization`, `resolveWorkspaceId` → `resolveOrganizationId`, `defaultWorkspace` → `defaultOrganization`, `workspaceWorkflowQueueName` → `organizationWorkflowQueueName`, `WORKSPACE_QUEUE_NAMES` → `ORGANIZATION_QUEUE_NAMES`
|
||||
- Actor key strings: change `"ws"` to `"org"` in key arrays (e.g. `["ws", id]` → `["org", id]`)
|
||||
- Queue names: `"workspace.command.*"` → `"organization.command.*"`
|
||||
- Topic keys: `"workspace:${id}"` → `"organization:${id}"`, event `"workspaceUpdated"` → `"organizationUpdated"`
|
||||
- Methods: `connectWorkspace` → `connectOrganization`, `getWorkspaceSummary` → `getOrganizationSummary`, `useWorkspace` → `useOrganization`
|
||||
- Files: `shared/src/workspace.ts` → `organization.ts`, `backend/src/config/workspace.ts` → `organization.ts`
|
||||
- Config keys: `config.workspace.default` → `config.organization.default`
|
||||
- URL paths: `/workspaces/$workspaceId` → `/organizations/$organizationId`
|
||||
- UI strings: `"Loading workspace..."` → `"Loading organization..."`
|
||||
- Tests: rename `workspace-*.test.ts` files, update `workspaceSnapshot()` → `organizationSnapshot()`, `workspaceId: "ws-1"` → `organizationId: "org-1"`
|
||||
|
||||
### After all renames: update CLAUDE.md files
|
||||
|
||||
Update `foundry/CLAUDE.md` and `foundry/packages/backend/CLAUDE.md` to use new terminology throughout (organization instead of workspace, repository instead of project, etc.). The rest of this spec already uses the new names.
|
||||
|
||||
## What gets deleted
|
||||
|
||||
### Entire directories/files
|
||||
|
||||
| Path (relative to `packages/backend/src/`) | Reason |
|
||||
|---|---|
|
||||
| `integrations/git/index.ts` | All local git operations |
|
||||
| `integrations/git-spice/index.ts` | Stack management via git-spice |
|
||||
| `actors/repository-branch-sync/` (currently `project-branch-sync/`) | Polling actor that fetches + reads local clone every 5s |
|
||||
| `actors/project-pr-sync/` | Empty directory, already dead |
|
||||
| `actors/repository/stack-model.ts` (currently `project/stack-model.ts`) | Stack parent/sort model (git-spice dependent) |
|
||||
| `test/git-spice.test.ts` | Tests for deleted git-spice integration |
|
||||
| `test/git-validate-remote.test.ts` | Tests for deleted git validation |
|
||||
| `test/stack-model.test.ts` | Tests for deleted stack model |
|
||||
|
||||
### Driver interfaces removed from `driver.ts`
|
||||
|
||||
- `GitDriver` — entire interface deleted
|
||||
- `StackDriver` — entire interface deleted
|
||||
- `BackendDriver.git` — removed
|
||||
- `BackendDriver.stack` — removed
|
||||
- All imports from `integrations/git/` and `integrations/git-spice/`
|
||||
|
||||
`BackendDriver` keeps only `github` and `tmux`.
|
||||
|
||||
### Test driver cleanup (`test/helpers/test-driver.ts`)
|
||||
|
||||
- Delete `createTestGitDriver()`
|
||||
- Delete `createTestStackDriver()`
|
||||
- Remove `git` and `stack` from `createTestDriver()`
|
||||
|
||||
### Docker volume removed (`compose.dev.yaml`, `compose.preview.yaml`)
|
||||
|
||||
- Remove `foundry_git_repos` volume and its mount at `/root/.local/share/foundry/repos`
|
||||
- Remove the CLAUDE.md note about the repos volume
|
||||
|
||||
### Actor registry cleanup (`actors/index.ts`, `actors/keys.ts`, `actors/handles.ts`)
|
||||
|
||||
- Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) registration
|
||||
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`)
|
||||
- Remove branch sync handle helpers
|
||||
|
||||
### Client key cleanup (`packages/client/src/keys.ts`, `packages/client/test/keys.test.ts`)
|
||||
|
||||
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) if exported
|
||||
|
||||
### Dead code removal: `providerProfiles` table
|
||||
|
||||
The `providerProfiles` table in the organization actor (currently workspace actor) DB is written but never read. Delete:
|
||||
|
||||
- Table definition in `actors/organization/db/schema.ts` (currently `workspace/db/schema.ts`)
|
||||
- All writes in `actors/organization/actions.ts` (currently `workspace/actions.ts`)
|
||||
- The `refreshProviderProfiles` queue command and handler
|
||||
- The `RefreshProviderProfilesCommand` interface
|
||||
- Add a DB migration to drop the `provider_profiles` table
|
||||
|
||||
### Ensure pattern cleanup (`actors/repository/actions.ts`, currently `project/actions.ts`)
|
||||
|
||||
Delete all `ensure*` functions that block action handlers on external I/O or cross-actor fan-out:
|
||||
|
||||
- **`ensureLocalClone()`** — Delete (git clone removal).
|
||||
- **`ensureProjectReady()`** / **`ensureRepositoryReady()`** — Delete (wrapper around `ensureLocalClone` + sync actors).
|
||||
- **`ensureProjectReadyForRead()`** / **`ensureRepositoryReadyForRead()`** — Delete (dispatches ensure with 10s wait on read path).
|
||||
- **`ensureProjectSyncActors()`** / **`ensureRepositorySyncActors()`** — Delete (spawns branch sync actor which is being removed).
|
||||
- **`forceProjectSync()`** / **`forceRepositorySync()`** — Delete (triggers branch sync actor).
|
||||
- **`ensureTaskIndexHydrated()`** — Delete. This is the migration path from `HistoryActor` → `task_index` table. Since we assume fresh repositories, no migration needed. The task index is populated on write (`createTask` inserts the row).
|
||||
- **`ensureTaskIndexHydratedForRead()`** — Delete (wrapper that dispatches `hydrateTaskIndex`).
|
||||
- **`taskIndexHydrated` state flag** — Delete from repository actor state.
|
||||
|
||||
The `ensureAskpassScript()` is fine — it's a fast local operation.
|
||||
|
||||
### Dead schema tables and helpers (`actors/repository/db/schema.ts`, `actors/repository/actions.ts`)
|
||||
|
||||
With the branch sync actor and git-spice stack operations deleted, these tables have no writer and should be removed:
|
||||
|
||||
- **`branches` table** — populated by `RepositoryBranchSyncActor` from the local clone. Delete the table, its schema definition, and all reads from it (including `enrichTaskRecord` which reads `diffStat`, `hasUnpushed`, `conflictsWithMain`, `parentBranch` from this table).
|
||||
- **`repoActionJobs` table** — populated by `runRepoStackAction()` for git-spice stack operations. Delete the table, its schema definition, and all helpers: `ensureRepoActionJobsTable()`, `writeRepoActionJob()`, `listRepoActionJobRows()`.
|
||||
|
||||
## What gets modified
|
||||
|
||||
### `actors/repository/actions.ts` (currently `project/actions.ts`)
|
||||
|
||||
This is the biggest change. Current git operations in this file:
|
||||
|
||||
1. **`createTaskMutation()`** — Currently calls `listLocalRemoteRefs` to check branch name conflicts against remote branches. Replace: branch conflict checking uses only the repository actor's `task_index` table (which branches are already taken by tasks). We don't need to check against remote branches — if the branch already exists on the remote, `git push` in the sandbox will handle it.
|
||||
2. **`registerTaskBranch()`** — Currently does `fetch` + `remoteDefaultBaseRef` + `revParse` + git-spice stack tracking. Replace: default base branch comes from GitHub repo metadata (already stored from webhook/API at repo add time). SHA resolution is not needed at task creation — the sandbox handles it. Delete all git-spice stack tracking.
|
||||
3. **`getRepoOverview()`** — Currently calls `listLocalRemoteRefs` + `remoteDefaultBaseRef` + `stack.available` + `stack.listStack`. Replace: branch data comes from GitHub API data we already store from webhooks (push/create/delete events feed branch state). Stack data is deleted. The overview returns branches from stored GitHub webhook data.
|
||||
4. **`runRepoStackAction()`** — Delete entirely (all git-spice stack operations).
|
||||
5. **All `normalizeBaseBranchName` imports from git-spice** — Inline or move to a simple utility if still needed.
|
||||
6. **All `ensureTaskIndexHydrated*` / `ensureRepositoryReady*` call sites** — Remove. Read actions query the `task_index` table directly; if it's empty, it's empty. Write actions populate it on create.
|
||||
|
||||
### `actors/repository/index.ts` (currently `project/index.ts`)
|
||||
|
||||
- Remove local clone path from state/initialization
|
||||
- Remove branch sync actor spawning
|
||||
- Remove any `ensureLocalClone` calls in lifecycle
|
||||
|
||||
### `actors/task/workbench.ts`
|
||||
|
||||
- **`ensureSandboxRepo()` line 405**: Currently calls `driver.git.remoteDefaultBaseRef()` on the local clone. Replace: read default branch from repository actor state (which gets it from GitHub API/webhook data at repo add time).
|
||||
|
||||
### `actors/organization/actions.ts` (currently `workspace/actions.ts`)
|
||||
|
||||
- **`addRemote()` line 320**: Currently calls `driver.git.validateRemote()` which runs `git ls-remote`. Replace: validate via GitHub API — `GET /repos/{owner}/{repo}` returns 404 for invalid repos. We already parse the remote URL into owner/repo for GitHub operations.
|
||||
|
||||
### `actors/keys.ts` / `actors/handles.ts`
|
||||
|
||||
- Remove `repositoryBranchSyncKey` (currently `projectBranchSyncKey`) export
|
||||
- Remove branch sync handle creation
|
||||
|
||||
## What stays the same
|
||||
|
||||
- `driver.github.*` — already uses GitHub API, no changes
|
||||
- `driver.tmux.*` — unrelated, no changes
|
||||
- `integrations/github/index.ts` — already GitHub API based, keeps working
|
||||
- All sandbox execution (`executeInSandbox()`) — already correct pattern
|
||||
- Webhook handlers for push/create/delete events — already feed GitHub data into backend
|
||||
|
||||
## CLAUDE.md updates
|
||||
|
||||
### `foundry/packages/backend/CLAUDE.md`
|
||||
|
||||
Remove `RepositoryBranchSyncActor` (currently `ProjectBranchSyncActor`) from the actor hierarchy tree:
|
||||
|
||||
```text
|
||||
OrganizationActor
|
||||
├─ HistoryActor(organization-scoped global feed)
|
||||
├─ GithubDataActor
|
||||
├─ RepositoryActor(repo)
|
||||
│ └─ TaskActor(task)
|
||||
│ ├─ TaskSessionActor(session) x N
|
||||
│ │ └─ SessionStatusSyncActor(session) x 0..1
|
||||
│ └─ Task-local workbench state
|
||||
└─ SandboxInstanceActor(sandboxProviderId, sandboxId) x N
|
||||
```
|
||||
|
||||
Add to Ownership Rules:
|
||||
|
||||
> - The backend stores no local git state. No clones, no refs, no working trees, no git-spice. Repo metadata (branches, default branch) comes from GitHub API and webhook events. All git operations that require a working tree execute inside sandboxes via `executeInSandbox()`.
|
||||
|
||||
### `foundry/CLAUDE.md`
|
||||
|
||||
Add a new section:
|
||||
|
||||
```markdown
|
||||
## Git State Policy
|
||||
|
||||
- The backend stores **zero git state**. No local clones, no refs, no working trees, no git-spice.
|
||||
- Repo metadata (branches, default branch, PRs) comes from GitHub API and webhook events already flowing into the system.
|
||||
- All git operations that require a working tree (diff, push, conflict check, rev-parse) execute inside the task's sandbox via `executeInSandbox()`.
|
||||
- Do not add local git clone paths, `git fetch`, `git for-each-ref`, or any direct git CLI calls to the backend. If you need git data, either read it from stored GitHub webhook/API data or run it in a sandbox.
|
||||
- The `BackendDriver` has no `GitDriver` or `StackDriver`. Only `GithubDriver` and `TmuxDriver` remain.
|
||||
- git-spice is not used anywhere in the system.
|
||||
```
|
||||
|
||||
Remove from CLAUDE.md:
|
||||
|
||||
> - Docker dev: `compose.dev.yaml` mounts a named volume at `/root/.local/share/foundry/repos` to persist backend-managed git clones across restarts. Code must still work if this volume is not present (create directories as needed).
|
||||
|
||||
## Concerns
|
||||
|
||||
1. **Concurrent agent work**: Another agent is currently modifying `workspace/actions.ts`, `project/actions.ts`, `task/workbench.ts`, `task/workflow/init.ts`, `task/workflow/queue.ts`, `driver.ts`, and `project-branch-sync/index.ts`. Those changes are adding `listLocalRemoteRefs` to the driver and removing polling loops/timeouts. The git clone removal work will **delete** the code the other agent is modifying. Coordinate: let the other agent's changes land first, then this spec deletes the git integration entirely.
|
||||
|
||||
2. **Rename ordering**: The rename spec (workspace→organization, project→repository, etc.) should ideally land **before** this spec is executed, so the file paths and identifiers match. If not, the implementing agent should map old names → new names using the table above.
|
||||
|
||||
3. **`project-pr-sync/` directory**: This is already an empty directory. Delete it as part of cleanup.
|
||||
|
||||
4. **`ensureRepoActionJobsTable()`**: The current spec mentions this should stay but the `repoActionJobs` table is being deleted. Updating: both the table and the ensure function should be deleted.
|
||||
|
||||
## Validation
|
||||
|
||||
After implementation, run:
|
||||
|
||||
```bash
|
||||
pnpm -w typecheck
|
||||
pnpm -w build
|
||||
pnpm -w test
|
||||
```
|
||||
|
||||
Then restart the dev stack and run the main user flow end-to-end:
|
||||
|
||||
```bash
|
||||
just foundry-dev-down && just foundry-dev
|
||||
```
|
||||
|
||||
Verify:
|
||||
1. Add a repo to an organization
|
||||
2. Create a task (should return immediately with taskId)
|
||||
3. Task appears in sidebar with pending status
|
||||
4. Task provisions and transitions to ready
|
||||
5. Session is created and initial message is sent
|
||||
6. Agent responds in the session transcript
|
||||
|
||||
This must work against a real GitHub repo (`rivet-dev/sandbox-agent-testing`) with the dev environment credentials.
|
||||
|
||||
### Codebase grep validation
|
||||
|
||||
After implementation, verify no local git operations or git-spice references remain in the backend:
|
||||
|
||||
```bash
|
||||
# No local git CLI calls (excludes integrations/github which is GitHub API, not local git)
|
||||
rg -l 'execFileAsync\("git"' foundry/packages/backend/src/ && echo "FAIL: local git CLI calls found" || echo "PASS"
|
||||
|
||||
# No git-spice references
|
||||
rg -l 'git.spice|gitSpice|git_spice' foundry/packages/backend/src/ && echo "FAIL: git-spice references found" || echo "PASS"
|
||||
|
||||
# No GitDriver or StackDriver references
|
||||
rg -l 'GitDriver|StackDriver' foundry/packages/backend/src/ && echo "FAIL: deleted driver interfaces still referenced" || echo "PASS"
|
||||
|
||||
# No local clone path references
|
||||
rg -l 'localPath|ensureCloned|ensureLocalClone|foundryRepoClonePath' foundry/packages/backend/src/ && echo "FAIL: local clone references found" || echo "PASS"
|
||||
|
||||
# No branch sync actor references
|
||||
rg -l 'BranchSync|branchSync|branch.sync' foundry/packages/backend/src/ && echo "FAIL: branch sync references found" || echo "PASS"
|
||||
|
||||
# No deleted ensure patterns
|
||||
rg -l 'ensureProjectReady|ensureTaskIndexHydrated|taskIndexHydrated' foundry/packages/backend/src/ && echo "FAIL: deleted ensure patterns found" || echo "PASS"
|
||||
|
||||
# integrations/git/ and integrations/git-spice/ directories should not exist
|
||||
ls foundry/packages/backend/src/integrations/git/index.ts 2>/dev/null && echo "FAIL: git integration not deleted" || echo "PASS"
|
||||
ls foundry/packages/backend/src/integrations/git-spice/index.ts 2>/dev/null && echo "FAIL: git-spice integration not deleted" || echo "PASS"
|
||||
```
|
||||
|
||||
All checks must pass before the change is considered complete.
|
||||
|
||||
### Rename verification
|
||||
|
||||
After the rename spec has landed, verify no old names remain anywhere in `foundry/`:
|
||||
|
||||
```bash
|
||||
# --- workspace → organization ---
|
||||
# No "WorkspaceActor", "WorkspaceEvent", "WorkspaceId", "WorkspaceSummary", etc. (exclude pnpm-workspace.yaml, node_modules, .turbo)
|
||||
rg -l 'WorkspaceActor|WorkspaceEvent|WorkspaceId|WorkspaceSummary|WorkspaceHandle|WorkspaceUseInput|WorkspaceTopicParams' foundry/packages/ && echo "FAIL: workspace type references remain" || echo "PASS"
|
||||
|
||||
# No workspaceId in domain code (exclude pnpm-workspace, node_modules, .turbo, this spec file)
|
||||
rg -l 'workspaceId' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspaceId references remain" || echo "PASS"
|
||||
|
||||
# No workspace actor directory
|
||||
ls foundry/packages/backend/src/actors/workspace/ 2>/dev/null && echo "FAIL: workspace actor directory not renamed" || echo "PASS"
|
||||
|
||||
# No workspaceKey function
|
||||
rg 'workspaceKey|selfWorkspace|getOrCreateWorkspace|resolveWorkspaceId|defaultWorkspace' foundry/packages/ --glob '!node_modules' && echo "FAIL: workspace function references remain" || echo "PASS"
|
||||
|
||||
# No "ws" actor key string (the old key prefix)
|
||||
rg '"\\"ws\\""|\["ws"' foundry/packages/ --glob '!node_modules' && echo "FAIL: old 'ws' actor key strings remain" || echo "PASS"
|
||||
|
||||
# No workspace queue names
|
||||
rg 'workspace\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: workspace queue names remain" || echo "PASS"
|
||||
|
||||
# No /workspaces/ URL paths
|
||||
rg '/workspaces/' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: /workspaces/ URL paths remain" || echo "PASS"
|
||||
|
||||
# No config.workspace
|
||||
rg 'config\.workspace' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: config.workspace references remain" || echo "PASS"
|
||||
|
||||
# --- project → repository ---
|
||||
# No ProjectActor, ProjectInput, ProjectSection, etc.
|
||||
rg -l 'ProjectActor|ProjectInput|ProjectSection|PROJECT_QUEUE|PROJECT_COLORS' foundry/packages/ --glob '!node_modules' && echo "FAIL: project type references remain" || echo "PASS"
|
||||
|
||||
# No project actor directory
|
||||
ls foundry/packages/backend/src/actors/project/ 2>/dev/null && echo "FAIL: project actor directory not renamed" || echo "PASS"
|
||||
|
||||
# No projectKey, selfProject, getOrCreateProject, etc.
|
||||
rg 'projectKey|selfProject|getOrCreateProject|getProject\b|projectBranchSync|projectPrSync|projectWorkflow' foundry/packages/ --glob '!node_modules' && echo "FAIL: project function references remain" || echo "PASS"
|
||||
|
||||
# No "project" actor key string
|
||||
rg '"\\"project\\""|\[".*"project"' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: old project actor key strings remain" || echo "PASS"
|
||||
|
||||
# No project.command.* queue names
|
||||
rg 'project\.command\.' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: project queue names remain" || echo "PASS"
|
||||
|
||||
# --- tab → session ---
|
||||
# No WorkbenchAgentTab, TaskWorkbenchTabInput, TabStrip, tabId (in workbench context)
|
||||
rg -l 'WorkbenchAgentTab|TaskWorkbenchTabInput|TaskWorkbenchAddTabResponse|TabStrip' foundry/packages/ --glob '!node_modules' && echo "FAIL: tab type references remain" || echo "PASS"
|
||||
|
||||
# No tabId (should be sessionId now)
|
||||
rg '\btabId\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: tabId references remain" || echo "PASS"
|
||||
|
||||
# No tab-strip.tsx file
|
||||
ls foundry/packages/frontend/src/components/mock-layout/tab-strip.tsx 2>/dev/null && echo "FAIL: tab-strip.tsx not renamed" || echo "PASS"
|
||||
|
||||
# No closeTab/addTab (should be closeSession/addSession)
|
||||
rg '\bcloseTab\b|\baddTab\b' foundry/packages/ --glob '!node_modules' && echo "FAIL: closeTab/addTab references remain" || echo "PASS"
|
||||
|
||||
# --- interest → subscription ---
|
||||
# No InterestManager, useInterest, etc.
|
||||
rg -l 'InterestManager|useInterest|DebugInterestTopic' foundry/packages/ --glob '!node_modules' && echo "FAIL: interest type references remain" || echo "PASS"
|
||||
|
||||
# No interest/ directory
|
||||
ls foundry/packages/client/src/interest/ 2>/dev/null && echo "FAIL: interest directory not renamed" || echo "PASS"
|
||||
|
||||
# --- ProviderId → SandboxProviderId ---
|
||||
# No bare ProviderId/ProviderIdSchema (but allow sandboxProviderId, model.provider, auth provider_id)
|
||||
rg '\bProviderIdSchema\b|\bProviderId\b' foundry/packages/shared/src/contracts.ts && echo "FAIL: bare ProviderId in contracts.ts" || echo "PASS"
|
||||
|
||||
# No bare providerId for sandbox context (check task schema)
|
||||
rg '\bproviderId\b' foundry/packages/backend/src/actors/task/db/schema.ts && echo "FAIL: bare providerId in task schema" || echo "PASS"
|
||||
|
||||
# No providerProfiles table (dead code, should be deleted)
|
||||
rg 'providerProfiles|provider_profiles|refreshProviderProfiles' foundry/packages/ --glob '!node_modules' --glob '!*.md' && echo "FAIL: providerProfiles references remain" || echo "PASS"
|
||||
|
||||
# --- Verify new names exist ---
|
||||
rg -l 'OrganizationActor|OrganizationEvent|OrganizationId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new organization names not found"
|
||||
rg -l 'RepositoryActor|RepositoryInput|RepositorySection' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new repository names not found"
|
||||
rg -l 'SubscriptionManager|useSubscription' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new subscription names not found"
|
||||
rg -l 'SandboxProviderIdSchema|SandboxProviderId' foundry/packages/ --glob '!node_modules' | head -3 || echo "WARN: new sandbox provider names not found"
|
||||
```
|
||||
|
||||
All checks must pass. False positives from markdown files, comments referencing old names in migration context, or `node_modules` should be excluded via the globs above.
|
||||
37
sdks/CLAUDE.md
Normal file
37
sdks/CLAUDE.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
# SDK Instructions
|
||||
|
||||
## TypeScript SDK Architecture
|
||||
|
||||
- TypeScript clients are split into:
|
||||
- `acp-http-client`: protocol-pure ACP-over-HTTP (`/v1/acp`) with no Sandbox-specific HTTP helpers.
|
||||
- `sandbox-agent`: `SandboxAgent` SDK wrapper that combines ACP session operations with Sandbox control-plane and filesystem helpers.
|
||||
- `SandboxAgent` entry points are `SandboxAgent.connect(...)` and `SandboxAgent.start(...)`.
|
||||
- Stable Sandbox session methods are `createSession`, `resumeSession`, `resumeOrCreateSession`, `destroySession`, `rawSendSessionMethod`, `onSessionEvent`, `setSessionMode`, `setSessionModel`, `setSessionThoughtLevel`, `setSessionConfigOption`, `getSessionConfigOptions`, `getSessionModes`, `respondPermission`, `rawRespondPermission`, and `onPermissionRequest`.
|
||||
- `Session` helpers are `prompt(...)`, `rawSend(...)`, `onEvent(...)`, `setMode(...)`, `setModel(...)`, `setThoughtLevel(...)`, `setConfigOption(...)`, `getConfigOptions()`, `getModes()`, `respondPermission(...)`, `rawRespondPermission(...)`, and `onPermissionRequest(...)`.
|
||||
- Cleanup is `sdk.dispose()`.
|
||||
|
||||
### React Component Methodology
|
||||
|
||||
- Shared React UI belongs in `sdks/react` only when it is reusable outside the Inspector.
|
||||
- If the same UI pattern is shared between the Sandbox Agent Inspector and Foundry, prefer extracting it into `sdks/react` instead of maintaining parallel implementations.
|
||||
- Keep shared components unstyled by default: behavior in the package, styling in the consumer via `className`, slot-level `classNames`, render overrides, and `data-*` hooks.
|
||||
- Prefer extracting reusable pieces such as transcript, composer, and conversation surfaces. Keep Inspector-specific shells such as session selection, session headers, and control-plane actions in `frontend/packages/inspector/`.
|
||||
- Document all shared React components in `docs/react-components.mdx`, and keep that page aligned with the exported surface in `sdks/react/src/index.ts`.
|
||||
|
||||
### TypeScript SDK Naming Conventions
|
||||
|
||||
- Use `respond<Thing>(id, reply)` for SDK methods that reply to an agent-initiated request (e.g. `respondPermission`). This is the standard pattern for answering any inbound JSON-RPC request from the agent.
|
||||
- Prefix raw/low-level escape hatches with `raw` (e.g. `rawRespondPermission`, `rawSend`). These accept protocol-level types directly and bypass SDK abstractions.
|
||||
|
||||
### Docs Source Of Truth
|
||||
|
||||
- For TypeScript docs/examples, source of truth is implementation in:
|
||||
- `sdks/typescript/src/client.ts`
|
||||
- `sdks/typescript/src/index.ts`
|
||||
- `sdks/acp-http-client/src/index.ts`
|
||||
- Do not document TypeScript APIs unless they are exported and implemented in those files.
|
||||
|
||||
## Tests
|
||||
|
||||
- TypeScript SDK tests should run against a real running server/runtime over real `/v1` HTTP APIs, typically using the real `mock` agent for deterministic behavior.
|
||||
- Do not use Vitest fetch/transport mocks to simulate server functionality in TypeScript SDK tests.
|
||||
|
|
@ -1,18 +1,47 @@
|
|||
# Server Instructions
|
||||
|
||||
## Architecture
|
||||
## ACP v1 Baseline
|
||||
|
||||
- Public API routes are defined in `server/packages/sandbox-agent/src/router.rs`.
|
||||
- ACP proxy runtime is in `server/packages/sandbox-agent/src/acp_proxy_runtime.rs`.
|
||||
- All API endpoints are under `/v1`.
|
||||
- Keep binary filesystem transfer endpoints as dedicated HTTP APIs:
|
||||
- v1 is ACP-native.
|
||||
- `/v1/*` is removed and returns `410 Gone` (`application/problem+json`).
|
||||
- `/opencode/*` is disabled during ACP core phases and returns `503`.
|
||||
- Prompt/session traffic is ACP JSON-RPC over streamable HTTP on `/v1/rpc`:
|
||||
- `POST /v1/rpc`
|
||||
- `GET /v1/rpc` (SSE)
|
||||
- `DELETE /v1/rpc`
|
||||
- Control-plane endpoints:
|
||||
- `GET /v1/health`
|
||||
- `GET /v1/agents`
|
||||
- `POST /v1/agents/{agent}/install`
|
||||
- Binary filesystem transfer endpoints (intentionally HTTP, not ACP extension methods):
|
||||
- `GET /v1/fs/file`
|
||||
- `PUT /v1/fs/file`
|
||||
- `POST /v1/fs/upload-batch`
|
||||
- Rationale: host-owned cross-agent-consistent behavior and large binary transfer needs that ACP JSON-RPC is not suited to stream efficiently.
|
||||
- Maintain ACP variants in parallel only when they share the same underlying filesystem implementation; SDK defaults should still prefer HTTP for large/binary transfers.
|
||||
- `/opencode/*` stays disabled (`503`) until Phase 7.
|
||||
- Agent install logic (native + ACP agent process + lazy install) is handled by `server/packages/agent-management/`.
|
||||
- Sandbox Agent ACP extension method naming:
|
||||
- Custom ACP methods use `_sandboxagent/...` (not `_sandboxagent/v1/...`).
|
||||
- Session detach method is `_sandboxagent/session/detach`.
|
||||
|
||||
## API Scope
|
||||
|
||||
- ACP is the primary protocol for agent/session behavior and all functionality that talks directly to the agent.
|
||||
- ACP extensions may be used for gaps (for example `skills`, `models`, and related metadata), but the default is that agent-facing behavior is implemented by the agent through ACP.
|
||||
- Custom HTTP APIs are for non-agent/session platform services (for example filesystem, terminals, and other host/runtime capabilities).
|
||||
- Filesystem and terminal APIs remain Sandbox Agent-specific HTTP contracts and are not ACP.
|
||||
- Do not make Sandbox Agent core flows depend on ACP client implementations of `fs/*` or `terminal/*`; in practice those client-side capabilities are often incomplete or inconsistent.
|
||||
- ACP-native filesystem and terminal methods are also too limited for Sandbox Agent host/runtime needs, so prefer the native HTTP APIs for richer behavior.
|
||||
- Keep `GET /v1/fs/file`, `PUT /v1/fs/file`, and `POST /v1/fs/upload-batch` on HTTP:
|
||||
- These are Sandbox Agent host/runtime operations with cross-agent-consistent behavior.
|
||||
- They may involve very large binary transfers that ACP JSON-RPC envelopes are not suited to stream.
|
||||
- This is intentionally separate from ACP native `fs/read_text_file` and `fs/write_text_file`.
|
||||
- ACP extension variants may exist in parallel, but SDK defaults should prefer HTTP for these binary transfer operations.
|
||||
|
||||
## Architecture
|
||||
|
||||
- HTTP contract and problem/error mapping: `server/packages/sandbox-agent/src/router.rs`
|
||||
- ACP proxy runtime: `server/packages/sandbox-agent/src/acp_proxy_runtime.rs`
|
||||
- ACP client runtime and agent process bridge: `server/packages/sandbox-agent/src/acp_runtime/mod.rs`
|
||||
- Agent install logic (native + ACP agent process + lazy install): `server/packages/agent-management/`
|
||||
- Inspector UI served at `/ui/` and bound to ACP over HTTP from `frontend/packages/inspector/`
|
||||
|
||||
## API Contract Rules
|
||||
|
||||
|
|
@ -21,6 +50,24 @@
|
|||
- Regenerate `docs/openapi.json` after endpoint contract changes.
|
||||
- Keep CLI and HTTP endpoint behavior aligned (`docs/cli.mdx`).
|
||||
|
||||
## ACP Protocol Compliance
|
||||
|
||||
- Before adding any new ACP method, property, or config option category to the SDK, verify it exists in the ACP spec at `https://agentclientprotocol.com/llms-full.txt`.
|
||||
- Valid `SessionConfigOptionCategory` values are: `mode`, `model`, `thought_level`, `other`, or custom categories prefixed with `_` (e.g. `_permission_mode`).
|
||||
- Do not invent ACP properties or categories (e.g. `permission_mode` is not a valid ACP category — use `_permission_mode` if it's a custom extension, or use existing ACP mechanisms like `session/set_mode`).
|
||||
- `NewSessionRequest` only has `_meta`, `cwd`, and `mcpServers`. Do not add non-ACP fields to it.
|
||||
- Sandbox Agent SDK abstractions (like `SessionCreateRequest`) may add convenience properties, but must clearly map to real ACP methods internally and not send fabricated fields over the wire.
|
||||
|
||||
## Source Documents
|
||||
|
||||
- ACP protocol specification (full LLM-readable reference): `https://agentclientprotocol.com/llms-full.txt`
|
||||
- `~/misc/acp-docs/schema/schema.json`
|
||||
- `~/misc/acp-docs/schema/meta.json`
|
||||
- `research/acp/spec.md`
|
||||
- `research/acp/v1-schema-to-acp-mapping.md`
|
||||
- `research/acp/friction.md`
|
||||
- `research/acp/todo.md`
|
||||
|
||||
## Tests
|
||||
|
||||
Primary v1 integration coverage:
|
||||
|
|
@ -38,3 +85,9 @@ cargo test -p sandbox-agent --test v1_agent_process_matrix
|
|||
- Keep `research/acp/spec.md` as the source spec.
|
||||
- Update `research/acp/todo.md` when scope/status changes.
|
||||
- Log blockers/decisions in `research/acp/friction.md`.
|
||||
|
||||
## Docker Examples (Dev Testing)
|
||||
|
||||
- When manually testing bleeding-edge (unreleased) versions of sandbox-agent in `examples/`, use `SANDBOX_AGENT_DEV=1` with the Docker-based examples.
|
||||
- This triggers a local build of `docker/runtime/Dockerfile.full` which builds the server binary from local source and packages it into the Docker image.
|
||||
- Example: `SANDBOX_AGENT_DEV=1 pnpm --filter @sandbox-agent/example-mcp start`
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue