* feat(foundry): checkpoint actor and workspace refactor
* docs(foundry): add agent handoff context
* wip(foundry): continue actor refactor
* wip(foundry): capture remaining local changes
* Complete Foundry refactor checklist
* Fix Foundry validation fallout
* wip
* wip: convert all actors from workflow to plain run handlers
Workaround for RivetKit bug where c.queue.iter() never yields messages
for actors created via getOrCreate from another actor's context. The
queue accepts messages (visible in inspector) but the iterator hangs.
Sleep/wake fixes it, but actors with active connections never sleep.
Converted organization, github-data, task, and user actors from
run: workflow(...) to plain run: async (c) => { for await ... }.
Also fixes:
- Missing auth tables in org migration (auth_verification etc)
- default_model NOT NULL constraint on org profile upsert
- Nested workflow step in github-data (HistoryDivergedError)
- Removed --force from frontend Dockerfile pnpm install
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Convert all actors from queues/workflows to direct actions, lazy task creation
Major refactor replacing all queue-based workflow communication with direct
RivetKit action calls across all actors. This works around a RivetKit bug
where c.queue.iter() deadlocks for actors created from another actor's context.
Key changes:
- All actors (organization, task, user, audit-log, github-data) converted
from run: workflow(...) to actions-only (no run handler, no queues)
- PR sync creates virtual task entries in org local DB instead of spawning
task actors — prevents OOM from 200+ actors created simultaneously
- Task actors created lazily on first user interaction via getOrCreate,
self-initialize from org's getTaskIndexEntry data
- Removed requireRepoExists cross-actor call (caused 500s), replaced with
local resolveTaskRepoId from org's taskIndex table
- Fixed getOrganizationContext to thread overrides through all sync phases
- Fixed sandbox repo path (/home/user/repo for E2B compatibility)
- Fixed buildSessionDetail to skip transcript fetch for pending sessions
- Added process crash protection (uncaughtException/unhandledRejection)
- Fixed React infinite render loop in mock-layout useEffect dependencies
- Added sandbox listProcesses error handling for expired E2B sandboxes
- Set E2B sandbox timeout to 1 hour (was 5 min default)
- Updated CLAUDE.md with lazy task creation rules, no-silent-catch policy,
React hook dependency safety rules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix E2B sandbox timeout comment, frontend stability, and create-flow improvements
- Add TEMPORARY comment on E2B timeoutMs with pointer to rivetkit sandbox
resilience proposal for when autoPause lands
- Fix React useEffect dependency stability in mock-layout and
organization-dashboard to prevent infinite re-render loops
- Fix terminal-pane ref handling
- Improve create-flow service and tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13 KiB
Foundry Agent Handoff
Baseline
- Repo:
rivet-dev/sandbox-agent - Branch:
columbus-v2 - Last pushed commit:
3174fe73(feat(foundry): checkpoint actor and workspace refactor) - Progress/spec tracker: FOUNDRY-CHANGES.md
What is already landed
These spec slices are already implemented and pushed:
- Item
1: backend actor renameauth-user->user - Item
2: Better Auth mapping comments - Item
5: task raw SQL cleanup into migrations - Item
6:history->audit-log - Item
7: default model moved to user-scoped app state - Item
20: admin action prefixing - Item
23: deadgetTaskEnriched/enrichTaskRecordremoval - Item
25:Workbench->Workspacerename across backend/shared/client/frontend - Item
26: branch rename deleted - Organization realtime was already collapsed to full-snapshot
organizationUpdated - Task realtime was already aligned to
taskUpdated
Known blocker
Spec item 3 is only partially done. The singleton constraint for the Better Auth user table is still blocked.
- File: foundry/packages/backend/src/actors/user/db/schema.ts
- Reason: Better Auth still depends on external string
user.id, so a literal singletonCHECK (id = 1)on that table is not a safe mechanical change.
Important current state
There are uncommitted edits on top of the pushed checkpoint. Another agent should start from the current worktree, not just origin/columbus-v2.
Current dirty files:
- foundry/packages/backend/src/actors/github-data/index.ts
- foundry/packages/backend/src/actors/organization/actions.ts
- foundry/packages/backend/src/actors/repository/actions.ts
- foundry/packages/backend/src/actors/task/workspace.ts
- foundry/packages/client/src/mock/backend-client.ts
These files are the current hot path for the unfinished structural work.
What is partially in place but not finished
User-owned task UI state
The user actor already has the schema and CRUD surface for per-user task/session UI state:
- foundry/packages/backend/src/actors/user/db/schema.ts
user_task_state - foundry/packages/backend/src/actors/user/index.ts
getTaskState,upsertTaskState,deleteTaskState
But the task actor and UI are still reading/writing the old task-global fields:
- foundry/packages/backend/src/actors/task/db/schema.ts
still contains
task_runtime.active_session_idand sessionunread/draft_* - foundry/packages/backend/src/actors/task/workspace.ts still derives unread/draft/active-session from task-local rows
- foundry/packages/frontend/src/components/mock-layout.tsx
still treats
activeSessionIdas frontend-local and uses task-level unread/draft state
So items 21, 22, 24, and part of 19 are only half-done.
Coordinator ownership
The current architecture still violates the intended coordinator pattern:
- Organization still owns
taskLookupandtaskSummaries - Organization still resolves
taskId -> repoId - Task still pushes summary updates to organization instead of repository
- Repository still does not own a
tasksprojection table yet
So items 9, 13, and 15 are still open.
Queue-only mutations
Task actor workspace commands already go through queue sends. Other actors still do not fully follow the queue-only mutation rule:
- foundry/packages/backend/src/actors/user/index.ts
- foundry/packages/backend/src/actors/github-data/index.ts
- foundry/packages/backend/src/actors/organization/actions.ts
- foundry/packages/backend/src/actors/organization/app-shell.ts
So items 4, 10, and 11 are still open.
Dynamic model/agent data
The frontend/client still hardcode model groups:
- foundry/packages/frontend/src/components/mock-layout/view-model.ts
- foundry/packages/client/src/workspace-model.ts
- foundry/packages/shared/src/workspace.ts
WorkspaceModelIdis still a hardcoded union
The repo already has the API source of truth available through the TypeScript SDK:
- sdks/typescript/src/client.ts
SandboxAgent.listAgents({ config: true }) - server/packages/sandbox-agent/src/router.rs
/v1/agents - server/packages/sandbox-agent/src/router/support.rs
fallback_config_options
So item 8 is still open.
GitHub sync chunking/progress
GitHub data sync is still a delete-and-replace flow:
- foundry/packages/backend/src/actors/github-data/index.ts
replaceRepositories,replaceBranches,replaceMembers,replacePullRequests, and full-sync flow - foundry/packages/backend/src/actors/github-data/db/schema.ts no generation/progress columns yet
- foundry/packages/shared/src/app-shell.ts no structured sync progress field yet
So item 16 is still open.
Recommended next order
If another agent picks this up, this is the safest order:
-
Finish items
21,22,24,19together. Reason: user-owned task UI state is already half-wired, and task schema cleanup depends on the same files. -
Finish items
9,13,15together. Reason: coordinator ownership, repo-owned task projections, and PR/task unification are the same refactor seam. -
Finish item
16. Reason: GitHub sync chunking is mostly isolated togithub-dataplus app-shell/shared snapshot wiring. -
Finish item
8. Reason: dynamic model/agent data is largely independent once user default model is already user-scoped. -
Finish items
4,10,11,12,18, final event audit. -
Do item
17last.
Concrete file hotspots for the next agent
Backend:
- foundry/packages/backend/src/actors/task/workspace.ts
- foundry/packages/backend/src/actors/task/db/schema.ts
- foundry/packages/backend/src/actors/task/workflow/common.ts
- foundry/packages/backend/src/actors/task/workflow/commands.ts
- foundry/packages/backend/src/actors/task/workflow/init.ts
- foundry/packages/backend/src/actors/repository/actions.ts
- foundry/packages/backend/src/actors/repository/db/schema.ts
- foundry/packages/backend/src/actors/organization/actions.ts
- foundry/packages/backend/src/actors/github-data/index.ts
- foundry/packages/backend/src/actors/user/index.ts
Shared/client/frontend:
- foundry/packages/shared/src/workspace.ts
- foundry/packages/shared/src/contracts.ts
- foundry/packages/shared/src/app-shell.ts
- foundry/packages/client/src/backend-client.ts
- foundry/packages/client/src/workspace-model.ts
- foundry/packages/frontend/src/components/mock-layout.tsx
- foundry/packages/frontend/src/components/mock-layout/view-model.ts
- foundry/packages/frontend/src/features/tasks/status.ts
Notes that matter
- The pushed checkpoint is useful, but it is not the full current state. There are uncommitted edits in the hot-path backend files listed above.
- The current tree already contains a partially added
user_task_statepath. Do not duplicate that work; finish the migration by removing the old task-owned fields and rewiring readers/writers. - The current task actor still reads mutable fields from
c.statesuch asrepoRemote,branchName,title,task,sandboxProviderId, andagentType. That is part of item19. - The current frontend still synthesizes PR-only rows into fake tasks. That should go away as part of repo-owned task projection / PR unification.