- proposal-revert-actions-to-queues.md: Detailed plan for reverting the actions-only pattern back to queues/workflows now that the RivetKit queue.iter() bug is fixed. Lists what to keep (lazy tasks, resolveTaskRepoId, sync override threading, E2B fixes, frontend fixes) vs what to revert (communication pattern only). - proposal-rivetkit-sandbox-resilience.md: Rivetkit sandbox actor changes for handling destroyed/paused sandboxes, keep-alive, and the UNIQUE constraint crash fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8.5 KiB
Proposal: Revert Actions-Only Pattern Back to Queues/Workflows
Background
We converted all actors from queue/workflow-based communication to direct actions as a workaround for a RivetKit bug where c.queue.iter() deadlocked for actors created from another actor's context. That bug has since been fixed in RivetKit. We want to revert to queues/workflows because they provide better observability (workflow history in the inspector), replay/recovery semantics, and are the idiomatic RivetKit pattern.
Reference branches
mainat commit32f3c6c3— the original queue/workflow code BEFORE the actions refactorqueues-to-actions— the current refactored code using direct actionstask-owner-git-authat commitf45a4674— the merged PR #262 that introduced the actions pattern
Use main as the reference for the queue/workflow communication patterns. Use queues-to-actions as the reference for bug fixes and new features that MUST be preserved.
What to KEEP (do NOT revert these)
These are bug fixes and improvements made during the actions refactor that are independent of the communication pattern:
1. Lazy task actor creation
- Virtual task entries in org's
taskIndex+taskSummariestables (no actor fan-out during PR sync) refreshTaskSummaryForBranchMutationwrites directly to org tables instead of spawning task actors- Task actors self-initialize in
getCurrentRecord()fromgetTaskIndexEntrywhen lazily created getTaskIndexEntryaction on org actor- See CLAUDE.md "Lazy Task Actor Creation" section
2. resolveTaskRepoId replacing requireRepoExists
requireRepoExistswas removed — it did a cross-actor call from org to github-data that was fragile- Replaced with
resolveTaskRepoIdwhich reads from the org's localtaskIndextable getTaskaction resolvesrepoIdfrom task index when not provided (sandbox actor only has taskId)
3. getOrganizationContext overrides threaded through sync phases
fullSyncBranchBatch,fullSyncMembers,fullSyncPullRequestBatchnow passconnectedAccount,installationStatus,installationIdoverrides fromFullSyncConfig- Without this, phases 2-4 fail with "Organization not initialized" when the org profile doesn't exist yet (webhook-triggered sync before user sign-in)
4. E2B sandbox fixes
timeoutMs: 60 * 60 * 1000in E2B create options (TEMPORARY until rivetkit autoPause lands)- Sandbox repo path uses
/home/user/repofor E2B compatibility listProcesseserror handling for expired E2B sandboxes
5. Frontend fixes
- React
useEffectdependency stability inmock-layout.tsxandorganization-dashboard.tsx(prevents infinite re-render loops) - Terminal pane ref handling
6. Process crash protection
process.on("uncaughtException")andprocess.on("unhandledRejection")handlers infoundry/packages/backend/src/index.ts
7. CLAUDE.md updates
- All new sections: lazy task creation rules, no-silent-catch policy, React hook dependency safety, dev workflow instructions, debugging section
8. requireWorkspaceTask uses getOrCreate
- User-initiated actions (createSession, sendMessage, etc.) use
getOrCreateto lazily materialize virtual tasks - The
getOrCreatecall passes{ organizationId, repoId, taskId }ascreateWithInput
9. getTask uses getOrCreate with resolveTaskRepoId
- When
repoIdis not provided (sandbox actor), resolves from task index - Uses
getOrCreatesince the task may be virtual
10. Audit log deleted workflow file
foundry/packages/backend/src/actors/audit-log/workflow.tswas deleted- The audit-log actor was simplified to a single
appendaction - Keep this simplification — audit-log doesn't need a workflow
What to REVERT (communication pattern only)
For each actor, revert from direct action calls back to queue sends with expectQueueResponse / fire-and-forget patterns. The reference for the queue patterns is main at 32f3c6c3.
1. Organization actor (foundry/packages/backend/src/actors/organization/)
index.ts:
- Revert from actions-only to
run: workflow(runOrganizationWorkflow) - Keep the actions that are pure reads (getAppSnapshot, getOrganizationSummarySnapshot, etc.)
- Mutations should go through the workflow queue command loop
workflow.ts:
- Restore
runOrganizationWorkflowwith thectx.loop("organization-command-loop", ...)that dispatches queue names to mutation handlers - Restore
ORGANIZATION_QUEUE_NAMESandCOMMAND_HANDLERS - Restore
organizationWorkflowQueueName()helper
app-shell.ts:
- Revert direct action calls back to queue sends:
sendOrganizationCommand(org, "organization.command.X", body)pattern - Revert
githubData.syncRepos(...)→githubData.send(githubDataWorkflowQueueName("syncRepos"), ...) - But KEEP the
getOrganizationContextoverride threading fix
actions/tasks.ts:
- Keep
resolveTaskRepoId(replacingrequireRepoExists) - Keep
requireWorkspaceTaskusinggetOrCreate - Keep
getTaskusinggetOrCreatewithresolveTaskRepoId - Keep
getTaskIndexEntry - Revert task actor calls from direct actions to queue sends where applicable
actions/task-mutations.ts:
- Keep lazy task creation (virtual entries in org tables)
- Revert
taskHandle.initialize(...)→taskHandle.send(taskWorkflowQueueName("task.command.initialize"), ...) - Revert
task.pullRequestSync(...)→task.send(taskWorkflowQueueName("task.command.pullRequestSync"), ...) - Revert
auditLog.append(...)→auditLog.send("auditLog.command.append", ...)
actions/organization.ts:
- Revert direct calls to org workflow back to queue sends
actions/github.ts:
- Revert direct calls back to queue sends
2. Task actor (foundry/packages/backend/src/actors/task/)
index.ts:
- Revert from actions-only to
run: workflow(runTaskWorkflow)(or plainrunwith queue iteration) - Keep read actions:
get,getTaskSummary,getTaskDetail,getSessionDetail
workflow/index.ts:
- Restore
taskCommandActionsas queue handlers in the workflow command loop - Restore
TASK_QUEUE_NAMESand dispatch map
workspace.ts:
- Revert sandbox/org action calls back to queue sends where they were queue-based before
3. User actor (foundry/packages/backend/src/actors/user/)
index.ts:
- Revert from actions-only to
run: workflow(runUserWorkflow)(or plain run with queue iteration)
workflow.ts:
- Restore queue command loop dispatching to mutation functions
4. GitHub-data actor (foundry/packages/backend/src/actors/github-data/)
index.ts:
- Revert from actions-only to having a run handler with queue iteration
- Keep the
getOrganizationContextoverride threading fix - Keep the
actionTimeout: 10 * 60_000for long sync operations
5. Audit-log actor
- Keep as actions-only (simplified). No need to revert — it's simpler with just
append.
6. Callers
foundry/packages/backend/src/services/better-auth.ts:
- Revert direct user actor action calls back to queue sends
foundry/packages/backend/src/actors/sandbox/index.ts:
- Revert
organization.getTask(...)→ queue send if it was queue-based before - Keep the E2B timeout fix and listProcesses error handling
Step-by-step procedure
- Create a new branch from
task-owner-git-auth(current HEAD) - For each actor, open a 3-way comparison:
main(original queues),queues-to-actions(current), and your working copy - Restore queue/workflow run handlers and command loops from
main - Restore queue name helpers and constants from
main - Restore caller sites to use queue sends from
main - Carefully preserve all items in the "KEEP" list above
- Test:
cd foundry && docker compose -f compose.dev.yaml up -d, sign in, verify GitHub sync completes, verify tasks show in sidebar, verify session creation works - Nuke RivetKit data between test runs:
docker volume rm foundry_foundry_rivetkit_storage
Verification checklist
- GitHub sync completes (160 repos for rivet-dev)
- Tasks show in sidebar (from PR sync, lazy/virtual entries)
- No task actors spawned during sync (check RivetKit inspector — should see 0 task actors until user clicks one)
- Clicking a task materializes the actor (lazy creation via getOrCreate)
- Session creation works on sandbox-agent-testing repo
- E2B sandbox provisions and connects
- Agent responds to messages
- No 500 errors in backend logs (except expected E2B sandbox expiry)
- Workflow history visible in RivetKit inspector for org, task, user actors
- CLAUDE.md constraints still documented and respected