You've already forked sandbox-agent

mirror of https://github.com/harivansh-afk/sandbox-agent.git synced 2026-04-15 05:02:11 +00:00

Code Issues Projects Releases Packages Wiki Activity Actions

sandbox-agent/.context/proposal-revert-actions-to-queues.md

Nathan Flurry a171956298 feat(foundry): revert actions to queue/workflow pattern with direct sends

Revert actor communication from direct action calls to queue/workflow-based
patterns for better observability (workflow history in RivetKit inspector),
replay/recovery semantics, and idiomatic RivetKit usage.

- Add queue/workflow infrastructure to all actors: organization, task, user,
  github-data, sandbox, and audit-log
- Mutations route through named queues processed by workflow command loops
  with ctx.step() wrapping for c.state/c.db access and observability
- Remove command action wrappers (~460 lines) — callers use .send() directly
  to queue names with expectQueueResponse() for wait:true results
- Keep sendPrompt and runProcess as direct sandbox actions (long-running /
  large responses that would block the workflow loop or exceed 128KB limit)
- Fix workspace fire-and-forget calls (enqueueWorkspaceEnsureSession,
  enqueueWorkspaceRefresh) to self-send to task queue instead of calling
  directly outside workflow step context

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-16 18:46:53 -07:00

12 KiB

Raw Blame History

Proposal: Revert Actions-Only Pattern Back to Queues/Workflows

Background

We converted all actors from queue/workflow-based communication to direct actions as a workaround for a RivetKit bug where c.queue.iter() deadlocked for actors created from another actor's context. That bug has since been fixed in RivetKit. We want to revert to queues/workflows because they provide better observability (workflow history in the inspector), replay/recovery semantics, and are the idiomatic RivetKit pattern.

Reference branches

main at commit 32f3c6c3 — the original queue/workflow code BEFORE the actions refactor
queues-to-actions — the actions refactor code with bug fixes (E2B, lazy tasks, etc.)
task-owner-git-auth at commit 3684e2e5 — the CURRENT branch with all work including task owner system, lazy tasks, and actions refactor

Use main as the reference for the queue/workflow communication patterns. Use task-owner-git-auth (current HEAD) as the authoritative source for ALL features and bug fixes that MUST be preserved — it has everything from queues-to-actions plus the task owner system.

What to KEEP (do NOT revert these)

These are bug fixes and improvements made during the actions refactor that are independent of the communication pattern:

1. Lazy task actor creation

Virtual task entries in org's taskIndex + taskSummaries tables (no actor fan-out during PR sync)
refreshTaskSummaryForBranchMutation writes directly to org tables instead of spawning task actors
Task actors self-initialize in getCurrentRecord() from getTaskIndexEntry when lazily created
getTaskIndexEntry action on org actor
See CLAUDE.md "Lazy Task Actor Creation" section

2. `resolveTaskRepoId` replacing `requireRepoExists`

requireRepoExists was removed — it did a cross-actor call from org to github-data that was fragile
Replaced with resolveTaskRepoId which reads from the org's local taskIndex table
getTask action resolves repoId from task index when not provided (sandbox actor only has taskId)

3. `getOrganizationContext` overrides threaded through sync phases

fullSyncBranchBatch, fullSyncMembers, fullSyncPullRequestBatch now pass connectedAccount, installationStatus, installationId overrides from FullSyncConfig
Without this, phases 2-4 fail with "Organization not initialized" when the org profile doesn't exist yet (webhook-triggered sync before user sign-in)

4. E2B sandbox fixes

timeoutMs: 60 * 60 * 1000 in E2B create options (TEMPORARY until rivetkit autoPause lands)
Sandbox repo path uses /home/user/repo for E2B compatibility
listProcesses error handling for expired E2B sandboxes

5. Frontend fixes

React useEffect dependency stability in mock-layout.tsx and organization-dashboard.tsx (prevents infinite re-render loops)
Terminal pane ref handling

6. Process crash protection

process.on("uncaughtException") and process.on("unhandledRejection") handlers in foundry/packages/backend/src/index.ts

7. CLAUDE.md updates

All new sections: lazy task creation rules, no-silent-catch policy, React hook dependency safety, dev workflow instructions, debugging section

8. `requireWorkspaceTask` uses `getOrCreate`

User-initiated actions (createSession, sendMessage, etc.) use getOrCreate to lazily materialize virtual tasks
The getOrCreate call passes { organizationId, repoId, taskId } as createWithInput

9. `getTask` uses `getOrCreate` with `resolveTaskRepoId`

When repoId is not provided (sandbox actor), resolves from task index
Uses getOrCreate since the task may be virtual

10. Audit log deleted workflow file

foundry/packages/backend/src/actors/audit-log/workflow.ts was deleted
The audit-log actor was simplified to a single append action
Keep this simplification — audit-log doesn't need a workflow

11. Task owner (primary user) system

New task_owner single-row table in task actor DB schema (foundry/packages/backend/src/actors/task/db/schema.ts) — stores primaryUserId, primaryGithubLogin, primaryGithubEmail, primaryGithubAvatarUrl
New migration in foundry/packages/backend/src/actors/task/db/migrations.ts creating the task_owner table
primaryUserLogin and primaryUserAvatarUrl columns added to org's taskSummaries table (foundry/packages/backend/src/actors/organization/db/schema.ts) + corresponding migration
readTaskOwner(), upsertTaskOwner() helpers in workspace.ts
maybeSwapTaskOwner() — called from sendWorkspaceMessage(), checks if a different user is sending and swaps owner + injects git credentials into sandbox
changeTaskOwnerManually() — called from the new changeOwner action on the task actor, updates owner without injecting credentials (credentials injected on next message from that user)
injectGitCredentials() — pushes git config user.name/email + credential store file into the sandbox via runProcess
resolveGithubIdentity() — resolves user's GitHub login/email/avatar/accessToken from their auth session
buildTaskSummary() now includes primaryUserLogin and primaryUserAvatarUrl in the summary pushed to org coordinator
New changeOwner action on task actor in workflow/index.ts
New changeWorkspaceTaskOwner action on org actor in actions/tasks.ts
New TaskWorkspaceChangeOwnerInput type in shared types (foundry/packages/shared/src/workspace.ts)
TaskSummary type extended with primaryUserLogin and primaryUserAvatarUrl

12. Task owner UI

New "Overview" tab in right sidebar (foundry/packages/frontend/src/components/mock-layout/right-sidebar.tsx) — shows current owner with avatar, click to open dropdown of org members to change owner
onChangeOwner and members props added to RightSidebar component
Primary user login shown in green in left sidebar task items (foundry/packages/frontend/src/components/mock-layout/sidebar.tsx)
changeWorkspaceTaskOwner method added to backend client and workspace client interfaces

13. Client changes for task owner

changeWorkspaceTaskOwner() added to backend-client.ts and all workspace client implementations (mock, remote)
Mock workspace client implements the owner change
Subscription manager test updated for new task summary shape

What to REVERT (communication pattern only)

For each actor, revert from direct action calls back to queue sends with expectQueueResponse / fire-and-forget patterns. The reference for the queue patterns is main at 32f3c6c3.

1. Organization actor (`foundry/packages/backend/src/actors/organization/`)

index.ts:

Revert from actions-only to run: workflow(runOrganizationWorkflow)
Keep the actions that are pure reads (getAppSnapshot, getOrganizationSummarySnapshot, etc.)
Mutations should go through the workflow queue command loop

workflow.ts:

Restore runOrganizationWorkflow with the ctx.loop("organization-command-loop", ...) that dispatches queue names to mutation handlers
Restore ORGANIZATION_QUEUE_NAMES and COMMAND_HANDLERS
Restore organizationWorkflowQueueName() helper

app-shell.ts:

Revert direct action calls back to queue sends: sendOrganizationCommand(org, "organization.command.X", body) pattern
Revert githubData.syncRepos(...) → githubData.send(githubDataWorkflowQueueName("syncRepos"), ...)
But KEEP the getOrganizationContext override threading fix

actions/tasks.ts:

Keep resolveTaskRepoId (replacing requireRepoExists)
Keep requireWorkspaceTask using getOrCreate
Keep getTask using getOrCreate with resolveTaskRepoId
Keep getTaskIndexEntry
Keep changeWorkspaceTaskOwner (new action — delegates to task actor's changeOwner)
Revert task actor calls from direct actions to queue sends where applicable

actions/task-mutations.ts:

Keep lazy task creation (virtual entries in org tables)
Revert taskHandle.initialize(...) → taskHandle.send(taskWorkflowQueueName("task.command.initialize"), ...)
Revert task.pullRequestSync(...) → task.send(taskWorkflowQueueName("task.command.pullRequestSync"), ...)
Revert auditLog.append(...) → auditLog.send("auditLog.command.append", ...)

actions/organization.ts:

Revert direct calls to org workflow back to queue sends

actions/github.ts:

Revert direct calls back to queue sends

2. Task actor (`foundry/packages/backend/src/actors/task/`)

index.ts:

Revert from actions-only to run: workflow(runTaskWorkflow) (or plain run with queue iteration)
Keep read actions: get, getTaskSummary, getTaskDetail, getSessionDetail

workflow/index.ts:

Restore taskCommandActions as queue handlers in the workflow command loop
Restore TASK_QUEUE_NAMES and dispatch map
Add changeOwner to the queue dispatch map (new command, not in main — add as task.command.changeOwner)

workspace.ts:

Revert sandbox/org action calls back to queue sends where they were queue-based before
Keep ALL task owner code: readTaskOwner, upsertTaskOwner, maybeSwapTaskOwner, changeTaskOwnerManually, injectGitCredentials, resolveGithubIdentity
Keep the authSessionId param added to ensureSandboxRepo
Keep the maybeSwapTaskOwner call in sendWorkspaceMessage
Keep primaryUserLogin/primaryUserAvatarUrl in buildTaskSummary

3. User actor (`foundry/packages/backend/src/actors/user/`)

index.ts:

Revert from actions-only to run: workflow(runUserWorkflow) (or plain run with queue iteration)

workflow.ts:

Restore queue command loop dispatching to mutation functions

4. GitHub-data actor (`foundry/packages/backend/src/actors/github-data/`)

index.ts:

Revert from actions-only to having a run handler with queue iteration
Keep the getOrganizationContext override threading fix
Keep the actionTimeout: 10 * 60_000 for long sync operations

5. Audit-log actor

Keep as actions-only (simplified). No need to revert — it's simpler with just append.

6. Callers

foundry/packages/backend/src/services/better-auth.ts:

Revert direct user actor action calls back to queue sends

foundry/packages/backend/src/actors/sandbox/index.ts:

Revert organization.getTask(...) → queue send if it was queue-based before
Keep the E2B timeout fix and listProcesses error handling

Step-by-step procedure

Create a new branch from task-owner-git-auth (current HEAD)
For each actor, open a 3-way comparison: main (original queues), queues-to-actions (current), and your working copy
Restore queue/workflow run handlers and command loops from main
Restore queue name helpers and constants from main
Restore caller sites to use queue sends from main
Carefully preserve all items in the "KEEP" list above
Test: cd foundry && docker compose -f compose.dev.yaml up -d, sign in, verify GitHub sync completes, verify tasks show in sidebar, verify session creation works
Nuke RivetKit data between test runs: docker volume rm foundry_foundry_rivetkit_storage

Verification checklist

GitHub sync completes (160 repos for rivet-dev)
Tasks show in sidebar (from PR sync, lazy/virtual entries)
No task actors spawned during sync (check RivetKit inspector — should see 0 task actors until user clicks one)
Clicking a task materializes the actor (lazy creation via getOrCreate)
Session creation works on sandbox-agent-testing repo
E2B sandbox provisions and connects
Agent responds to messages
No 500 errors in backend logs (except expected E2B sandbox expiry)
Workflow history visible in RivetKit inspector for org, task, user actors
CLAUDE.md constraints still documented and respected
Task owner shows in right sidebar "Overview" tab
Owner dropdown shows org members and allows switching
Sending a message as a different user swaps the owner
Primary user login shown in green on sidebar task items
Git credentials injected into sandbox on owner swap (check /home/user/.git-token exists)

12 KiB Raw Blame History

Proposal: Revert Actions-Only Pattern Back to Queues/Workflows

Background

Reference branches

What to KEEP (do NOT revert these)

1. Lazy task actor creation

2. resolveTaskRepoId replacing requireRepoExists

3. getOrganizationContext overrides threaded through sync phases

4. E2B sandbox fixes

5. Frontend fixes

6. Process crash protection

7. CLAUDE.md updates

8. requireWorkspaceTask uses getOrCreate

9. getTask uses getOrCreate with resolveTaskRepoId

10. Audit log deleted workflow file

11. Task owner (primary user) system

12. Task owner UI

13. Client changes for task owner

What to REVERT (communication pattern only)

1. Organization actor (foundry/packages/backend/src/actors/organization/)

2. Task actor (foundry/packages/backend/src/actors/task/)

3. User actor (foundry/packages/backend/src/actors/user/)

4. GitHub-data actor (foundry/packages/backend/src/actors/github-data/)

5. Audit-log actor

6. Callers

Step-by-step procedure

Verification checklist

12 KiB

Raw Blame History

2. `resolveTaskRepoId` replacing `requireRepoExists`

3. `getOrganizationContext` overrides threaded through sync phases

8. `requireWorkspaceTask` uses `getOrCreate`

9. `getTask` uses `getOrCreate` with `resolveTaskRepoId`

1. Organization actor (`foundry/packages/backend/src/actors/organization/`)

2. Task actor (`foundry/packages/backend/src/actors/task/`)

3. User actor (`foundry/packages/backend/src/actors/user/`)

4. GitHub-data actor (`foundry/packages/backend/src/actors/github-data/`)