mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 22:03:48 +00:00
- Remove @ts-nocheck from better-auth.ts, auth-user/index.ts, app-shell.ts and fix all type errors - Fix getAccessTokenForSession: read GitHub token directly from account record instead of calling Better Auth's internal /get-access-token endpoint which returns 403 on server-side calls - Re-implement workspaceAuth helper functions (workspaceAuthColumn, normalizeAuthValue, workspaceAuthClause, workspaceAuthWhere) that were accidentally deleted - Remove all retry logic (withRetries, isRetryableAppActorError) - Implement CORS origin allowlist from configured environment - Document cachedAppWorkspace singleton pattern - Add inline org sync fallback in buildAppSnapshot for post-OAuth flow - Add no-retry rule to CLAUDE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 KiB
15 KiB
Project Instructions
Language Policy
Use TypeScript for all source code.
- Never add raw JavaScript source files (
.js,.mjs,.cjs). - Prefer
.ts/.tsxfor runtime code, scripts, tests, and tooling. - If touching old JavaScript, migrate it to TypeScript instead of extending it.
Monorepo + Tooling
Use pnpm workspaces and Turborepo.
- Workspace root uses
pnpm-workspace.yamlandturbo.json. - Packages live in
packages/*. coreis renamed toshared.packages/cliis disabled and excluded from active workspace validation.- Integrations and providers live under
packages/backend/src/{integrations,providers}.
CLI Status
packages/cliis fully disabled for active development.- Do not implement new behavior in
packages/cliunless explicitly requested. - Frontend is the primary product surface; prioritize
packages/frontend+ supportingpackages/client/packages/backend. - Workspace
build,typecheck, andtestintentionally exclude@sandbox-agent/foundry-cli. pnpm-workspace.yamlexcludespackages/clifrom workspace package resolution.
Common Commands
- Foundry is the canonical name for this product tree. Do not introduce or preserve legacy pre-Foundry naming in code, docs, commands, or runtime paths.
- Install deps:
pnpm install - Full active-workspace validation:
pnpm -w typecheck,pnpm -w build,pnpm -w test - Start the full dev stack:
just foundry-dev - Start the local production-build preview stack:
just foundry-preview - Start only the backend locally:
just foundry-backend-start - Start only the frontend locally:
pnpm --filter @sandbox-agent/foundry-frontend dev - Start the frontend against the mock workbench client:
FOUNDRY_FRONTEND_CLIENT_MODE=mock pnpm --filter @sandbox-agent/foundry-frontend dev - Stop the compose dev stack:
just foundry-dev-down - Tail compose logs:
just foundry-dev-logs - Stop the preview stack:
just foundry-preview-down - Tail preview logs:
just foundry-preview-logs
Railway Logs
- Production Foundry Railway logs can be read from a linked workspace with
railway logs --deployment --lines 200orrailway logs <deployment-id> --deployment --lines 200. - Production deploys should go through
git pushto the deployment branch/workflow. Do not userailway upfor Foundry deploys. - If Railway logs fail because the workspace is not linked to the correct project/service/environment, run:
railway link --project 33e3e2df-32c5-41c5-a4af-dca8654acb1d --environment cf387142-61fd-4668-8cf7-b3559e0983cb --service 91c7e450-d6d2-481a-b2a4-0a916f4160fc - That links this directory to the
sandbox-agentproject,productionenvironment, andfoundry-apiservice. - Production proxy chain:
api.sandboxagent.devroutes through Cloudflare → Fastly/Varnish → Railway. When debugging request duplication, timeouts, or retry behavior, check headers likecf-ray,x-varnish,x-railway-edge, andcdn-loopto identify which layer is involved.
Frontend + Client Boundary
- Keep a browser-friendly GUI implementation aligned with the TUI interaction model wherever possible.
- Do not import
rivetkitdirectly in CLI or GUI packages. RivetKit client access must stay isolated insidepackages/client. - All backend interaction (actor calls, metadata/health checks, backend HTTP endpoint access) must go through the dedicated client library in
packages/client. - Outside
packages/client, do not call backend endpoints directly (for examplefetch(.../v1/rivet...)), except in black-box E2E tests that intentionally exercise raw transport behavior. - GUI state should update in realtime (no manual refresh buttons). Prefer RivetKit push reactivity and actor-driven events; do not add polling/refetch for normal product flows.
- Keep the mock workbench types and mock client in
packages/shared+packages/clientup to date with the frontend contract. The mock is the UI testing reference implementation while backend functionality catches up. - Keep frontend route/state coverage current in code and tests; there is no separate page-inventory doc to maintain.
- If Foundry uses a shared component from
@sandbox-agent/react, make changes insdks/reactinstead of copying or forking that component into Foundry. - When changing shared React components in
sdks/reactfor Foundry, verify they still work in the Sandbox Agent Inspector before finishing. - When making UI changes, verify the live flow with
agent-browser, take screenshots of the updated UI, and offer to open those screenshots in Preview when you finish. - When asked for screenshots, capture all relevant affected screens and modal states, not just a single viewport. Include empty, populated, success, and blocked/error states when they are part of the changed flow.
- If a screenshot catches a transition frame, blank modal, or otherwise misleading state, retake it before reporting it.
Runtime Policy
- Runtime is Bun-native.
- Use Bun for CLI/backend execution paths and process spawning.
- Do not add Node compatibility fallbacks for OpenTUI/runtime execution.
Defensive Error Handling
- Write code defensively: validate assumptions at boundaries and state transitions.
- If the system reaches an unexpected state, raise an explicit error with actionable context.
- Do not fail silently, swallow errors, or auto-ignore inconsistent data.
- Prefer fail-fast behavior over hidden degradation when correctness is uncertain.
RivetKit Dependency Policy
For all Rivet/RivetKit implementation:
- Use SQLite + Drizzle for persistent state.
- SQLite is per actor instance (per actor key), not a shared backend-global database:
- Each actor instance gets its own SQLite DB.
- Schema design should assume a single actor instance owns the entire DB.
- Do not add
workspaceId/repoId/taskIdcolumns just to "namespace" rows for a given actor instance; use actor state and/or the actor key instead. - Example: the
taskactor instance already represents(workspaceId, repoId, taskId), so its SQLite tables should not need those columns for primary keys.
- Do not use backend-global SQLite singletons; database access must go through actor
dbproviders (c.db). - The default dependency source for RivetKit is the published
rivetkitpackage so workspace installs and CI remain self-contained. - When working on coordinated RivetKit changes, you may temporarily relink to a local checkout instead of the published package.
- Dedicated local checkout for this workspace:
/Users/nathan/conductor/workspaces/task/rivet-checkout - Preferred local link target:
../rivet-checkout/rivetkit-typescript/packages/rivetkit - Sub-packages (
@rivetkit/sqlite-vfs, etc.) resolve transitively from the RivetKit workspace when using the local checkout.
- Dedicated local checkout for this workspace:
- Before using a local checkout, build RivetKit in the rivet repo:
cd ../rivet-checkout/rivetkit-typescript pnpm install pnpm build -F rivetkit
Rivet Routing
- Mount RivetKit directly on
/v1/rivetviaregistry.handler(c.req.raw). - Do not add an extra proxy or manager-specific route layer in the backend.
- Let RivetKit own metadata/public endpoint behavior for
/v1/rivet.
Workspace + Actor Rules
- Everything is scoped to a workspace.
- Workspace resolution order:
--workspaceflag -> config default ->"default". ControlPlaneActoris replaced byWorkspaceActor(workspace coordinator).- Every actor key must be prefixed with workspace namespace (
["ws", workspaceId, ...]). - CLI/TUI/GUI must use
@sandbox-agent/foundry-client(packages/client) for backend access;rivetkit/clientimports are only allowed insidepackages/client. - Do not add custom backend REST endpoints (no
/v1/*shim layer). - We own the sandbox-agent project; treat sandbox-agent defects as first-party bugs and fix them instead of working around them.
- Keep strict single-writer ownership: each table/row has exactly one actor writer.
- Parent actors (
workspace,project,task,history,sandbox-instance) use command-only loops with no timeout. - Periodic syncing lives in dedicated child actors with one timeout cadence each.
- Do not build blocking flows that wait on external systems to become ready or complete. Prefer push-based progression driven by actor messages, events, webhooks, or queue/workflow state changes.
- Use workflows/background commands for any repo sync, sandbox provisioning, agent install, branch restack/rebase, or other multi-step external work. Do not keep user-facing actions/requests open while that work runs.
sendpolicy: alwaysawaitthesend(...)call itself so enqueue failures surface immediately, but default towait: false.- Only use
send(..., { wait: true })for short, bounded mutations that should finish quickly and do not depend on external readiness, polling actors, provider setup, repo/network I/O, or long-running queue drains. - Request/action contract: wait only until the minimum resource needed for the client's next step exists. Example: task creation may wait for task actor creation/identity, but not for sandbox provisioning or session bootstrap.
- Read paths must not force refresh/sync work inline. Serve the latest cached projection, mark staleness explicitly, and trigger background refresh separately when needed.
- If a workflow needs to resume after some external work completes, model that as workflow state plus follow-up messages/events instead of holding the original request open.
- No retries: never add retry loops (
withRetries,setTimeoutretry, exponential backoff) anywhere in the codebase. If an operation fails, surface the error immediately. If a dependency is not ready yet, model that explicitly with workflow state and resume from a push/event instead of polling or retry loops. - Actor handle policy:
- Prefer explicit
getor explicitcreatebased on workflow intent; do not default togetOrCreate. - Use
get/getForIdwhen the actor is expected to already exist; if missing, surface an explicitActor not founderror with recovery context. - Use create semantics only on explicit provisioning/create paths where creating a new actor instance is intended.
getOrCreateis a last resort for create paths when an explicit create API is unavailable; never use it in read/command paths.- For long-lived cross-actor links (for example sandbox/session runtime access), persist actor identity (
actorId) and keep a fallback lookup path by actor id. - Docker dev:
compose.dev.yamlmounts a named volume at/root/.local/share/foundry/reposto persist backend-managed git clones across restarts. Code must still work if this volume is not present (create directories as needed). - RivetKit actor
c.stateis durable, but in Docker it is stored under/root/.local/share/rivetkit. If that path is not persisted, actor state-derived indexes (for example, inprojectactor state) can be lost after container recreation even when other data still exists. - Workflow history divergence policy:
- Production: never auto-delete actor state to resolve
HistoryDivergedError; ship explicit workflow migrations (ctx.removed(...), step compatibility). - Development: manual local state reset is allowed as an operator recovery path when migrations are not yet available.
- Storage rule of thumb:
- Put simple metadata in
c.state(KV state): small scalars and identifiers like{ taskId },{ repoId }, booleans, counters, timestamps, status strings. - If it grows beyond trivial (arrays, maps, histories, query/filter needs, relational consistency), use SQLite + Drizzle in
c.db.
Testing Policy
- Never use vitest mocks (
vi.mock,vi.spyOn,vi.fn). Instead, define driver interfaces for external I/O and pass test implementations via the actor runtime context. - All external service calls (git CLI, GitHub CLI, sandbox-agent HTTP, tmux) must go through the
BackendDriverinterface on the runtime context. - Integration tests use
setupTest()fromrivetkit/testand are gated behindHF_ENABLE_ACTOR_INTEGRATION_TESTS=1. - End-to-end testing must run against the dev backend started via
docker compose -f compose.dev.yaml up(host -> container). Do not run E2E against an in-process test runtime.- E2E tests should talk to the backend over HTTP (default
http://127.0.0.1:7741/v1/rivet) and use real GitHub repos/PRs. - For Foundry live verification, use
rivet-dev/sandbox-agent-testingas the default testing repo unless the task explicitly says otherwise. - Secrets (e.g.
OPENAI_API_KEY,GITHUB_TOKEN/GH_TOKEN) must be provided via environment variables, never hardcoded in the repo. ~/misc/env.txtand~/misc/the-foundry.envcontain the expected local OpenAI + GitHub OAuth/App config for dev.- Do not assume
gh auth tokenis sufficient for Foundry task provisioning against private repos. Sandbox/bootstrap git clone, push, and PR flows require a repo-capableGITHUB_TOKEN/GH_TOKENin the backend container. - Preferred product behavior for org workspaces is to mint a GitHub App installation token from the workspace installation and inject it into backend/sandbox git operations. Do not rely on an operator's ambient CLI auth as the long-term solution.
- E2E tests should talk to the backend over HTTP (default
- Treat client E2E tests in
packages/client/testas the primary end-to-end source of truth for product behavior. - Keep backend tests small and targeted. Only retain backend-only tests for invariants or persistence rules that are not well-covered through client E2E.
- Do not keep large browser E2E suites around in a broken state. If a frontend browser E2E is not maintained and producing signal, remove it until it can be replaced with a reliable test.
Config
- Keep config path at
~/.config/foundry/config.toml. - Evolve properties in place; do not move config location.
Project Guidance
Project-specific guidance lives in README.md, CONTRIBUTING.md, and the relevant files under research/.
Keep those updated when:
- Commands change
- Configuration options change
- Architecture changes
- Plugins/providers change
- Actor ownership changes
Friction Logs
Track friction at:
research/friction/rivet.mdxresearch/friction/sandbox-agent.mdxresearch/friction/sandboxes.mdxresearch/friction/general.mdx
Category mapping:
rivet: Rivet/RivetKit runtime, actor model, queues, keyssandbox-agent: sandbox-agent SDK/API behaviorsandboxes: provider implementations (worktree/daytona/etc)general: everything else
Each entry must include:
- Date (
YYYY-MM-DD) - Commit SHA (or
uncommitted) - What you were implementing
- Friction/issue
- Attempted fix/workaround and outcome
History Events
Log notable workflow changes to events so hf history remains complete:
- create
- attach
- push/sync/merge
- archive/kill
- status transitions
- PR state transitions
Validation After Changes
Always run and fix failures:
pnpm -w typecheck
pnpm -w build
pnpm -w test
After making code changes, always update the dev server before declaring the work complete. If the dev stack is running through Docker Compose, restart or recreate the relevant dev services so the running app reflects the latest code.