# Sandbox Agent Friction Log ## 2026-02-17 - uncommitted ### What I Was Working On Stabilizing Daytona-backed Codex task initialization (`init_create_session`) and diagnosing repeated sandbox-agent `session/new` failures. ### Friction / Issue Two issues compounded each other: 1. The backend added a local `45s` Promise timeout around `sandbox-agent` SDK `createSession()`, but the underlying ACP call is not abortable. Timed-out calls kept running in the background while retries started new session creates, causing overlapping ACP requests and noisy failures. 2. Daytona sandboxes were missing `node`/`npm`/`npx`, while the installed Codex ACP launcher is `npx @zed-industries/codex-acp`. Session initialization could hang/time out because the launcher dependency chain was incomplete. ### Attempted Fix / Workaround 1. Removed the local `45s` timeout wrapper around `SandboxAgent.createSession()` in backend integration. 2. Updated sandbox-instance retry classification to avoid immediate retries for timeout/504 failures, while still retrying quick transient transport failures (502/503/connection reset/refused). 3. Kept Daytona on published `sandbox-agent 0.2.0` and set `SANDBOX_AGENT_ACP_REQUEST_TIMEOUT_MS` via backend env override (`HF_SANDBOX_AGENT_ACP_REQUEST_TIMEOUT_MS`, default `120000`). 4. Updated Daytona bootstrap to install `nodejs` + `npm` (and validate `npx` availability) so `codex-acp` launcher can run. ### Outcome - `createSession` no longer races itself due local timeout. - Timeout errors are surfaced directly instead of hidden behind repeated local timeout retries. - Daytona sandboxes keep published sandbox-agent bootstrap with compatible runtime prerequisites for Codex ACP launch. ## 2026-02-08 - uncommitted ### What I Was Working On Wiring task initialization to create/poll sandbox-agent sessions through provider-resolved endpoints. ### Friction / Issue Local test runs cannot assume a live sandbox-agent backend, so session bootstrap is inherently optional in tests and on clean machines. ### Attempted Fix / Workaround 1. Wrapped session creation in guarded error handling during task initialization. 2. Persisted task state as `queued` when session creation fails, while keeping sandbox metadata written. 3. Continued status tracking through runtime messages when a session is available. ### Outcome - Task creation remains deterministic without hard dependency on a running sandbox-agent process. - Behavior is testable in CI/local environments that do not run sandbox-agent. ## 2026-02-12 - uncommitted ### What I Was Working On Upgrading backend integration from legacy sandbox-agent session endpoints to `sandbox-agent@0.2.0` and validating Daytona-backed execution. ### Friction / Issue `0.2.0` no longer exposes `/v1/sessions` endpoints used by the backend integration; direct session create/status polling via legacy REST paths returns `404`. ### Attempted Fix / Workaround 1. Switched backend integration to `sandbox-agent` SDK (`SandboxAgent.connect`, `createSession`, `getSession`, `getEvents`). 2. Added status inference from SDK state/events for compatibility with existing task status sync actor. 3. Upgraded Daytona provider to install/start `sandbox-agent 0.2.0` in sandboxes and expose a preview endpoint for SDK calls. ### Outcome - Backend no longer depends on removed `/v1/sessions` endpoints. - Daytona flow is aligned with `sandbox-agent 0.2.0` runtime and SDK usage.