chore: remove .context/ from git and add to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:00:32 +00:00 · 2026-03-16 17:56:50 -07:00 · 2026-03-16 17:56:50 -07:00 · 4252c705df
commit 4252c705df
parent 33821d8660
25 changed files with 1 additions and 1666 deletions
--- a/.context/attachments/CleanShot
+++ b/.context/attachments/CleanShot
--- a/.context/attachments/PR
+++ b/.context/attachments/PR
@ -1,19 +0,0 @@
-The user likes the current state of the code.
-
-There are 27 uncommitted changes.
-The current branch is desktop-use.
-The target branch is origin/main.
-
-There is no upstream branch yet.
-The user requested a PR.
-
-Follow these steps to create a PR:
-
- If you have any skills related to creating PRs, invoke them now. Instructions there should take precedence over these instructions.
- Run `git diff` to review uncommitted changes
- Commit them. Follow any instructions the user gave you about writing commit messages.
- Push to origin.
- Use `git diff origin/main...` to review the PR diff
- Use `gh pr create --base main` to create a PR onto the target branch. Keep the title under 80 characters. Keep the description under five sentences, unless the user instructed you otherwise. Describe not just changes made in this session but ALL changes in the workspace diff.
-
-If any of these steps fail, ask the user for help.
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -1,101 +0,0 @@
-## Code Review Instructions
-
-1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
-
-    - The root CLAUDE.md file, if it exists
-    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
-
-2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
-
-3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
-
-4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
-
-    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
-    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
-
-    Agent 3: Opus bug agent
-    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
-
-    Agent 4: Opus bug agent
-    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
-
-    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
-
-    - Objective bugs that will cause incorrect behavior at runtime
-    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
-
-    We do NOT want:
-
-    - Subjective concerns or "suggestions"
-    - Style preferences not explicitly required by CLAUDE.md
-    - Potential issues that "might" be problems
-    - Anything requiring interpretation or judgment calls
-
-    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
-
-    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
-
-5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
-
-6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
-
-7. Post inline comments for each issue using mcp__conductor__DiffComment:
-
-    **IMPORTANT: Only post ONE comment per unique issue.**
-
-8. Write out a list of issues found, along with the location of the comment. For example:
-
-    <example>
-    ### **#1 Empty input causes crash**
-
-    If the input field is empty when page loads, the app will crash.
-
-    File: src/ui/Input.tsx
-
-    ### **#2 Dead code**
-
-    The getUserData function is now unused. It should be deleted.
-
-    File: src/core/UserData.ts
-    </example>
-
-Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
-
-   Pre-existing issues
-   Something that appears to be a bug but is actually correct
-   Pedantic nitpicks that a senior engineer would not flag
-   Issues that a linter will catch (do not run the linter to verify)
-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
-
-Notes:
-
-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
-
-## Fallback: if you don't have access to subagents
-
-If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
-
-## Fallback: if you don't have access to the workspace diff tool
-
-If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
-
-```bash
-# Get the merge base between this branch and the target
-MERGE_BASE=$(git merge-base origin/main HEAD)
-
-# Get the committed diff against the merge base
-git diff $MERGE_BASE HEAD
-
-# Get any uncommitted changes (staged and unstaged)
-git diff HEAD
-```
-
-Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
-
-No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
-
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -1,101 +0,0 @@
-## Code Review Instructions
-
-1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
-
-    - The root CLAUDE.md file, if it exists
-    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
-
-2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
-
-3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
-
-4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
-
-    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
-    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
-
-    Agent 3: Opus bug agent
-    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
-
-    Agent 4: Opus bug agent
-    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
-
-    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
-
-    - Objective bugs that will cause incorrect behavior at runtime
-    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
-
-    We do NOT want:
-
-    - Subjective concerns or "suggestions"
-    - Style preferences not explicitly required by CLAUDE.md
-    - Potential issues that "might" be problems
-    - Anything requiring interpretation or judgment calls
-
-    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
-
-    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
-
-5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
-
-6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
-
-7. Post inline comments for each issue using mcp__conductor__DiffComment:
-
-    **IMPORTANT: Only post ONE comment per unique issue.**
-
-8. Write out a list of issues found, along with the location of the comment. For example:
-
-    <example>
-    ### **#1 Empty input causes crash**
-
-    If the input field is empty when page loads, the app will crash.
-
-    File: src/ui/Input.tsx
-
-    ### **#2 Dead code**
-
-    The getUserData function is now unused. It should be deleted.
-
-    File: src/core/UserData.ts
-    </example>
-
-Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
-
-   Pre-existing issues
-   Something that appears to be a bug but is actually correct
-   Pedantic nitpicks that a senior engineer would not flag
-   Issues that a linter will catch (do not run the linter to verify)
-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
-
-Notes:
-
-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
-
-## Fallback: if you don't have access to subagents
-
-If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
-
-## Fallback: if you don't have access to the workspace diff tool
-
-If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
-
-```bash
-# Get the merge base between this branch and the target
-MERGE_BASE=$(git merge-base origin/main HEAD)
-
-# Get the committed diff against the merge base
-git diff $MERGE_BASE HEAD
-
-# Get any uncommitted changes (staged and unstaged)
-git diff HEAD
-```
-
-Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
-
-No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
-
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -1,101 +0,0 @@
-## Code Review Instructions
-
-1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
-
-    - The root CLAUDE.md file, if it exists
-    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
-
-2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
-
-3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
-
-4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
-
-    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
-    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
-
-    Agent 3: Opus bug agent
-    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
-
-    Agent 4: Opus bug agent
-    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
-
-    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
-
-    - Objective bugs that will cause incorrect behavior at runtime
-    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
-
-    We do NOT want:
-
-    - Subjective concerns or "suggestions"
-    - Style preferences not explicitly required by CLAUDE.md
-    - Potential issues that "might" be problems
-    - Anything requiring interpretation or judgment calls
-
-    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
-
-    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
-
-5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
-
-6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
-
-7. Post inline comments for each issue using mcp__conductor__DiffComment:
-
-    **IMPORTANT: Only post ONE comment per unique issue.**
-
-8. Write out a list of issues found, along with the location of the comment. For example:
-
-    <example>
-    ### **#1 Empty input causes crash**
-
-    If the input field is empty when page loads, the app will crash.
-
-    File: src/ui/Input.tsx
-
-    ### **#2 Dead code**
-
-    The getUserData function is now unused. It should be deleted.
-
-    File: src/core/UserData.ts
-    </example>
-
-Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
-
-   Pre-existing issues
-   Something that appears to be a bug but is actually correct
-   Pedantic nitpicks that a senior engineer would not flag
-   Issues that a linter will catch (do not run the linter to verify)
-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
-
-Notes:
-
-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
-
-## Fallback: if you don't have access to subagents
-
-If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
-
-## Fallback: if you don't have access to the workspace diff tool
-
-If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
-
-```bash
-# Get the merge base between this branch and the target
-MERGE_BASE=$(git merge-base origin/main HEAD)
-
-# Get the committed diff against the merge base
-git diff $MERGE_BASE HEAD
-
-# Get any uncommitted changes (staged and unstaged)
-git diff HEAD
-```
-
-Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
-
-No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
-
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -1,101 +0,0 @@
-## Code Review Instructions
-
-1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
-
-    - The root CLAUDE.md file, if it exists
-    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
-
-2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
-
-3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
-
-4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
-
-    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
-    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
-
-    Agent 3: Opus bug agent
-    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
-
-    Agent 4: Opus bug agent
-    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
-
-    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
-
-    - Objective bugs that will cause incorrect behavior at runtime
-    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
-
-    We do NOT want:
-
-    - Subjective concerns or "suggestions"
-    - Style preferences not explicitly required by CLAUDE.md
-    - Potential issues that "might" be problems
-    - Anything requiring interpretation or judgment calls
-
-    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
-
-    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
-
-5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
-
-6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
-
-7. Post inline comments for each issue using mcp__conductor__DiffComment:
-
-    **IMPORTANT: Only post ONE comment per unique issue.**
-
-8. Write out a list of issues found, along with the location of the comment. For example:
-
-    <example>
-    ### **#1 Empty input causes crash**
-
-    If the input field is empty when page loads, the app will crash.
-
-    File: src/ui/Input.tsx
-
-    ### **#2 Dead code**
-
-    The getUserData function is now unused. It should be deleted.
-
-    File: src/core/UserData.ts
-    </example>
-
-Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
-
-   Pre-existing issues
-   Something that appears to be a bug but is actually correct
-   Pedantic nitpicks that a senior engineer would not flag
-   Issues that a linter will catch (do not run the linter to verify)
-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
-
-Notes:
-
-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
-
-## Fallback: if you don't have access to subagents
-
-If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
-
-## Fallback: if you don't have access to the workspace diff tool
-
-If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
-
-```bash
-# Get the merge base between this branch and the target
-MERGE_BASE=$(git merge-base origin/main HEAD)
-
-# Get the committed diff against the merge base
-git diff $MERGE_BASE HEAD
-
-# Get any uncommitted changes (staged and unstaged)
-git diff HEAD
-```
-
-Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
-
-No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
-
--- a/.context/attachments/plan.md
+++ b/.context/attachments/plan.md
@ -1,215 +0,0 @@
-# Desktop Computer Use API Enhancements
-
-## Context
-
-Competitive analysis of Daytona, Cloudflare Sandbox SDK, and CUA revealed significant gaps in our desktop computer use API. Both Daytona and Cloudflare have or are building screenshot compression, hotkey combos, mouseDown/mouseUp, keyDown/keyUp, per-component process health, and live desktop streaming. CUA additionally has window management and accessibility trees. We have none of these. This plan closes the most impactful gaps across 7 tasks.
-
-## Execution Order
-
-```
-Sprint 1 (parallel, no dependencies):  Tasks 1, 2, 3, 4
-Sprint 2 (foundational refactor):      Task 5
-Sprint 3 (parallel, depend on #5):     Tasks 6, 7
-```
-
---
-
-## Task 1: Unify keyboard press with object modifiers
-
-**What**: Change `DesktopKeyboardPressRequest` to accept a `modifiers` object instead of requiring DSL strings like `"ctrl+c"`.
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopKeyModifiers { ctrl, shift, alt, cmd }` struct (all `Option<bool>`). Add `modifiers: Option<DesktopKeyModifiers>` to `DesktopKeyboardPressRequest`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Modify `press_key_args()` (~line 1349) to build xdotool key string from modifiers object. If modifiers present, construct `"ctrl+shift+a"` style string. `cmd` maps to `super`.
- `server/packages/sandbox-agent/src/router.rs` — Add `DesktopKeyModifiers` to OpenAPI schemas list.
- `docs/openapi.json` — Regenerate.
-
-**Backward compatible**: Old `{"key": "ctrl+a"}` still works. New form: `{"key": "a", "modifiers": {"ctrl": true}}`.
-
-**Test**: Unit test that `press_key_args("a", Some({ctrl: true, shift: true}))` produces `["key", "--", "ctrl+shift+a"]`. Integration test with both old and new request shapes.
-
---
-
-## Task 2: Add mouseDown/mouseUp and keyDown/keyUp endpoints
-
-**What**: 4 new endpoints for low-level press/release control.
-
-**Endpoints**:
- `POST /v1/desktop/mouse/down` — `xdotool mousedown BUTTON` (optional x,y moves first)
- `POST /v1/desktop/mouse/up` — `xdotool mouseup BUTTON`
- `POST /v1/desktop/keyboard/down` — `xdotool keydown KEY`
- `POST /v1/desktop/keyboard/up` — `xdotool keyup KEY`
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopMouseDownRequest`, `DesktopMouseUpRequest` (x/y optional, button optional), `DesktopKeyboardDownRequest`, `DesktopKeyboardUpRequest` (key: String).
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add 4 public methods following existing `click_mouse()` / `press_key()` patterns.
- `server/packages/sandbox-agent/src/router.rs` — Add 4 routes, 4 handlers with utoipa annotations.
- `sdks/typescript/src/client.ts` — Add `mouseDownDesktop()`, `mouseUpDesktop()`, `keyDownDesktop()`, `keyUpDesktop()`.
- `docs/openapi.json` — Regenerate.
-
-**Test**: Integration test: mouseDown → mousemove → mouseUp sequence. keyDown → keyUp sequence.
-
---
-
-## Task 3: Screenshot compression
-
-**What**: Add format, quality, and scale query params to screenshot endpoints.
-
-**Params**: `format` (png|jpeg|webp, default png), `quality` (1-100, default 85), `scale` (0.1-1.0, default 1.0).
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopScreenshotFormat` enum. Add `format`, `quality`, `scale` fields to `DesktopScreenshotQuery` and `DesktopRegionScreenshotQuery`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — After capturing PNG via `import`, pipe through ImageMagick `convert` if format != png or scale != 1.0: `convert png:- -resize {scale*100}% -quality {quality} {format}:-`. Add a `run_command_with_stdin()` helper (or modify existing `run_command_output`) to pipe bytes into a command's stdin.
- `server/packages/sandbox-agent/src/router.rs` — Modify screenshot handlers to pass format/quality/scale, return dynamic `Content-Type` header.
- `sdks/typescript/src/client.ts` — Update `takeDesktopScreenshot()` to accept format/quality/scale.
- `docs/openapi.json` — Regenerate.
-
-**Dependencies**: ImageMagick `convert` already installed in Docker. Verify WebP delegate availability.
-
-**Test**: Integration tests: request `?format=jpeg&quality=50`, verify `Content-Type: image/jpeg` and JPEG magic bytes. Verify default still returns PNG. Verify `?scale=0.5` returns a smaller image.
-
---
-
-## Task 4: Window listing API
-
-**What**: New endpoint to list open windows.
-
-**Endpoint**: `GET /v1/desktop/windows`
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopWindowInfo { id, title, x, y, width, height, is_active }` and `DesktopWindowListResponse`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add `list_windows()` method using xdotool (already installed):
-  1. `xdotool search --onlyvisible --name ""` → window IDs
-  2. `xdotool getwindowname {id}` + `xdotool getwindowgeometry {id}` per window
-  3. `xdotool getactivewindow` → is_active flag
-  4. Add `parse_window_geometry()` helper.
- `server/packages/sandbox-agent/src/router.rs` — Add route, handler, OpenAPI annotations.
- `sdks/typescript/src/client.ts` — Add `listDesktopWindows()`.
- `docs/openapi.json` — Regenerate.
-
-**No new Docker dependencies** — xdotool already installed.
-
-**Test**: Integration test: start desktop, verify `GET /v1/desktop/windows` returns 200 with a list (may be empty if no GUI apps open, which is fine).
-
---
-
-## Task 5: Unify desktop processes into process runtime with owner flag
-
-**What**: Desktop processes (Xvfb, openbox, dbus) get registered in the general process runtime with an `owner` field, gaining log streaming, SSE, and unified lifecycle for free.
-
-**Files**:
-
- `server/packages/sandbox-agent/src/process_runtime.rs`:
-  - Add `ProcessOwner` enum: `User`, `Desktop`, `System`.
-  - Add `RestartPolicy` enum: `Never`, `Always`, `OnFailure`.
-  - Add `owner: ProcessOwner` and `restart_policy: Option<RestartPolicy>` to `ProcessStartSpec`, `ManagedProcess`, and `ProcessSnapshot`.
-  - Modify `list_processes()` to accept optional owner filter.
-  - Add auto-restart logic in `watch_exit()`: if restart_policy is Always (or OnFailure and exit code != 0), re-spawn the process using stored spec. Need to store the original `ProcessStartSpec` on `ManagedProcess`.
-
- `server/packages/sandbox-agent/src/router/types.rs`:
-  - Add `owner` to `ProcessInfo` response.
-  - Add `ProcessListQuery { owner: Option<ProcessOwner> }`.
-
- `server/packages/sandbox-agent/src/router.rs`:
-  - Modify `get_v1_processes` to accept `Query<ProcessListQuery>` and filter.
-  - Pass `ProcessRuntime` into `DesktopRuntime::new()`.
-  - Add `ProcessOwner`, `RestartPolicy` to OpenAPI schemas.
-
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — **Major refactor**:
-  - Remove `ManagedDesktopChild` struct.
-  - `DesktopRuntime` takes `ProcessRuntime` as constructor param.
-  - `start_xvfb_locked()` and `start_openbox_locked()` call `process_runtime.start_process(ProcessStartSpec { owner: Desktop, restart_policy: Some(Always), ... })` instead of spawning directly.
-  - Store returned process IDs in state instead of `Child` handles.
-  - `stop` calls `process_runtime.stop_process()` / `kill_process()`.
-  - `processes_locked()` queries process runtime for desktop-owned processes.
-  - dbus-launch remains a direct one-shot spawn (it's not a long-running process, just produces env vars).
-
- `sdks/typescript/src/client.ts` — Add `owner` filter option to `listProcesses()`.
- `docs/openapi.json` — Regenerate.
-
-**Risks**:
- Lock ordering: desktop runtime holds Mutex, process runtime uses RwLock. Release desktop Mutex before calling process runtime, or restructure.
- `log_path` field in `DesktopProcessInfo` no longer applies (logs are in-memory now). Remove or deprecate.
-
-**Test**: Integration: start desktop, `GET /v1/processes?owner=desktop` returns Xvfb+openbox. `GET /v1/processes?owner=user` excludes them. Desktop process logs are streamable via `GET /v1/processes/{id}/logs?follow=true`. Existing desktop lifecycle tests still pass.
-
---
-
-## Task 6: Screen recording API (ffmpeg x11grab)
-
-**What**: 6 endpoints for recording the desktop to MP4.
-
-**Endpoints**:
- `POST /v1/desktop/recording/start` — Start ffmpeg recording
- `POST /v1/desktop/recording/stop` — Stop recording (SIGTERM → wait → SIGKILL)
- `GET /v1/desktop/recordings` — List recordings
- `GET /v1/desktop/recordings/{id}` — Get recording metadata
- `GET /v1/desktop/recordings/{id}/download` — Serve MP4 file
- `DELETE /v1/desktop/recordings/{id}` — Delete recording
-
-**Files**:
- **New**: `server/packages/sandbox-agent/src/desktop_recording.rs` — Recording state, ffmpeg process management. `start_recording()` spawns ffmpeg via process runtime (owner=Desktop): `ffmpeg -f x11grab -video_size WxH -i :99 -c:v libx264 -preset ultrafast -r 30 {path}`. Recordings stored in `{state_dir}/recordings/`.
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add recording request/response types.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Wire recording manager, expose through desktop runtime.
- `server/packages/sandbox-agent/src/router.rs` — Add 6 routes + handlers.
- `server/packages/sandbox-agent/src/desktop_install.rs` — Add `ffmpeg` to dependency detection (soft: only error when recording is requested).
- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add `ffmpeg` to apt-get.
- `sdks/typescript/src/client.ts` — Add 6 recording methods.
- `docs/openapi.json` — Regenerate.
-
-**Depends on**: Task 5 (ffmpeg runs as desktop-owned process).
-
-**Test**: Integration: start desktop → start recording → wait 2s → stop → list → download (verify MP4 magic bytes) → delete.
-
---
-
-## Task 7: Neko WebRTC desktop streaming + React component
-
-**What**: Integrate neko for WebRTC desktop streaming, mirroring the ProcessTerminal + Ghostty pattern.
-
-### Server side
-
- **New**: `server/packages/sandbox-agent/src/desktop_streaming.rs` — Manages neko process via process runtime (owner=Desktop). Neko connects to existing Xvfb display, runs GStreamer pipeline for H.264 encoding.
- `server/packages/sandbox-agent/src/router.rs`:
-  - `GET /v1/desktop/stream/ws` — WebSocket proxy to neko's internal WebSocket. Upgrade request, bridge bidirectionally.
-  - `POST /v1/desktop/stream/start` / `POST /v1/desktop/stream/stop` — Lifecycle control.
- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add neko binary + GStreamer packages (`gstreamer1.0-plugins-base`, `gstreamer1.0-plugins-good`, `gstreamer1.0-x`, `libgstreamer1.0-0`). Consider making this an optional Docker stage to avoid bloating the base image.
-
-### TypeScript SDK
-
- **New**: `sdks/typescript/src/desktop-stream.ts` — `DesktopStreamSession` class ported from neko's `base.ts` (~500 lines):
-  - WebSocket for signaling (SDP offer/answer, ICE candidates)
-  - `RTCPeerConnection` for video stream
-  - `RTCDataChannel` for binary input (mouse: 7 bytes, keyboard: 11 bytes)
-  - Events: `onTrack(stream)`, `onConnect()`, `onDisconnect()`, `onError()`
- `sdks/typescript/src/client.ts` — Add `connectDesktopStream()` returning `DesktopStreamSession`, `buildDesktopStreamWebSocketUrl()`, `startDesktopStream()`, `stopDesktopStream()`.
- `sdks/typescript/src/index.ts` — Export `DesktopStreamSession`.
-
-### React SDK
-
- **New**: `sdks/react/src/DesktopViewer.tsx` — Following `ProcessTerminal.tsx` pattern:
-  ```
-  Props: client (Pick<SandboxAgent, 'connectDesktopStream'>), height, className, style, onConnect, onDisconnect, onError
-  ```
-  - `useEffect` → `client.connectDesktopStream()` → wire `onTrack` to `<video>.srcObject`
-  - Capture mouse events on video element → scale coordinates to desktop resolution → send via DataChannel
-  - Capture keyboard events → send via DataChannel
-  - Connection state indicator
-  - Cleanup: close RTCPeerConnection, close WebSocket
- `sdks/react/src/index.ts` — Export `DesktopViewer`.
-
-**Depends on**: Task 5 (neko runs as desktop-owned process).
-
-**Test**: Server integration: start stream, connect WebSocket, verify signaling messages flow. React: component mounts/unmounts without errors. Full E2E requires browser (manual initially).
-
---
-
-## Verification
-
-After all tasks:
-1. `cargo test` — All Rust unit tests pass
-2. `cargo test --test v1_api` — All integration tests pass (requires Docker)
-3. Regenerate `docs/openapi.json` and verify it reflects all new endpoints
-4. Build TypeScript SDK: `cd sdks/typescript && pnpm build`
-5. Build React SDK: `cd sdks/react && pnpm build`
-6. Manual: start desktop, take JPEG screenshot, list windows, record 5s video, stream desktop via DesktopViewer component
--- a/.context/docker-test-image.stamp
+++ b/.context/docker-test-image.stamp
--- a/.context/docker-test-zgvGyf/bin/Xvfb
+++ b/.context/docker-test-zgvGyf/bin/Xvfb
@ -1,15 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-display="${1:-:191}"
-number="${display#:}"
-socket="/tmp/.X11-unix/X${number}"
-mkdir -p /tmp/.X11-unix
-touch "$socket"
-cleanup() {
-  rm -f "$socket"
-  exit 0
-}
-trap cleanup INT TERM EXIT
-while :; do
-  sleep 1
-done
--- a/.context/docker-test-zgvGyf/bin/dbus-launch
+++ b/.context/docker-test-zgvGyf/bin/dbus-launch
@ -1,4 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-echo "DBUS_SESSION_BUS_ADDRESS=unix:path=/tmp/sandbox-agent-test-bus"
-echo "DBUS_SESSION_BUS_PID=$$"
--- a/.context/docker-test-zgvGyf/bin/import
+++ b/.context/docker-test-zgvGyf/bin/import
@ -1,3 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-printf '\211PNG\r\n\032\n\000\000\000\rIHDR\000\000\000\001\000\000\000\001\010\006\000\000\000\037\025\304\211\000\000\000\013IDATx\234c\000\001\000\000\005\000\001\r\n-\264\000\000\000\000IEND\256B`\202'
--- a/.context/docker-test-zgvGyf/bin/openbox
+++ b/.context/docker-test-zgvGyf/bin/openbox
@ -1,6 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-trap 'exit 0' INT TERM
-while :; do
-  sleep 1
-done
--- a/.context/docker-test-zgvGyf/bin/xdotool
+++ b/.context/docker-test-zgvGyf/bin/xdotool
@ -1,57 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-state_dir="${SANDBOX_AGENT_DESKTOP_FAKE_STATE_DIR:?missing fake state dir}"
-state_file="${state_dir}/mouse"
-mkdir -p "$state_dir"
-if [ ! -f "$state_file" ]; then
-  printf '0 0\n' > "$state_file"
-fi
-
-read_state() {
-  read -r x y < "$state_file"
-}
-
-write_state() {
-  printf '%s %s\n' "$1" "$2" > "$state_file"
-}
-
-command="${1:-}"
-case "$command" in
-  getmouselocation)
-    read_state
-    printf 'X=%s\nY=%s\nSCREEN=0\nWINDOW=0\n' "$x" "$y"
-    ;;
-  mousemove)
-    shift
-    x="${1:-0}"
-    y="${2:-0}"
-    shift 2 || true
-    while [ "$#" -gt 0 ]; do
-      token="$1"
-      shift
-      case "$token" in
-        mousemove)
-          x="${1:-0}"
-          y="${2:-0}"
-          shift 2 || true
-          ;;
-        mousedown|mouseup)
-          shift 1 || true
-          ;;
-        click)
-          if [ "${1:-}" = "--repeat" ]; then
-            shift 2 || true
-          fi
-          shift 1 || true
-          ;;
-      esac
-    done
-    write_state "$x" "$y"
-    ;;
-  type|key)
-    exit 0
-    ;;
-  *)
-    exit 0
-    ;;
-esac
--- a/.context/docker-test-zgvGyf/bin/xrandr
+++ b/.context/docker-test-zgvGyf/bin/xrandr
@ -1,5 +0,0 @@
-#!/usr/bin/env sh
-set -eu
-cat <<'EOF'
-Screen 0: minimum 1 x 1, current 1440 x 900, maximum 32767 x 32767
-EOF
--- a/.context/docker-test-zgvGyf/xdg-data/Library/Application
+++ b/.context/docker-test-zgvGyf/xdg-data/Library/Application
@ -1,111 +0,0 @@
-#!/usr/bin/env node
-const { createInterface } = require("node:readline");
-
-let nextSession = 0;
-
-function emit(value) {
-  process.stdout.write(JSON.stringify(value) + "\n");
-}
-
-function firstText(prompt) {
-  if (!Array.isArray(prompt)) {
-    return "";
-  }
-
-  for (const block of prompt) {
-    if (block && block.type === "text" && typeof block.text === "string") {
-      return block.text;
-    }
-  }
-
-  return "";
-}
-
-const rl = createInterface({
-  input: process.stdin,
-  crlfDelay: Infinity,
-});
-
-rl.on("line", (line) => {
-  let msg;
-  try {
-    msg = JSON.parse(line);
-  } catch {
-    return;
-  }
-
-  const hasMethod = typeof msg?.method === "string";
-  const hasId = Object.prototype.hasOwnProperty.call(msg, "id");
-  const method = hasMethod ? msg.method : undefined;
-
-  if (method === "session/prompt") {
-    const sessionId = typeof msg?.params?.sessionId === "string" ? msg.params.sessionId : "";
-    const text = firstText(msg?.params?.prompt);
-    emit({
-      jsonrpc: "2.0",
-      method: "session/update",
-      params: {
-        sessionId,
-        update: {
-          sessionUpdate: "agent_message_chunk",
-          content: {
-            type: "text",
-            text: "mock: " + text,
-          },
-        },
-      },
-    });
-  }
-
-  if (!hasMethod || !hasId) {
-    return;
-  }
-
-  if (method === "initialize") {
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        protocolVersion: 1,
-        capabilities: {},
-        serverInfo: {
-          name: "mock-acp-agent",
-          version: "0.0.1",
-        },
-      },
-    });
-    return;
-  }
-
-  if (method === "session/new") {
-    nextSession += 1;
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        sessionId: "mock-session-" + nextSession,
-      },
-    });
-    return;
-  }
-
-  if (method === "session/prompt") {
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        stopReason: "end_turn",
-      },
-    });
-    return;
-  }
-
-  emit({
-    jsonrpc: "2.0",
-    id: msg.id,
-    result: {
-      ok: true,
-      echoedMethod: method,
-    },
-  });
-});
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/bin/agent_processes/mock-acp
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/bin/agent_processes/mock-acp
@ -1,111 +0,0 @@
-#!/usr/bin/env node
-const { createInterface } = require("node:readline");
-
-let nextSession = 0;
-
-function emit(value) {
-  process.stdout.write(JSON.stringify(value) + "\n");
-}
-
-function firstText(prompt) {
-  if (!Array.isArray(prompt)) {
-    return "";
-  }
-
-  for (const block of prompt) {
-    if (block && block.type === "text" && typeof block.text === "string") {
-      return block.text;
-    }
-  }
-
-  return "";
-}
-
-const rl = createInterface({
-  input: process.stdin,
-  crlfDelay: Infinity,
-});
-
-rl.on("line", (line) => {
-  let msg;
-  try {
-    msg = JSON.parse(line);
-  } catch {
-    return;
-  }
-
-  const hasMethod = typeof msg?.method === "string";
-  const hasId = Object.prototype.hasOwnProperty.call(msg, "id");
-  const method = hasMethod ? msg.method : undefined;
-
-  if (method === "session/prompt") {
-    const sessionId = typeof msg?.params?.sessionId === "string" ? msg.params.sessionId : "";
-    const text = firstText(msg?.params?.prompt);
-    emit({
-      jsonrpc: "2.0",
-      method: "session/update",
-      params: {
-        sessionId,
-        update: {
-          sessionUpdate: "agent_message_chunk",
-          content: {
-            type: "text",
-            text: "mock: " + text,
-          },
-        },
-      },
-    });
-  }
-
-  if (!hasMethod || !hasId) {
-    return;
-  }
-
-  if (method === "initialize") {
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        protocolVersion: 1,
-        capabilities: {},
-        serverInfo: {
-          name: "mock-acp-agent",
-          version: "0.0.1",
-        },
-      },
-    });
-    return;
-  }
-
-  if (method === "session/new") {
-    nextSession += 1;
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        sessionId: "mock-session-" + nextSession,
-      },
-    });
-    return;
-  }
-
-  if (method === "session/prompt") {
-    emit({
-      jsonrpc: "2.0",
-      id: msg.id,
-      result: {
-        stopReason: "end_turn",
-      },
-    });
-    return;
-  }
-
-  emit({
-    jsonrpc: "2.0",
-    id: msg.id,
-    result: {
-      ok: true,
-      echoedMethod: method,
-    },
-  });
-});
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/logs/log-03-08-26
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/logs/log-03-08-26
@ -1,4 +0,0 @@
-ts=2026-03-08T07:57:29.140584296Z level=info target=sandbox_agent::telemetry message="anonymous telemetry is enabled, disable with --no-telemetry"
-ts=2026-03-08T07:57:29.141203296Z level=info target=sandbox_agent::cli message="server listening" addr=0.0.0.0:3000
-ts=2026-03-08T07:57:29.298687421Z level=info target=sandbox_agent::router span=http.request span_path=http.request message=request method=GET uri=/v1/health
-ts=2026-03-08T07:57:29.302092338Z level=info target=sandbox_agent::router span=http.request span_path=http.request status="200 OK" latency_ms=3 method=GET uri=/v1/health
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/telemetry_id
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/telemetry_id
@ -1 +0,0 @@
-5a1927c6af3d83586f34112f58e0c8d6
--- a/.context/notes.md
+++ b/.context/notes.md
--- a/.context/plans/desktop-computer-use-api-enhancements.md
+++ b/.context/plans/desktop-computer-use-api-enhancements.md
@ -1,215 +0,0 @@
-# Desktop Computer Use API Enhancements
-
-## Context
-
-Competitive analysis of Daytona, Cloudflare Sandbox SDK, and CUA revealed significant gaps in our desktop computer use API. Both Daytona and Cloudflare have or are building screenshot compression, hotkey combos, mouseDown/mouseUp, keyDown/keyUp, per-component process health, and live desktop streaming. CUA additionally has window management and accessibility trees. We have none of these. This plan closes the most impactful gaps across 7 tasks.
-
-## Execution Order
-
-```
-Sprint 1 (parallel, no dependencies):  Tasks 1, 2, 3, 4
-Sprint 2 (foundational refactor):      Task 5
-Sprint 3 (parallel, depend on #5):     Tasks 6, 7
-```
-
---
-
-## Task 1: Unify keyboard press with object modifiers
-
-**What**: Change `DesktopKeyboardPressRequest` to accept a `modifiers` object instead of requiring DSL strings like `"ctrl+c"`.
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopKeyModifiers { ctrl, shift, alt, cmd }` struct (all `Option<bool>`). Add `modifiers: Option<DesktopKeyModifiers>` to `DesktopKeyboardPressRequest`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Modify `press_key_args()` (~line 1349) to build xdotool key string from modifiers object. If modifiers present, construct `"ctrl+shift+a"` style string. `cmd` maps to `super`.
- `server/packages/sandbox-agent/src/router.rs` — Add `DesktopKeyModifiers` to OpenAPI schemas list.
- `docs/openapi.json` — Regenerate.
-
-**Backward compatible**: Old `{"key": "ctrl+a"}` still works. New form: `{"key": "a", "modifiers": {"ctrl": true}}`.
-
-**Test**: Unit test that `press_key_args("a", Some({ctrl: true, shift: true}))` produces `["key", "--", "ctrl+shift+a"]`. Integration test with both old and new request shapes.
-
---
-
-## Task 2: Add mouseDown/mouseUp and keyDown/keyUp endpoints
-
-**What**: 4 new endpoints for low-level press/release control.
-
-**Endpoints**:
- `POST /v1/desktop/mouse/down` — `xdotool mousedown BUTTON` (optional x,y moves first)
- `POST /v1/desktop/mouse/up` — `xdotool mouseup BUTTON`
- `POST /v1/desktop/keyboard/down` — `xdotool keydown KEY`
- `POST /v1/desktop/keyboard/up` — `xdotool keyup KEY`
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopMouseDownRequest`, `DesktopMouseUpRequest` (x/y optional, button optional), `DesktopKeyboardDownRequest`, `DesktopKeyboardUpRequest` (key: String).
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add 4 public methods following existing `click_mouse()` / `press_key()` patterns.
- `server/packages/sandbox-agent/src/router.rs` — Add 4 routes, 4 handlers with utoipa annotations.
- `sdks/typescript/src/client.ts` — Add `mouseDownDesktop()`, `mouseUpDesktop()`, `keyDownDesktop()`, `keyUpDesktop()`.
- `docs/openapi.json` — Regenerate.
-
-**Test**: Integration test: mouseDown → mousemove → mouseUp sequence. keyDown → keyUp sequence.
-
---
-
-## Task 3: Screenshot compression
-
-**What**: Add format, quality, and scale query params to screenshot endpoints.
-
-**Params**: `format` (png|jpeg|webp, default png), `quality` (1-100, default 85), `scale` (0.1-1.0, default 1.0).
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopScreenshotFormat` enum. Add `format`, `quality`, `scale` fields to `DesktopScreenshotQuery` and `DesktopRegionScreenshotQuery`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — After capturing PNG via `import`, pipe through ImageMagick `convert` if format != png or scale != 1.0: `convert png:- -resize {scale*100}% -quality {quality} {format}:-`. Add a `run_command_with_stdin()` helper (or modify existing `run_command_output`) to pipe bytes into a command's stdin.
- `server/packages/sandbox-agent/src/router.rs` — Modify screenshot handlers to pass format/quality/scale, return dynamic `Content-Type` header.
- `sdks/typescript/src/client.ts` — Update `takeDesktopScreenshot()` to accept format/quality/scale.
- `docs/openapi.json` — Regenerate.
-
-**Dependencies**: ImageMagick `convert` already installed in Docker. Verify WebP delegate availability.
-
-**Test**: Integration tests: request `?format=jpeg&quality=50`, verify `Content-Type: image/jpeg` and JPEG magic bytes. Verify default still returns PNG. Verify `?scale=0.5` returns a smaller image.
-
---
-
-## Task 4: Window listing API
-
-**What**: New endpoint to list open windows.
-
-**Endpoint**: `GET /v1/desktop/windows`
-
-**Files**:
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopWindowInfo { id, title, x, y, width, height, is_active }` and `DesktopWindowListResponse`.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add `list_windows()` method using xdotool (already installed):
-  1. `xdotool search --onlyvisible --name ""` → window IDs
-  2. `xdotool getwindowname {id}` + `xdotool getwindowgeometry {id}` per window
-  3. `xdotool getactivewindow` → is_active flag
-  4. Add `parse_window_geometry()` helper.
- `server/packages/sandbox-agent/src/router.rs` — Add route, handler, OpenAPI annotations.
- `sdks/typescript/src/client.ts` — Add `listDesktopWindows()`.
- `docs/openapi.json` — Regenerate.
-
-**No new Docker dependencies** — xdotool already installed.
-
-**Test**: Integration test: start desktop, verify `GET /v1/desktop/windows` returns 200 with a list (may be empty if no GUI apps open, which is fine).
-
---
-
-## Task 5: Unify desktop processes into process runtime with owner flag
-
-**What**: Desktop processes (Xvfb, openbox, dbus) get registered in the general process runtime with an `owner` field, gaining log streaming, SSE, and unified lifecycle for free.
-
-**Files**:
-
- `server/packages/sandbox-agent/src/process_runtime.rs`:
-  - Add `ProcessOwner` enum: `User`, `Desktop`, `System`.
-  - Add `RestartPolicy` enum: `Never`, `Always`, `OnFailure`.
-  - Add `owner: ProcessOwner` and `restart_policy: Option<RestartPolicy>` to `ProcessStartSpec`, `ManagedProcess`, and `ProcessSnapshot`.
-  - Modify `list_processes()` to accept optional owner filter.
-  - Add auto-restart logic in `watch_exit()`: if restart_policy is Always (or OnFailure and exit code != 0), re-spawn the process using stored spec. Need to store the original `ProcessStartSpec` on `ManagedProcess`.
-
- `server/packages/sandbox-agent/src/router/types.rs`:
-  - Add `owner` to `ProcessInfo` response.
-  - Add `ProcessListQuery { owner: Option<ProcessOwner> }`.
-
- `server/packages/sandbox-agent/src/router.rs`:
-  - Modify `get_v1_processes` to accept `Query<ProcessListQuery>` and filter.
-  - Pass `ProcessRuntime` into `DesktopRuntime::new()`.
-  - Add `ProcessOwner`, `RestartPolicy` to OpenAPI schemas.
-
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — **Major refactor**:
-  - Remove `ManagedDesktopChild` struct.
-  - `DesktopRuntime` takes `ProcessRuntime` as constructor param.
-  - `start_xvfb_locked()` and `start_openbox_locked()` call `process_runtime.start_process(ProcessStartSpec { owner: Desktop, restart_policy: Some(Always), ... })` instead of spawning directly.
-  - Store returned process IDs in state instead of `Child` handles.
-  - `stop` calls `process_runtime.stop_process()` / `kill_process()`.
-  - `processes_locked()` queries process runtime for desktop-owned processes.
-  - dbus-launch remains a direct one-shot spawn (it's not a long-running process, just produces env vars).
-
- `sdks/typescript/src/client.ts` — Add `owner` filter option to `listProcesses()`.
- `docs/openapi.json` — Regenerate.
-
-**Risks**:
- Lock ordering: desktop runtime holds Mutex, process runtime uses RwLock. Release desktop Mutex before calling process runtime, or restructure.
- `log_path` field in `DesktopProcessInfo` no longer applies (logs are in-memory now). Remove or deprecate.
-
-**Test**: Integration: start desktop, `GET /v1/processes?owner=desktop` returns Xvfb+openbox. `GET /v1/processes?owner=user` excludes them. Desktop process logs are streamable via `GET /v1/processes/{id}/logs?follow=true`. Existing desktop lifecycle tests still pass.
-
---
-
-## Task 6: Screen recording API (ffmpeg x11grab)
-
-**What**: 6 endpoints for recording the desktop to MP4.
-
-**Endpoints**:
- `POST /v1/desktop/recording/start` — Start ffmpeg recording
- `POST /v1/desktop/recording/stop` — Stop recording (SIGTERM → wait → SIGKILL)
- `GET /v1/desktop/recordings` — List recordings
- `GET /v1/desktop/recordings/{id}` — Get recording metadata
- `GET /v1/desktop/recordings/{id}/download` — Serve MP4 file
- `DELETE /v1/desktop/recordings/{id}` — Delete recording
-
-**Files**:
- **New**: `server/packages/sandbox-agent/src/desktop_recording.rs` — Recording state, ffmpeg process management. `start_recording()` spawns ffmpeg via process runtime (owner=Desktop): `ffmpeg -f x11grab -video_size WxH -i :99 -c:v libx264 -preset ultrafast -r 30 {path}`. Recordings stored in `{state_dir}/recordings/`.
- `server/packages/sandbox-agent/src/desktop_types.rs` — Add recording request/response types.
- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Wire recording manager, expose through desktop runtime.
- `server/packages/sandbox-agent/src/router.rs` — Add 6 routes + handlers.
- `server/packages/sandbox-agent/src/desktop_install.rs` — Add `ffmpeg` to dependency detection (soft: only error when recording is requested).
- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add `ffmpeg` to apt-get.
- `sdks/typescript/src/client.ts` — Add 6 recording methods.
- `docs/openapi.json` — Regenerate.
-
-**Depends on**: Task 5 (ffmpeg runs as desktop-owned process).
-
-**Test**: Integration: start desktop → start recording → wait 2s → stop → list → download (verify MP4 magic bytes) → delete.
-
---
-
-## Task 7: Neko WebRTC desktop streaming + React component
-
-**What**: Integrate neko for WebRTC desktop streaming, mirroring the ProcessTerminal + Ghostty pattern.
-
-### Server side
-
- **New**: `server/packages/sandbox-agent/src/desktop_streaming.rs` — Manages neko process via process runtime (owner=Desktop). Neko connects to existing Xvfb display, runs GStreamer pipeline for H.264 encoding.
- `server/packages/sandbox-agent/src/router.rs`:
-  - `GET /v1/desktop/stream/ws` — WebSocket proxy to neko's internal WebSocket. Upgrade request, bridge bidirectionally.
-  - `POST /v1/desktop/stream/start` / `POST /v1/desktop/stream/stop` — Lifecycle control.
- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add neko binary + GStreamer packages (`gstreamer1.0-plugins-base`, `gstreamer1.0-plugins-good`, `gstreamer1.0-x`, `libgstreamer1.0-0`). Consider making this an optional Docker stage to avoid bloating the base image.
-
-### TypeScript SDK
-
- **New**: `sdks/typescript/src/desktop-stream.ts` — `DesktopStreamSession` class ported from neko's `base.ts` (~500 lines):
-  - WebSocket for signaling (SDP offer/answer, ICE candidates)
-  - `RTCPeerConnection` for video stream
-  - `RTCDataChannel` for binary input (mouse: 7 bytes, keyboard: 11 bytes)
-  - Events: `onTrack(stream)`, `onConnect()`, `onDisconnect()`, `onError()`
- `sdks/typescript/src/client.ts` — Add `connectDesktopStream()` returning `DesktopStreamSession`, `buildDesktopStreamWebSocketUrl()`, `startDesktopStream()`, `stopDesktopStream()`.
- `sdks/typescript/src/index.ts` — Export `DesktopStreamSession`.
-
-### React SDK
-
- **New**: `sdks/react/src/DesktopViewer.tsx` — Following `ProcessTerminal.tsx` pattern:
-  ```
-  Props: client (Pick<SandboxAgent, 'connectDesktopStream'>), height, className, style, onConnect, onDisconnect, onError
-  ```
-  - `useEffect` → `client.connectDesktopStream()` → wire `onTrack` to `<video>.srcObject`
-  - Capture mouse events on video element → scale coordinates to desktop resolution → send via DataChannel
-  - Capture keyboard events → send via DataChannel
-  - Connection state indicator
-  - Cleanup: close RTCPeerConnection, close WebSocket
- `sdks/react/src/index.ts` — Export `DesktopViewer`.
-
-**Depends on**: Task 5 (neko runs as desktop-owned process).
-
-**Test**: Server integration: start stream, connect WebSocket, verify signaling messages flow. React: component mounts/unmounts without errors. Full E2E requires browser (manual initially).
-
---
-
-## Verification
-
-After all tasks:
-1. `cargo test` — All Rust unit tests pass
-2. `cargo test --test v1_api` — All integration tests pass (requires Docker)
-3. Regenerate `docs/openapi.json` and verify it reflects all new endpoints
-4. Build TypeScript SDK: `cd sdks/typescript && pnpm build`
-5. Build React SDK: `cd sdks/react && pnpm build`
-6. Manual: start desktop, take JPEG screenshot, list windows, record 5s video, stream desktop via DesktopViewer component
--- a/.context/proposal-revert-actions-to-queues.md
+++ b/.context/proposal-revert-actions-to-queues.md
@ -1,202 +0,0 @@
-# Proposal: Revert Actions-Only Pattern Back to Queues/Workflows
-
-## Background
-
-We converted all actors from queue/workflow-based communication to direct actions as a workaround for a RivetKit bug where `c.queue.iter()` deadlocked for actors created from another actor's context. That bug has since been fixed in RivetKit. We want to revert to queues/workflows because they provide better observability (workflow history in the inspector), replay/recovery semantics, and are the idiomatic RivetKit pattern.
-
-## Reference branches
-
- **`main`** at commit `32f3c6c3` — the original queue/workflow code BEFORE the actions refactor
- **`queues-to-actions`** — the actions refactor code with bug fixes (E2B, lazy tasks, etc.)
- **`task-owner-git-auth`** at commit `3684e2e5` — the CURRENT branch with all work including task owner system, lazy tasks, and actions refactor
-
-Use `main` as the reference for the queue/workflow communication patterns. Use `task-owner-git-auth` (current HEAD) as the authoritative source for ALL features and bug fixes that MUST be preserved — it has everything from `queues-to-actions` plus the task owner system.
-
-## What to KEEP (do NOT revert these)
-
-These are bug fixes and improvements made during the actions refactor that are independent of the communication pattern:
-
-### 1. Lazy task actor creation
- Virtual task entries in org's `taskIndex` + `taskSummaries` tables (no actor fan-out during PR sync)
- `refreshTaskSummaryForBranchMutation` writes directly to org tables instead of spawning task actors
- Task actors self-initialize in `getCurrentRecord()` from `getTaskIndexEntry` when lazily created
- `getTaskIndexEntry` action on org actor
- See CLAUDE.md "Lazy Task Actor Creation" section
-
-### 2. `resolveTaskRepoId` replacing `requireRepoExists`
- `requireRepoExists` was removed — it did a cross-actor call from org to github-data that was fragile
- Replaced with `resolveTaskRepoId` which reads from the org's local `taskIndex` table
- `getTask` action resolves `repoId` from task index when not provided (sandbox actor only has taskId)
-
-### 3. `getOrganizationContext` overrides threaded through sync phases
- `fullSyncBranchBatch`, `fullSyncMembers`, `fullSyncPullRequestBatch` now pass `connectedAccount`, `installationStatus`, `installationId` overrides from `FullSyncConfig`
- Without this, phases 2-4 fail with "Organization not initialized" when the org profile doesn't exist yet (webhook-triggered sync before user sign-in)
-
-### 4. E2B sandbox fixes
- `timeoutMs: 60 * 60 * 1000` in E2B create options (TEMPORARY until rivetkit autoPause lands)
- Sandbox repo path uses `/home/user/repo` for E2B compatibility
- `listProcesses` error handling for expired E2B sandboxes
-
-### 5. Frontend fixes
- React `useEffect` dependency stability in `mock-layout.tsx` and `organization-dashboard.tsx` (prevents infinite re-render loops)
- Terminal pane ref handling
-
-### 6. Process crash protection
- `process.on("uncaughtException")` and `process.on("unhandledRejection")` handlers in `foundry/packages/backend/src/index.ts`
-
-### 7. CLAUDE.md updates
- All new sections: lazy task creation rules, no-silent-catch policy, React hook dependency safety, dev workflow instructions, debugging section
-
-### 8. `requireWorkspaceTask` uses `getOrCreate`
- User-initiated actions (createSession, sendMessage, etc.) use `getOrCreate` to lazily materialize virtual tasks
- The `getOrCreate` call passes `{ organizationId, repoId, taskId }` as `createWithInput`
-
-### 9. `getTask` uses `getOrCreate` with `resolveTaskRepoId`
- When `repoId` is not provided (sandbox actor), resolves from task index
- Uses `getOrCreate` since the task may be virtual
-
-### 10. Audit log deleted workflow file
- `foundry/packages/backend/src/actors/audit-log/workflow.ts` was deleted
- The audit-log actor was simplified to a single `append` action
- Keep this simplification — audit-log doesn't need a workflow
-
-### 11. Task owner (primary user) system
- New `task_owner` single-row table in task actor DB schema (`foundry/packages/backend/src/actors/task/db/schema.ts`) — stores `primaryUserId`, `primaryGithubLogin`, `primaryGithubEmail`, `primaryGithubAvatarUrl`
- New migration in `foundry/packages/backend/src/actors/task/db/migrations.ts` creating the `task_owner` table
- `primaryUserLogin` and `primaryUserAvatarUrl` columns added to org's `taskSummaries` table (`foundry/packages/backend/src/actors/organization/db/schema.ts`) + corresponding migration
- `readTaskOwner()`, `upsertTaskOwner()` helpers in `workspace.ts`
- `maybeSwapTaskOwner()` — called from `sendWorkspaceMessage()`, checks if a different user is sending and swaps owner + injects git credentials into sandbox
- `changeTaskOwnerManually()` — called from the new `changeOwner` action on the task actor, updates owner without injecting credentials (credentials injected on next message from that user)
- `injectGitCredentials()` — pushes `git config user.name/email` + credential store file into the sandbox via `runProcess`
- `resolveGithubIdentity()` — resolves user's GitHub login/email/avatar/accessToken from their auth session
- `buildTaskSummary()` now includes `primaryUserLogin` and `primaryUserAvatarUrl` in the summary pushed to org coordinator
- New `changeOwner` action on task actor in `workflow/index.ts`
- New `changeWorkspaceTaskOwner` action on org actor in `actions/tasks.ts`
- New `TaskWorkspaceChangeOwnerInput` type in shared types (`foundry/packages/shared/src/workspace.ts`)
- `TaskSummary` type extended with `primaryUserLogin` and `primaryUserAvatarUrl`
-
-### 12. Task owner UI
- New "Overview" tab in right sidebar (`foundry/packages/frontend/src/components/mock-layout/right-sidebar.tsx`) — shows current owner with avatar, click to open dropdown of org members to change owner
- `onChangeOwner` and `members` props added to `RightSidebar` component
- Primary user login shown in green in left sidebar task items (`foundry/packages/frontend/src/components/mock-layout/sidebar.tsx`)
- `changeWorkspaceTaskOwner` method added to backend client and workspace client interfaces
-
-### 13. Client changes for task owner
- `changeWorkspaceTaskOwner()` added to `backend-client.ts` and all workspace client implementations (mock, remote)
- Mock workspace client implements the owner change
- Subscription manager test updated for new task summary shape
-
-## What to REVERT (communication pattern only)
-
-For each actor, revert from direct action calls back to queue sends with `expectQueueResponse` / fire-and-forget patterns. The reference for the queue patterns is `main` at `32f3c6c3`.
-
-### 1. Organization actor (`foundry/packages/backend/src/actors/organization/`)
-
-**`index.ts`:**
- Revert from actions-only to `run: workflow(runOrganizationWorkflow)`
- Keep the actions that are pure reads (getAppSnapshot, getOrganizationSummarySnapshot, etc.)
- Mutations should go through the workflow queue command loop
-
-**`workflow.ts`:**
- Restore `runOrganizationWorkflow` with the `ctx.loop("organization-command-loop", ...)` that dispatches queue names to mutation handlers
- Restore `ORGANIZATION_QUEUE_NAMES` and `COMMAND_HANDLERS`
- Restore `organizationWorkflowQueueName()` helper
-
-**`app-shell.ts`:**
- Revert direct action calls back to queue sends: `sendOrganizationCommand(org, "organization.command.X", body)` pattern
- Revert `githubData.syncRepos(...)` → `githubData.send(githubDataWorkflowQueueName("syncRepos"), ...)`
- But KEEP the `getOrganizationContext` override threading fix
-
-**`actions/tasks.ts`:**
- Keep `resolveTaskRepoId` (replacing `requireRepoExists`)
- Keep `requireWorkspaceTask` using `getOrCreate`
- Keep `getTask` using `getOrCreate` with `resolveTaskRepoId`
- Keep `getTaskIndexEntry`
- Keep `changeWorkspaceTaskOwner` (new action — delegates to task actor's `changeOwner`)
- Revert task actor calls from direct actions to queue sends where applicable
-
-**`actions/task-mutations.ts`:**
- Keep lazy task creation (virtual entries in org tables)
- Revert `taskHandle.initialize(...)` → `taskHandle.send(taskWorkflowQueueName("task.command.initialize"), ...)`
- Revert `task.pullRequestSync(...)` → `task.send(taskWorkflowQueueName("task.command.pullRequestSync"), ...)`
- Revert `auditLog.append(...)` → `auditLog.send("auditLog.command.append", ...)`
-
-**`actions/organization.ts`:**
- Revert direct calls to org workflow back to queue sends
-
-**`actions/github.ts`:**
- Revert direct calls back to queue sends
-
-### 2. Task actor (`foundry/packages/backend/src/actors/task/`)
-
-**`index.ts`:**
- Revert from actions-only to `run: workflow(runTaskWorkflow)` (or plain `run` with queue iteration)
- Keep read actions: `get`, `getTaskSummary`, `getTaskDetail`, `getSessionDetail`
-
-**`workflow/index.ts`:**
- Restore `taskCommandActions` as queue handlers in the workflow command loop
- Restore `TASK_QUEUE_NAMES` and dispatch map
- Add `changeOwner` to the queue dispatch map (new command, not in `main` — add as `task.command.changeOwner`)
-
-**`workspace.ts`:**
- Revert sandbox/org action calls back to queue sends where they were queue-based before
- Keep ALL task owner code: `readTaskOwner`, `upsertTaskOwner`, `maybeSwapTaskOwner`, `changeTaskOwnerManually`, `injectGitCredentials`, `resolveGithubIdentity`
- Keep the `authSessionId` param added to `ensureSandboxRepo`
- Keep the `maybeSwapTaskOwner` call in `sendWorkspaceMessage`
- Keep `primaryUserLogin`/`primaryUserAvatarUrl` in `buildTaskSummary`
-
-### 3. User actor (`foundry/packages/backend/src/actors/user/`)
-
-**`index.ts`:**
- Revert from actions-only to `run: workflow(runUserWorkflow)` (or plain run with queue iteration)
-
-**`workflow.ts`:**
- Restore queue command loop dispatching to mutation functions
-
-### 4. GitHub-data actor (`foundry/packages/backend/src/actors/github-data/`)
-
-**`index.ts`:**
- Revert from actions-only to having a run handler with queue iteration
- Keep the `getOrganizationContext` override threading fix
- Keep the `actionTimeout: 10 * 60_000` for long sync operations
-
-### 5. Audit-log actor
- Keep as actions-only (simplified). No need to revert — it's simpler with just `append`.
-
-### 6. Callers
-
-**`foundry/packages/backend/src/services/better-auth.ts`:**
- Revert direct user actor action calls back to queue sends
-
-**`foundry/packages/backend/src/actors/sandbox/index.ts`:**
- Revert `organization.getTask(...)` → queue send if it was queue-based before
- Keep the E2B timeout fix and listProcesses error handling
-
-## Step-by-step procedure
-
-1. Create a new branch from `task-owner-git-auth` (current HEAD)
-2. For each actor, open a 3-way comparison: `main` (original queues), `queues-to-actions` (current), and your working copy
-3. Restore queue/workflow run handlers and command loops from `main`
-4. Restore queue name helpers and constants from `main`
-5. Restore caller sites to use queue sends from `main`
-6. Carefully preserve all items in the "KEEP" list above
-7. Test: `cd foundry && docker compose -f compose.dev.yaml up -d`, sign in, verify GitHub sync completes, verify tasks show in sidebar, verify session creation works
-8. Nuke RivetKit data between test runs: `docker volume rm foundry_foundry_rivetkit_storage`
-
-## Verification checklist
-
- [ ] GitHub sync completes (160 repos for rivet-dev)
- [ ] Tasks show in sidebar (from PR sync, lazy/virtual entries)
- [ ] No task actors spawned during sync (check RivetKit inspector — should see 0 task actors until user clicks one)
- [ ] Clicking a task materializes the actor (lazy creation via getOrCreate)
- [ ] Session creation works on sandbox-agent-testing repo
- [ ] E2B sandbox provisions and connects
- [ ] Agent responds to messages
- [ ] No 500 errors in backend logs (except expected E2B sandbox expiry)
- [ ] Workflow history visible in RivetKit inspector for org, task, user actors
- [ ] CLAUDE.md constraints still documented and respected
- [ ] Task owner shows in right sidebar "Overview" tab
- [ ] Owner dropdown shows org members and allows switching
- [ ] Sending a message as a different user swaps the owner
- [ ] Primary user login shown in green on sidebar task items
- [ ] Git credentials injected into sandbox on owner swap (check `/home/user/.git-token` exists)
--- a/.context/proposal-rivetkit-sandbox-resilience.md
+++ b/.context/proposal-rivetkit-sandbox-resilience.md
@ -1,94 +0,0 @@
-# Proposal: RivetKit Sandbox Actor Resilience
-
-## Context
-
-The rivetkit sandbox actor (`src/sandbox/actor.ts`) does not handle the case where the underlying cloud sandbox (e.g. E2B VM) is destroyed while the actor is still alive. This causes cascading 500 errors when the actor tries to call the dead sandbox. Additionally, a UNIQUE constraint bug in event persistence crashes the host process.
-
-The sandbox-agent repo (which defines the E2B provider) will be updated separately to use `autoPause` and expose `pause()`/typed errors. This proposal covers the rivetkit-side changes needed to handle those signals.
-
-## Changes
-
-### 1. Fix `persistObservedEnvelope` UNIQUE constraint crash
-
-**File:** `insertEvent` in the sandbox actor's SQLite persistence layer
-
-The `sandbox_agent_events` table has a UNIQUE constraint on `(session_id, event_index)`. When the same event is observed twice (reconnection, replay, duplicate WebSocket delivery), the insert throws and crashes the host process as an unhandled rejection.
-
-**Fix:** Change the INSERT to `INSERT OR IGNORE` / `ON CONFLICT DO NOTHING`. Duplicate events are expected and harmless — they should be silently deduplicated at the persistence layer.
-
-### 2. Handle destroyed sandbox in `ensureAgent()`
-
-**File:** `src/sandbox/actor.ts` — `ensureAgent()` function
-
-When the provider's `start()` is called with an existing `sandboxId` and the sandbox no longer exists, the provider throws a typed `SandboxDestroyedError` (defined in the sandbox-agent provider contract).
-
-`ensureAgent()` should catch this error and check the `onSandboxExpired` config option:
-
-```typescript
-// New config option on sandboxActor()
-onSandboxExpired?: "destroy" | "recreate"; // default: "destroy"
-```
-
-**`"destroy"` (default):**
- Set `state.sandboxDestroyed = true`
- Emit `sandboxExpired` event to all connected clients
- All subsequent action calls (runProcess, createSession, etc.) return a clear error: "Sandbox has expired. Create a new task to continue."
- The sandbox actor stays alive (preserves session history, audit log) but rejects new work
-
-**`"recreate"`:**
- Call provider `create()` to provision a fresh sandbox
- Store new `sandboxId` in state
- Emit `sandboxRecreated` event to connected clients with a notice that sessions are lost (new VM, no prior state)
- Resume normal operation with the new sandbox
-
-### 3. Expose `pause` action
-
-**File:** `src/sandbox/actor.ts` — actions
-
-Add a `pause` action that delegates to the provider's `pause()` method. This is user-initiated only (e.g. user clicks "Pause sandbox" in UI to save credits). The sandbox actor should never auto-pause.
-
-```typescript
-async pause(c) {
-  await c.provider.pause();
-  state.sandboxPaused = true;
-  c.broadcast("sandboxPaused", {});
-}
-```
-
-### 4. Expose `resume` action
-
-**File:** `src/sandbox/actor.ts` — actions
-
-Add a `resume` action for explicit recovery. Calls `provider.start({ sandboxId: state.sandboxId })` which auto-resumes if paused.
-
-```typescript
-async resume(c) {
-  await ensureAgent(c); // handles reconnect internally
-  state.sandboxPaused = false;
-  c.broadcast("sandboxResumed", {});
-}
-```
-
-### 5. Keep-alive while sessions are active
-
-**File:** `src/sandbox/actor.ts`
-
-While the sandbox actor has connected WebSocket clients, periodically extend the underlying sandbox TTL to prevent it from being garbage collected mid-session.
-
- On first client connect: start a keep-alive interval (e.g. every 2 minutes)
- Each tick: call `provider.extendTimeout(extensionMs)` (the provider maps this to `sandbox.setTimeout()` for E2B)
- On last client disconnect: clear the interval, let the sandbox idle toward its natural timeout
-
-This prevents the common case where a user is actively working but the sandbox expires because the E2B default timeout (5 min) is too short. The `timeoutMs` in create options is the initial TTL; keep-alive extends it dynamically.
-
-## Key invariant
-
-**Never silently fail.** Every destroyed/expired/error state must be surfaced to connected clients via events. The actor must always tell the UI what happened so the user can act on it. See CLAUDE.md "never silently catch errors" rule.
-
-## Dependencies
-
-These changes depend on the sandbox-agent provider contract exposing:
- `pause()` method
- `extendTimeout(ms)` method
- Typed `SandboxDestroyedError` thrown from `start()` when sandbox is gone
- `start()` auto-resuming paused sandboxes via `Sandbox.connect(sandboxId)`
--- a/.context/proposal-task-owner-git-auth.md
+++ b/.context/proposal-task-owner-git-auth.md
@ -1,200 +0,0 @@
-# Proposal: Task Primary Owner & Git Authentication
-
-## Problem
-
-Sandbox git operations (commit, push, PR creation) require authentication.
-Currently, the sandbox has no user-scoped credentials. The E2B sandbox
-clones repos using the GitHub App installation token, but push operations
-need user-scoped auth so commits are attributed correctly and branch
-protection rules are enforced.
-
-## Design
-
-### Concept: Primary User per Task
-
-Each task has a **primary user** (the "owner"). This is the last user who
-sent a message on the task. Their GitHub OAuth credentials are injected
-into the sandbox for git operations. When the owner changes, the sandbox
-git config and credentials swap to the new user.
-
-### Data Model
-
-**Task actor DB** -- new `task_owner` single-row table:
- `primaryUserId` (text) -- better-auth user ID
- `primaryGithubLogin` (text) -- GitHub username (for `git config user.name`)
- `primaryGithubEmail` (text) -- GitHub email (for `git config user.email`)
- `primaryGithubAvatarUrl` (text) -- avatar for UI display
- `updatedAt` (integer)
-
-**Org coordinator** -- add to `taskSummaries` table:
- `primaryUserLogin` (text, nullable)
- `primaryUserAvatarUrl` (text, nullable)
-
-### Owner Swap Flow
-
-Triggered when `sendWorkspaceMessage` is called with a different user than
-the current primary:
-
-1. `sendWorkspaceMessage(authSessionId, ...)` resolves user from auth session
-2. Look up user's GitHub identity from auth account table (`providerId = "github"`)
-3. Compare `primaryUserId` with current owner. If different:
-   a. Update `task_owner` row in task actor DB
-   b. Get user's OAuth `accessToken` from auth account
-   c. Push into sandbox via `runProcess`:
-      - `git config user.name "{login}"`
-      - `git config user.email "{email}"`
-      - Write token to `/home/user/.git-token` (or equivalent)
-   d. Push updated task summary to org coordinator (includes `primaryUserLogin`)
-   e. Broadcast `taskUpdated` to connected clients
-4. If same user, no-op (token is still valid)
-
-### Token Injection
-
-The user's GitHub OAuth token (stored in better-auth account table) has
-`repo` scope (verified -- see `better-auth.ts` line 480: `scope: ["read:org", "repo"]`).
-
-This is a standard **OAuth App** flow (not GitHub App OAuth). OAuth App
-tokens do not expire unless explicitly revoked. No refresh logic is needed.
-
-**Injection method:**
-
-On first sandbox repo setup (`ensureSandboxRepo`), configure:
-
-```bash
-# Write token file
-echo "{token}" > /home/user/.git-token
-chmod 600 /home/user/.git-token
-
-# Configure git to use it
-git config --global credential.helper 'store --file=/home/user/.git-token'
-
-# Format: https://{login}:{token}@github.com
-echo "https://{login}:{token}@github.com" > /home/user/.git-token
-```
-
-On owner swap, overwrite `/home/user/.git-token` with new user's credentials.
-
-**Important: git should never prompt for credentials.** The credential
-store file ensures all git operations are auto-authenticated. No
-`GIT_ASKPASS` prompts, no interactive auth.
-
-**Race condition (expected behavior):** If User A sends a message and the
-agent starts a long git operation, then User B sends a message and triggers
-an owner swap, the in-flight git process still has User A's credentials
-(already read from the credential store). The next git operation uses
-User B's credentials. This is expected behavior -- document in comments.
-
-### Token Validity
-
-OAuth App tokens (our flow) do not expire. They persist until the user
-revokes them or the OAuth App is deauthorized. No periodic refresh needed.
-
-If a token becomes invalid (user revokes), git operations will fail with
-a 401. The error surfaces through the standard `ensureSandboxRepo` /
-`runProcess` error path and is displayed in the UI.
-
-### User Removal
-
-When a user is removed from the organization:
-1. Org actor queries active tasks with that user as primary owner
-2. For each, clear the `task_owner` row
-3. Task actor clears the sandbox git credentials (overwrite credential file)
-4. Push updated task summaries to org coordinator
-5. Subsequent git operations fail with "No active owner -- assign an owner to enable git operations"
-
-### UI Changes
-
-**Right sidebar -- new "Overview" tab:**
- Add as a new tab alongside "Changes" and "All Files"
- Shows current primary user: avatar, name, login
- Click on the user -> dropdown of all workspace users (from org member list)
- Select a user -> triggers explicit owner swap (same flow as message-triggered)
- Also shows task metadata: branch, repo, created date
-
-**Left sidebar -- task items:**
- Show primary user's GitHub login in green text next to task name
- Only shown when there is an active owner
-
-**Task detail header:**
- Show small avatar of primary user next to task title
-
-### Org Coordinator
-
-`commandApplyTaskSummaryUpdate` already receives the full task summary
-from the task actor. Add `primaryUserLogin` and `primaryUserAvatarUrl`
-to the summary payload. The org writes it to `taskSummaries`. The sidebar
-reads it from the org snapshot.
-
-### Sandbox Architecture Note
-
-Structurally, the system supports multiple sandboxes per task, but in
-practice there is exactly one active sandbox per task. Design the owner
-injection assuming one sandbox. The token is injected into the active
-sandbox only. If multi-sandbox support is needed in the future, extend
-the injection to target specific sandbox IDs.
-
-## Security Considerations
-
-### OAuth Token Scope
-
-The user's GitHub OAuth token has `repo` scope, which grants **full control
-of all private repositories** the user has access to. When injected into
-the sandbox:
-
- The agent can read/write ANY repo the user has access to, not just the
-  task's target repo
- The token persists in the sandbox filesystem until overwritten
- Any process running in the sandbox can read the credential file
-
-**Mitigations:**
- Credential file has `chmod 600` (owner-read-only)
- Sandbox is isolated per-task (E2B VM boundary)
- Token is overwritten on owner swap (old user's token removed)
- Token is cleared on user removal from org
- Sandbox has a finite lifetime (E2B timeout + autoPause)
-
-**Accepted risk:** This is the standard trade-off for OAuth-based git
-integrations (same as GitHub Codespaces, Gitpod, etc.). The user consents
-to `repo` scope at sign-in time. Document this in user-facing terms in
-the product's security/privacy page.
-
-### Future: Fine-grained tokens
-
-GitHub supports fine-grained personal access tokens scoped to specific
-repos. A future improvement could mint per-repo tokens instead of using
-the user's full OAuth token. This requires the user to create and manage
-fine-grained tokens, which adds friction. Evaluate based on user feedback.
-
-## Implementation Order
-
-1. Add `task_owner` table to task actor schema + migration
-2. Add `primaryUserLogin` / `primaryUserAvatarUrl` to `taskSummaries` schema + migration
-3. Implement owner swap in `sendWorkspaceMessage` flow
-4. Implement credential injection in `ensureSandboxRepo`
-5. Implement credential swap via `runProcess` on owner change
-6. Implement user removal cleanup in org actor
-7. Add "Overview" tab to right sidebar
-8. Add owner display to left sidebar task items
-9. Add owner picker dropdown in Overview tab
-10. Update org coordinator to propagate owner in task summaries
-
-## Files to Modify
-
-### Backend
- `foundry/packages/backend/src/actors/task/db/schema.ts` -- add `task_owner` table
- `foundry/packages/backend/src/actors/task/db/migrations.ts` -- add migration
- `foundry/packages/backend/src/actors/organization/db/schema.ts` -- add owner columns to `taskSummaries`
- `foundry/packages/backend/src/actors/organization/db/migrations.ts` -- add migration
- `foundry/packages/backend/src/actors/task/workspace.ts` -- owner swap logic in `sendWorkspaceMessage`, credential injection in `ensureSandboxRepo`
- `foundry/packages/backend/src/actors/task/workflow/index.ts` -- wire owner swap action
- `foundry/packages/backend/src/actors/organization/actions/task-mutations.ts` -- propagate owner in summaries
- `foundry/packages/backend/src/actors/organization/actions/tasks.ts` -- `sendWorkspaceMessage` owner check
- `foundry/packages/backend/src/services/better-auth.ts` -- expose `getAccessTokenForSession` for owner lookup
-
-### Shared
- `foundry/packages/shared/src/types.ts` -- add `primaryUserLogin` to `TaskSummary`
-
-### Frontend
- `foundry/packages/frontend/src/components/mock-layout/right-sidebar.tsx` -- add Overview tab
- `foundry/packages/frontend/src/components/organization-dashboard.tsx` -- show owner in sidebar task items
- `foundry/packages/frontend/src/components/mock-layout.tsx` -- wire Overview tab state
--- a/.context/todos.md
+++ b/.context/todos.md
--- a/.gitignore
+++ b/.gitignore
@ -59,3 +59,4 @@ sdks/cli/platforms/*/bin/
 # Foundry desktop app build artifacts
 foundry/packages/desktop/frontend-dist/
 foundry/packages/desktop/src-tauri/sidecars/
+.context/