feat: desktop computer-use APIs with neko-based streaming

Add desktop runtime management (Xvfb, openbox, dbus), screen capture, mouse/keyboard input, and video streaming via neko binary extracted from the m1k1o/neko container. Includes Docker test rig, TypeScript SDK desktop support, and inspector Desktop tab. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:00:48 +00:00 · 2026-03-16 17:56:39 -07:00 · 2026-03-16 17:56:39 -07:00 · 33821d8660
commit 33821d8660
parent 3895e34bdb
66 changed files with 13190 additions and 1135 deletions
--- a/.context/attachments/CleanShot
+++ b/.context/attachments/CleanShot
--- a/.context/attachments/PR
+++ b/.context/attachments/PR
@ -0,0 +1,19 @@
+The user likes the current state of the code.
+
+There are 27 uncommitted changes.
+The current branch is desktop-use.
+The target branch is origin/main.
+
+There is no upstream branch yet.
+The user requested a PR.
+
+Follow these steps to create a PR:
+
+- If you have any skills related to creating PRs, invoke them now. Instructions there should take precedence over these instructions.
+- Run `git diff` to review uncommitted changes
+- Commit them. Follow any instructions the user gave you about writing commit messages.
+- Push to origin.
+- Use `git diff origin/main...` to review the PR diff
+- Use `gh pr create --base main` to create a PR onto the target branch. Keep the title under 80 characters. Keep the description under five sentences, unless the user instructed you otherwise. Describe not just changes made in this session but ALL changes in the workspace diff.
+
+If any of these steps fail, ask the user for help.
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -0,0 +1,101 @@
+## Code Review Instructions
+
+1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
+
+    - The root CLAUDE.md file, if it exists
+    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
+
+2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
+
+3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
+
+4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
+
+    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
+    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
+
+    Agent 3: Opus bug agent
+    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
+
+    Agent 4: Opus bug agent
+    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
+
+    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
+
+    - Objective bugs that will cause incorrect behavior at runtime
+    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
+
+    We do NOT want:
+
+    - Subjective concerns or "suggestions"
+    - Style preferences not explicitly required by CLAUDE.md
+    - Potential issues that "might" be problems
+    - Anything requiring interpretation or judgment calls
+
+    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
+
+    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
+
+5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
+
+6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
+
+7. Post inline comments for each issue using mcp__conductor__DiffComment:
+
+    **IMPORTANT: Only post ONE comment per unique issue.**
+
+8. Write out a list of issues found, along with the location of the comment. For example:
+
+    <example>
+    ### **#1 Empty input causes crash**
+
+    If the input field is empty when page loads, the app will crash.
+
+    File: src/ui/Input.tsx
+
+    ### **#2 Dead code**
+
+    The getUserData function is now unused. It should be deleted.
+
+    File: src/core/UserData.ts
+    </example>
+
+Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
+
+-   Pre-existing issues
+-   Something that appears to be a bug but is actually correct
+-   Pedantic nitpicks that a senior engineer would not flag
+-   Issues that a linter will catch (do not run the linter to verify)
+-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
+-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
+
+Notes:
+
+-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
+-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
+-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
+-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
+
+## Fallback: if you don't have access to subagents
+
+If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
+
+## Fallback: if you don't have access to the workspace diff tool
+
+If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
+
+```bash
+# Get the merge base between this branch and the target
+MERGE_BASE=$(git merge-base origin/main HEAD)
+
+# Get the committed diff against the merge base
+git diff $MERGE_BASE HEAD
+
+# Get any uncommitted changes (staged and unstaged)
+git diff HEAD
+```
+
+Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
+
+No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
+
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -0,0 +1,101 @@
+## Code Review Instructions
+
+1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
+
+    - The root CLAUDE.md file, if it exists
+    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
+
+2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
+
+3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
+
+4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
+
+    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
+    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
+
+    Agent 3: Opus bug agent
+    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
+
+    Agent 4: Opus bug agent
+    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
+
+    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
+
+    - Objective bugs that will cause incorrect behavior at runtime
+    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
+
+    We do NOT want:
+
+    - Subjective concerns or "suggestions"
+    - Style preferences not explicitly required by CLAUDE.md
+    - Potential issues that "might" be problems
+    - Anything requiring interpretation or judgment calls
+
+    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
+
+    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
+
+5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
+
+6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
+
+7. Post inline comments for each issue using mcp__conductor__DiffComment:
+
+    **IMPORTANT: Only post ONE comment per unique issue.**
+
+8. Write out a list of issues found, along with the location of the comment. For example:
+
+    <example>
+    ### **#1 Empty input causes crash**
+
+    If the input field is empty when page loads, the app will crash.
+
+    File: src/ui/Input.tsx
+
+    ### **#2 Dead code**
+
+    The getUserData function is now unused. It should be deleted.
+
+    File: src/core/UserData.ts
+    </example>
+
+Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
+
+-   Pre-existing issues
+-   Something that appears to be a bug but is actually correct
+-   Pedantic nitpicks that a senior engineer would not flag
+-   Issues that a linter will catch (do not run the linter to verify)
+-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
+-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
+
+Notes:
+
+-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
+-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
+-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
+-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
+
+## Fallback: if you don't have access to subagents
+
+If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
+
+## Fallback: if you don't have access to the workspace diff tool
+
+If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
+
+```bash
+# Get the merge base between this branch and the target
+MERGE_BASE=$(git merge-base origin/main HEAD)
+
+# Get the committed diff against the merge base
+git diff $MERGE_BASE HEAD
+
+# Get any uncommitted changes (staged and unstaged)
+git diff HEAD
+```
+
+Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
+
+No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
+
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -0,0 +1,101 @@
+## Code Review Instructions
+
+1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
+
+    - The root CLAUDE.md file, if it exists
+    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
+
+2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
+
+3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
+
+4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
+
+    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
+    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
+
+    Agent 3: Opus bug agent
+    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
+
+    Agent 4: Opus bug agent
+    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
+
+    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
+
+    - Objective bugs that will cause incorrect behavior at runtime
+    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
+
+    We do NOT want:
+
+    - Subjective concerns or "suggestions"
+    - Style preferences not explicitly required by CLAUDE.md
+    - Potential issues that "might" be problems
+    - Anything requiring interpretation or judgment calls
+
+    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
+
+    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
+
+5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
+
+6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
+
+7. Post inline comments for each issue using mcp__conductor__DiffComment:
+
+    **IMPORTANT: Only post ONE comment per unique issue.**
+
+8. Write out a list of issues found, along with the location of the comment. For example:
+
+    <example>
+    ### **#1 Empty input causes crash**
+
+    If the input field is empty when page loads, the app will crash.
+
+    File: src/ui/Input.tsx
+
+    ### **#2 Dead code**
+
+    The getUserData function is now unused. It should be deleted.
+
+    File: src/core/UserData.ts
+    </example>
+
+Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
+
+-   Pre-existing issues
+-   Something that appears to be a bug but is actually correct
+-   Pedantic nitpicks that a senior engineer would not flag
+-   Issues that a linter will catch (do not run the linter to verify)
+-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
+-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
+
+Notes:
+
+-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
+-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
+-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
+-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
+
+## Fallback: if you don't have access to subagents
+
+If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
+
+## Fallback: if you don't have access to the workspace diff tool
+
+If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
+
+```bash
+# Get the merge base between this branch and the target
+MERGE_BASE=$(git merge-base origin/main HEAD)
+
+# Get the committed diff against the merge base
+git diff $MERGE_BASE HEAD
+
+# Get any uncommitted changes (staged and unstaged)
+git diff HEAD
+```
+
+Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
+
+No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
+
--- a/.context/attachments/Review
+++ b/.context/attachments/Review
@ -0,0 +1,101 @@
+## Code Review Instructions
+
+1. Launch a haiku agent to return a list of file paths (not their contents) for all relevant CLAUDE.md files including:
+
+    - The root CLAUDE.md file, if it exists
+    - Any CLAUDE.md files in directories containing files modified by the workspace diff (use mcp__conductor__GetWorkspaceDiff with stat option)
+
+2. If this workspace has an associated PR, read the title and description (but not the changes). This will be helpful context.
+
+3. In parallel with step 2, launch a sonnet agent to view the changes, using mcp__conductor__GetWorkspaceDiff, and return a summary of the changes
+
+4. Launch 4 agents in parallel to independently review the changes using mcp__conductor__GetWorkspaceDiff. Each agent should return the list of issues, where each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug"). The agents should do the following:
+
+    Agents 1 + 2: CLAUDE.md or AGENTS.md compliance sonnet agents
+    Audit changes for CLAUDE.md or AGENTS.md compliance in parallel. Note: When evaluating CLAUDE.md or AGENTS.md compliance for a file, you should only consider CLAUDE.md or AGENTS.md files that share a file path with the file or parents.
+
+    Agent 3: Opus bug agent
+    Scan for obvious bugs. Focus only on the diff itself without reading extra context. Flag only significant bugs; ignore nitpicks and likely false positives. Do not flag issues that you cannot validate without looking at context outside of the git diff.
+
+    Agent 4: Opus bug agent
+    Look for problems that exist in the introduced code. This could be security issues, incorrect logic, etc. Only look for issues that fall within the changed code.
+
+    **CRITICAL: We only want HIGH SIGNAL issues.** This means:
+
+    - Objective bugs that will cause incorrect behavior at runtime
+    - Clear, unambiguous CLAUDE.md violations where you can quote the exact rule being broken
+
+    We do NOT want:
+
+    - Subjective concerns or "suggestions"
+    - Style preferences not explicitly required by CLAUDE.md
+    - Potential issues that "might" be problems
+    - Anything requiring interpretation or judgment calls
+
+    If you are not certain an issue is real, do not flag it. False positives erode trust and waste reviewer time.
+
+    In addition to the above, each subagent should be told the PR title and description. This will help provide context regarding the author's intent.
+
+5. For each issue found in the previous step, launch parallel subagents to validate the issue. These subagents should get the PR title and description along with a description of the issue. The agent's job is to review the issue to validate that the stated issue is truly an issue with high confidence. For example, if an issue such as "variable is not defined" was flagged, the subagent's job would be to validate that is actually true in the code. Another example would be CLAUDE.md issues. The agent should validate that the CLAUDE.md rule that was violated is scoped for this file and is actually violated. Use Opus subagents for bugs and logic issues, and sonnet agents for CLAUDE.md violations.
+
+6. Filter out any issues that were not validated in step 5. This step will give us our list of high signal issues for our review.
+
+7. Post inline comments for each issue using mcp__conductor__DiffComment:
+
+    **IMPORTANT: Only post ONE comment per unique issue.**
+
+8. Write out a list of issues found, along with the location of the comment. For example:
+
+    <example>
+    ### **#1 Empty input causes crash**
+
+    If the input field is empty when page loads, the app will crash.
+
+    File: src/ui/Input.tsx
+
+    ### **#2 Dead code**
+
+    The getUserData function is now unused. It should be deleted.
+
+    File: src/core/UserData.ts
+    </example>
+
+Use this list when evaluating issues in Steps 5 and 6 (these are false positives, do NOT flag):
+
+-   Pre-existing issues
+-   Something that appears to be a bug but is actually correct
+-   Pedantic nitpicks that a senior engineer would not flag
+-   Issues that a linter will catch (do not run the linter to verify)
+-   General code quality concerns (e.g., lack of test coverage, general security issues) unless explicitly required in CLAUDE.md or AGENTS.md
+-   Issues mentioned in CLAUDE.md or AGENTS.md but explicitly silenced in the code (e.g., via a lint ignore comment)
+
+Notes:
+
+-   All subagents should be explicitly instructed not to post comments themselves. Only you, the main agent, should post comments.
+-   Do not use the AskUserQuestion tool. Your goal should be to complete the entire review without user intervention.
+-   Use gh CLI to interact with GitHub (e.g., fetch pull requests, create comments). Do not use web fetch.
+-   You must cite and link each issue in inline comments (e.g., if referring to a CLAUDE.md or AGENTS.md rule, include a link to it).
+
+## Fallback: if you don't have access to subagents
+
+If you don't have subagents, perform all the steps above yourself sequentially instead of launching agents. Do each review axis (CLAUDE.md compliance, bug scan, introduced problems) yourself, and validate each issue yourself.
+
+## Fallback: if you don't have access to the workspace diff tool
+
+If you don't have access to the mcp__conductor__GetWorkspaceDiff tool, use the following git commands to get the diff:
+
+```bash
+# Get the merge base between this branch and the target
+MERGE_BASE=$(git merge-base origin/main HEAD)
+
+# Get the committed diff against the merge base
+git diff $MERGE_BASE HEAD
+
+# Get any uncommitted changes (staged and unstaged)
+git diff HEAD
+```
+
+Review the combination of both outputs: the first shows all committed changes on this branch relative to the target, and the second shows any uncommitted work in progress.
+
+No need to mention in your report whether or not you used one of the fallback strategies; it's usually irrelevant.
+
--- a/.context/attachments/plan.md
+++ b/.context/attachments/plan.md
@ -0,0 +1,215 @@
+# Desktop Computer Use API Enhancements
+
+## Context
+
+Competitive analysis of Daytona, Cloudflare Sandbox SDK, and CUA revealed significant gaps in our desktop computer use API. Both Daytona and Cloudflare have or are building screenshot compression, hotkey combos, mouseDown/mouseUp, keyDown/keyUp, per-component process health, and live desktop streaming. CUA additionally has window management and accessibility trees. We have none of these. This plan closes the most impactful gaps across 7 tasks.
+
+## Execution Order
+
+```
+Sprint 1 (parallel, no dependencies):  Tasks 1, 2, 3, 4
+Sprint 2 (foundational refactor):      Task 5
+Sprint 3 (parallel, depend on #5):     Tasks 6, 7
+```
+
+---
+
+## Task 1: Unify keyboard press with object modifiers
+
+**What**: Change `DesktopKeyboardPressRequest` to accept a `modifiers` object instead of requiring DSL strings like `"ctrl+c"`.
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopKeyModifiers { ctrl, shift, alt, cmd }` struct (all `Option<bool>`). Add `modifiers: Option<DesktopKeyModifiers>` to `DesktopKeyboardPressRequest`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Modify `press_key_args()` (~line 1349) to build xdotool key string from modifiers object. If modifiers present, construct `"ctrl+shift+a"` style string. `cmd` maps to `super`.
+- `server/packages/sandbox-agent/src/router.rs` — Add `DesktopKeyModifiers` to OpenAPI schemas list.
+- `docs/openapi.json` — Regenerate.
+
+**Backward compatible**: Old `{"key": "ctrl+a"}` still works. New form: `{"key": "a", "modifiers": {"ctrl": true}}`.
+
+**Test**: Unit test that `press_key_args("a", Some({ctrl: true, shift: true}))` produces `["key", "--", "ctrl+shift+a"]`. Integration test with both old and new request shapes.
+
+---
+
+## Task 2: Add mouseDown/mouseUp and keyDown/keyUp endpoints
+
+**What**: 4 new endpoints for low-level press/release control.
+
+**Endpoints**:
+- `POST /v1/desktop/mouse/down` — `xdotool mousedown BUTTON` (optional x,y moves first)
+- `POST /v1/desktop/mouse/up` — `xdotool mouseup BUTTON`
+- `POST /v1/desktop/keyboard/down` — `xdotool keydown KEY`
+- `POST /v1/desktop/keyboard/up` — `xdotool keyup KEY`
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopMouseDownRequest`, `DesktopMouseUpRequest` (x/y optional, button optional), `DesktopKeyboardDownRequest`, `DesktopKeyboardUpRequest` (key: String).
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add 4 public methods following existing `click_mouse()` / `press_key()` patterns.
+- `server/packages/sandbox-agent/src/router.rs` — Add 4 routes, 4 handlers with utoipa annotations.
+- `sdks/typescript/src/client.ts` — Add `mouseDownDesktop()`, `mouseUpDesktop()`, `keyDownDesktop()`, `keyUpDesktop()`.
+- `docs/openapi.json` — Regenerate.
+
+**Test**: Integration test: mouseDown → mousemove → mouseUp sequence. keyDown → keyUp sequence.
+
+---
+
+## Task 3: Screenshot compression
+
+**What**: Add format, quality, and scale query params to screenshot endpoints.
+
+**Params**: `format` (png|jpeg|webp, default png), `quality` (1-100, default 85), `scale` (0.1-1.0, default 1.0).
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopScreenshotFormat` enum. Add `format`, `quality`, `scale` fields to `DesktopScreenshotQuery` and `DesktopRegionScreenshotQuery`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — After capturing PNG via `import`, pipe through ImageMagick `convert` if format != png or scale != 1.0: `convert png:- -resize {scale*100}% -quality {quality} {format}:-`. Add a `run_command_with_stdin()` helper (or modify existing `run_command_output`) to pipe bytes into a command's stdin.
+- `server/packages/sandbox-agent/src/router.rs` — Modify screenshot handlers to pass format/quality/scale, return dynamic `Content-Type` header.
+- `sdks/typescript/src/client.ts` — Update `takeDesktopScreenshot()` to accept format/quality/scale.
+- `docs/openapi.json` — Regenerate.
+
+**Dependencies**: ImageMagick `convert` already installed in Docker. Verify WebP delegate availability.
+
+**Test**: Integration tests: request `?format=jpeg&quality=50`, verify `Content-Type: image/jpeg` and JPEG magic bytes. Verify default still returns PNG. Verify `?scale=0.5` returns a smaller image.
+
+---
+
+## Task 4: Window listing API
+
+**What**: New endpoint to list open windows.
+
+**Endpoint**: `GET /v1/desktop/windows`
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopWindowInfo { id, title, x, y, width, height, is_active }` and `DesktopWindowListResponse`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add `list_windows()` method using xdotool (already installed):
+  1. `xdotool search --onlyvisible --name ""` → window IDs
+  2. `xdotool getwindowname {id}` + `xdotool getwindowgeometry {id}` per window
+  3. `xdotool getactivewindow` → is_active flag
+  4. Add `parse_window_geometry()` helper.
+- `server/packages/sandbox-agent/src/router.rs` — Add route, handler, OpenAPI annotations.
+- `sdks/typescript/src/client.ts` — Add `listDesktopWindows()`.
+- `docs/openapi.json` — Regenerate.
+
+**No new Docker dependencies** — xdotool already installed.
+
+**Test**: Integration test: start desktop, verify `GET /v1/desktop/windows` returns 200 with a list (may be empty if no GUI apps open, which is fine).
+
+---
+
+## Task 5: Unify desktop processes into process runtime with owner flag
+
+**What**: Desktop processes (Xvfb, openbox, dbus) get registered in the general process runtime with an `owner` field, gaining log streaming, SSE, and unified lifecycle for free.
+
+**Files**:
+
+- `server/packages/sandbox-agent/src/process_runtime.rs`:
+  - Add `ProcessOwner` enum: `User`, `Desktop`, `System`.
+  - Add `RestartPolicy` enum: `Never`, `Always`, `OnFailure`.
+  - Add `owner: ProcessOwner` and `restart_policy: Option<RestartPolicy>` to `ProcessStartSpec`, `ManagedProcess`, and `ProcessSnapshot`.
+  - Modify `list_processes()` to accept optional owner filter.
+  - Add auto-restart logic in `watch_exit()`: if restart_policy is Always (or OnFailure and exit code != 0), re-spawn the process using stored spec. Need to store the original `ProcessStartSpec` on `ManagedProcess`.
+
+- `server/packages/sandbox-agent/src/router/types.rs`:
+  - Add `owner` to `ProcessInfo` response.
+  - Add `ProcessListQuery { owner: Option<ProcessOwner> }`.
+
+- `server/packages/sandbox-agent/src/router.rs`:
+  - Modify `get_v1_processes` to accept `Query<ProcessListQuery>` and filter.
+  - Pass `ProcessRuntime` into `DesktopRuntime::new()`.
+  - Add `ProcessOwner`, `RestartPolicy` to OpenAPI schemas.
+
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — **Major refactor**:
+  - Remove `ManagedDesktopChild` struct.
+  - `DesktopRuntime` takes `ProcessRuntime` as constructor param.
+  - `start_xvfb_locked()` and `start_openbox_locked()` call `process_runtime.start_process(ProcessStartSpec { owner: Desktop, restart_policy: Some(Always), ... })` instead of spawning directly.
+  - Store returned process IDs in state instead of `Child` handles.
+  - `stop` calls `process_runtime.stop_process()` / `kill_process()`.
+  - `processes_locked()` queries process runtime for desktop-owned processes.
+  - dbus-launch remains a direct one-shot spawn (it's not a long-running process, just produces env vars).
+
+- `sdks/typescript/src/client.ts` — Add `owner` filter option to `listProcesses()`.
+- `docs/openapi.json` — Regenerate.
+
+**Risks**:
+- Lock ordering: desktop runtime holds Mutex, process runtime uses RwLock. Release desktop Mutex before calling process runtime, or restructure.
+- `log_path` field in `DesktopProcessInfo` no longer applies (logs are in-memory now). Remove or deprecate.
+
+**Test**: Integration: start desktop, `GET /v1/processes?owner=desktop` returns Xvfb+openbox. `GET /v1/processes?owner=user` excludes them. Desktop process logs are streamable via `GET /v1/processes/{id}/logs?follow=true`. Existing desktop lifecycle tests still pass.
+
+---
+
+## Task 6: Screen recording API (ffmpeg x11grab)
+
+**What**: 6 endpoints for recording the desktop to MP4.
+
+**Endpoints**:
+- `POST /v1/desktop/recording/start` — Start ffmpeg recording
+- `POST /v1/desktop/recording/stop` — Stop recording (SIGTERM → wait → SIGKILL)
+- `GET /v1/desktop/recordings` — List recordings
+- `GET /v1/desktop/recordings/{id}` — Get recording metadata
+- `GET /v1/desktop/recordings/{id}/download` — Serve MP4 file
+- `DELETE /v1/desktop/recordings/{id}` — Delete recording
+
+**Files**:
+- **New**: `server/packages/sandbox-agent/src/desktop_recording.rs` — Recording state, ffmpeg process management. `start_recording()` spawns ffmpeg via process runtime (owner=Desktop): `ffmpeg -f x11grab -video_size WxH -i :99 -c:v libx264 -preset ultrafast -r 30 {path}`. Recordings stored in `{state_dir}/recordings/`.
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add recording request/response types.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Wire recording manager, expose through desktop runtime.
+- `server/packages/sandbox-agent/src/router.rs` — Add 6 routes + handlers.
+- `server/packages/sandbox-agent/src/desktop_install.rs` — Add `ffmpeg` to dependency detection (soft: only error when recording is requested).
+- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add `ffmpeg` to apt-get.
+- `sdks/typescript/src/client.ts` — Add 6 recording methods.
+- `docs/openapi.json` — Regenerate.
+
+**Depends on**: Task 5 (ffmpeg runs as desktop-owned process).
+
+**Test**: Integration: start desktop → start recording → wait 2s → stop → list → download (verify MP4 magic bytes) → delete.
+
+---
+
+## Task 7: Neko WebRTC desktop streaming + React component
+
+**What**: Integrate neko for WebRTC desktop streaming, mirroring the ProcessTerminal + Ghostty pattern.
+
+### Server side
+
+- **New**: `server/packages/sandbox-agent/src/desktop_streaming.rs` — Manages neko process via process runtime (owner=Desktop). Neko connects to existing Xvfb display, runs GStreamer pipeline for H.264 encoding.
+- `server/packages/sandbox-agent/src/router.rs`:
+  - `GET /v1/desktop/stream/ws` — WebSocket proxy to neko's internal WebSocket. Upgrade request, bridge bidirectionally.
+  - `POST /v1/desktop/stream/start` / `POST /v1/desktop/stream/stop` — Lifecycle control.
+- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add neko binary + GStreamer packages (`gstreamer1.0-plugins-base`, `gstreamer1.0-plugins-good`, `gstreamer1.0-x`, `libgstreamer1.0-0`). Consider making this an optional Docker stage to avoid bloating the base image.
+
+### TypeScript SDK
+
+- **New**: `sdks/typescript/src/desktop-stream.ts` — `DesktopStreamSession` class ported from neko's `base.ts` (~500 lines):
+  - WebSocket for signaling (SDP offer/answer, ICE candidates)
+  - `RTCPeerConnection` for video stream
+  - `RTCDataChannel` for binary input (mouse: 7 bytes, keyboard: 11 bytes)
+  - Events: `onTrack(stream)`, `onConnect()`, `onDisconnect()`, `onError()`
+- `sdks/typescript/src/client.ts` — Add `connectDesktopStream()` returning `DesktopStreamSession`, `buildDesktopStreamWebSocketUrl()`, `startDesktopStream()`, `stopDesktopStream()`.
+- `sdks/typescript/src/index.ts` — Export `DesktopStreamSession`.
+
+### React SDK
+
+- **New**: `sdks/react/src/DesktopViewer.tsx` — Following `ProcessTerminal.tsx` pattern:
+  ```
+  Props: client (Pick<SandboxAgent, 'connectDesktopStream'>), height, className, style, onConnect, onDisconnect, onError
+  ```
+  - `useEffect` → `client.connectDesktopStream()` → wire `onTrack` to `<video>.srcObject`
+  - Capture mouse events on video element → scale coordinates to desktop resolution → send via DataChannel
+  - Capture keyboard events → send via DataChannel
+  - Connection state indicator
+  - Cleanup: close RTCPeerConnection, close WebSocket
+- `sdks/react/src/index.ts` — Export `DesktopViewer`.
+
+**Depends on**: Task 5 (neko runs as desktop-owned process).
+
+**Test**: Server integration: start stream, connect WebSocket, verify signaling messages flow. React: component mounts/unmounts without errors. Full E2E requires browser (manual initially).
+
+---
+
+## Verification
+
+After all tasks:
+1. `cargo test` — All Rust unit tests pass
+2. `cargo test --test v1_api` — All integration tests pass (requires Docker)
+3. Regenerate `docs/openapi.json` and verify it reflects all new endpoints
+4. Build TypeScript SDK: `cd sdks/typescript && pnpm build`
+5. Build React SDK: `cd sdks/react && pnpm build`
+6. Manual: start desktop, take JPEG screenshot, list windows, record 5s video, stream desktop via DesktopViewer component
--- a/.context/docker-test-image.stamp
+++ b/.context/docker-test-image.stamp
--- a/.context/docker-test-zgvGyf/bin/Xvfb
+++ b/.context/docker-test-zgvGyf/bin/Xvfb
@ -0,0 +1,15 @@
+#!/usr/bin/env sh
+set -eu
+display="${1:-:191}"
+number="${display#:}"
+socket="/tmp/.X11-unix/X${number}"
+mkdir -p /tmp/.X11-unix
+touch "$socket"
+cleanup() {
+  rm -f "$socket"
+  exit 0
+}
+trap cleanup INT TERM EXIT
+while :; do
+  sleep 1
+done
--- a/.context/docker-test-zgvGyf/bin/dbus-launch
+++ b/.context/docker-test-zgvGyf/bin/dbus-launch
@ -0,0 +1,4 @@
+#!/usr/bin/env sh
+set -eu
+echo "DBUS_SESSION_BUS_ADDRESS=unix:path=/tmp/sandbox-agent-test-bus"
+echo "DBUS_SESSION_BUS_PID=$$"
--- a/.context/docker-test-zgvGyf/bin/import
+++ b/.context/docker-test-zgvGyf/bin/import
@ -0,0 +1,3 @@
+#!/usr/bin/env sh
+set -eu
+printf '\211PNG\r\n\032\n\000\000\000\rIHDR\000\000\000\001\000\000\000\001\010\006\000\000\000\037\025\304\211\000\000\000\013IDATx\234c\000\001\000\000\005\000\001\r\n-\264\000\000\000\000IEND\256B`\202'
--- a/.context/docker-test-zgvGyf/bin/openbox
+++ b/.context/docker-test-zgvGyf/bin/openbox
@ -0,0 +1,6 @@
+#!/usr/bin/env sh
+set -eu
+trap 'exit 0' INT TERM
+while :; do
+  sleep 1
+done
--- a/.context/docker-test-zgvGyf/bin/xdotool
+++ b/.context/docker-test-zgvGyf/bin/xdotool
@ -0,0 +1,57 @@
+#!/usr/bin/env sh
+set -eu
+state_dir="${SANDBOX_AGENT_DESKTOP_FAKE_STATE_DIR:?missing fake state dir}"
+state_file="${state_dir}/mouse"
+mkdir -p "$state_dir"
+if [ ! -f "$state_file" ]; then
+  printf '0 0\n' > "$state_file"
+fi
+
+read_state() {
+  read -r x y < "$state_file"
+}
+
+write_state() {
+  printf '%s %s\n' "$1" "$2" > "$state_file"
+}
+
+command="${1:-}"
+case "$command" in
+  getmouselocation)
+    read_state
+    printf 'X=%s\nY=%s\nSCREEN=0\nWINDOW=0\n' "$x" "$y"
+    ;;
+  mousemove)
+    shift
+    x="${1:-0}"
+    y="${2:-0}"
+    shift 2 || true
+    while [ "$#" -gt 0 ]; do
+      token="$1"
+      shift
+      case "$token" in
+        mousemove)
+          x="${1:-0}"
+          y="${2:-0}"
+          shift 2 || true
+          ;;
+        mousedown|mouseup)
+          shift 1 || true
+          ;;
+        click)
+          if [ "${1:-}" = "--repeat" ]; then
+            shift 2 || true
+          fi
+          shift 1 || true
+          ;;
+      esac
+    done
+    write_state "$x" "$y"
+    ;;
+  type|key)
+    exit 0
+    ;;
+  *)
+    exit 0
+    ;;
+esac
--- a/.context/docker-test-zgvGyf/bin/xrandr
+++ b/.context/docker-test-zgvGyf/bin/xrandr
@ -0,0 +1,5 @@
+#!/usr/bin/env sh
+set -eu
+cat <<'EOF'
+Screen 0: minimum 1 x 1, current 1440 x 900, maximum 32767 x 32767
+EOF
--- a/.context/docker-test-zgvGyf/xdg-data/Library/Application
+++ b/.context/docker-test-zgvGyf/xdg-data/Library/Application
@ -0,0 +1,111 @@
+#!/usr/bin/env node
+const { createInterface } = require("node:readline");
+
+let nextSession = 0;
+
+function emit(value) {
+  process.stdout.write(JSON.stringify(value) + "\n");
+}
+
+function firstText(prompt) {
+  if (!Array.isArray(prompt)) {
+    return "";
+  }
+
+  for (const block of prompt) {
+    if (block && block.type === "text" && typeof block.text === "string") {
+      return block.text;
+    }
+  }
+
+  return "";
+}
+
+const rl = createInterface({
+  input: process.stdin,
+  crlfDelay: Infinity,
+});
+
+rl.on("line", (line) => {
+  let msg;
+  try {
+    msg = JSON.parse(line);
+  } catch {
+    return;
+  }
+
+  const hasMethod = typeof msg?.method === "string";
+  const hasId = Object.prototype.hasOwnProperty.call(msg, "id");
+  const method = hasMethod ? msg.method : undefined;
+
+  if (method === "session/prompt") {
+    const sessionId = typeof msg?.params?.sessionId === "string" ? msg.params.sessionId : "";
+    const text = firstText(msg?.params?.prompt);
+    emit({
+      jsonrpc: "2.0",
+      method: "session/update",
+      params: {
+        sessionId,
+        update: {
+          sessionUpdate: "agent_message_chunk",
+          content: {
+            type: "text",
+            text: "mock: " + text,
+          },
+        },
+      },
+    });
+  }
+
+  if (!hasMethod || !hasId) {
+    return;
+  }
+
+  if (method === "initialize") {
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        protocolVersion: 1,
+        capabilities: {},
+        serverInfo: {
+          name: "mock-acp-agent",
+          version: "0.0.1",
+        },
+      },
+    });
+    return;
+  }
+
+  if (method === "session/new") {
+    nextSession += 1;
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        sessionId: "mock-session-" + nextSession,
+      },
+    });
+    return;
+  }
+
+  if (method === "session/prompt") {
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        stopReason: "end_turn",
+      },
+    });
+    return;
+  }
+
+  emit({
+    jsonrpc: "2.0",
+    id: msg.id,
+    result: {
+      ok: true,
+      echoedMethod: method,
+    },
+  });
+});
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/bin/agent_processes/mock-acp
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/bin/agent_processes/mock-acp
@ -0,0 +1,111 @@
+#!/usr/bin/env node
+const { createInterface } = require("node:readline");
+
+let nextSession = 0;
+
+function emit(value) {
+  process.stdout.write(JSON.stringify(value) + "\n");
+}
+
+function firstText(prompt) {
+  if (!Array.isArray(prompt)) {
+    return "";
+  }
+
+  for (const block of prompt) {
+    if (block && block.type === "text" && typeof block.text === "string") {
+      return block.text;
+    }
+  }
+
+  return "";
+}
+
+const rl = createInterface({
+  input: process.stdin,
+  crlfDelay: Infinity,
+});
+
+rl.on("line", (line) => {
+  let msg;
+  try {
+    msg = JSON.parse(line);
+  } catch {
+    return;
+  }
+
+  const hasMethod = typeof msg?.method === "string";
+  const hasId = Object.prototype.hasOwnProperty.call(msg, "id");
+  const method = hasMethod ? msg.method : undefined;
+
+  if (method === "session/prompt") {
+    const sessionId = typeof msg?.params?.sessionId === "string" ? msg.params.sessionId : "";
+    const text = firstText(msg?.params?.prompt);
+    emit({
+      jsonrpc: "2.0",
+      method: "session/update",
+      params: {
+        sessionId,
+        update: {
+          sessionUpdate: "agent_message_chunk",
+          content: {
+            type: "text",
+            text: "mock: " + text,
+          },
+        },
+      },
+    });
+  }
+
+  if (!hasMethod || !hasId) {
+    return;
+  }
+
+  if (method === "initialize") {
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        protocolVersion: 1,
+        capabilities: {},
+        serverInfo: {
+          name: "mock-acp-agent",
+          version: "0.0.1",
+        },
+      },
+    });
+    return;
+  }
+
+  if (method === "session/new") {
+    nextSession += 1;
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        sessionId: "mock-session-" + nextSession,
+      },
+    });
+    return;
+  }
+
+  if (method === "session/prompt") {
+    emit({
+      jsonrpc: "2.0",
+      id: msg.id,
+      result: {
+        stopReason: "end_turn",
+      },
+    });
+    return;
+  }
+
+  emit({
+    jsonrpc: "2.0",
+    id: msg.id,
+    result: {
+      ok: true,
+      echoedMethod: method,
+    },
+  });
+});
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/logs/log-03-08-26
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/logs/log-03-08-26
@ -0,0 +1,4 @@
+ts=2026-03-08T07:57:29.140584296Z level=info target=sandbox_agent::telemetry message="anonymous telemetry is enabled, disable with --no-telemetry"
+ts=2026-03-08T07:57:29.141203296Z level=info target=sandbox_agent::cli message="server listening" addr=0.0.0.0:3000
+ts=2026-03-08T07:57:29.298687421Z level=info target=sandbox_agent::router span=http.request span_path=http.request message=request method=GET uri=/v1/health
+ts=2026-03-08T07:57:29.302092338Z level=info target=sandbox_agent::router span=http.request span_path=http.request status="200 OK" latency_ms=3 method=GET uri=/v1/health
--- a/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/telemetry_id
+++ b/.context/docker-test-zgvGyf/xdg-data/sandbox-agent/telemetry_id
@ -0,0 +1 @@
+5a1927c6af3d83586f34112f58e0c8d6
--- a/.context/notes.md
+++ b/.context/notes.md
--- a/.context/plans/desktop-computer-use-api-enhancements.md
+++ b/.context/plans/desktop-computer-use-api-enhancements.md
@ -0,0 +1,215 @@
+# Desktop Computer Use API Enhancements
+
+## Context
+
+Competitive analysis of Daytona, Cloudflare Sandbox SDK, and CUA revealed significant gaps in our desktop computer use API. Both Daytona and Cloudflare have or are building screenshot compression, hotkey combos, mouseDown/mouseUp, keyDown/keyUp, per-component process health, and live desktop streaming. CUA additionally has window management and accessibility trees. We have none of these. This plan closes the most impactful gaps across 7 tasks.
+
+## Execution Order
+
+```
+Sprint 1 (parallel, no dependencies):  Tasks 1, 2, 3, 4
+Sprint 2 (foundational refactor):      Task 5
+Sprint 3 (parallel, depend on #5):     Tasks 6, 7
+```
+
+---
+
+## Task 1: Unify keyboard press with object modifiers
+
+**What**: Change `DesktopKeyboardPressRequest` to accept a `modifiers` object instead of requiring DSL strings like `"ctrl+c"`.
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopKeyModifiers { ctrl, shift, alt, cmd }` struct (all `Option<bool>`). Add `modifiers: Option<DesktopKeyModifiers>` to `DesktopKeyboardPressRequest`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Modify `press_key_args()` (~line 1349) to build xdotool key string from modifiers object. If modifiers present, construct `"ctrl+shift+a"` style string. `cmd` maps to `super`.
+- `server/packages/sandbox-agent/src/router.rs` — Add `DesktopKeyModifiers` to OpenAPI schemas list.
+- `docs/openapi.json` — Regenerate.
+
+**Backward compatible**: Old `{"key": "ctrl+a"}` still works. New form: `{"key": "a", "modifiers": {"ctrl": true}}`.
+
+**Test**: Unit test that `press_key_args("a", Some({ctrl: true, shift: true}))` produces `["key", "--", "ctrl+shift+a"]`. Integration test with both old and new request shapes.
+
+---
+
+## Task 2: Add mouseDown/mouseUp and keyDown/keyUp endpoints
+
+**What**: 4 new endpoints for low-level press/release control.
+
+**Endpoints**:
+- `POST /v1/desktop/mouse/down` — `xdotool mousedown BUTTON` (optional x,y moves first)
+- `POST /v1/desktop/mouse/up` — `xdotool mouseup BUTTON`
+- `POST /v1/desktop/keyboard/down` — `xdotool keydown KEY`
+- `POST /v1/desktop/keyboard/up` — `xdotool keyup KEY`
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopMouseDownRequest`, `DesktopMouseUpRequest` (x/y optional, button optional), `DesktopKeyboardDownRequest`, `DesktopKeyboardUpRequest` (key: String).
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add 4 public methods following existing `click_mouse()` / `press_key()` patterns.
+- `server/packages/sandbox-agent/src/router.rs` — Add 4 routes, 4 handlers with utoipa annotations.
+- `sdks/typescript/src/client.ts` — Add `mouseDownDesktop()`, `mouseUpDesktop()`, `keyDownDesktop()`, `keyUpDesktop()`.
+- `docs/openapi.json` — Regenerate.
+
+**Test**: Integration test: mouseDown → mousemove → mouseUp sequence. keyDown → keyUp sequence.
+
+---
+
+## Task 3: Screenshot compression
+
+**What**: Add format, quality, and scale query params to screenshot endpoints.
+
+**Params**: `format` (png|jpeg|webp, default png), `quality` (1-100, default 85), `scale` (0.1-1.0, default 1.0).
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopScreenshotFormat` enum. Add `format`, `quality`, `scale` fields to `DesktopScreenshotQuery` and `DesktopRegionScreenshotQuery`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — After capturing PNG via `import`, pipe through ImageMagick `convert` if format != png or scale != 1.0: `convert png:- -resize {scale*100}% -quality {quality} {format}:-`. Add a `run_command_with_stdin()` helper (or modify existing `run_command_output`) to pipe bytes into a command's stdin.
+- `server/packages/sandbox-agent/src/router.rs` — Modify screenshot handlers to pass format/quality/scale, return dynamic `Content-Type` header.
+- `sdks/typescript/src/client.ts` — Update `takeDesktopScreenshot()` to accept format/quality/scale.
+- `docs/openapi.json` — Regenerate.
+
+**Dependencies**: ImageMagick `convert` already installed in Docker. Verify WebP delegate availability.
+
+**Test**: Integration tests: request `?format=jpeg&quality=50`, verify `Content-Type: image/jpeg` and JPEG magic bytes. Verify default still returns PNG. Verify `?scale=0.5` returns a smaller image.
+
+---
+
+## Task 4: Window listing API
+
+**What**: New endpoint to list open windows.
+
+**Endpoint**: `GET /v1/desktop/windows`
+
+**Files**:
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add `DesktopWindowInfo { id, title, x, y, width, height, is_active }` and `DesktopWindowListResponse`.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Add `list_windows()` method using xdotool (already installed):
+  1. `xdotool search --onlyvisible --name ""` → window IDs
+  2. `xdotool getwindowname {id}` + `xdotool getwindowgeometry {id}` per window
+  3. `xdotool getactivewindow` → is_active flag
+  4. Add `parse_window_geometry()` helper.
+- `server/packages/sandbox-agent/src/router.rs` — Add route, handler, OpenAPI annotations.
+- `sdks/typescript/src/client.ts` — Add `listDesktopWindows()`.
+- `docs/openapi.json` — Regenerate.
+
+**No new Docker dependencies** — xdotool already installed.
+
+**Test**: Integration test: start desktop, verify `GET /v1/desktop/windows` returns 200 with a list (may be empty if no GUI apps open, which is fine).
+
+---
+
+## Task 5: Unify desktop processes into process runtime with owner flag
+
+**What**: Desktop processes (Xvfb, openbox, dbus) get registered in the general process runtime with an `owner` field, gaining log streaming, SSE, and unified lifecycle for free.
+
+**Files**:
+
+- `server/packages/sandbox-agent/src/process_runtime.rs`:
+  - Add `ProcessOwner` enum: `User`, `Desktop`, `System`.
+  - Add `RestartPolicy` enum: `Never`, `Always`, `OnFailure`.
+  - Add `owner: ProcessOwner` and `restart_policy: Option<RestartPolicy>` to `ProcessStartSpec`, `ManagedProcess`, and `ProcessSnapshot`.
+  - Modify `list_processes()` to accept optional owner filter.
+  - Add auto-restart logic in `watch_exit()`: if restart_policy is Always (or OnFailure and exit code != 0), re-spawn the process using stored spec. Need to store the original `ProcessStartSpec` on `ManagedProcess`.
+
+- `server/packages/sandbox-agent/src/router/types.rs`:
+  - Add `owner` to `ProcessInfo` response.
+  - Add `ProcessListQuery { owner: Option<ProcessOwner> }`.
+
+- `server/packages/sandbox-agent/src/router.rs`:
+  - Modify `get_v1_processes` to accept `Query<ProcessListQuery>` and filter.
+  - Pass `ProcessRuntime` into `DesktopRuntime::new()`.
+  - Add `ProcessOwner`, `RestartPolicy` to OpenAPI schemas.
+
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — **Major refactor**:
+  - Remove `ManagedDesktopChild` struct.
+  - `DesktopRuntime` takes `ProcessRuntime` as constructor param.
+  - `start_xvfb_locked()` and `start_openbox_locked()` call `process_runtime.start_process(ProcessStartSpec { owner: Desktop, restart_policy: Some(Always), ... })` instead of spawning directly.
+  - Store returned process IDs in state instead of `Child` handles.
+  - `stop` calls `process_runtime.stop_process()` / `kill_process()`.
+  - `processes_locked()` queries process runtime for desktop-owned processes.
+  - dbus-launch remains a direct one-shot spawn (it's not a long-running process, just produces env vars).
+
+- `sdks/typescript/src/client.ts` — Add `owner` filter option to `listProcesses()`.
+- `docs/openapi.json` — Regenerate.
+
+**Risks**:
+- Lock ordering: desktop runtime holds Mutex, process runtime uses RwLock. Release desktop Mutex before calling process runtime, or restructure.
+- `log_path` field in `DesktopProcessInfo` no longer applies (logs are in-memory now). Remove or deprecate.
+
+**Test**: Integration: start desktop, `GET /v1/processes?owner=desktop` returns Xvfb+openbox. `GET /v1/processes?owner=user` excludes them. Desktop process logs are streamable via `GET /v1/processes/{id}/logs?follow=true`. Existing desktop lifecycle tests still pass.
+
+---
+
+## Task 6: Screen recording API (ffmpeg x11grab)
+
+**What**: 6 endpoints for recording the desktop to MP4.
+
+**Endpoints**:
+- `POST /v1/desktop/recording/start` — Start ffmpeg recording
+- `POST /v1/desktop/recording/stop` — Stop recording (SIGTERM → wait → SIGKILL)
+- `GET /v1/desktop/recordings` — List recordings
+- `GET /v1/desktop/recordings/{id}` — Get recording metadata
+- `GET /v1/desktop/recordings/{id}/download` — Serve MP4 file
+- `DELETE /v1/desktop/recordings/{id}` — Delete recording
+
+**Files**:
+- **New**: `server/packages/sandbox-agent/src/desktop_recording.rs` — Recording state, ffmpeg process management. `start_recording()` spawns ffmpeg via process runtime (owner=Desktop): `ffmpeg -f x11grab -video_size WxH -i :99 -c:v libx264 -preset ultrafast -r 30 {path}`. Recordings stored in `{state_dir}/recordings/`.
+- `server/packages/sandbox-agent/src/desktop_types.rs` — Add recording request/response types.
+- `server/packages/sandbox-agent/src/desktop_runtime.rs` — Wire recording manager, expose through desktop runtime.
+- `server/packages/sandbox-agent/src/router.rs` — Add 6 routes + handlers.
+- `server/packages/sandbox-agent/src/desktop_install.rs` — Add `ffmpeg` to dependency detection (soft: only error when recording is requested).
+- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add `ffmpeg` to apt-get.
+- `sdks/typescript/src/client.ts` — Add 6 recording methods.
+- `docs/openapi.json` — Regenerate.
+
+**Depends on**: Task 5 (ffmpeg runs as desktop-owned process).
+
+**Test**: Integration: start desktop → start recording → wait 2s → stop → list → download (verify MP4 magic bytes) → delete.
+
+---
+
+## Task 7: Neko WebRTC desktop streaming + React component
+
+**What**: Integrate neko for WebRTC desktop streaming, mirroring the ProcessTerminal + Ghostty pattern.
+
+### Server side
+
+- **New**: `server/packages/sandbox-agent/src/desktop_streaming.rs` — Manages neko process via process runtime (owner=Desktop). Neko connects to existing Xvfb display, runs GStreamer pipeline for H.264 encoding.
+- `server/packages/sandbox-agent/src/router.rs`:
+  - `GET /v1/desktop/stream/ws` — WebSocket proxy to neko's internal WebSocket. Upgrade request, bridge bidirectionally.
+  - `POST /v1/desktop/stream/start` / `POST /v1/desktop/stream/stop` — Lifecycle control.
+- `docker/runtime/Dockerfile` and `docker/test-agent/Dockerfile` — Add neko binary + GStreamer packages (`gstreamer1.0-plugins-base`, `gstreamer1.0-plugins-good`, `gstreamer1.0-x`, `libgstreamer1.0-0`). Consider making this an optional Docker stage to avoid bloating the base image.
+
+### TypeScript SDK
+
+- **New**: `sdks/typescript/src/desktop-stream.ts` — `DesktopStreamSession` class ported from neko's `base.ts` (~500 lines):
+  - WebSocket for signaling (SDP offer/answer, ICE candidates)
+  - `RTCPeerConnection` for video stream
+  - `RTCDataChannel` for binary input (mouse: 7 bytes, keyboard: 11 bytes)
+  - Events: `onTrack(stream)`, `onConnect()`, `onDisconnect()`, `onError()`
+- `sdks/typescript/src/client.ts` — Add `connectDesktopStream()` returning `DesktopStreamSession`, `buildDesktopStreamWebSocketUrl()`, `startDesktopStream()`, `stopDesktopStream()`.
+- `sdks/typescript/src/index.ts` — Export `DesktopStreamSession`.
+
+### React SDK
+
+- **New**: `sdks/react/src/DesktopViewer.tsx` — Following `ProcessTerminal.tsx` pattern:
+  ```
+  Props: client (Pick<SandboxAgent, 'connectDesktopStream'>), height, className, style, onConnect, onDisconnect, onError
+  ```
+  - `useEffect` → `client.connectDesktopStream()` → wire `onTrack` to `<video>.srcObject`
+  - Capture mouse events on video element → scale coordinates to desktop resolution → send via DataChannel
+  - Capture keyboard events → send via DataChannel
+  - Connection state indicator
+  - Cleanup: close RTCPeerConnection, close WebSocket
+- `sdks/react/src/index.ts` — Export `DesktopViewer`.
+
+**Depends on**: Task 5 (neko runs as desktop-owned process).
+
+**Test**: Server integration: start stream, connect WebSocket, verify signaling messages flow. React: component mounts/unmounts without errors. Full E2E requires browser (manual initially).
+
+---
+
+## Verification
+
+After all tasks:
+1. `cargo test` — All Rust unit tests pass
+2. `cargo test --test v1_api` — All integration tests pass (requires Docker)
+3. Regenerate `docs/openapi.json` and verify it reflects all new endpoints
+4. Build TypeScript SDK: `cd sdks/typescript && pnpm build`
+5. Build React SDK: `cd sdks/react && pnpm build`
+6. Manual: start desktop, take JPEG screenshot, list windows, record 5s video, stream desktop via DesktopViewer component
--- a/.context/todos.md
+++ b/.context/todos.md
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -22,19 +22,6 @@
  - `server/packages/sandbox-agent/src/cli.rs`
 - Keep docs aligned to implemented endpoints/commands only (for example ACP under `/v1/acp`, not legacy `/v1/sessions` APIs).

-## E2E Agent Testing
-
- When asked to test agents e2e and you do not have the API tokens/credentials required, always stop and ask the user where to find the tokens before proceeding.
-
-## ACP Adapter Audit
-
- `scripts/audit-acp-deps/adapters.json` is the single source of truth for ACP adapter npm packages, pinned versions, and the `@agentclientprotocol/sdk` pin.
- The Rust fallback install path in `server/packages/agent-management/src/agents.rs` reads adapter entries from `adapters.json` at compile time via `include_str!`.
- Run `cd scripts/audit-acp-deps && npx tsx audit.ts` to compare our pinned versions against the ACP registry and npm latest.
- When bumping an adapter version, update `adapters.json` only — the Rust code picks it up automatically.
- When adding a new agent, add an entry to `adapters.json` (the `_` fallback arm in `install_agent_process_fallback` handles it).
- When updating the `@agentclientprotocol/sdk` pin, update both `adapters.json` (sdkDeps) and `sdks/acp-http-client/package.json`.
-
 ## Change Tracking

 - If the user asks to "push" changes, treat that as permission to commit and push all current workspace changes, not a hand-picked subset, unless the user explicitly scopes the push.
@ -43,41 +30,13 @@
 - Regenerate `docs/openapi.json` when HTTP contracts change.
 - Keep `docs/inspector.mdx` and `docs/sdks/typescript.mdx` aligned with implementation.
 - Append blockers/decisions to `research/acp/friction.md` during ACP work.
- Each agent has its own doc page at `docs/agents/<name>.mdx` listing models, modes, and thought levels. Update the relevant page when changing `fallback_config_options`. To regenerate capability data, run `cd scripts/agent-configs && npx tsx dump.ts`. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
+- `docs/agent-capabilities.mdx` lists models/modes/thought levels per agent. Update it when adding a new agent or changing `fallback_config_options`. If its "Last updated" date is >2 weeks old, re-run `cd scripts/agent-configs && npx tsx dump.ts` and update the doc to match. Source data: `scripts/agent-configs/resources/*.json` and hardcoded entries in `server/packages/sandbox-agent/src/router/support.rs` (`fallback_config_options`).
 - Some agent models are gated by subscription (e.g. Claude `opus`). The live report only shows models available to the current credentials. The static doc and JSON resource files should list all known models regardless of subscription tier.

-## Adding Providers
+## Docker Test Image

-When adding a new sandbox provider, update all of the following:
-
- `sdks/typescript/src/providers/<name>.ts` — provider implementation
- `sdks/typescript/package.json` — add `./<name>` export, peerDependencies, peerDependenciesMeta, devDependencies
- `sdks/typescript/tsup.config.ts` — add entry point and external
- `sdks/typescript/tests/providers.test.ts` — add test entry
- `examples/<name>/` — create example with `src/index.ts` and `tests/<name>.test.ts`
- `docs/deploy/<name>.mdx` — create deploy guide
- `docs/docs.json` — add to Deploy pages navigation
- `docs/quickstart.mdx` — add tab in "Start the sandbox" step, add credentials entry in "Passing LLM credentials" accordion
-
-## Adding Agents
-
-When adding a new agent, update all of the following:
-
- `docs/agents/<name>.mdx` — create agent page with usage snippet and capabilities table
- `docs/docs.json` — add to the Agents group under Agent
- `docs/quickstart.mdx` — add tab in the "Create a session and send a prompt" CodeGroup
-
-## Persist Packages (Deprecated)
-
- The `@sandbox-agent/persist-*` npm packages (`persist-sqlite`, `persist-postgres`, `persist-indexeddb`, `persist-rivet`) are deprecated stubs. They still publish to npm but throw a deprecation error at import time.
- Driver implementations now live inline in examples and consuming packages:
-  - SQLite: `examples/persist-sqlite/src/persist.ts`
-  - Postgres: `examples/persist-postgres/src/persist.ts`
-  - IndexedDB: `frontend/packages/inspector/src/persist-indexeddb.ts`
-  - Rivet: inlined in `docs/multiplayer.mdx`
-  - In-memory: built into the main `sandbox-agent` SDK (`InMemorySessionPersistDriver`)
- Docs (`docs/session-persistence.mdx`) link to the example implementations on GitHub instead of referencing the packages.
- Do not re-add `@sandbox-agent/persist-*` as dependencies anywhere. New persist drivers should be copied into the consuming project directly.
+- Docker-backed Rust and TypeScript tests build `docker/test-agent/Dockerfile` directly in-process and cache the image tag only in memory (`OnceLock` in Rust, module-level variable in TypeScript).
+- Do not add cross-process image-build scripts unless there is a concrete need for them.

 ## Install Version References

@ -93,27 +52,20 @@ When adding a new agent, update all of the following:
  - `docs/sdk-overview.mdx`
  - `docs/react-components.mdx`
  - `docs/session-persistence.mdx`
-  - `docs/architecture.mdx`
  - `docs/deploy/local.mdx`
  - `docs/deploy/cloudflare.mdx`
  - `docs/deploy/vercel.mdx`
  - `docs/deploy/daytona.mdx`
  - `docs/deploy/e2b.mdx`
  - `docs/deploy/docker.mdx`
-  - `docs/deploy/boxlite.mdx`
-  - `docs/deploy/modal.mdx`
-  - `docs/deploy/computesdk.mdx`
  - `frontend/packages/website/src/components/GetStarted.tsx`
  - `.claude/commands/post-release-testing.md`
  - `examples/cloudflare/Dockerfile`
-  - `examples/boxlite/Dockerfile`
-  - `examples/boxlite-python/Dockerfile`
  - `examples/daytona/src/index.ts`
  - `examples/shared/src/docker.ts`
  - `examples/docker/src/index.ts`
  - `examples/e2b/src/index.ts`
  - `examples/vercel/src/index.ts`
-  - `sdks/typescript/src/providers/shared.ts`
  - `scripts/release/main.ts`
  - `scripts/release/promote-artifacts.ts`
  - `scripts/release/sdk.ts`
--- a/docker/runtime/Dockerfile
+++ b/docker/runtime/Dockerfile
@ -149,7 +149,8 @@ FROM debian:bookworm-slim
 RUN apt-get update && apt-get install -y \
    ca-certificates \
    curl \
-    git && \
+    git \
+    ffmpeg && \
    rm -rf /var/lib/apt/lists/*

 # Copy the binary from builder
--- a/docker/test-agent/Dockerfile
+++ b/docker/test-agent/Dockerfile
@ -0,0 +1,42 @@
+FROM rust:1.88.0-bookworm AS builder
+WORKDIR /build
+
+COPY Cargo.toml Cargo.lock ./
+COPY server/ ./server/
+COPY gigacode/ ./gigacode/
+COPY resources/agent-schemas/artifacts/ ./resources/agent-schemas/artifacts/
+COPY scripts/agent-configs/ ./scripts/agent-configs/
+
+ENV SANDBOX_AGENT_SKIP_INSPECTOR=1
+
+RUN --mount=type=cache,target=/usr/local/cargo/registry \
+    --mount=type=cache,target=/usr/local/cargo/git \
+    --mount=type=cache,target=/build/target \
+    cargo build -p sandbox-agent --release && \
+    cp target/release/sandbox-agent /sandbox-agent
+
+FROM node:22-bookworm-slim
+RUN apt-get update -qq && \
+    apt-get install -y -qq --no-install-recommends \
+      ca-certificates \
+      bash \
+      libstdc++6 \
+      xvfb \
+      openbox \
+      xdotool \
+      imagemagick \
+      ffmpeg \
+      x11-xserver-utils \
+      dbus-x11 \
+      xauth \
+      fonts-dejavu-core \
+      xterm \
+      > /dev/null 2>&1 && \
+    rm -rf /var/lib/apt/lists/*
+
+COPY --from=builder /sandbox-agent /usr/local/bin/sandbox-agent
+
+EXPOSE 3000
+
+ENTRYPOINT ["/usr/local/bin/sandbox-agent"]
+CMD ["server", "--host", "0.0.0.0", "--port", "3000", "--no-token"]
--- a/docs/cli.mdx
+++ b/docs/cli.mdx
@ -37,6 +37,36 @@ Notes:
 - Set `SANDBOX_AGENT_LOG_STDOUT=1` to force stdout/stderr logging.
 - Use `SANDBOX_AGENT_LOG_DIR` to override log directory.

+## install
+
+Install first-party runtime dependencies.
+
+### install desktop
+
+Install the Linux desktop runtime packages required by `/v1/desktop/*`.
+
+```bash
+sandbox-agent install desktop [OPTIONS]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--yes` | Skip the confirmation prompt |
+| `--print-only` | Print the package-manager command without executing it |
+| `--package-manager <apt\|dnf\|apk>` | Override package-manager detection |
+| `--no-fonts` | Skip the default DejaVu font package |
+
+```bash
+sandbox-agent install desktop --yes
+sandbox-agent install desktop --print-only
+```
+
+Notes:
+
+- Supported on Linux only.
+- The command detects `apt`, `dnf`, or `apk`.
+- If the host is not already running as root, the command requires `sudo`.
+
 ## install-agent

 Install or reinstall a single agent, or every supported agent with `--all`.
--- a/docs/deploy/docker.mdx
+++ b/docs/deploy/docker.mdx
@ -15,43 +15,64 @@ Run the published full image with all supported agents pre-installed:
 docker run --rm -p 3000:3000 \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
-  rivetdev/sandbox-agent:0.4.1-rc.1-full \
+  rivetdev/sandbox-agent:0.3.1-full \
  server --no-token --host 0.0.0.0 --port 3000
 ```

-The `0.4.1-rc.1-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image.
+The `0.3.1-full` tag pins the exact version. The moving `full` tag is also published for contributors who want the latest full image.

-## TypeScript with the Docker provider
+If you also want the desktop API inside the container, install desktop dependencies before starting the server:

 ```bash
-npm install sandbox-agent@0.3.x dockerode get-port
+docker run --rm -p 3000:3000 \
+  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
+  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
+  node:22-bookworm-slim sh -c "\
+    apt-get update && \
+    DEBIAN_FRONTEND=noninteractive apt-get install -y curl ca-certificates bash libstdc++6 && \
+    rm -rf /var/lib/apt/lists/* && \
+    curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh && \
+    sandbox-agent install desktop --yes && \
+    sandbox-agent server --no-token --host 0.0.0.0 --port 3000"
 ```

-```typescript
-import { SandboxAgent } from "sandbox-agent";
-import { docker } from "sandbox-agent/docker";
+In a Dockerfile:

-const sdk = await SandboxAgent.start({
-  sandbox: docker({
-    env: [
-      `ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`,
-      `OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`,
-    ].filter(Boolean),
-  }),
+```dockerfile
+RUN sandbox-agent install desktop --yes
+```
+
+## TypeScript with dockerode
+
+```typescript
+import Docker from "dockerode";
+import { SandboxAgent } from "sandbox-agent";
+
+const docker = new Docker();
+const PORT = 3000;
+
+const container = await docker.createContainer({
+  Image: "rivetdev/sandbox-agent:0.3.1-full",
+  Cmd: ["server", "--no-token", "--host", "0.0.0.0", "--port", `${PORT}`],
+  Env: [
+    `ANTHROPIC_API_KEY=${process.env.ANTHROPIC_API_KEY}`,
+    `OPENAI_API_KEY=${process.env.OPENAI_API_KEY}`,
+    `CODEX_API_KEY=${process.env.CODEX_API_KEY}`,
+  ].filter(Boolean),
+  ExposedPorts: { [`${PORT}/tcp`]: {} },
+  HostConfig: {
+    AutoRemove: true,
+    PortBindings: { [`${PORT}/tcp`]: [{ HostPort: `${PORT}` }] },
+  },
 });

-try {
-  const session = await sdk.createSession({ agent: "codex" });
-  await session.prompt([{ type: "text", text: "Summarize this repository." }]);
-} finally {
-  await sdk.destroySandbox();
-}
-```
+await container.start();

-The `docker` provider uses the `rivetdev/sandbox-agent:0.4.1-rc.1-full` image by default. Override with `image`:
+const baseUrl = `http://127.0.0.1:${PORT}`;
+const sdk = await SandboxAgent.connect({ baseUrl });

-```typescript
-docker({ image: "my-custom-image:latest" })
+const session = await sdk.createSession({ agent: "codex" });
+await session.prompt([{ type: "text", text: "Summarize this repository." }]);
 ```

 ## Building a custom image with everything preinstalled
--- a/docs/inspector.mdx
+++ b/docs/inspector.mdx
@ -35,6 +35,7 @@ console.log(url);
 - Prompt testing
 - Request/response debugging
 - Interactive permission prompts (approve, always-allow, or reject tool-use requests)
+- Desktop panel for status, remediation, start/stop, and screenshot refresh
 - Process management (create, stop, kill, delete, view logs)
 - Interactive PTY terminal for tty processes
 - One-shot command execution
@ -50,3 +51,16 @@ console.log(url);
 The Inspector includes an embedded Ghostty-based terminal for interactive tty
 processes. The UI uses the SDK's high-level `connectProcessTerminal(...)`
 wrapper via the shared `@sandbox-agent/react` `ProcessTerminal` component.
+
+## Desktop panel
+
+The `Desktop` panel shows the current desktop runtime state, missing dependencies,
+the suggested install command, last error details, process/log paths, and the
+latest captured screenshot.
+
+Use it to:
+
+- Check whether desktop dependencies are installed
+- Start or stop the managed desktop runtime
+- Refresh desktop status
+- Capture a fresh screenshot on demand
--- a/docs/openapi.json
+++ b/docs/openapi.json
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@ -1,370 +1,289 @@
 ---
 title: "Quickstart"
-description: "Get a coding agent running in a sandbox in under a minute."
+description: "Start the server and send your first message."
 icon: "rocket"
 ---

 <Steps>
-  <Step title="Install">
+  <Step title="Install skill (optional)">
    <Tabs>
-      <Tab title="npm">
+      <Tab title="npx">
+        ```bash
+        npx skills add rivet-dev/skills -s sandbox-agent
+        ```
+      </Tab>
+      <Tab title="bunx">
+        ```bash
+        bunx skills add rivet-dev/skills -s sandbox-agent
+        ```
+      </Tab>
+    </Tabs>
+  </Step>
+
+  <Step title="Set environment variables">
+    Each coding agent requires API keys to connect to their respective LLM providers.
+
+    <Tabs>
+      <Tab title="Local shell">
+        ```bash
+        export ANTHROPIC_API_KEY="sk-ant-..."
+        export OPENAI_API_KEY="sk-..."
+        ```
+      </Tab>
+
+      <Tab title="E2B">
+        ```typescript
+        import { Sandbox } from "@e2b/code-interpreter";
+
+        const envs: Record<string, string> = {};
+        if (process.env.ANTHROPIC_API_KEY) envs.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+        if (process.env.OPENAI_API_KEY) envs.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+        const sandbox = await Sandbox.create({ envs });
+        ```
+      </Tab>
+
+      <Tab title="Daytona">
+        ```typescript
+        import { Daytona } from "@daytonaio/sdk";
+
+        const envVars: Record<string, string> = {};
+        if (process.env.ANTHROPIC_API_KEY) envVars.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY;
+        if (process.env.OPENAI_API_KEY) envVars.OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+
+        const daytona = new Daytona();
+        const sandbox = await daytona.create({
+          snapshot: "sandbox-agent-ready",
+          envVars,
+        });
+        ```
+      </Tab>
+
+      <Tab title="Docker">
+        ```bash
+        docker run -p 2468:2468 \
+          -e ANTHROPIC_API_KEY="sk-ant-..." \
+          -e OPENAI_API_KEY="sk-..." \
+          rivetdev/sandbox-agent:0.3.1-full \
+          server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+    </Tabs>
+
+    <AccordionGroup>
+      <Accordion title="Extracting API keys from current machine">
+        Use `sandbox-agent credentials extract-env --export` to extract your existing API keys (Anthropic, OpenAI, etc.) from local Claude Code or Codex config files.
+      </Accordion>
+      <Accordion title="Testing without API keys">
+        Use the `mock` agent for SDK and integration testing without provider credentials.
+      </Accordion>
+      <Accordion title="Multi-tenant and per-user billing">
+        For per-tenant token tracking, budget enforcement, or usage-based billing, see [LLM Credentials](/llm-credentials) for gateway options like OpenRouter, LiteLLM, and Portkey.
+      </Accordion>
+    </AccordionGroup>
+  </Step>
+
+  <Step title="Run the server">
+    <Tabs>
+      <Tab title="curl">
+        Install and run the binary directly.
+
+        ```bash
+        curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+
+      <Tab title="npx">
+        Run without installing globally.
+
+        ```bash
+        npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+
+      <Tab title="bunx">
+        Run without installing globally.
+
+        ```bash
+        bunx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+
+      <Tab title="npm i -g">
+        Install globally, then run.
+
+        ```bash
+        npm install -g @sandbox-agent/cli@0.3.x
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+
+      <Tab title="bun add -g">
+        Install globally, then run.
+
+        ```bash
+        bun add -g @sandbox-agent/cli@0.3.x
+        # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
+        bun pm -g trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
+        sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+        ```
+      </Tab>
+
+      <Tab title="Node.js (local)">
+        For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
+
        ```bash
        npm install sandbox-agent@0.3.x
        ```
+
+        ```typescript
+        import { SandboxAgent } from "sandbox-agent";
+
+        const sdk = await SandboxAgent.start();
+        ```
      </Tab>
-      <Tab title="bun">
+
+      <Tab title="Bun (local)">
+        For local development, use `SandboxAgent.start()` to spawn and manage the server as a subprocess.
+
        ```bash
        bun add sandbox-agent@0.3.x
        # Allow Bun to run postinstall scripts for native binaries (required for SandboxAgent.start()).
        bun pm trust @sandbox-agent/cli-linux-x64 @sandbox-agent/cli-linux-arm64 @sandbox-agent/cli-darwin-arm64 @sandbox-agent/cli-darwin-x64 @sandbox-agent/cli-win32-x64
        ```
-      </Tab>
-    </Tabs>
-  </Step>
-
-  <Step title="Start the sandbox">
-    `SandboxAgent.start()` provisions a sandbox, starts a lightweight [Sandbox Agent server](/architecture) inside it, and connects your SDK client.
-
-    <Tabs>
-      <Tab title="Local">
-        ```bash
-        npm install sandbox-agent@0.3.x
-        ```

        ```typescript
        import { SandboxAgent } from "sandbox-agent";
-        import { local } from "sandbox-agent/local";

-        // Runs on your machine. Inherits process.env automatically.
-        const client = await SandboxAgent.start({
-          sandbox: local(),
-        });
+        const sdk = await SandboxAgent.start();
        ```
-
-        See [Local deploy guide](/deploy/local)
      </Tab>

-      <Tab title="E2B">
+      <Tab title="Build from source">
+        If you're running from source instead of the installed CLI.
+
        ```bash
-        npm install sandbox-agent@0.3.x @e2b/code-interpreter
+        cargo run -p sandbox-agent -- server --no-token --host 0.0.0.0 --port 2468
        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { e2b } from "sandbox-agent/e2b";
-
-        // Provisions a cloud sandbox on E2B, installs the server, and connects.
-        const client = await SandboxAgent.start({
-          sandbox: e2b(),
-        });
-        ```
-
-        See [E2B deploy guide](/deploy/e2b)
-      </Tab>
-
-      <Tab title="Daytona">
-        ```bash
-        npm install sandbox-agent@0.3.x @daytonaio/sdk
-        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { daytona } from "sandbox-agent/daytona";
-
-        // Provisions a Daytona workspace with the server pre-installed.
-        const client = await SandboxAgent.start({
-          sandbox: daytona(),
-        });
-        ```
-
-        See [Daytona deploy guide](/deploy/daytona)
-      </Tab>
-
-      <Tab title="Vercel">
-        ```bash
-        npm install sandbox-agent@0.3.x @vercel/sandbox
-        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { vercel } from "sandbox-agent/vercel";
-
-        // Provisions a Vercel sandbox with the server installed on boot.
-        const client = await SandboxAgent.start({
-          sandbox: vercel(),
-        });
-        ```
-
-        See [Vercel deploy guide](/deploy/vercel)
-      </Tab>
-
-      <Tab title="Modal">
-        ```bash
-        npm install sandbox-agent@0.3.x modal
-        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { modal } from "sandbox-agent/modal";
-
-        // Builds a container image with agents pre-installed (cached after first run),
-        // starts a Modal sandbox from that image, and connects.
-        const client = await SandboxAgent.start({
-          sandbox: modal(),
-        });
-        ```
-
-        See [Modal deploy guide](/deploy/modal)
-      </Tab>
-
-      <Tab title="Cloudflare">
-        ```bash
-        npm install sandbox-agent@0.3.x @cloudflare/sandbox
-        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { cloudflare } from "sandbox-agent/cloudflare";
-        import { SandboxClient } from "@cloudflare/sandbox";
-
-        // Uses the Cloudflare Sandbox SDK to provision and connect.
-        // The Cloudflare SDK handles server lifecycle internally.
-        const cfSandboxClient = new SandboxClient();
-        const client = await SandboxAgent.start({
-          sandbox: cloudflare({ sdk: cfSandboxClient }),
-        });
-        ```
-
-        See [Cloudflare deploy guide](/deploy/cloudflare)
-      </Tab>
-
-      <Tab title="Docker">
-        ```bash
-        npm install sandbox-agent@0.3.x dockerode get-port
-        ```
-
-        ```typescript
-        import { SandboxAgent } from "sandbox-agent";
-        import { docker } from "sandbox-agent/docker";
-
-        // Runs a Docker container locally. Good for testing.
-        const client = await SandboxAgent.start({
-          sandbox: docker(),
-        });
-        ```
-
-        See [Docker deploy guide](/deploy/docker)
      </Tab>
    </Tabs>

-    <div style={{ height: "1rem" }} />
-
-    **More info:**
+    Binding to `0.0.0.0` allows the server to accept connections from any network interface, which is required when running inside a sandbox where clients connect remotely.

    <AccordionGroup>
-      <Accordion title="Passing LLM credentials">
-        Agents need API keys for their LLM provider. Each provider passes credentials differently:
+      <Accordion title="Configuring token">
+        Tokens are usually not required. Most sandbox providers (E2B, Daytona, etc.) already secure networking at the infrastructure layer.

-        ```typescript
-        // Local — inherits process.env automatically
+        If you expose the server publicly, use `--token "$SANDBOX_TOKEN"` to require authentication:

-        // E2B
-        e2b({ create: { envs: { ANTHROPIC_API_KEY: "..." } } })
-
-        // Daytona
-        daytona({ create: { envVars: { ANTHROPIC_API_KEY: "..." } } })
-
-        // Vercel
-        vercel({ create: { env: { ANTHROPIC_API_KEY: "..." } } })
-
-        // Modal
-        modal({ create: { secrets: { ANTHROPIC_API_KEY: "..." } } })
-
-        // Docker
-        docker({ env: ["ANTHROPIC_API_KEY=..."] })
+        ```bash
+        sandbox-agent server --token "$SANDBOX_TOKEN" --host 0.0.0.0 --port 2468
        ```

-        For multi-tenant billing, per-user keys, and gateway options, see [LLM Credentials](/llm-credentials).
-      </Accordion>
+        Then pass the token when connecting:

-      <Accordion title="Implementing a custom provider">
-        Implement the `SandboxProvider` interface to use any sandbox platform:
-
-        ```typescript
-        import { SandboxAgent, type SandboxProvider } from "sandbox-agent";
-
-        const myProvider: SandboxProvider = {
-          name: "my-provider",
-          async create() {
-            // Provision a sandbox, install & start the server, return an ID
-            return "sandbox-123";
-          },
-          async destroy(sandboxId) {
-            // Tear down the sandbox
-          },
-          async getUrl(sandboxId) {
-            // Return the Sandbox Agent server URL
-            return `https://${sandboxId}.my-platform.dev:3000`;
-          },
-        };
-
-        const client = await SandboxAgent.start({
-          sandbox: myProvider,
-        });
-        ```
-      </Accordion>
-
-      <Accordion title="Connecting to an existing server">
-        If you already have a Sandbox Agent server running, connect directly:
-
-        ```typescript
-        const client = await SandboxAgent.connect({
-          baseUrl: "http://127.0.0.1:2468",
-        });
-        ```
-      </Accordion>
-
-      <Accordion title="Starting the server manually">
        <Tabs>
+          <Tab title="TypeScript">
+            ```typescript
+            import { SandboxAgent } from "sandbox-agent";
+
+            const sdk = await SandboxAgent.connect({
+              baseUrl: "http://your-server:2468",
+              token: process.env.SANDBOX_TOKEN,
+            });
+            ```
+          </Tab>
+
          <Tab title="curl">
            ```bash
-            curl -fsSL https://releases.rivet.dev/sandbox-agent/0.3.x/install.sh | sh
-            sandbox-agent server --no-token --host 0.0.0.0 --port 2468
+            curl "http://your-server:2468/v1/health" \
+              -H "Authorization: Bearer $SANDBOX_TOKEN"
            ```
          </Tab>
-          <Tab title="npx">
+
+          <Tab title="CLI">
            ```bash
-            npx @sandbox-agent/cli@0.3.x server --no-token --host 0.0.0.0 --port 2468
-            ```
-          </Tab>
-          <Tab title="Docker">
-            ```bash
-            docker run -p 2468:2468 \
-              -e ANTHROPIC_API_KEY="sk-ant-..." \
-              -e OPENAI_API_KEY="sk-..." \
-              rivetdev/sandbox-agent:0.4.1-rc.1-full \
-              server --no-token --host 0.0.0.0 --port 2468
+            sandbox-agent --token "$SANDBOX_TOKEN" api agents list \
+              --endpoint http://your-server:2468
            ```
          </Tab>
        </Tabs>
      </Accordion>
+      <Accordion title="CORS">
+        If you're calling the server from a browser, see the [CORS configuration guide](/cors).
+      </Accordion>
    </AccordionGroup>
  </Step>

-  <Step title="Create a session and send a prompt">
-    <CodeGroup>
+  <Step title="Install agents (optional)">
+    To preinstall agents:

-      ```typescript Claude
-      const session = await client.createSession({
-        agent: "claude",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-      ```typescript Codex
-      const session = await client.createSession({
-        agent: "codex",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-      ```typescript OpenCode
-      const session = await client.createSession({
-        agent: "opencode",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-      ```typescript Cursor
-      const session = await client.createSession({
-        agent: "cursor",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-      ```typescript Amp
-      const session = await client.createSession({
-        agent: "amp",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-      ```typescript Pi
-      const session = await client.createSession({
-        agent: "pi",
-      });
-
-      session.onEvent((event) => {
-        console.log(event.sender, event.payload);
-      });
-
-      const result = await session.prompt([
-        { type: "text", text: "Summarize the repository and suggest next steps." },
-      ]);
-
-      console.log(result.stopReason);
-      ```
-
-    </CodeGroup>
-
-    See [Agent Sessions](/agent-sessions) for the full sessions API.
-  </Step>
-
-  <Step title="Clean up">
-    ```typescript
-    await client.destroySandbox(); // provider-defined cleanup and disconnect
+    ```bash
+    sandbox-agent install-agent --all
    ```

-    Use `client.dispose()` instead to disconnect without changing sandbox state. On E2B, `client.pauseSandbox()` pauses the sandbox and `client.killSandbox()` deletes it permanently.
+    If agents are not installed up front, they are lazily installed when creating a session.
  </Step>

-  <Step title="Inspect with the UI">
-    Open the Inspector at `/ui/` on your server (e.g. `http://localhost:2468/ui/`) to view sessions and events in a GUI.
+  <Step title="Install desktop dependencies (optional, Linux only)">
+    If you want to use `/v1/desktop/*`, install the desktop runtime packages first:
+
+    ```bash
+    sandbox-agent install desktop --yes
+    ```
+
+    Then use `GET /v1/desktop/status` or `sdk.getDesktopStatus()` to verify the runtime is ready before calling desktop screenshot or input APIs.
+  </Step>
+
+  <Step title="Create a session">
+    ```typescript
+    import { SandboxAgent } from "sandbox-agent";
+
+    const sdk = await SandboxAgent.connect({
+      baseUrl: "http://127.0.0.1:2468",
+    });
+
+    const session = await sdk.createSession({
+      agent: "claude",
+      sessionInit: {
+        cwd: "/",
+        mcpServers: [],
+      },
+    });
+
+    console.log(session.id);
+    ```
+  </Step>
+
+  <Step title="Send a message">
+    ```typescript
+    const result = await session.prompt([
+      { type: "text", text: "Summarize the repository and suggest next steps." },
+    ]);
+
+    console.log(result.stopReason);
+    ```
+  </Step>
+
+  <Step title="Read events">
+    ```typescript
+    const off = session.onEvent((event) => {
+      console.log(event.sender, event.payload);
+    });
+
+    const page = await sdk.getEvents({
+      sessionId: session.id,
+      limit: 50,
+    });
+
+    console.log(page.items.length);
+    off();
+    ```
+  </Step>
+
+  <Step title="Test with Inspector">
+    Open the Inspector UI at `/ui/` on your server (for example, `http://localhost:2468/ui/`) to inspect sessions and events in a GUI.

    <Frame>
      <img src="/images/inspector.png" alt="Sandbox Agent Inspector" />
@ -372,44 +291,16 @@ icon: "rocket"
  </Step>
 </Steps>

-## Full example
-
-```typescript
-import { SandboxAgent } from "sandbox-agent";
-import { e2b } from "sandbox-agent/e2b";
-
-const client = await SandboxAgent.start({
-  sandbox: e2b({
-    create: {
-      envs: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY },
-    },
-  }),
-});
-
-try {
-  const session = await client.createSession({ agent: "claude" });
-
-  session.onEvent((event) => {
-    console.log(`[${event.sender}]`, JSON.stringify(event.payload));
-  });
-
-  const result = await session.prompt([
-    { type: "text", text: "Write a function that checks if a number is prime." },
-  ]);
-
-  console.log("Done:", result.stopReason);
-} finally {
-  await client.destroySandbox();
-}
-```
-
 ## Next steps

-<CardGroup cols={2}>
-  <Card title="SDK Overview" icon="compass" href="/sdk-overview">
-    Full TypeScript SDK API surface.
+<CardGroup cols={3}>
+  <Card title="Session Persistence" icon="database" href="/session-persistence">
+    Configure in-memory, Rivet Actor state, IndexedDB, SQLite, and Postgres persistence.
  </Card>
  <Card title="Deploy to a Sandbox" icon="box" href="/deploy/local">
-    Deploy to E2B, Daytona, Docker, Vercel, or Cloudflare.
+    Deploy your agent to E2B, Daytona, Docker, Vercel, or Cloudflare.
+  </Card>
+  <Card title="SDK Overview" icon="compass" href="/sdk-overview">
+    Use the latest TypeScript SDK API.
  </Card>
 </CardGroup>
--- a/docs/sdk-overview.mdx
+++ b/docs/sdk-overview.mdx
@ -196,6 +196,44 @@ const writeResult = await sdk.writeFsFile({ path: "./hello.txt" }, "hello");
 console.log(health.status, agents.agents.length, entries.length, writeResult.path);
 ```

+## Desktop API
+
+The SDK also wraps the desktop host/runtime HTTP API.
+
+Install desktop dependencies first on Linux hosts:
+
+```bash
+sandbox-agent install desktop --yes
+```
+
+Then query status, surface remediation if needed, and start the runtime:
+
+```ts
+const status = await sdk.getDesktopStatus();
+
+if (status.state === "install_required") {
+  console.log(status.installCommand);
+}
+
+const started = await sdk.startDesktop({
+  width: 1440,
+  height: 900,
+  dpi: 96,
+});
+
+const screenshot = await sdk.takeDesktopScreenshot();
+const displayInfo = await sdk.getDesktopDisplayInfo();
+
+await sdk.moveDesktopMouse({ x: 400, y: 300 });
+await sdk.clickDesktop({ x: 400, y: 300, button: "left", clickCount: 1 });
+await sdk.typeDesktopText({ text: "hello world", delayMs: 10 });
+await sdk.pressDesktopKey({ key: "ctrl+l" });
+
+await sdk.stopDesktop();
+```
+
+Screenshot helpers return `Uint8Array` PNG bytes. The SDK does not attempt to install OS packages remotely; callers should surface `missingDependencies` and `installCommand` from `getDesktopStatus()`.
+
 ## Error handling

 ```ts
--- a/frontend/packages/inspector/index.html
+++ b/frontend/packages/inspector/index.html
@ -2889,6 +2889,94 @@
        gap: 20px;
      }

+      .desktop-panel {
+        display: flex;
+        flex-direction: column;
+        gap: 16px;
+      }
+
+      .desktop-state-grid {
+        display: grid;
+        grid-template-columns: repeat(3, minmax(0, 1fr));
+        gap: 12px;
+        margin-bottom: 12px;
+      }
+
+      .desktop-start-controls {
+        display: grid;
+        grid-template-columns: repeat(3, minmax(0, 1fr));
+        gap: 10px;
+      }
+
+      .desktop-input-group {
+        display: flex;
+        flex-direction: column;
+        gap: 4px;
+      }
+
+      .desktop-chip-list {
+        display: flex;
+        flex-wrap: wrap;
+        gap: 8px;
+      }
+
+      .desktop-command {
+        margin-top: 6px;
+        padding: 8px 10px;
+        border-radius: var(--radius);
+        border: 1px solid var(--border);
+        background: var(--surface);
+        overflow-x: auto;
+      }
+
+      .desktop-diagnostic-block + .desktop-diagnostic-block {
+        margin-top: 14px;
+      }
+
+      .desktop-process-list {
+        display: flex;
+        flex-direction: column;
+        gap: 10px;
+        margin-top: 8px;
+      }
+
+      .desktop-process-item {
+        padding: 10px;
+        border-radius: var(--radius);
+        border: 1px solid var(--border);
+        background: var(--surface);
+        display: flex;
+        flex-direction: column;
+        gap: 4px;
+      }
+
+      .desktop-screenshot-empty {
+        padding: 18px;
+        border: 1px dashed var(--border);
+        border-radius: var(--radius);
+        color: var(--muted);
+        background: var(--surface);
+        text-align: center;
+      }
+
+      .desktop-screenshot-frame {
+        border-radius: calc(var(--radius) + 2px);
+        overflow: hidden;
+        border: 1px solid var(--border);
+        background:
+          linear-gradient(135deg, rgba(15, 23, 42, 0.9), rgba(30, 41, 59, 0.92)),
+          radial-gradient(circle at top right, rgba(56, 189, 248, 0.12), transparent 40%);
+        padding: 10px;
+      }
+
+      .desktop-screenshot-image {
+        display: block;
+        width: 100%;
+        height: auto;
+        border-radius: var(--radius);
+        background: rgba(0, 0, 0, 0.24);
+      }
+
      .processes-section {
        display: flex;
        flex-direction: column;
@ -3551,6 +3639,11 @@
          grid-template-columns: 1fr;
        }

+        .desktop-state-grid,
+        .desktop-start-controls {
+          grid-template-columns: 1fr;
+        }
+
        .session-sidebar {
          display: none;
        }
--- a/frontend/packages/inspector/package.json
+++ b/frontend/packages/inspector/package.json
@ -18,6 +18,7 @@
    "@types/react-dom": "^19.1.6",
    "@vitejs/plugin-react": "^4.3.1",
    "fake-indexeddb": "^6.2.4",
+    "jsdom": "^26.1.0",
    "typescript": "^5.7.3",
    "vite": "^5.4.7",
    "vitest": "^3.0.0"
--- a/frontend/packages/inspector/src/components/debug/DebugPanel.tsx
+++ b/frontend/packages/inspector/src/components/debug/DebugPanel.tsx
@ -1,4 +1,4 @@
-import { ChevronLeft, ChevronRight, Cloud, Play, PlayCircle, Server, Terminal, Wrench } from "lucide-react";
+import { ChevronLeft, ChevronRight, Cloud, Monitor, Play, PlayCircle, Server, Terminal, Wrench } from "lucide-react";
 import type { AgentInfo, SandboxAgent, SessionEvent } from "sandbox-agent";

 type AgentModeInfo = { id: string; name: string; description: string };
@ -9,9 +9,10 @@ import ProcessesTab from "./ProcessesTab";
 import ProcessRunTab from "./ProcessRunTab";
 import SkillsTab from "./SkillsTab";
 import RequestLogTab from "./RequestLogTab";
+import DesktopTab from "./DesktopTab";
 import type { RequestLog } from "../../types/requestLog";

-export type DebugTab = "log" | "events" | "agents" | "mcp" | "skills" | "processes" | "run-process";
+export type DebugTab = "log" | "events" | "agents" | "desktop" | "mcp" | "skills" | "processes" | "run-process";

 const DebugPanel = ({
  debugTab,
@ -75,6 +76,10 @@ const DebugPanel = ({
          <Cloud className="button-icon" style={{ marginRight: 4, width: 12, height: 12 }} />
          Agents
        </button>
+        <button className={`debug-tab ${debugTab === "desktop" ? "active" : ""}`} onClick={() => onDebugTabChange("desktop")}>
+          <Monitor className="button-icon" style={{ marginRight: 4, width: 12, height: 12 }} />
+          Desktop
+        </button>
        <button className={`debug-tab ${debugTab === "mcp" ? "active" : ""}`} onClick={() => onDebugTabChange("mcp")}>
          <Server className="button-icon" style={{ marginRight: 4, width: 12, height: 12 }} />
          MCP
@ -112,6 +117,8 @@ const DebugPanel = ({
          />
        )}

+        {debugTab === "desktop" && <DesktopTab getClient={getClient} />}
+
        {debugTab === "mcp" && <McpTab getClient={getClient} />}

        {debugTab === "processes" && <ProcessesTab getClient={getClient} />}
--- a/frontend/packages/inspector/src/components/debug/DesktopTab.test.tsx
+++ b/frontend/packages/inspector/src/components/debug/DesktopTab.test.tsx
@ -0,0 +1,142 @@
+// @vitest-environment jsdom
+
+import { act } from "react";
+import { createRoot, type Root } from "react-dom/client";
+import { afterEach, beforeEach, describe, expect, it } from "vitest";
+import { SandboxAgent } from "sandbox-agent";
+import {
+  createDockerTestLayout,
+  disposeDockerTestLayout,
+  startDockerSandboxAgent,
+  type DockerSandboxAgentHandle,
+} from "../../../../../../sdks/typescript/tests/helpers/docker.ts";
+import DesktopTab from "./DesktopTab";
+
+type DockerTestLayout = ReturnType<typeof createDockerTestLayout>;
+
+function sleep(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
+async function waitFor<T>(fn: () => T | undefined | null, timeoutMs = 20_000, stepMs = 50): Promise<T> {
+  const started = Date.now();
+  while (Date.now() - started < timeoutMs) {
+    const value = fn();
+    if (value !== undefined && value !== null) {
+      return value;
+    }
+    await sleep(stepMs);
+  }
+  throw new Error("timed out waiting for condition");
+}
+
+function findButton(container: HTMLElement, label: string): HTMLButtonElement | undefined {
+  return Array.from(container.querySelectorAll("button")).find((button) => button.textContent?.includes(label)) as HTMLButtonElement | undefined;
+}
+
+describe.sequential("DesktopTab", () => {
+  let container: HTMLDivElement;
+  let root: Root;
+  let layout: DockerTestLayout | undefined;
+  let handle: DockerSandboxAgentHandle | undefined;
+  let client: SandboxAgent | undefined;
+
+  beforeEach(() => {
+    (globalThis as { IS_REACT_ACT_ENVIRONMENT?: boolean }).IS_REACT_ACT_ENVIRONMENT = true;
+    container = document.createElement("div");
+    document.body.appendChild(container);
+    root = createRoot(container);
+  });
+
+  afterEach(async () => {
+    await act(async () => {
+      root.unmount();
+    });
+    if (client) {
+      await client.stopDesktop().catch(() => {});
+      await client.dispose().catch(() => {});
+    }
+    if (handle) {
+      await handle.dispose();
+    }
+    if (layout) {
+      disposeDockerTestLayout(layout);
+    }
+    container.remove();
+    delete (globalThis as { IS_REACT_ACT_ENVIRONMENT?: boolean }).IS_REACT_ACT_ENVIRONMENT;
+    client = undefined;
+    handle = undefined;
+    layout = undefined;
+  });
+
+  async function connectDesktopClient(options?: { pathMode?: "merge" | "replace" }): Promise<SandboxAgent> {
+    layout = createDockerTestLayout();
+    handle = await startDockerSandboxAgent(layout, {
+      timeoutMs: 30_000,
+      pathMode: options?.pathMode,
+      env: options?.pathMode === "replace" ? { PATH: layout.rootDir } : undefined,
+    });
+    client = await SandboxAgent.connect({
+      baseUrl: handle.baseUrl,
+      token: handle.token,
+    });
+    return client;
+  }
+
+  it("renders install remediation when desktop deps are missing", async () => {
+    const connectedClient = await connectDesktopClient({ pathMode: "replace" });
+
+    await act(async () => {
+      root.render(<DesktopTab getClient={() => connectedClient} />);
+    });
+
+    await waitFor(() => {
+      const text = container.textContent ?? "";
+      return text.includes("install_required") ? text : undefined;
+    });
+
+    expect(container.textContent).toContain("install_required");
+    expect(container.textContent).toContain("sandbox-agent install desktop --yes");
+    expect(container.textContent).toContain("Xvfb");
+  });
+
+  it("starts desktop, refreshes screenshot, and stops desktop", async () => {
+    const connectedClient = await connectDesktopClient();
+
+    await act(async () => {
+      root.render(<DesktopTab getClient={() => connectedClient} />);
+    });
+
+    await waitFor(() => {
+      const text = container.textContent ?? "";
+      return text.includes("inactive") ? true : undefined;
+    });
+
+    const startButton = await waitFor(() => findButton(container, "Start Desktop"));
+    await act(async () => {
+      startButton.dispatchEvent(new MouseEvent("click", { bubbles: true }));
+    });
+
+    await waitFor(() => {
+      const screenshot = container.querySelector("img[alt='Desktop screenshot']") as HTMLImageElement | null;
+      return screenshot?.src ? screenshot : undefined;
+    });
+
+    const screenshot = container.querySelector("img[alt='Desktop screenshot']") as HTMLImageElement | null;
+    expect(screenshot).toBeTruthy();
+    expect(screenshot?.src.startsWith("blob:") || screenshot?.src.startsWith("data:image/png")).toBe(true);
+    expect(container.textContent).toContain("active");
+
+    const stopButton = await waitFor(() => findButton(container, "Stop Desktop"));
+    await act(async () => {
+      stopButton.dispatchEvent(new MouseEvent("click", { bubbles: true }));
+    });
+
+    await waitFor(() => {
+      const text = container.textContent ?? "";
+      return text.includes("inactive") ? true : undefined;
+    });
+
+    expect(container.textContent).toContain("inactive");
+  });
+});
--- a/frontend/packages/inspector/src/components/debug/DesktopTab.tsx
+++ b/frontend/packages/inspector/src/components/debug/DesktopTab.tsx
@ -0,0 +1,340 @@
+import { Loader2, Monitor, Play, RefreshCw, Square, Camera } from "lucide-react";
+import { useCallback, useEffect, useMemo, useState } from "react";
+import { SandboxAgentError } from "sandbox-agent";
+import type { DesktopStatusResponse, SandboxAgent } from "sandbox-agent";
+
+const MIN_SPIN_MS = 350;
+
+const extractErrorMessage = (error: unknown, fallback: string): string => {
+  if (error instanceof SandboxAgentError && error.problem?.detail) return error.problem.detail;
+  if (error instanceof Error) return error.message;
+  return fallback;
+};
+
+const formatStartedAt = (value: string | null | undefined): string => {
+  if (!value) {
+    return "Not started";
+  }
+  const parsed = new Date(value);
+  return Number.isNaN(parsed.getTime()) ? value : parsed.toLocaleString();
+};
+
+const createScreenshotUrl = async (bytes: Uint8Array): Promise<string> => {
+  const payload = new Uint8Array(bytes.byteLength);
+  payload.set(bytes);
+  const blob = new Blob([payload.buffer], { type: "image/png" });
+
+  if (typeof URL.createObjectURL === "function") {
+    return URL.createObjectURL(blob);
+  }
+
+  return await new Promise((resolve, reject) => {
+    const reader = new FileReader();
+    reader.onerror = () => reject(reader.error ?? new Error("Unable to read screenshot blob."));
+    reader.onload = () => {
+      if (typeof reader.result === "string") {
+        resolve(reader.result);
+      } else {
+        reject(new Error("Unable to read screenshot blob."));
+      }
+    };
+    reader.readAsDataURL(blob);
+  });
+};
+
+const DesktopTab = ({ getClient }: { getClient: () => SandboxAgent }) => {
+  const [status, setStatus] = useState<DesktopStatusResponse | null>(null);
+  const [loading, setLoading] = useState(false);
+  const [refreshing, setRefreshing] = useState(false);
+  const [acting, setActing] = useState<"start" | "stop" | null>(null);
+  const [error, setError] = useState<string | null>(null);
+
+  const [width, setWidth] = useState("1440");
+  const [height, setHeight] = useState("900");
+  const [dpi, setDpi] = useState("96");
+
+  const [screenshotUrl, setScreenshotUrl] = useState<string | null>(null);
+  const [screenshotLoading, setScreenshotLoading] = useState(false);
+  const [screenshotError, setScreenshotError] = useState<string | null>(null);
+
+  const revokeScreenshotUrl = useCallback(() => {
+    setScreenshotUrl((current) => {
+      if (current?.startsWith("blob:") && typeof URL.revokeObjectURL === "function") {
+        URL.revokeObjectURL(current);
+      }
+      return null;
+    });
+  }, []);
+
+  const loadStatus = useCallback(
+    async (mode: "initial" | "refresh" = "initial") => {
+      if (mode === "initial") {
+        setLoading(true);
+      } else {
+        setRefreshing(true);
+      }
+      setError(null);
+      try {
+        const next = await getClient().getDesktopStatus();
+        setStatus(next);
+        return next;
+      } catch (loadError) {
+        setError(extractErrorMessage(loadError, "Unable to load desktop status."));
+        return null;
+      } finally {
+        setLoading(false);
+        setRefreshing(false);
+      }
+    },
+    [getClient],
+  );
+
+  const refreshScreenshot = useCallback(async () => {
+    setScreenshotLoading(true);
+    setScreenshotError(null);
+    try {
+      const bytes = await getClient().takeDesktopScreenshot();
+      revokeScreenshotUrl();
+      setScreenshotUrl(await createScreenshotUrl(bytes));
+    } catch (captureError) {
+      revokeScreenshotUrl();
+      setScreenshotError(extractErrorMessage(captureError, "Unable to capture desktop screenshot."));
+    } finally {
+      setScreenshotLoading(false);
+    }
+  }, [getClient, revokeScreenshotUrl]);
+
+  useEffect(() => {
+    void loadStatus();
+  }, [loadStatus]);
+
+  useEffect(() => {
+    if (status?.state === "active") {
+      void refreshScreenshot();
+    } else {
+      revokeScreenshotUrl();
+    }
+  }, [refreshScreenshot, revokeScreenshotUrl, status?.state]);
+
+  useEffect(() => {
+    return () => {
+      revokeScreenshotUrl();
+    };
+  }, [revokeScreenshotUrl]);
+
+  const handleStart = async () => {
+    const parsedWidth = Number.parseInt(width, 10);
+    const parsedHeight = Number.parseInt(height, 10);
+    const parsedDpi = Number.parseInt(dpi, 10);
+    setActing("start");
+    setError(null);
+    const startedAt = Date.now();
+    try {
+      const next = await getClient().startDesktop({
+        width: Number.isFinite(parsedWidth) ? parsedWidth : undefined,
+        height: Number.isFinite(parsedHeight) ? parsedHeight : undefined,
+        dpi: Number.isFinite(parsedDpi) ? parsedDpi : undefined,
+      });
+      setStatus(next);
+      if (next.state === "active") {
+        await refreshScreenshot();
+      }
+    } catch (startError) {
+      setError(extractErrorMessage(startError, "Unable to start desktop runtime."));
+      await loadStatus("refresh");
+    } finally {
+      const elapsedMs = Date.now() - startedAt;
+      if (elapsedMs < MIN_SPIN_MS) {
+        await new Promise((resolve) => window.setTimeout(resolve, MIN_SPIN_MS - elapsedMs));
+      }
+      setActing(null);
+    }
+  };
+
+  const handleStop = async () => {
+    setActing("stop");
+    setError(null);
+    const startedAt = Date.now();
+    try {
+      const next = await getClient().stopDesktop();
+      setStatus(next);
+      revokeScreenshotUrl();
+    } catch (stopError) {
+      setError(extractErrorMessage(stopError, "Unable to stop desktop runtime."));
+      await loadStatus("refresh");
+    } finally {
+      const elapsedMs = Date.now() - startedAt;
+      if (elapsedMs < MIN_SPIN_MS) {
+        await new Promise((resolve) => window.setTimeout(resolve, MIN_SPIN_MS - elapsedMs));
+      }
+      setActing(null);
+    }
+  };
+
+  const canRefreshScreenshot = status?.state === "active";
+  const resolutionLabel = useMemo(() => {
+    const resolution = status?.resolution;
+    if (!resolution) return "Unknown";
+    const dpiLabel = resolution.dpi ? ` @ ${resolution.dpi} DPI` : "";
+    return `${resolution.width} x ${resolution.height}${dpiLabel}`;
+  }, [status?.resolution]);
+
+  return (
+    <div className="desktop-panel">
+      <div className="inline-row" style={{ marginBottom: 16 }}>
+        <button className="button secondary small" onClick={() => void loadStatus("refresh")} disabled={loading || refreshing}>
+          <RefreshCw className={`button-icon ${loading || refreshing ? "spinner-icon" : ""}`} />
+          Refresh Status
+        </button>
+        <button className="button secondary small" onClick={() => void refreshScreenshot()} disabled={!canRefreshScreenshot || screenshotLoading}>
+          {screenshotLoading ? <Loader2 className="button-icon spinner-icon" /> : <Camera className="button-icon" />}
+          Refresh Screenshot
+        </button>
+      </div>
+
+      {error && <div className="banner error">{error}</div>}
+      {screenshotError && <div className="banner error">{screenshotError}</div>}
+
+      <div className="card">
+        <div className="card-header">
+          <span className="card-title">
+            <Monitor size={14} style={{ marginRight: 6 }} />
+            Desktop Runtime
+          </span>
+          <span
+            className={`pill ${
+              status?.state === "active" ? "success" : status?.state === "install_required" ? "warning" : status?.state === "failed" ? "danger" : ""
+            }`}
+          >
+            {status?.state ?? "unknown"}
+          </span>
+        </div>
+
+        <div className="desktop-state-grid">
+          <div>
+            <div className="card-meta">Display</div>
+            <div className="mono">{status?.display ?? "Not assigned"}</div>
+          </div>
+          <div>
+            <div className="card-meta">Resolution</div>
+            <div className="mono">{resolutionLabel}</div>
+          </div>
+          <div>
+            <div className="card-meta">Started</div>
+            <div>{formatStartedAt(status?.startedAt)}</div>
+          </div>
+        </div>
+
+        <div className="desktop-start-controls">
+          <div className="desktop-input-group">
+            <label className="label">Width</label>
+            <input className="setup-input mono" value={width} onChange={(event) => setWidth(event.target.value)} inputMode="numeric" />
+          </div>
+          <div className="desktop-input-group">
+            <label className="label">Height</label>
+            <input className="setup-input mono" value={height} onChange={(event) => setHeight(event.target.value)} inputMode="numeric" />
+          </div>
+          <div className="desktop-input-group">
+            <label className="label">DPI</label>
+            <input className="setup-input mono" value={dpi} onChange={(event) => setDpi(event.target.value)} inputMode="numeric" />
+          </div>
+        </div>
+
+        <div className="card-actions">
+          <button className="button success small" onClick={() => void handleStart()} disabled={acting === "start"}>
+            {acting === "start" ? <Loader2 className="button-icon spinner-icon" /> : <Play className="button-icon" />}
+            Start Desktop
+          </button>
+          <button className="button danger small" onClick={() => void handleStop()} disabled={acting === "stop"}>
+            {acting === "stop" ? <Loader2 className="button-icon spinner-icon" /> : <Square className="button-icon" />}
+            Stop Desktop
+          </button>
+        </div>
+      </div>
+
+      {status?.missingDependencies && status.missingDependencies.length > 0 && (
+        <div className="card">
+          <div className="card-header">
+            <span className="card-title">Missing Dependencies</span>
+          </div>
+          <div className="desktop-chip-list">
+            {status.missingDependencies.map((dependency) => (
+              <span key={dependency} className="pill warning">
+                {dependency}
+              </span>
+            ))}
+          </div>
+          {status.installCommand && (
+            <>
+              <div className="card-meta" style={{ marginTop: 12 }}>
+                Install command
+              </div>
+              <div className="mono desktop-command">{status.installCommand}</div>
+            </>
+          )}
+        </div>
+      )}
+
+      {(status?.lastError || status?.runtimeLogPath || (status?.processes?.length ?? 0) > 0) && (
+        <div className="card">
+          <div className="card-header">
+            <span className="card-title">Diagnostics</span>
+          </div>
+          {status?.lastError && (
+            <div className="desktop-diagnostic-block">
+              <div className="card-meta">Last error</div>
+              <div className="mono">{status.lastError.code}</div>
+              <div>{status.lastError.message}</div>
+            </div>
+          )}
+          {status?.runtimeLogPath && (
+            <div className="desktop-diagnostic-block">
+              <div className="card-meta">Runtime log</div>
+              <div className="mono">{status.runtimeLogPath}</div>
+            </div>
+          )}
+          {status?.processes && status.processes.length > 0 && (
+            <div className="desktop-diagnostic-block">
+              <div className="card-meta">Processes</div>
+              <div className="desktop-process-list">
+                {status.processes.map((process) => (
+                  <div key={`${process.name}-${process.pid ?? "none"}`} className="desktop-process-item">
+                    <div>
+                      <strong>{process.name}</strong>
+                      <span className={`pill ${process.running ? "success" : "danger"}`} style={{ marginLeft: 8 }}>
+                        {process.running ? "running" : "stopped"}
+                      </span>
+                    </div>
+                    <div className="mono">{process.pid ? `pid ${process.pid}` : "no pid"}</div>
+                    {process.logPath && <div className="mono">{process.logPath}</div>}
+                  </div>
+                ))}
+              </div>
+            </div>
+          )}
+        </div>
+      )}
+
+      <div className="card">
+        <div className="card-header">
+          <span className="card-title">Latest Screenshot</span>
+          {status?.state === "active" ? <span className="card-meta">Manual refresh only</span> : null}
+        </div>
+
+        {loading ? <div className="card-meta">Loading...</div> : null}
+        {!loading && !screenshotUrl && (
+          <div className="desktop-screenshot-empty">
+            {status?.state === "active" ? "No screenshot loaded yet." : "Start the desktop runtime to capture a screenshot."}
+          </div>
+        )}
+        {screenshotUrl && (
+          <div className="desktop-screenshot-frame">
+            <img src={screenshotUrl} alt="Desktop screenshot" className="desktop-screenshot-image" />
+          </div>
+        )}
+      </div>
+    </div>
+  );
+};
+
+export default DesktopTab;
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
--- a/research/acp/friction.md
+++ b/research/acp/friction.md
@ -277,3 +277,13 @@ Update this file continuously during the migration.
 - Owner: Unassigned.
 - Status: resolved
 - Links: `sdks/acp-http-client/src/index.ts`, `sdks/acp-http-client/tests/smoke.test.ts`, `sdks/typescript/tests/integration.test.ts`
+
+- Date: 2026-03-07
+- Area: Desktop host/runtime API boundary
+- Issue: Desktop automation needed screenshot/input/file-transfer-like host capabilities, but routing it through ACP would have mixed agent protocol semantics with host-owned runtime control and binary payloads.
+- Impact: A desktop feature built as ACP methods would blur the division between agent/session behavior and Sandbox Agent host/runtime APIs, and would complicate binary screenshot transport.
+- Proposed direction: Ship desktop as first-party HTTP endpoints under `/v1/desktop/*`, keep health/install/remediation in the server runtime, and expose the feature through the SDK and inspector without ACP extension methods.
+- Decision: Accepted and implemented for phase one.
+- Owner: Unassigned.
+- Status: resolved
+- Links: `server/packages/sandbox-agent/src/router.rs`, `server/packages/sandbox-agent/src/desktop_runtime.rs`, `sdks/typescript/src/client.ts`, `frontend/packages/inspector/src/components/debug/DesktopTab.tsx`
--- a/sdks/react/src/DesktopViewer.tsx
+++ b/sdks/react/src/DesktopViewer.tsx
@ -0,0 +1,257 @@
+"use client";
+
+import type { CSSProperties, MouseEvent, WheelEvent } from "react";
+import { useEffect, useRef, useState } from "react";
+import type { DesktopMouseButton, DesktopStreamErrorStatus, DesktopStreamReadyStatus, SandboxAgent } from "sandbox-agent";
+
+type ConnectionState = "connecting" | "ready" | "closed" | "error";
+
+export type DesktopViewerClient = Pick<SandboxAgent, "startDesktopStream" | "stopDesktopStream" | "connectDesktopStream">;
+
+export interface DesktopViewerProps {
+  client: DesktopViewerClient;
+  className?: string;
+  style?: CSSProperties;
+  imageStyle?: CSSProperties;
+  height?: number | string;
+  onConnect?: (status: DesktopStreamReadyStatus) => void;
+  onDisconnect?: () => void;
+  onError?: (error: DesktopStreamErrorStatus | Error) => void;
+}
+
+const shellStyle: CSSProperties = {
+  display: "flex",
+  flexDirection: "column",
+  overflow: "hidden",
+  border: "1px solid rgba(15, 23, 42, 0.14)",
+  borderRadius: 14,
+  background: "linear-gradient(180deg, rgba(248, 250, 252, 0.96) 0%, rgba(226, 232, 240, 0.92) 100%)",
+  boxShadow: "0 20px 40px rgba(15, 23, 42, 0.08)",
+};
+
+const statusBarStyle: CSSProperties = {
+  display: "flex",
+  alignItems: "center",
+  justifyContent: "space-between",
+  gap: 12,
+  padding: "10px 14px",
+  borderBottom: "1px solid rgba(15, 23, 42, 0.08)",
+  background: "rgba(255, 255, 255, 0.78)",
+  color: "#0f172a",
+  fontSize: 12,
+  lineHeight: 1.4,
+};
+
+const viewportStyle: CSSProperties = {
+  position: "relative",
+  display: "flex",
+  alignItems: "center",
+  justifyContent: "center",
+  overflow: "hidden",
+  background: "radial-gradient(circle at top, rgba(14, 165, 233, 0.18), transparent 45%), linear-gradient(180deg, #0f172a 0%, #111827 100%)",
+};
+
+const imageBaseStyle: CSSProperties = {
+  display: "block",
+  width: "100%",
+  height: "100%",
+  objectFit: "contain",
+  userSelect: "none",
+};
+
+const hintStyle: CSSProperties = {
+  opacity: 0.66,
+};
+
+const getStatusColor = (state: ConnectionState): string => {
+  switch (state) {
+    case "ready":
+      return "#15803d";
+    case "error":
+      return "#b91c1c";
+    case "closed":
+      return "#b45309";
+    default:
+      return "#475569";
+  }
+};
+
+export const DesktopViewer = ({ client, className, style, imageStyle, height = 480, onConnect, onDisconnect, onError }: DesktopViewerProps) => {
+  const wrapperRef = useRef<HTMLDivElement | null>(null);
+  const sessionRef = useRef<ReturnType<DesktopViewerClient["connectDesktopStream"]> | null>(null);
+  const [connectionState, setConnectionState] = useState<ConnectionState>("connecting");
+  const [statusMessage, setStatusMessage] = useState("Starting desktop stream...");
+  const [frameUrl, setFrameUrl] = useState<string | null>(null);
+  const [resolution, setResolution] = useState<{ width: number; height: number } | null>(null);
+
+  useEffect(() => {
+    let cancelled = false;
+    let lastObjectUrl: string | null = null;
+    let session: ReturnType<DesktopViewerClient["connectDesktopStream"]> | null = null;
+
+    setConnectionState("connecting");
+    setStatusMessage("Starting desktop stream...");
+    setResolution(null);
+
+    const connect = async () => {
+      try {
+        await client.startDesktopStream();
+        if (cancelled) {
+          return;
+        }
+
+        session = client.connectDesktopStream();
+        sessionRef.current = session;
+        session.onReady((status) => {
+          if (cancelled) {
+            return;
+          }
+          setConnectionState("ready");
+          setStatusMessage("Desktop stream connected.");
+          setResolution({ width: status.width, height: status.height });
+          onConnect?.(status);
+        });
+        session.onFrame((frame) => {
+          if (cancelled) {
+            return;
+          }
+          const nextUrl = URL.createObjectURL(new Blob([frame.slice().buffer], { type: "image/jpeg" }));
+          setFrameUrl((current) => {
+            if (current) {
+              URL.revokeObjectURL(current);
+            }
+            return nextUrl;
+          });
+          if (lastObjectUrl) {
+            URL.revokeObjectURL(lastObjectUrl);
+          }
+          lastObjectUrl = nextUrl;
+        });
+        session.onError((error) => {
+          if (cancelled) {
+            return;
+          }
+          setConnectionState("error");
+          setStatusMessage(error instanceof Error ? error.message : error.message);
+          onError?.(error);
+        });
+        session.onClose(() => {
+          if (cancelled) {
+            return;
+          }
+          setConnectionState((current) => (current === "error" ? current : "closed"));
+          setStatusMessage((current) => (current === "Desktop stream connected." ? "Desktop stream disconnected." : current));
+          onDisconnect?.();
+        });
+      } catch (error) {
+        if (cancelled) {
+          return;
+        }
+        const nextError = error instanceof Error ? error : new Error("Failed to initialize desktop stream.");
+        setConnectionState("error");
+        setStatusMessage(nextError.message);
+        onError?.(nextError);
+      }
+    };
+
+    void connect();
+
+    return () => {
+      cancelled = true;
+      session?.close();
+      sessionRef.current = null;
+      void client.stopDesktopStream().catch(() => undefined);
+      setFrameUrl((current) => {
+        if (current) {
+          URL.revokeObjectURL(current);
+        }
+        return null;
+      });
+      if (lastObjectUrl) {
+        URL.revokeObjectURL(lastObjectUrl);
+      }
+    };
+  }, [client, onConnect, onDisconnect, onError]);
+
+  const scalePoint = (clientX: number, clientY: number) => {
+    const wrapper = wrapperRef.current;
+    if (!wrapper || !resolution) {
+      return null;
+    }
+    const rect = wrapper.getBoundingClientRect();
+    if (rect.width === 0 || rect.height === 0) {
+      return null;
+    }
+    const x = Math.max(0, Math.min(resolution.width, ((clientX - rect.left) / rect.width) * resolution.width));
+    const y = Math.max(0, Math.min(resolution.height, ((clientY - rect.top) / rect.height) * resolution.height));
+    return {
+      x: Math.round(x),
+      y: Math.round(y),
+    };
+  };
+
+  const buttonFromMouseEvent = (event: MouseEvent<HTMLDivElement>): DesktopMouseButton => {
+    switch (event.button) {
+      case 1:
+        return "middle";
+      case 2:
+        return "right";
+      default:
+        return "left";
+    }
+  };
+
+  const withSession = (callback: (session: NonNullable<ReturnType<DesktopViewerClient["connectDesktopStream"]>>) => void) => {
+    const session = sessionRef.current;
+    if (session) {
+      callback(session);
+    }
+  };
+
+  return (
+    <div className={className} style={{ ...shellStyle, ...style }}>
+      <div style={statusBarStyle}>
+        <span style={{ color: getStatusColor(connectionState) }}>{statusMessage}</span>
+        <span style={hintStyle}>{resolution ? `${resolution.width}×${resolution.height}` : "Awaiting frames"}</span>
+      </div>
+      <div
+        ref={wrapperRef}
+        role="button"
+        tabIndex={0}
+        style={{ ...viewportStyle, height }}
+        onMouseMove={(event) => {
+          const point = scalePoint(event.clientX, event.clientY);
+          if (!point) {
+            return;
+          }
+          withSession((session) => session.moveMouse(point.x, point.y));
+        }}
+        onMouseDown={(event) => {
+          event.preventDefault();
+          const point = scalePoint(event.clientX, event.clientY);
+          withSession((session) => session.mouseDown(buttonFromMouseEvent(event), point?.x, point?.y));
+        }}
+        onMouseUp={(event) => {
+          const point = scalePoint(event.clientX, event.clientY);
+          withSession((session) => session.mouseUp(buttonFromMouseEvent(event), point?.x, point?.y));
+        }}
+        onWheel={(event: WheelEvent<HTMLDivElement>) => {
+          event.preventDefault();
+          const point = scalePoint(event.clientX, event.clientY);
+          if (!point) {
+            return;
+          }
+          withSession((session) => session.scroll(point.x, point.y, Math.round(event.deltaX), Math.round(event.deltaY)));
+        }}
+        onKeyDown={(event) => {
+          withSession((session) => session.keyDown(event.key));
+        }}
+        onKeyUp={(event) => {
+          withSession((session) => session.keyUp(event.key));
+        }}
+      >
+        {frameUrl ? <img alt="Desktop stream" draggable={false} src={frameUrl} style={{ ...imageBaseStyle, ...imageStyle }} /> : null}
+      </div>
+    </div>
+  );
+};
--- a/sdks/react/src/index.ts
+++ b/sdks/react/src/index.ts
@ -1,6 +1,7 @@
 export { AgentConversation } from "./AgentConversation.tsx";
 export { AgentTranscript } from "./AgentTranscript.tsx";
 export { ChatComposer } from "./ChatComposer.tsx";
+export { DesktopViewer } from "./DesktopViewer.tsx";
 export { ProcessTerminal } from "./ProcessTerminal.tsx";
 export { useTranscriptVirtualizer } from "./useTranscriptVirtualizer.ts";

@ -23,6 +24,11 @@ export type {
  ChatComposerProps,
 } from "./ChatComposer.tsx";

+export type {
+  DesktopViewerClient,
+  DesktopViewerProps,
+} from "./DesktopViewer.tsx";
+
 export type {
  ProcessTerminalClient,
  ProcessTerminalProps,
--- a/sdks/typescript/src/client.ts
+++ b/sdks/typescript/src/client.ts
@ -23,12 +23,35 @@ import {
  type SetSessionModeRequest,
 } from "acp-http-client";
 import type { SandboxProvider } from "./providers/types.ts";
+import { DesktopStreamSession, type DesktopStreamConnectOptions } from "./desktop-stream.ts";
 import {
  type AcpServerListResponse,
  type AgentInfo,
  type AgentInstallRequest,
  type AgentInstallResponse,
  type AgentListResponse,
+  type DesktopActionResponse,
+  type DesktopDisplayInfoResponse,
+  type DesktopKeyboardDownRequest,
+  type DesktopKeyboardPressRequest,
+  type DesktopKeyboardTypeRequest,
+  type DesktopMouseClickRequest,
+  type DesktopMouseDownRequest,
+  type DesktopMouseDragRequest,
+  type DesktopMouseMoveRequest,
+  type DesktopMousePositionResponse,
+  type DesktopMouseScrollRequest,
+  type DesktopMouseUpRequest,
+  type DesktopKeyboardUpRequest,
+  type DesktopRecordingInfo,
+  type DesktopRecordingListResponse,
+  type DesktopRecordingStartRequest,
+  type DesktopRegionScreenshotQuery,
+  type DesktopScreenshotQuery,
+  type DesktopStartRequest,
+  type DesktopStatusResponse,
+  type DesktopStreamStatusResponse,
+  type DesktopWindowListResponse,
  type FsActionResponse,
  type FsDeleteQuery,
  type FsEntriesQuery,
@ -53,7 +76,9 @@ import {
  type ProcessInfo,
  type ProcessInputRequest,
  type ProcessInputResponse,
+  type ProcessListQuery,
  type ProcessListResponse,
+  type ProcessOwner,
  type ProcessLogEntry,
  type ProcessLogsQuery,
  type ProcessLogsResponse,
@ -201,6 +226,7 @@ export interface ProcessTerminalConnectOptions extends ProcessTerminalWebSocketU
 }

 export type ProcessTerminalSessionOptions = ProcessTerminalConnectOptions;
+export type DesktopStreamSessionOptions = DesktopStreamConnectOptions;

 export class SandboxAgentError extends Error {
  readonly status: number;
@ -1533,6 +1559,148 @@ export class SandboxAgent {
    return this.requestHealth();
  }

+  async startDesktop(request: DesktopStartRequest = {}): Promise<DesktopStatusResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/start`, {
+      body: request,
+    });
+  }
+
+  async stopDesktop(): Promise<DesktopStatusResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/stop`);
+  }
+
+  async getDesktopStatus(): Promise<DesktopStatusResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/status`);
+  }
+
+  async getDesktopDisplayInfo(): Promise<DesktopDisplayInfoResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/display/info`);
+  }
+
+  async takeDesktopScreenshot(query: DesktopScreenshotQuery = {}): Promise<Uint8Array> {
+    const response = await this.requestRaw("GET", `${API_PREFIX}/desktop/screenshot`, {
+      query,
+      accept: "image/*",
+    });
+    const buffer = await response.arrayBuffer();
+    return new Uint8Array(buffer);
+  }
+
+  async takeDesktopRegionScreenshot(query: DesktopRegionScreenshotQuery): Promise<Uint8Array> {
+    const response = await this.requestRaw("GET", `${API_PREFIX}/desktop/screenshot/region`, {
+      query,
+      accept: "image/*",
+    });
+    const buffer = await response.arrayBuffer();
+    return new Uint8Array(buffer);
+  }
+
+  async getDesktopMousePosition(): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/mouse/position`);
+  }
+
+  async moveDesktopMouse(request: DesktopMouseMoveRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/move`, {
+      body: request,
+    });
+  }
+
+  async clickDesktop(request: DesktopMouseClickRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/click`, {
+      body: request,
+    });
+  }
+
+  async mouseDownDesktop(request: DesktopMouseDownRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/down`, {
+      body: request,
+    });
+  }
+
+  async mouseUpDesktop(request: DesktopMouseUpRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/up`, {
+      body: request,
+    });
+  }
+
+  async dragDesktopMouse(request: DesktopMouseDragRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/drag`, {
+      body: request,
+    });
+  }
+
+  async scrollDesktop(request: DesktopMouseScrollRequest): Promise<DesktopMousePositionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/mouse/scroll`, {
+      body: request,
+    });
+  }
+
+  async typeDesktopText(request: DesktopKeyboardTypeRequest): Promise<DesktopActionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/keyboard/type`, {
+      body: request,
+    });
+  }
+
+  async pressDesktopKey(request: DesktopKeyboardPressRequest): Promise<DesktopActionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/keyboard/press`, {
+      body: request,
+    });
+  }
+
+  async keyDownDesktop(request: DesktopKeyboardDownRequest): Promise<DesktopActionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/keyboard/down`, {
+      body: request,
+    });
+  }
+
+  async keyUpDesktop(request: DesktopKeyboardUpRequest): Promise<DesktopActionResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/keyboard/up`, {
+      body: request,
+    });
+  }
+
+  async listDesktopWindows(): Promise<DesktopWindowListResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/windows`);
+  }
+
+  async startDesktopRecording(request: DesktopRecordingStartRequest = {}): Promise<DesktopRecordingInfo> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/recording/start`, {
+      body: request,
+    });
+  }
+
+  async stopDesktopRecording(): Promise<DesktopRecordingInfo> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/recording/stop`);
+  }
+
+  async listDesktopRecordings(): Promise<DesktopRecordingListResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/recordings`);
+  }
+
+  async getDesktopRecording(id: string): Promise<DesktopRecordingInfo> {
+    return this.requestJson("GET", `${API_PREFIX}/desktop/recordings/${encodeURIComponent(id)}`);
+  }
+
+  async downloadDesktopRecording(id: string): Promise<Uint8Array> {
+    const response = await this.requestRaw("GET", `${API_PREFIX}/desktop/recordings/${encodeURIComponent(id)}/download`, {
+      accept: "video/mp4",
+    });
+    const buffer = await response.arrayBuffer();
+    return new Uint8Array(buffer);
+  }
+
+  async deleteDesktopRecording(id: string): Promise<void> {
+    await this.requestRaw("DELETE", `${API_PREFIX}/desktop/recordings/${encodeURIComponent(id)}`);
+  }
+
+  async startDesktopStream(): Promise<DesktopStreamStatusResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/stream/start`);
+  }
+
+  async stopDesktopStream(): Promise<DesktopStreamStatusResponse> {
+    return this.requestJson("POST", `${API_PREFIX}/desktop/stream/stop`);
+  }
+
  async listAgents(options?: AgentQueryOptions): Promise<AgentListResponse> {
    return this.requestJson("GET", `${API_PREFIX}/agents`, {
      query: toAgentQuery(options),
@ -1665,8 +1833,10 @@ export class SandboxAgent {
    });
  }

-  async listProcesses(): Promise<ProcessListResponse> {
-    return this.requestJson("GET", `${API_PREFIX}/processes`);
+  async listProcesses(query?: ProcessListQuery): Promise<ProcessListResponse> {
+    return this.requestJson("GET", `${API_PREFIX}/processes`, {
+      query,
+    });
  }

  async getProcess(id: string): Promise<ProcessInfo> {
@ -1754,6 +1924,32 @@ export class SandboxAgent {
    return new ProcessTerminalSession(this.connectProcessTerminalWebSocket(id, options));
  }

+  buildDesktopStreamWebSocketUrl(options: ProcessTerminalWebSocketUrlOptions = {}): string {
+    return toWebSocketUrl(
+      this.buildUrl(`${API_PREFIX}/desktop/stream/ws`, {
+        access_token: options.accessToken ?? this.token,
+      }),
+    );
+  }
+
+  connectDesktopStreamWebSocket(options: DesktopStreamConnectOptions = {}): WebSocket {
+    const WebSocketCtor = options.WebSocket ?? globalThis.WebSocket;
+    if (!WebSocketCtor) {
+      throw new Error("WebSocket API is not available; provide a WebSocket implementation.");
+    }
+
+    return new WebSocketCtor(
+      this.buildDesktopStreamWebSocketUrl({
+        accessToken: options.accessToken,
+      }),
+      options.protocols,
+    );
+  }
+
+  connectDesktopStream(options: DesktopStreamSessionOptions = {}): DesktopStreamSession {
+    return new DesktopStreamSession(this.connectDesktopStreamWebSocket(options));
+  }
+
  private async getLiveConnection(agent: string): Promise<LiveAcpConnection> {
    await this.awaitHealthy();

--- a/sdks/typescript/src/desktop-stream.ts
+++ b/sdks/typescript/src/desktop-stream.ts
@ -0,0 +1,236 @@
+import type { DesktopMouseButton } from "./types.ts";
+
+const WS_READY_STATE_CONNECTING = 0;
+const WS_READY_STATE_OPEN = 1;
+const WS_READY_STATE_CLOSED = 3;
+
+export interface DesktopStreamReadyStatus {
+  type: "ready";
+  width: number;
+  height: number;
+}
+
+export interface DesktopStreamErrorStatus {
+  type: "error";
+  message: string;
+}
+
+export type DesktopStreamStatusMessage = DesktopStreamReadyStatus | DesktopStreamErrorStatus;
+
+export interface DesktopStreamConnectOptions {
+  accessToken?: string;
+  WebSocket?: typeof WebSocket;
+  protocols?: string | string[];
+}
+
+type DesktopStreamClientFrame =
+  | {
+      type: "moveMouse";
+      x: number;
+      y: number;
+    }
+  | {
+      type: "mouseDown" | "mouseUp";
+      x?: number;
+      y?: number;
+      button?: DesktopMouseButton;
+    }
+  | {
+      type: "scroll";
+      x: number;
+      y: number;
+      deltaX?: number;
+      deltaY?: number;
+    }
+  | {
+      type: "keyDown" | "keyUp";
+      key: string;
+    }
+  | {
+      type: "close";
+    };
+
+export class DesktopStreamSession {
+  readonly socket: WebSocket;
+  readonly closed: Promise<void>;
+
+  private readonly readyListeners = new Set<(status: DesktopStreamReadyStatus) => void>();
+  private readonly frameListeners = new Set<(frame: Uint8Array) => void>();
+  private readonly errorListeners = new Set<(error: DesktopStreamErrorStatus | Error) => void>();
+  private readonly closeListeners = new Set<() => void>();
+
+  private closeSignalSent = false;
+  private closedResolve!: () => void;
+
+  constructor(socket: WebSocket) {
+    this.socket = socket;
+    this.socket.binaryType = "arraybuffer";
+    this.closed = new Promise<void>((resolve) => {
+      this.closedResolve = resolve;
+    });
+
+    this.socket.addEventListener("message", (event) => {
+      void this.handleMessage(event.data);
+    });
+    this.socket.addEventListener("error", () => {
+      this.emitError(new Error("Desktop stream websocket connection failed."));
+    });
+    this.socket.addEventListener("close", () => {
+      this.closedResolve();
+      for (const listener of this.closeListeners) {
+        listener();
+      }
+    });
+  }
+
+  onReady(listener: (status: DesktopStreamReadyStatus) => void): () => void {
+    this.readyListeners.add(listener);
+    return () => {
+      this.readyListeners.delete(listener);
+    };
+  }
+
+  onFrame(listener: (frame: Uint8Array) => void): () => void {
+    this.frameListeners.add(listener);
+    return () => {
+      this.frameListeners.delete(listener);
+    };
+  }
+
+  onError(listener: (error: DesktopStreamErrorStatus | Error) => void): () => void {
+    this.errorListeners.add(listener);
+    return () => {
+      this.errorListeners.delete(listener);
+    };
+  }
+
+  onClose(listener: () => void): () => void {
+    this.closeListeners.add(listener);
+    return () => {
+      this.closeListeners.delete(listener);
+    };
+  }
+
+  moveMouse(x: number, y: number): void {
+    this.sendFrame({ type: "moveMouse", x, y });
+  }
+
+  mouseDown(button?: DesktopMouseButton, x?: number, y?: number): void {
+    this.sendFrame({ type: "mouseDown", button, x, y });
+  }
+
+  mouseUp(button?: DesktopMouseButton, x?: number, y?: number): void {
+    this.sendFrame({ type: "mouseUp", button, x, y });
+  }
+
+  scroll(x: number, y: number, deltaX?: number, deltaY?: number): void {
+    this.sendFrame({ type: "scroll", x, y, deltaX, deltaY });
+  }
+
+  keyDown(key: string): void {
+    this.sendFrame({ type: "keyDown", key });
+  }
+
+  keyUp(key: string): void {
+    this.sendFrame({ type: "keyUp", key });
+  }
+
+  close(): void {
+    if (this.socket.readyState === WS_READY_STATE_CONNECTING) {
+      this.socket.addEventListener(
+        "open",
+        () => {
+          this.close();
+        },
+        { once: true },
+      );
+      return;
+    }
+
+    if (this.socket.readyState === WS_READY_STATE_OPEN) {
+      if (!this.closeSignalSent) {
+        this.closeSignalSent = true;
+        this.sendFrame({ type: "close" });
+      }
+      this.socket.close();
+      return;
+    }
+
+    if (this.socket.readyState !== WS_READY_STATE_CLOSED) {
+      this.socket.close();
+    }
+  }
+
+  private async handleMessage(data: unknown): Promise<void> {
+    try {
+      if (typeof data === "string") {
+        const frame = parseStatusFrame(data);
+        if (!frame) {
+          this.emitError(new Error("Received invalid desktop stream control frame."));
+          return;
+        }
+
+        if (frame.type === "ready") {
+          for (const listener of this.readyListeners) {
+            listener(frame);
+          }
+          return;
+        }
+
+        this.emitError(frame);
+        return;
+      }
+
+      const bytes = await decodeBinaryFrame(data);
+      for (const listener of this.frameListeners) {
+        listener(bytes);
+      }
+    } catch (error) {
+      this.emitError(error instanceof Error ? error : new Error(String(error)));
+    }
+  }
+
+  private sendFrame(frame: DesktopStreamClientFrame): void {
+    if (this.socket.readyState !== WS_READY_STATE_OPEN) {
+      return;
+    }
+    this.socket.send(JSON.stringify(frame));
+  }
+
+  private emitError(error: DesktopStreamErrorStatus | Error): void {
+    for (const listener of this.errorListeners) {
+      listener(error);
+    }
+  }
+}
+
+function parseStatusFrame(payload: string): DesktopStreamStatusMessage | null {
+  const value = JSON.parse(payload) as Record<string, unknown>;
+  if (value.type === "ready" && typeof value.width === "number" && typeof value.height === "number") {
+    return {
+      type: "ready",
+      width: value.width,
+      height: value.height,
+    };
+  }
+  if (value.type === "error" && typeof value.message === "string") {
+    return {
+      type: "error",
+      message: value.message,
+    };
+  }
+  return null;
+}
+
+async function decodeBinaryFrame(data: unknown): Promise<Uint8Array> {
+  if (data instanceof ArrayBuffer) {
+    return new Uint8Array(data);
+  }
+  if (ArrayBuffer.isView(data)) {
+    return new Uint8Array(data.buffer, data.byteOffset, data.byteLength);
+  }
+  if (typeof Blob !== "undefined" && data instanceof Blob) {
+    return new Uint8Array(await data.arrayBuffer());
+  }
+  throw new Error("Unsupported desktop stream binary frame type.");
+}
--- a/sdks/typescript/src/generated/openapi.ts
+++ b/sdks/typescript/src/generated/openapi.ts
--- a/sdks/typescript/src/index.ts
+++ b/sdks/typescript/src/index.ts
@ -14,10 +14,18 @@ export {
 export { AcpRpcError } from "acp-http-client";

 export { buildInspectorUrl } from "./inspector.ts";
+export { DesktopStreamSession } from "./desktop-stream.ts";
+export type {
+  DesktopStreamConnectOptions,
+  DesktopStreamErrorStatus,
+  DesktopStreamReadyStatus,
+  DesktopStreamStatusMessage,
+} from "./desktop-stream.ts";

 export type {
  SandboxAgentHealthWaitOptions,
  AgentQueryOptions,
+  DesktopStreamSessionOptions,
  ProcessLogFollowQuery,
  ProcessLogListener,
  ProcessLogSubscription,
@ -50,6 +58,37 @@ export type {
  AgentInstallRequest,
  AgentInstallResponse,
  AgentListResponse,
+  DesktopActionResponse,
+  DesktopDisplayInfoResponse,
+  DesktopErrorInfo,
+  DesktopKeyboardDownRequest,
+  DesktopKeyboardUpRequest,
+  DesktopKeyModifiers,
+  DesktopKeyboardPressRequest,
+  DesktopKeyboardTypeRequest,
+  DesktopMouseButton,
+  DesktopMouseClickRequest,
+  DesktopMouseDownRequest,
+  DesktopMouseDragRequest,
+  DesktopMouseMoveRequest,
+  DesktopMousePositionResponse,
+  DesktopMouseScrollRequest,
+  DesktopMouseUpRequest,
+  DesktopProcessInfo,
+  DesktopRecordingInfo,
+  DesktopRecordingListResponse,
+  DesktopRecordingStartRequest,
+  DesktopRecordingStatus,
+  DesktopRegionScreenshotQuery,
+  DesktopResolution,
+  DesktopScreenshotFormat,
+  DesktopScreenshotQuery,
+  DesktopStartRequest,
+  DesktopState,
+  DesktopStatusResponse,
+  DesktopStreamStatusResponse,
+  DesktopWindowInfo,
+  DesktopWindowListResponse,
  FsActionResponse,
  FsDeleteQuery,
  FsEntriesQuery,
@ -74,10 +113,12 @@ export type {
  ProcessInfo,
  ProcessInputRequest,
  ProcessInputResponse,
+  ProcessListQuery,
  ProcessListResponse,
  ProcessLogEntry,
  ProcessLogsQuery,
  ProcessLogsResponse,
+  ProcessOwner,
  ProcessLogsStream,
  ProcessRunRequest,
  ProcessRunResponse,
--- a/sdks/typescript/src/types.ts
+++ b/sdks/typescript/src/types.ts
@ -4,6 +4,38 @@ import type { components, operations } from "./generated/openapi.ts";
 export type ProblemDetails = components["schemas"]["ProblemDetails"];

 export type HealthResponse = JsonResponse<operations["get_v1_health"], 200>;
+export type DesktopState = components["schemas"]["DesktopState"];
+export type DesktopResolution = components["schemas"]["DesktopResolution"];
+export type DesktopErrorInfo = components["schemas"]["DesktopErrorInfo"];
+export type DesktopProcessInfo = components["schemas"]["DesktopProcessInfo"];
+export type DesktopStatusResponse = JsonResponse<operations["get_v1_desktop_status"], 200>;
+export type DesktopStartRequest = JsonRequestBody<operations["post_v1_desktop_start"]>;
+export type DesktopScreenshotFormat = components["schemas"]["DesktopScreenshotFormat"];
+export type DesktopScreenshotQuery =
+  QueryParams<operations["get_v1_desktop_screenshot"]> extends never ? Record<string, never> : QueryParams<operations["get_v1_desktop_screenshot"]>;
+export type DesktopRegionScreenshotQuery = QueryParams<operations["get_v1_desktop_screenshot_region"]>;
+export type DesktopMousePositionResponse = JsonResponse<operations["get_v1_desktop_mouse_position"], 200>;
+export type DesktopMouseButton = components["schemas"]["DesktopMouseButton"];
+export type DesktopMouseMoveRequest = JsonRequestBody<operations["post_v1_desktop_mouse_move"]>;
+export type DesktopMouseClickRequest = JsonRequestBody<operations["post_v1_desktop_mouse_click"]>;
+export type DesktopMouseDownRequest = JsonRequestBody<operations["post_v1_desktop_mouse_down"]>;
+export type DesktopMouseUpRequest = JsonRequestBody<operations["post_v1_desktop_mouse_up"]>;
+export type DesktopMouseDragRequest = JsonRequestBody<operations["post_v1_desktop_mouse_drag"]>;
+export type DesktopMouseScrollRequest = JsonRequestBody<operations["post_v1_desktop_mouse_scroll"]>;
+export type DesktopKeyboardTypeRequest = JsonRequestBody<operations["post_v1_desktop_keyboard_type"]>;
+export type DesktopKeyModifiers = components["schemas"]["DesktopKeyModifiers"];
+export type DesktopKeyboardPressRequest = JsonRequestBody<operations["post_v1_desktop_keyboard_press"]>;
+export type DesktopKeyboardDownRequest = JsonRequestBody<operations["post_v1_desktop_keyboard_down"]>;
+export type DesktopKeyboardUpRequest = JsonRequestBody<operations["post_v1_desktop_keyboard_up"]>;
+export type DesktopActionResponse = JsonResponse<operations["post_v1_desktop_keyboard_type"], 200>;
+export type DesktopDisplayInfoResponse = JsonResponse<operations["get_v1_desktop_display_info"], 200>;
+export type DesktopWindowInfo = components["schemas"]["DesktopWindowInfo"];
+export type DesktopWindowListResponse = JsonResponse<operations["get_v1_desktop_windows"], 200>;
+export type DesktopRecordingStartRequest = JsonRequestBody<operations["post_v1_desktop_recording_start"]>;
+export type DesktopRecordingStatus = components["schemas"]["DesktopRecordingStatus"];
+export type DesktopRecordingInfo = JsonResponse<operations["post_v1_desktop_recording_start"], 200>;
+export type DesktopRecordingListResponse = JsonResponse<operations["get_v1_desktop_recordings"], 200>;
+export type DesktopStreamStatusResponse = JsonResponse<operations["post_v1_desktop_stream_start"], 200>;
 export type AgentListResponse = JsonResponse<operations["get_v1_agents"], 200>;
 export type AgentInfo = components["schemas"]["AgentInfo"];
 export type AgentQuery = QueryParams<operations["get_v1_agents"]>;
@ -37,11 +69,13 @@ export type ProcessCreateRequest = JsonRequestBody<operations["post_v1_processes
 export type ProcessInfo = components["schemas"]["ProcessInfo"];
 export type ProcessInputRequest = JsonRequestBody<operations["post_v1_process_input"]>;
 export type ProcessInputResponse = JsonResponse<operations["post_v1_process_input"], 200>;
+export type ProcessListQuery = QueryParams<operations["get_v1_processes"]>;
 export type ProcessListResponse = JsonResponse<operations["get_v1_processes"], 200>;
 export type ProcessLogEntry = components["schemas"]["ProcessLogEntry"];
 export type ProcessLogsQuery = QueryParams<operations["get_v1_process_logs"]>;
 export type ProcessLogsResponse = JsonResponse<operations["get_v1_process_logs"], 200>;
 export type ProcessLogsStream = components["schemas"]["ProcessLogsStream"];
+export type ProcessOwner = components["schemas"]["ProcessOwner"];
 export type ProcessRunRequest = JsonRequestBody<operations["post_v1_processes_run"]>;
 export type ProcessRunResponse = JsonResponse<operations["post_v1_processes_run"], 200>;
 export type ProcessSignalQuery = QueryParams<operations["post_v1_process_stop"]>;
--- a/sdks/typescript/tests/helpers/docker.ts
+++ b/sdks/typescript/tests/helpers/docker.ts
@ -0,0 +1,244 @@
+import { execFileSync } from "node:child_process";
+import { mkdtempSync, mkdirSync, rmSync } from "node:fs";
+import { dirname, join, resolve } from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const REPO_ROOT = resolve(__dirname, "../../../..");
+const CONTAINER_PORT = 3000;
+const DEFAULT_PATH = "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";
+const DEFAULT_IMAGE_TAG = "sandbox-agent-test:dev";
+const STANDARD_PATHS = new Set(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]);
+
+let cachedImage: string | undefined;
+let containerCounter = 0;
+
+export type DockerSandboxAgentHandle = {
+  baseUrl: string;
+  token: string;
+  dispose: () => Promise<void>;
+};
+
+export type DockerSandboxAgentOptions = {
+  env?: Record<string, string>;
+  pathMode?: "merge" | "replace";
+  timeoutMs?: number;
+};
+
+type TestLayout = {
+  rootDir: string;
+  homeDir: string;
+  xdgDataHome: string;
+  xdgStateHome: string;
+  appDataDir: string;
+  localAppDataDir: string;
+  installDir: string;
+};
+
+export function createDockerTestLayout(): TestLayout {
+  const tempRoot = join(REPO_ROOT, ".context", "docker-test-");
+  mkdirSync(resolve(REPO_ROOT, ".context"), { recursive: true });
+  const rootDir = mkdtempSync(tempRoot);
+  const homeDir = join(rootDir, "home");
+  const xdgDataHome = join(rootDir, "xdg-data");
+  const xdgStateHome = join(rootDir, "xdg-state");
+  const appDataDir = join(rootDir, "appdata", "Roaming");
+  const localAppDataDir = join(rootDir, "appdata", "Local");
+  const installDir = join(xdgDataHome, "sandbox-agent", "bin");
+
+  for (const dir of [homeDir, xdgDataHome, xdgStateHome, appDataDir, localAppDataDir, installDir]) {
+    mkdirSync(dir, { recursive: true });
+  }
+
+  return {
+    rootDir,
+    homeDir,
+    xdgDataHome,
+    xdgStateHome,
+    appDataDir,
+    localAppDataDir,
+    installDir,
+  };
+}
+
+export function disposeDockerTestLayout(layout: TestLayout): void {
+  try {
+    rmSync(layout.rootDir, { recursive: true, force: true });
+  } catch (error) {
+    if (typeof process.getuid === "function" && typeof process.getgid === "function") {
+      try {
+        execFileSync(
+          "docker",
+          [
+            "run",
+            "--rm",
+            "--user",
+            "0:0",
+            "--entrypoint",
+            "sh",
+            "-v",
+            `${layout.rootDir}:${layout.rootDir}`,
+            ensureImage(),
+            "-c",
+            `chown -R ${process.getuid()}:${process.getgid()} '${layout.rootDir}'`,
+          ],
+          { stdio: "pipe" },
+        );
+        rmSync(layout.rootDir, { recursive: true, force: true });
+        return;
+      } catch {}
+    }
+    throw error;
+  }
+}
+
+export async function startDockerSandboxAgent(layout: TestLayout, options: DockerSandboxAgentOptions = {}): Promise<DockerSandboxAgentHandle> {
+  const image = ensureImage();
+  const containerId = uniqueContainerId();
+  const env = buildEnv(layout, options.env ?? {}, options.pathMode ?? "merge");
+  const mounts = buildMounts(layout.rootDir, env);
+
+  const args = ["run", "-d", "--rm", "--name", containerId, "-p", `127.0.0.1::${CONTAINER_PORT}`];
+
+  if (typeof process.getuid === "function" && typeof process.getgid === "function") {
+    args.push("--user", `${process.getuid()}:${process.getgid()}`);
+  }
+
+  if (process.platform === "linux") {
+    args.push("--add-host", "host.docker.internal:host-gateway");
+  }
+
+  for (const mount of mounts) {
+    args.push("-v", `${mount}:${mount}`);
+  }
+
+  for (const [key, value] of Object.entries(env)) {
+    args.push("-e", `${key}=${value}`);
+  }
+
+  args.push(image, "server", "--host", "0.0.0.0", "--port", String(CONTAINER_PORT), "--no-token");
+
+  execFileSync("docker", args, { stdio: "pipe" });
+
+  try {
+    const mapping = execFileSync("docker", ["port", containerId, `${CONTAINER_PORT}/tcp`], {
+      encoding: "utf8",
+      stdio: ["ignore", "pipe", "pipe"],
+    }).trim();
+    const mappingParts = mapping.split(":");
+    const hostPort = mappingParts[mappingParts.length - 1]?.trim();
+    if (!hostPort) {
+      throw new Error(`missing mapped host port in ${mapping}`);
+    }
+    const baseUrl = `http://127.0.0.1:${hostPort}`;
+    await waitForHealth(baseUrl, options.timeoutMs ?? 30_000);
+
+    return {
+      baseUrl,
+      token: "",
+      dispose: async () => {
+        try {
+          execFileSync("docker", ["rm", "-f", containerId], { stdio: "pipe" });
+        } catch {}
+      },
+    };
+  } catch (error) {
+    try {
+      execFileSync("docker", ["rm", "-f", containerId], { stdio: "pipe" });
+    } catch {}
+    throw error;
+  }
+}
+
+function ensureImage(): string {
+  if (cachedImage) {
+    return cachedImage;
+  }
+
+  cachedImage = process.env.SANDBOX_AGENT_TEST_IMAGE ?? DEFAULT_IMAGE_TAG;
+  execFileSync("docker", ["build", "--tag", cachedImage, "--file", resolve(REPO_ROOT, "docker/test-agent/Dockerfile"), REPO_ROOT], {
+    cwd: REPO_ROOT,
+    stdio: ["ignore", "ignore", "pipe"],
+  });
+  return cachedImage;
+}
+
+function buildEnv(layout: TestLayout, extraEnv: Record<string, string>, pathMode: "merge" | "replace"): Record<string, string> {
+  const env: Record<string, string> = {
+    HOME: layout.homeDir,
+    USERPROFILE: layout.homeDir,
+    XDG_DATA_HOME: layout.xdgDataHome,
+    XDG_STATE_HOME: layout.xdgStateHome,
+    APPDATA: layout.appDataDir,
+    LOCALAPPDATA: layout.localAppDataDir,
+    PATH: DEFAULT_PATH,
+  };
+
+  const customPathEntries = new Set<string>();
+  for (const entry of (extraEnv.PATH ?? "").split(":")) {
+    if (!entry || entry === DEFAULT_PATH || !entry.startsWith("/")) continue;
+    if (entry.startsWith(layout.rootDir)) {
+      customPathEntries.add(entry);
+    }
+  }
+  if (pathMode === "replace") {
+    env.PATH = extraEnv.PATH ?? "";
+  } else if (customPathEntries.size > 0) {
+    env.PATH = `${Array.from(customPathEntries).join(":")}:${DEFAULT_PATH}`;
+  }
+
+  for (const [key, value] of Object.entries(extraEnv)) {
+    if (key === "PATH") {
+      continue;
+    }
+    env[key] = rewriteLocalhostUrl(key, value);
+  }
+
+  return env;
+}
+
+function buildMounts(rootDir: string, env: Record<string, string>): string[] {
+  const mounts = new Set<string>([rootDir]);
+
+  for (const key of ["HOME", "USERPROFILE", "XDG_DATA_HOME", "XDG_STATE_HOME", "APPDATA", "LOCALAPPDATA", "SANDBOX_AGENT_DESKTOP_FAKE_STATE_DIR"]) {
+    const value = env[key];
+    if (value?.startsWith("/")) {
+      mounts.add(value);
+    }
+  }
+
+  for (const entry of (env.PATH ?? "").split(":")) {
+    if (entry.startsWith("/") && !STANDARD_PATHS.has(entry)) {
+      mounts.add(entry);
+    }
+  }
+
+  return Array.from(mounts);
+}
+
+async function waitForHealth(baseUrl: string, timeoutMs: number): Promise<void> {
+  const started = Date.now();
+  while (Date.now() - started < timeoutMs) {
+    try {
+      const response = await fetch(`${baseUrl}/v1/health`);
+      if (response.ok) {
+        return;
+      }
+    } catch {}
+    await new Promise((resolve) => setTimeout(resolve, 200));
+  }
+
+  throw new Error(`timed out waiting for sandbox-agent health at ${baseUrl}`);
+}
+
+function uniqueContainerId(): string {
+  containerCounter += 1;
+  return `sandbox-agent-ts-${process.pid}-${Date.now().toString(36)}-${containerCounter.toString(36)}`;
+}
+
+function rewriteLocalhostUrl(key: string, value: string): string {
+  if (key.endsWith("_URL") || key.endsWith("_URI")) {
+    return value.replace("http://127.0.0.1", "http://host.docker.internal").replace("http://localhost", "http://host.docker.internal");
+  }
+  return value;
+}
--- a/sdks/typescript/tests/integration.test.ts
+++ b/sdks/typescript/tests/integration.test.ts
@ -1,9 +1,6 @@
-import { describe, it, expect, beforeAll, afterAll } from "vitest";
-import { existsSync } from "node:fs";
-import { mkdtempSync, rmSync } from "node:fs";
-import { dirname, resolve } from "node:path";
+import { describe, it, expect, beforeEach, afterEach } from "vitest";
+import { mkdirSync, mkdtempSync, rmSync } from "node:fs";
 import { join } from "node:path";
-import { fileURLToPath } from "node:url";
 import { tmpdir } from "node:os";
 import {
  InMemorySessionPersistDriver,
@ -14,36 +11,11 @@ import {
  type SessionPersistDriver,
  type SessionRecord,
 } from "../src/index.ts";
-import { spawnSandboxAgent, isNodeRuntime, type SandboxAgentSpawnHandle } from "../src/spawn.ts";
+import { isNodeRuntime } from "../src/spawn.ts";
+import { createDockerTestLayout, disposeDockerTestLayout, startDockerSandboxAgent, type DockerSandboxAgentHandle } from "./helpers/docker.ts";
 import { prepareMockAgentDataHome } from "./helpers/mock-agent.ts";
 import WebSocket from "ws";

-const __dirname = dirname(fileURLToPath(import.meta.url));
-
-function findBinary(): string | null {
-  if (process.env.SANDBOX_AGENT_BIN) {
-    return process.env.SANDBOX_AGENT_BIN;
-  }
-
-  const cargoPaths = [resolve(__dirname, "../../../target/debug/sandbox-agent"), resolve(__dirname, "../../../target/release/sandbox-agent")];
-
-  for (const p of cargoPaths) {
-    if (existsSync(p)) {
-      return p;
-    }
-  }
-
-  return null;
-}
-
-const BINARY_PATH = findBinary();
-if (!BINARY_PATH) {
-  throw new Error("sandbox-agent binary not found. Build it (cargo build -p sandbox-agent) or set SANDBOX_AGENT_BIN.");
-}
-if (!process.env.SANDBOX_AGENT_BIN) {
-  process.env.SANDBOX_AGENT_BIN = BINARY_PATH;
-}
-
 function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
 }
@ -110,6 +82,15 @@ async function waitForAsync<T>(fn: () => Promise<T | undefined | null>, timeoutM
  throw new Error("timed out waiting for condition");
 }

+async function withTimeout<T>(promise: Promise<T>, label: string, timeoutMs = 15_000): Promise<T> {
+  return await Promise.race([
+    promise,
+    sleep(timeoutMs).then(() => {
+      throw new Error(`${label} timed out after ${timeoutMs}ms`);
+    }),
+  ]);
+}
+
 function buildTarArchive(entries: Array<{ name: string; content: string }>): Uint8Array {
  const blocks: Buffer[] = [];

@ -174,34 +155,77 @@ function decodeProcessLogData(data: string, encoding: string): string {

 function nodeCommand(source: string): { command: string; args: string[] } {
  return {
-    command: process.execPath,
+    command: "node",
    args: ["-e", source],
  };
 }

+function forwardRequest(defaultFetch: typeof fetch, baseUrl: string, outgoing: Request, parsed: URL): Promise<Response> {
+  const forwardedInit: RequestInit & { duplex?: "half" } = {
+    method: outgoing.method,
+    headers: new Headers(outgoing.headers),
+    signal: outgoing.signal,
+  };
+
+  if (outgoing.method !== "GET" && outgoing.method !== "HEAD") {
+    forwardedInit.body = outgoing.body;
+    forwardedInit.duplex = "half";
+  }
+
+  const forwardedUrl = new URL(`${parsed.pathname}${parsed.search}`, baseUrl);
+  return defaultFetch(forwardedUrl, forwardedInit);
+}
+
+async function launchDesktopFocusWindow(sdk: SandboxAgent, display: string): Promise<string> {
+  const windowProcess = await sdk.createProcess({
+    command: "xterm",
+    args: ["-geometry", "80x24+40+40", "-title", "Sandbox Desktop Test", "-e", "sh", "-lc", "sleep 60"],
+    env: { DISPLAY: display },
+  });
+
+  await waitForAsync(
+    async () => {
+      const result = await sdk.runProcess({
+        command: "sh",
+        args: [
+          "-lc",
+          'wid="$(xdotool search --onlyvisible --name \'Sandbox Desktop Test\' 2>/dev/null | head -n 1 || true)"; if [ -z "$wid" ]; then exit 3; fi; xdotool windowactivate "$wid"',
+        ],
+        env: { DISPLAY: display },
+        timeoutMs: 5_000,
+      });
+
+      return result.exitCode === 0 ? true : undefined;
+    },
+    10_000,
+    200,
+  );
+
+  return windowProcess.id;
+}
+
 describe("Integration: TypeScript SDK flat session API", () => {
-  let handle: SandboxAgentSpawnHandle;
+  let handle: DockerSandboxAgentHandle;
  let baseUrl: string;
  let token: string;
-  let dataHome: string;
+  let layout: ReturnType<typeof createDockerTestLayout>;

-  beforeAll(async () => {
-    dataHome = mkdtempSync(join(tmpdir(), "sdk-integration-"));
-    const agentEnv = prepareMockAgentDataHome(dataHome);
+  beforeEach(async () => {
+    layout = createDockerTestLayout();
+    prepareMockAgentDataHome(layout.xdgDataHome);

-    handle = await spawnSandboxAgent({
-      enabled: true,
-      log: "silent",
+    handle = await startDockerSandboxAgent(layout, {
      timeoutMs: 30000,
-      env: agentEnv,
    });
    baseUrl = handle.baseUrl;
    token = handle.token;
  });

-  afterAll(async () => {
-    await handle.dispose();
-    rmSync(dataHome, { recursive: true, force: true });
+  afterEach(async () => {
+    await handle?.dispose?.();
+    if (layout) {
+      disposeDockerTestLayout(layout);
+    }
  });

  it("detects Node.js runtime", () => {
@ -280,11 +304,12 @@ describe("Integration: TypeScript SDK flat session API", () => {
      token,
    });

-    const directory = mkdtempSync(join(tmpdir(), "sdk-fs-"));
+    const directory = join(layout.rootDir, "fs-test");
    const nestedDir = join(directory, "nested");
    const filePath = join(directory, "notes.txt");
    const movedPath = join(directory, "notes-moved.txt");
    const uploadDir = join(directory, "uploaded");
+    mkdirSync(directory, { recursive: true });

    try {
      const listedAgents = await sdk.listAgents({ config: true, noCache: true });
@ -341,25 +366,30 @@ describe("Integration: TypeScript SDK flat session API", () => {
      const parsed = new URL(outgoing.url);
      seenPaths.push(parsed.pathname);

-      const forwardedUrl = new URL(`${parsed.pathname}${parsed.search}`, baseUrl);
-      const forwarded = new Request(forwardedUrl.toString(), outgoing);
-      return defaultFetch(forwarded);
+      return forwardRequest(defaultFetch, baseUrl, outgoing, parsed);
    };

    const sdk = await SandboxAgent.connect({
      token,
      fetch: customFetch,
    });
+    let sessionId: string | undefined;

-    await sdk.getHealth();
-    const session = await sdk.createSession({ agent: "mock" });
-    const prompt = await session.prompt([{ type: "text", text: "custom fetch integration test" }]);
-    expect(prompt.stopReason).toBe("end_turn");
+    try {
+      await withTimeout(sdk.getHealth(), "custom fetch getHealth");
+      const session = await withTimeout(sdk.createSession({ agent: "mock" }), "custom fetch createSession");
+      sessionId = session.id;
+      expect(session.agent).toBe("mock");
+      await withTimeout(sdk.destroySession(session.id), "custom fetch destroySession");

-    expect(seenPaths).toContain("/v1/health");
-    expect(seenPaths.some((path) => path.startsWith("/v1/acp/"))).toBe(true);
-
-    await sdk.dispose();
+      expect(seenPaths).toContain("/v1/health");
+      expect(seenPaths.some((path) => path.startsWith("/v1/acp/"))).toBe(true);
+    } finally {
+      if (sessionId) {
+        await sdk.destroySession(sessionId).catch(() => {});
+      }
+      await withTimeout(sdk.dispose(), "custom fetch dispose");
+    }
  }, 60_000);

  it("requires baseUrl when fetch is not provided", async () => {
@ -386,9 +416,7 @@ describe("Integration: TypeScript SDK flat session API", () => {
        }
      }

-      const forwardedUrl = new URL(`${parsed.pathname}${parsed.search}`, baseUrl);
-      const forwarded = new Request(forwardedUrl.toString(), outgoing);
-      return defaultFetch(forwarded);
+      return forwardRequest(defaultFetch, baseUrl, outgoing, parsed);
    };

    const sdk = await SandboxAgent.connect({
@ -710,7 +738,9 @@ describe("Integration: TypeScript SDK flat session API", () => {
      token,
    });

-    const directory = mkdtempSync(join(tmpdir(), "sdk-config-"));
+    const directory = join(layout.rootDir, "config-test");
+
+    mkdirSync(directory, { recursive: true });

    const mcpConfig = {
      type: "local" as const,
@ -957,4 +987,98 @@ describe("Integration: TypeScript SDK flat session API", () => {
      await sdk.dispose();
    }
  });
+
+  it("covers desktop status, screenshot, display, mouse, and keyboard helpers", async () => {
+    const sdk = await SandboxAgent.connect({
+      baseUrl,
+      token,
+    });
+    let focusWindowProcessId: string | undefined;
+
+    try {
+      const initialStatus = await sdk.getDesktopStatus();
+      expect(initialStatus.state).toBe("inactive");
+
+      const started = await sdk.startDesktop({
+        width: 1440,
+        height: 900,
+        dpi: 96,
+      });
+      expect(started.state).toBe("active");
+      expect(started.display?.startsWith(":")).toBe(true);
+      expect(started.missingDependencies).toEqual([]);
+
+      const displayInfo = await sdk.getDesktopDisplayInfo();
+      expect(displayInfo.display).toBe(started.display);
+      expect(displayInfo.resolution.width).toBe(1440);
+      expect(displayInfo.resolution.height).toBe(900);
+
+      const screenshot = await sdk.takeDesktopScreenshot();
+      expect(Buffer.from(screenshot.subarray(0, 8)).equals(Buffer.from("\x89PNG\r\n\x1a\n", "binary"))).toBe(true);
+
+      const region = await sdk.takeDesktopRegionScreenshot({
+        x: 10,
+        y: 20,
+        width: 40,
+        height: 50,
+      });
+      expect(Buffer.from(region.subarray(0, 8)).equals(Buffer.from("\x89PNG\r\n\x1a\n", "binary"))).toBe(true);
+
+      const moved = await sdk.moveDesktopMouse({ x: 40, y: 50 });
+      expect(moved.x).toBe(40);
+      expect(moved.y).toBe(50);
+
+      const dragged = await sdk.dragDesktopMouse({
+        startX: 40,
+        startY: 50,
+        endX: 80,
+        endY: 90,
+        button: "left",
+      });
+      expect(dragged.x).toBe(80);
+      expect(dragged.y).toBe(90);
+
+      const clicked = await sdk.clickDesktop({
+        x: 80,
+        y: 90,
+        button: "left",
+        clickCount: 1,
+      });
+      expect(clicked.x).toBe(80);
+      expect(clicked.y).toBe(90);
+
+      const scrolled = await sdk.scrollDesktop({
+        x: 80,
+        y: 90,
+        deltaY: -2,
+      });
+      expect(scrolled.x).toBe(80);
+      expect(scrolled.y).toBe(90);
+
+      const position = await sdk.getDesktopMousePosition();
+      expect(position.x).toBe(80);
+      expect(position.y).toBe(90);
+
+      focusWindowProcessId = await launchDesktopFocusWindow(sdk, started.display!);
+
+      const typed = await sdk.typeDesktopText({
+        text: "hello desktop",
+        delayMs: 5,
+      });
+      expect(typed.ok).toBe(true);
+
+      const pressed = await sdk.pressDesktopKey({ key: "ctrl+l" });
+      expect(pressed.ok).toBe(true);
+
+      const stopped = await sdk.stopDesktop();
+      expect(stopped.state).toBe("inactive");
+    } finally {
+      if (focusWindowProcessId) {
+        await sdk.killProcess(focusWindowProcessId, { waitMs: 5_000 }).catch(() => {});
+        await sdk.deleteProcess(focusWindowProcessId).catch(() => {});
+      }
+      await sdk.stopDesktop().catch(() => {});
+      await sdk.dispose();
+    }
+  });
 });
--- a/sdks/typescript/vitest.config.ts
+++ b/sdks/typescript/vitest.config.ts
@ -4,7 +4,6 @@ export default defineConfig({
  test: {
    include: ["tests/**/*.test.ts"],
    testTimeout: 30000,
-    teardownTimeout: 10000,
-    pool: "forks",
+    hookTimeout: 120000,
  },
 });
--- a/server/packages/sandbox-agent/src/cli.rs
+++ b/server/packages/sandbox-agent/src/cli.rs
@ -11,6 +11,7 @@ mod build_version {
    include!(concat!(env!("OUT_DIR"), "/version.rs"));
 }

+use crate::desktop_install::{install_desktop, DesktopInstallRequest, DesktopPackageManager};
 use crate::router::{
    build_router_with_state, shutdown_servers, AppState, AuthConfig, BrandingMode,
 };
@ -75,6 +76,8 @@ pub enum Command {
    Server(ServerArgs),
    /// Call the HTTP API without writing client code.
    Api(ApiArgs),
+    /// Install first-party runtime dependencies.
+    Install(InstallArgs),
    /// EXPERIMENTAL: OpenCode compatibility layer (disabled until ACP Phase 7).
    Opencode(OpencodeArgs),
    /// Manage the sandbox-agent background daemon.
@ -118,6 +121,12 @@ pub struct ApiArgs {
    command: ApiCommand,
 }

+#[derive(Args, Debug)]
+pub struct InstallArgs {
+    #[command(subcommand)]
+    command: InstallCommand,
+}
+
 #[derive(Args, Debug)]
 pub struct OpencodeArgs {
    #[arg(long, short = 'H', default_value = DEFAULT_HOST)]
@ -156,6 +165,12 @@ pub struct DaemonArgs {
    command: DaemonCommand,
 }

+#[derive(Subcommand, Debug)]
+pub enum InstallCommand {
+    /// Install desktop runtime dependencies.
+    Desktop(InstallDesktopArgs),
+}
+
 #[derive(Subcommand, Debug)]
 pub enum DaemonCommand {
    /// Start the daemon in the background.
@ -310,6 +325,18 @@ pub struct InstallAgentArgs {
    agent_process_version: Option<String>,
 }

+#[derive(Args, Debug)]
+pub struct InstallDesktopArgs {
+    #[arg(long, default_value_t = false)]
+    yes: bool,
+    #[arg(long, default_value_t = false)]
+    print_only: bool,
+    #[arg(long, value_enum)]
+    package_manager: Option<DesktopPackageManager>,
+    #[arg(long, default_value_t = false)]
+    no_fonts: bool,
+}
+
 #[derive(Args, Debug)]
 pub struct CredentialsExtractArgs {
    #[arg(long, short = 'a', value_enum)]
@ -405,6 +432,7 @@ pub fn run_command(command: &Command, cli: &CliConfig) -> Result<(), CliError> {
    match command {
        Command::Server(args) => run_server(cli, args),
        Command::Api(subcommand) => run_api(&subcommand.command, cli),
+        Command::Install(subcommand) => run_install(&subcommand.command),
        Command::Opencode(args) => run_opencode(cli, args),
        Command::Daemon(subcommand) => run_daemon(&subcommand.command, cli),
        Command::InstallAgent(args) => install_agent_local(args),
@ -413,6 +441,12 @@ pub fn run_command(command: &Command, cli: &CliConfig) -> Result<(), CliError> {
    }
 }

+fn run_install(command: &InstallCommand) -> Result<(), CliError> {
+    match command {
+        InstallCommand::Desktop(args) => install_desktop_local(args),
+    }
+}
+
 fn run_server(cli: &CliConfig, server: &ServerArgs) -> Result<(), CliError> {
    let auth = if let Some(token) = cli.token.clone() {
        AuthConfig::with_token(token)
@ -477,6 +511,17 @@ fn run_api(command: &ApiCommand, cli: &CliConfig) -> Result<(), CliError> {
    }
 }

+fn install_desktop_local(args: &InstallDesktopArgs) -> Result<(), CliError> {
+    install_desktop(DesktopInstallRequest {
+        yes: args.yes,
+        print_only: args.print_only,
+        package_manager: args.package_manager,
+        no_fonts: args.no_fonts,
+    })
+    .map(|_| ())
+    .map_err(CliError::Server)
+}
+
 fn run_agents(command: &AgentsCommand, cli: &CliConfig) -> Result<(), CliError> {
    match command {
        AgentsCommand::List(args) => {
--- a/server/packages/sandbox-agent/src/desktop_errors.rs
+++ b/server/packages/sandbox-agent/src/desktop_errors.rs
@ -0,0 +1,217 @@
+use sandbox_agent_error::ProblemDetails;
+use serde_json::{json, Map, Value};
+
+use crate::desktop_types::{DesktopErrorInfo, DesktopProcessInfo};
+
+#[derive(Debug, Clone)]
+pub struct DesktopProblem {
+    status: u16,
+    title: &'static str,
+    code: &'static str,
+    message: String,
+    missing_dependencies: Vec<String>,
+    install_command: Option<String>,
+    processes: Vec<DesktopProcessInfo>,
+}
+
+impl DesktopProblem {
+    pub fn unsupported_platform(message: impl Into<String>) -> Self {
+        Self::new(
+            501,
+            "Desktop Unsupported",
+            "desktop_unsupported_platform",
+            message,
+        )
+    }
+
+    pub fn dependencies_missing(
+        missing_dependencies: Vec<String>,
+        install_command: Option<String>,
+        processes: Vec<DesktopProcessInfo>,
+    ) -> Self {
+        let mut message = if missing_dependencies.is_empty() {
+            "Desktop dependencies are not installed".to_string()
+        } else {
+            format!(
+                "Desktop dependencies are not installed: {}",
+                missing_dependencies.join(", ")
+            )
+        };
+        if let Some(command) = install_command.as_ref() {
+            message.push_str(&format!(
+                ". Run `{command}` to install them, or install the required tools manually."
+            ));
+        }
+        Self::new(
+            503,
+            "Desktop Dependencies Missing",
+            "desktop_dependencies_missing",
+            message,
+        )
+        .with_missing_dependencies(missing_dependencies)
+        .with_install_command(install_command)
+        .with_processes(processes)
+    }
+
+    pub fn runtime_inactive(message: impl Into<String>) -> Self {
+        Self::new(
+            409,
+            "Desktop Runtime Inactive",
+            "desktop_runtime_inactive",
+            message,
+        )
+    }
+
+    pub fn runtime_starting(message: impl Into<String>) -> Self {
+        Self::new(
+            409,
+            "Desktop Runtime Starting",
+            "desktop_runtime_starting",
+            message,
+        )
+    }
+
+    pub fn runtime_failed(
+        message: impl Into<String>,
+        install_command: Option<String>,
+        processes: Vec<DesktopProcessInfo>,
+    ) -> Self {
+        Self::new(
+            503,
+            "Desktop Runtime Failed",
+            "desktop_runtime_failed",
+            message,
+        )
+        .with_install_command(install_command)
+        .with_processes(processes)
+    }
+
+    pub fn invalid_action(message: impl Into<String>) -> Self {
+        Self::new(
+            400,
+            "Desktop Invalid Action",
+            "desktop_invalid_action",
+            message,
+        )
+    }
+
+    pub fn screenshot_failed(
+        message: impl Into<String>,
+        processes: Vec<DesktopProcessInfo>,
+    ) -> Self {
+        Self::new(
+            502,
+            "Desktop Screenshot Failed",
+            "desktop_screenshot_failed",
+            message,
+        )
+        .with_processes(processes)
+    }
+
+    pub fn input_failed(message: impl Into<String>, processes: Vec<DesktopProcessInfo>) -> Self {
+        Self::new(502, "Desktop Input Failed", "desktop_input_failed", message)
+            .with_processes(processes)
+    }
+
+    pub fn to_problem_details(&self) -> ProblemDetails {
+        let mut extensions = Map::new();
+        extensions.insert("code".to_string(), Value::String(self.code.to_string()));
+        if !self.missing_dependencies.is_empty() {
+            extensions.insert(
+                "missingDependencies".to_string(),
+                Value::Array(
+                    self.missing_dependencies
+                        .iter()
+                        .cloned()
+                        .map(Value::String)
+                        .collect(),
+                ),
+            );
+        }
+        if let Some(install_command) = self.install_command.as_ref() {
+            extensions.insert(
+                "installCommand".to_string(),
+                Value::String(install_command.clone()),
+            );
+        }
+        if !self.processes.is_empty() {
+            extensions.insert("processes".to_string(), json!(self.processes));
+        }
+
+        ProblemDetails {
+            type_: format!("urn:sandbox-agent:error:{}", self.code),
+            title: self.title.to_string(),
+            status: self.status,
+            detail: Some(self.message.clone()),
+            instance: None,
+            extensions,
+        }
+    }
+
+    pub fn to_error_info(&self) -> DesktopErrorInfo {
+        DesktopErrorInfo {
+            code: self.code.to_string(),
+            message: self.message.clone(),
+        }
+    }
+
+    pub fn code(&self) -> &'static str {
+        self.code
+    }
+
+    fn new(
+        status: u16,
+        title: &'static str,
+        code: &'static str,
+        message: impl Into<String>,
+    ) -> Self {
+        Self {
+            status,
+            title,
+            code,
+            message: message.into(),
+            missing_dependencies: Vec::new(),
+            install_command: None,
+            processes: Vec::new(),
+        }
+    }
+
+    fn with_missing_dependencies(mut self, missing_dependencies: Vec<String>) -> Self {
+        self.missing_dependencies = missing_dependencies;
+        self
+    }
+
+    fn with_install_command(mut self, install_command: Option<String>) -> Self {
+        self.install_command = install_command;
+        self
+    }
+
+    fn with_processes(mut self, processes: Vec<DesktopProcessInfo>) -> Self {
+        self.processes = processes;
+        self
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn dependencies_missing_detail_includes_install_command() {
+        let problem = DesktopProblem::dependencies_missing(
+            vec!["Xvfb".to_string(), "openbox".to_string()],
+            Some("sandbox-agent install desktop --yes".to_string()),
+            Vec::new(),
+        );
+        let details = problem.to_problem_details();
+        let detail = details.detail.expect("detail");
+        assert!(detail.contains("Desktop dependencies are not installed: Xvfb, openbox"));
+        assert!(detail.contains("sandbox-agent install desktop --yes"));
+        assert_eq!(
+            details.extensions.get("installCommand"),
+            Some(&Value::String(
+                "sandbox-agent install desktop --yes".to_string()
+            ))
+        );
+    }
+}
--- a/server/packages/sandbox-agent/src/desktop_install.rs
+++ b/server/packages/sandbox-agent/src/desktop_install.rs
@ -0,0 +1,324 @@
+use std::fmt;
+use std::io::{self, Write};
+use std::path::PathBuf;
+use std::process::Command as ProcessCommand;
+
+use clap::ValueEnum;
+
+const AUTOMATIC_INSTALL_SUPPORTED_DISTROS: &str =
+    "Automatic desktop dependency installation is supported on Debian/Ubuntu (apt), Fedora/RHEL (dnf), and Alpine (apk).";
+const AUTOMATIC_INSTALL_UNSUPPORTED_ENVS: &str =
+    "Automatic installation is not supported on macOS, Windows, or Linux distributions without apt, dnf, or apk.";
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq, ValueEnum)]
+pub enum DesktopPackageManager {
+    Apt,
+    Dnf,
+    Apk,
+}
+
+#[derive(Debug, Clone)]
+pub struct DesktopInstallRequest {
+    pub yes: bool,
+    pub print_only: bool,
+    pub package_manager: Option<DesktopPackageManager>,
+    pub no_fonts: bool,
+}
+
+pub(crate) fn desktop_platform_support_message() -> String {
+    format!("Desktop APIs are only supported on Linux. {AUTOMATIC_INSTALL_SUPPORTED_DISTROS}")
+}
+
+fn linux_install_support_message() -> String {
+    format!("{AUTOMATIC_INSTALL_SUPPORTED_DISTROS} {AUTOMATIC_INSTALL_UNSUPPORTED_ENVS}")
+}
+
+pub fn install_desktop(request: DesktopInstallRequest) -> Result<(), String> {
+    if std::env::consts::OS != "linux" {
+        return Err(format!(
+            "desktop installation is only supported on Linux. {}",
+            linux_install_support_message()
+        ));
+    }
+
+    let package_manager = match request.package_manager {
+        Some(value) => value,
+        None => detect_package_manager().ok_or_else(|| {
+            format!(
+                "could not detect a supported package manager. {} Install the desktop dependencies manually on this distribution.",
+                linux_install_support_message()
+            )
+        })?,
+    };
+
+    let packages = desktop_packages(package_manager, request.no_fonts);
+    let used_sudo = !running_as_root() && find_binary("sudo").is_some();
+    if !running_as_root() && !used_sudo {
+        return Err(
+            "desktop installation requires root or sudo access; rerun as root or install dependencies manually"
+                .to_string(),
+        );
+    }
+
+    println!("Desktop package manager: {}", package_manager);
+    println!("Desktop packages:");
+    for package in &packages {
+        println!("  - {package}");
+    }
+    println!("Install command:");
+    println!(
+        "  {}",
+        render_install_command(package_manager, used_sudo, &packages)
+    );
+
+    if request.print_only {
+        return Ok(());
+    }
+
+    if !request.yes && !prompt_yes_no("Proceed with desktop dependency installation? [y/N] ")? {
+        return Err("installation cancelled".to_string());
+    }
+
+    run_install_commands(package_manager, used_sudo, &packages)?;
+
+    println!("Desktop dependencies installed.");
+    Ok(())
+}
+
+fn detect_package_manager() -> Option<DesktopPackageManager> {
+    if find_binary("apt-get").is_some() {
+        return Some(DesktopPackageManager::Apt);
+    }
+    if find_binary("dnf").is_some() {
+        return Some(DesktopPackageManager::Dnf);
+    }
+    if find_binary("apk").is_some() {
+        return Some(DesktopPackageManager::Apk);
+    }
+    None
+}
+
+fn desktop_packages(package_manager: DesktopPackageManager, no_fonts: bool) -> Vec<String> {
+    let mut packages = match package_manager {
+        DesktopPackageManager::Apt => vec![
+            "xvfb",
+            "openbox",
+            "xdotool",
+            "imagemagick",
+            "ffmpeg",
+            "x11-xserver-utils",
+            "dbus-x11",
+            "xauth",
+            "fonts-dejavu-core",
+        ],
+        DesktopPackageManager::Dnf => vec![
+            "xorg-x11-server-Xvfb",
+            "openbox",
+            "xdotool",
+            "ImageMagick",
+            "ffmpeg",
+            "xrandr",
+            "dbus-x11",
+            "xauth",
+            "dejavu-sans-fonts",
+        ],
+        DesktopPackageManager::Apk => vec![
+            "xvfb",
+            "openbox",
+            "xdotool",
+            "imagemagick",
+            "ffmpeg",
+            "xrandr",
+            "dbus",
+            "xauth",
+            "ttf-dejavu",
+        ],
+    }
+    .into_iter()
+    .map(str::to_string)
+    .collect::<Vec<_>>();
+
+    if no_fonts {
+        packages.retain(|package| {
+            package != "fonts-dejavu-core"
+                && package != "dejavu-sans-fonts"
+                && package != "ttf-dejavu"
+        });
+    }
+
+    packages
+}
+
+fn render_install_command(
+    package_manager: DesktopPackageManager,
+    used_sudo: bool,
+    packages: &[String],
+) -> String {
+    let sudo = if used_sudo { "sudo " } else { "" };
+    match package_manager {
+        DesktopPackageManager::Apt => format!(
+            "{sudo}apt-get update && {sudo}env DEBIAN_FRONTEND=noninteractive apt-get install -y {}",
+            packages.join(" ")
+        ),
+        DesktopPackageManager::Dnf => {
+            format!("{sudo}dnf install -y {}", packages.join(" "))
+        }
+        DesktopPackageManager::Apk => {
+            format!("{sudo}apk add --no-cache {}", packages.join(" "))
+        }
+    }
+}
+
+fn run_install_commands(
+    package_manager: DesktopPackageManager,
+    used_sudo: bool,
+    packages: &[String],
+) -> Result<(), String> {
+    match package_manager {
+        DesktopPackageManager::Apt => {
+            run_command(command_with_privilege(
+                used_sudo,
+                "apt-get",
+                vec!["update".to_string()],
+            ))?;
+            let mut args = vec![
+                "DEBIAN_FRONTEND=noninteractive".to_string(),
+                "apt-get".to_string(),
+                "install".to_string(),
+                "-y".to_string(),
+            ];
+            args.extend(packages.iter().cloned());
+            run_command(command_with_privilege(used_sudo, "env", args))?;
+        }
+        DesktopPackageManager::Dnf => {
+            let mut args = vec!["install".to_string(), "-y".to_string()];
+            args.extend(packages.iter().cloned());
+            run_command(command_with_privilege(used_sudo, "dnf", args))?;
+        }
+        DesktopPackageManager::Apk => {
+            let mut args = vec!["add".to_string(), "--no-cache".to_string()];
+            args.extend(packages.iter().cloned());
+            run_command(command_with_privilege(used_sudo, "apk", args))?;
+        }
+    }
+    Ok(())
+}
+
+fn command_with_privilege(
+    used_sudo: bool,
+    program: &str,
+    args: Vec<String>,
+) -> (String, Vec<String>) {
+    if used_sudo {
+        let mut sudo_args = vec![program.to_string()];
+        sudo_args.extend(args);
+        ("sudo".to_string(), sudo_args)
+    } else {
+        (program.to_string(), args)
+    }
+}
+
+fn run_command((program, args): (String, Vec<String>)) -> Result<(), String> {
+    let status = ProcessCommand::new(&program)
+        .args(&args)
+        .status()
+        .map_err(|err| format!("failed to run `{program}`: {err}"))?;
+    if !status.success() {
+        return Err(format!(
+            "command `{}` exited with status {}",
+            format_command(&program, &args),
+            status
+        ));
+    }
+    Ok(())
+}
+
+fn prompt_yes_no(prompt: &str) -> Result<bool, String> {
+    print!("{prompt}");
+    io::stdout()
+        .flush()
+        .map_err(|err| format!("failed to flush prompt: {err}"))?;
+    let mut input = String::new();
+    io::stdin()
+        .read_line(&mut input)
+        .map_err(|err| format!("failed to read confirmation: {err}"))?;
+    let normalized = input.trim().to_ascii_lowercase();
+    Ok(matches!(normalized.as_str(), "y" | "yes"))
+}
+
+fn running_as_root() -> bool {
+    #[cfg(unix)]
+    unsafe {
+        return libc::geteuid() == 0;
+    }
+    #[cfg(not(unix))]
+    {
+        false
+    }
+}
+
+fn find_binary(name: &str) -> Option<PathBuf> {
+    let path_env = std::env::var_os("PATH")?;
+    for path in std::env::split_paths(&path_env) {
+        let candidate = path.join(name);
+        if candidate.is_file() {
+            return Some(candidate);
+        }
+    }
+    None
+}
+
+fn format_command(program: &str, args: &[String]) -> String {
+    let mut parts = vec![program.to_string()];
+    parts.extend(args.iter().cloned());
+    parts.join(" ")
+}
+
+impl fmt::Display for DesktopPackageManager {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        match self {
+            DesktopPackageManager::Apt => write!(f, "apt"),
+            DesktopPackageManager::Dnf => write!(f, "dnf"),
+            DesktopPackageManager::Apk => write!(f, "apk"),
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn desktop_platform_support_message_mentions_linux_and_supported_distros() {
+        let message = desktop_platform_support_message();
+        assert!(message.contains("only supported on Linux"));
+        assert!(message.contains("Debian/Ubuntu (apt)"));
+        assert!(message.contains("Fedora/RHEL (dnf)"));
+        assert!(message.contains("Alpine (apk)"));
+    }
+
+    #[test]
+    fn linux_install_support_message_mentions_unsupported_environments() {
+        let message = linux_install_support_message();
+        assert!(message.contains("Debian/Ubuntu (apt)"));
+        assert!(message.contains("Fedora/RHEL (dnf)"));
+        assert!(message.contains("Alpine (apk)"));
+        assert!(message.contains("macOS"));
+        assert!(message.contains("Windows"));
+        assert!(message.contains("without apt, dnf, or apk"));
+    }
+
+    #[test]
+    fn desktop_packages_support_no_fonts() {
+        let packages = desktop_packages(DesktopPackageManager::Apt, true);
+        assert!(!packages.iter().any(|value| value == "fonts-dejavu-core"));
+        assert!(packages.iter().any(|value| value == "xvfb"));
+    }
+
+    #[test]
+    fn render_install_command_matches_package_manager() {
+        let packages = vec!["xvfb".to_string(), "openbox".to_string()];
+        let command = render_install_command(DesktopPackageManager::Apk, false, &packages);
+        assert_eq!(command, "apk add --no-cache xvfb openbox");
+    }
+}
--- a/server/packages/sandbox-agent/src/desktop_recording.rs
+++ b/server/packages/sandbox-agent/src/desktop_recording.rs
@ -0,0 +1,329 @@
+use std::collections::BTreeMap;
+use std::fs;
+use std::path::{Path, PathBuf};
+use std::sync::Arc;
+
+use tokio::sync::Mutex;
+
+use sandbox_agent_error::SandboxError;
+
+use crate::desktop_types::{
+    DesktopRecordingInfo, DesktopRecordingListResponse, DesktopRecordingStartRequest,
+    DesktopRecordingStatus, DesktopResolution,
+};
+use crate::process_runtime::{
+    ProcessOwner, ProcessRuntime, ProcessStartSpec, ProcessStatus, RestartPolicy,
+};
+
+#[derive(Debug, Clone)]
+pub struct DesktopRecordingContext {
+    pub display: String,
+    pub environment: std::collections::HashMap<String, String>,
+    pub resolution: DesktopResolution,
+}
+
+#[derive(Debug, Clone)]
+pub struct DesktopRecordingManager {
+    process_runtime: Arc<ProcessRuntime>,
+    recordings_dir: PathBuf,
+    inner: Arc<Mutex<DesktopRecordingState>>,
+}
+
+#[derive(Debug, Default)]
+struct DesktopRecordingState {
+    next_id: u64,
+    current_id: Option<String>,
+    recordings: BTreeMap<String, RecordingEntry>,
+}
+
+#[derive(Debug, Clone)]
+struct RecordingEntry {
+    info: DesktopRecordingInfo,
+    path: PathBuf,
+}
+
+impl DesktopRecordingManager {
+    pub fn new(process_runtime: Arc<ProcessRuntime>, state_dir: PathBuf) -> Self {
+        Self {
+            process_runtime,
+            recordings_dir: state_dir.join("recordings"),
+            inner: Arc::new(Mutex::new(DesktopRecordingState::default())),
+        }
+    }
+
+    pub async fn start(
+        &self,
+        context: DesktopRecordingContext,
+        request: DesktopRecordingStartRequest,
+    ) -> Result<DesktopRecordingInfo, SandboxError> {
+        if find_binary("ffmpeg").is_none() {
+            return Err(SandboxError::Conflict {
+                message: "ffmpeg is required for desktop recording".to_string(),
+            });
+        }
+
+        self.ensure_recordings_dir()?;
+
+        {
+            let mut state = self.inner.lock().await;
+            self.refresh_locked(&mut state).await?;
+            if state.current_id.is_some() {
+                return Err(SandboxError::Conflict {
+                    message: "a desktop recording is already active".to_string(),
+                });
+            }
+        }
+
+        let mut state = self.inner.lock().await;
+        let id_num = state.next_id + 1;
+        state.next_id = id_num;
+        let id = format!("rec_{id_num}");
+        let file_name = format!("{id}.mp4");
+        let path = self.recordings_dir.join(&file_name);
+        let fps = request.fps.unwrap_or(30).clamp(1, 60);
+        let args = vec![
+            "-y".to_string(),
+            "-video_size".to_string(),
+            format!("{}x{}", context.resolution.width, context.resolution.height),
+            "-framerate".to_string(),
+            fps.to_string(),
+            "-f".to_string(),
+            "x11grab".to_string(),
+            "-i".to_string(),
+            context.display,
+            "-c:v".to_string(),
+            "libx264".to_string(),
+            "-preset".to_string(),
+            "ultrafast".to_string(),
+            "-pix_fmt".to_string(),
+            "yuv420p".to_string(),
+            path.to_string_lossy().to_string(),
+        ];
+        let snapshot = self
+            .process_runtime
+            .start_process(ProcessStartSpec {
+                command: "ffmpeg".to_string(),
+                args,
+                cwd: None,
+                env: context.environment,
+                tty: false,
+                interactive: false,
+                owner: ProcessOwner::Desktop,
+                restart_policy: Some(RestartPolicy::Never),
+            })
+            .await?;
+
+        let info = DesktopRecordingInfo {
+            id: id.clone(),
+            status: DesktopRecordingStatus::Recording,
+            process_id: Some(snapshot.id),
+            file_name,
+            bytes: 0,
+            started_at: chrono::Utc::now().to_rfc3339(),
+            ended_at: None,
+        };
+        state.current_id = Some(id.clone());
+        state.recordings.insert(
+            id,
+            RecordingEntry {
+                info: info.clone(),
+                path,
+            },
+        );
+        Ok(info)
+    }
+
+    pub async fn stop(&self) -> Result<DesktopRecordingInfo, SandboxError> {
+        let (recording_id, process_id) = {
+            let mut state = self.inner.lock().await;
+            self.refresh_locked(&mut state).await?;
+            let recording_id = state
+                .current_id
+                .clone()
+                .ok_or_else(|| SandboxError::Conflict {
+                    message: "no desktop recording is active".to_string(),
+                })?;
+            let process_id = state
+                .recordings
+                .get(&recording_id)
+                .and_then(|entry| entry.info.process_id.clone());
+            (recording_id, process_id)
+        };
+
+        if let Some(process_id) = process_id {
+            let snapshot = self
+                .process_runtime
+                .stop_process(&process_id, Some(5_000))
+                .await?;
+            if snapshot.status == ProcessStatus::Running {
+                let _ = self
+                    .process_runtime
+                    .kill_process(&process_id, Some(1_000))
+                    .await;
+            }
+        }
+
+        let mut state = self.inner.lock().await;
+        self.refresh_locked(&mut state).await?;
+        let entry = state
+            .recordings
+            .get(&recording_id)
+            .ok_or_else(|| SandboxError::NotFound {
+                resource: "desktop_recording".to_string(),
+                id: recording_id.clone(),
+            })?;
+        Ok(entry.info.clone())
+    }
+
+    pub async fn list(&self) -> Result<DesktopRecordingListResponse, SandboxError> {
+        let mut state = self.inner.lock().await;
+        self.refresh_locked(&mut state).await?;
+        Ok(DesktopRecordingListResponse {
+            recordings: state
+                .recordings
+                .values()
+                .map(|entry| entry.info.clone())
+                .collect(),
+        })
+    }
+
+    pub async fn get(&self, id: &str) -> Result<DesktopRecordingInfo, SandboxError> {
+        let mut state = self.inner.lock().await;
+        self.refresh_locked(&mut state).await?;
+        state
+            .recordings
+            .get(id)
+            .map(|entry| entry.info.clone())
+            .ok_or_else(|| SandboxError::NotFound {
+                resource: "desktop_recording".to_string(),
+                id: id.to_string(),
+            })
+    }
+
+    pub async fn download_path(&self, id: &str) -> Result<PathBuf, SandboxError> {
+        let mut state = self.inner.lock().await;
+        self.refresh_locked(&mut state).await?;
+        let entry = state
+            .recordings
+            .get(id)
+            .ok_or_else(|| SandboxError::NotFound {
+                resource: "desktop_recording".to_string(),
+                id: id.to_string(),
+            })?;
+        if !entry.path.is_file() {
+            return Err(SandboxError::NotFound {
+                resource: "desktop_recording_file".to_string(),
+                id: id.to_string(),
+            });
+        }
+        Ok(entry.path.clone())
+    }
+
+    pub async fn delete(&self, id: &str) -> Result<(), SandboxError> {
+        let mut state = self.inner.lock().await;
+        self.refresh_locked(&mut state).await?;
+        if state.current_id.as_deref() == Some(id) {
+            return Err(SandboxError::Conflict {
+                message: "stop the active desktop recording before deleting it".to_string(),
+            });
+        }
+        let entry = state
+            .recordings
+            .remove(id)
+            .ok_or_else(|| SandboxError::NotFound {
+                resource: "desktop_recording".to_string(),
+                id: id.to_string(),
+            })?;
+        if entry.path.exists() {
+            fs::remove_file(&entry.path).map_err(|err| SandboxError::StreamError {
+                message: format!(
+                    "failed to delete desktop recording {}: {err}",
+                    entry.path.display()
+                ),
+            })?;
+        }
+        Ok(())
+    }
+
+    fn ensure_recordings_dir(&self) -> Result<(), SandboxError> {
+        fs::create_dir_all(&self.recordings_dir).map_err(|err| SandboxError::StreamError {
+            message: format!(
+                "failed to create desktop recordings dir {}: {err}",
+                self.recordings_dir.display()
+            ),
+        })
+    }
+
+    async fn refresh_locked(&self, state: &mut DesktopRecordingState) -> Result<(), SandboxError> {
+        let ids: Vec<String> = state.recordings.keys().cloned().collect();
+        for id in ids {
+            let should_clear_current = {
+                let Some(entry) = state.recordings.get_mut(&id) else {
+                    continue;
+                };
+                let Some(process_id) = entry.info.process_id.clone() else {
+                    Self::refresh_bytes(entry);
+                    continue;
+                };
+
+                let snapshot = match self.process_runtime.snapshot(&process_id).await {
+                    Ok(snapshot) => snapshot,
+                    Err(SandboxError::NotFound { .. }) => {
+                        Self::finalize_entry(entry, false);
+                        continue;
+                    }
+                    Err(err) => return Err(err),
+                };
+
+                if snapshot.status == ProcessStatus::Running {
+                    Self::refresh_bytes(entry);
+                    false
+                } else {
+                    Self::finalize_entry(entry, snapshot.exit_code == Some(0));
+                    true
+                }
+            };
+
+            if should_clear_current && state.current_id.as_deref() == Some(id.as_str()) {
+                state.current_id = None;
+            }
+        }
+
+        Ok(())
+    }
+
+    fn refresh_bytes(entry: &mut RecordingEntry) {
+        entry.info.bytes = file_size(&entry.path);
+    }
+
+    fn finalize_entry(entry: &mut RecordingEntry, success: bool) {
+        let bytes = file_size(&entry.path);
+        entry.info.status = if success || (entry.path.is_file() && bytes > 0) {
+            DesktopRecordingStatus::Completed
+        } else {
+            DesktopRecordingStatus::Failed
+        };
+        entry
+            .info
+            .ended_at
+            .get_or_insert_with(|| chrono::Utc::now().to_rfc3339());
+        entry.info.bytes = bytes;
+    }
+}
+
+fn find_binary(name: &str) -> Option<PathBuf> {
+    let path_env = std::env::var_os("PATH")?;
+    for path in std::env::split_paths(&path_env) {
+        let candidate = path.join(name);
+        if candidate.is_file() {
+            return Some(candidate);
+        }
+    }
+    None
+}
+
+fn file_size(path: &Path) -> u64 {
+    fs::metadata(path)
+        .map(|metadata| metadata.len())
+        .unwrap_or(0)
+}
--- a/server/packages/sandbox-agent/src/desktop_runtime.rs
+++ b/server/packages/sandbox-agent/src/desktop_runtime.rs
--- a/server/packages/sandbox-agent/src/desktop_streaming.rs
+++ b/server/packages/sandbox-agent/src/desktop_streaming.rs
@ -0,0 +1,47 @@
+use std::sync::Arc;
+
+use tokio::sync::Mutex;
+
+use sandbox_agent_error::SandboxError;
+
+use crate::desktop_types::DesktopStreamStatusResponse;
+
+#[derive(Debug, Clone)]
+pub struct DesktopStreamingManager {
+    inner: Arc<Mutex<DesktopStreamingState>>,
+}
+
+#[derive(Debug, Default)]
+struct DesktopStreamingState {
+    active: bool,
+}
+
+impl DesktopStreamingManager {
+    pub fn new() -> Self {
+        Self {
+            inner: Arc::new(Mutex::new(DesktopStreamingState::default())),
+        }
+    }
+
+    pub async fn start(&self) -> DesktopStreamStatusResponse {
+        let mut state = self.inner.lock().await;
+        state.active = true;
+        DesktopStreamStatusResponse { active: true }
+    }
+
+    pub async fn stop(&self) -> DesktopStreamStatusResponse {
+        let mut state = self.inner.lock().await;
+        state.active = false;
+        DesktopStreamStatusResponse { active: false }
+    }
+
+    pub async fn ensure_active(&self) -> Result<(), SandboxError> {
+        if self.inner.lock().await.active {
+            Ok(())
+        } else {
+            Err(SandboxError::Conflict {
+                message: "desktop streaming is not active".to_string(),
+            })
+        }
+    }
+}
--- a/server/packages/sandbox-agent/src/desktop_types.rs
+++ b/server/packages/sandbox-agent/src/desktop_types.rs
@ -0,0 +1,302 @@
+use schemars::JsonSchema;
+use serde::{Deserialize, Serialize};
+use utoipa::{IntoParams, ToSchema};
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "snake_case")]
+pub enum DesktopState {
+    Inactive,
+    InstallRequired,
+    Starting,
+    Active,
+    Stopping,
+    Failed,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopResolution {
+    pub width: u32,
+    pub height: u32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub dpi: Option<u32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopErrorInfo {
+    pub code: String,
+    pub message: String,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopProcessInfo {
+    pub name: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub pid: Option<u32>,
+    pub running: bool,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub log_path: Option<String>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopStatusResponse {
+    pub state: DesktopState,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub display: Option<String>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub resolution: Option<DesktopResolution>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub started_at: Option<String>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub last_error: Option<DesktopErrorInfo>,
+    #[serde(default)]
+    pub missing_dependencies: Vec<String>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub install_command: Option<String>,
+    #[serde(default)]
+    pub processes: Vec<DesktopProcessInfo>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub runtime_log_path: Option<String>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, IntoParams, Default)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopStartRequest {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub width: Option<u32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub height: Option<u32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub dpi: Option<u32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, IntoParams, Default)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopScreenshotQuery {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub format: Option<DesktopScreenshotFormat>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub quality: Option<u8>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub scale: Option<f32>,
+}
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "lowercase")]
+pub enum DesktopScreenshotFormat {
+    Png,
+    Jpeg,
+    Webp,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, IntoParams)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopRegionScreenshotQuery {
+    pub x: i32,
+    pub y: i32,
+    pub width: u32,
+    pub height: u32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub format: Option<DesktopScreenshotFormat>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub quality: Option<u8>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub scale: Option<f32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMousePositionResponse {
+    pub x: i32,
+    pub y: i32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub screen: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub window: Option<String>,
+}
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "lowercase")]
+pub enum DesktopMouseButton {
+    Left,
+    Middle,
+    Right,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseMoveRequest {
+    pub x: i32,
+    pub y: i32,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseClickRequest {
+    pub x: i32,
+    pub y: i32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub button: Option<DesktopMouseButton>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub click_count: Option<u32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseDownRequest {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub x: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub y: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub button: Option<DesktopMouseButton>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseUpRequest {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub x: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub y: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub button: Option<DesktopMouseButton>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseDragRequest {
+    pub start_x: i32,
+    pub start_y: i32,
+    pub end_x: i32,
+    pub end_y: i32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub button: Option<DesktopMouseButton>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopMouseScrollRequest {
+    pub x: i32,
+    pub y: i32,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub delta_x: Option<i32>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub delta_y: Option<i32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopKeyboardTypeRequest {
+    pub text: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub delay_ms: Option<u32>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopKeyboardPressRequest {
+    pub key: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub modifiers: Option<DesktopKeyModifiers>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq, Default)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopKeyModifiers {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub ctrl: Option<bool>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub shift: Option<bool>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub alt: Option<bool>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub cmd: Option<bool>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopKeyboardDownRequest {
+    pub key: String,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopKeyboardUpRequest {
+    pub key: String,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopActionResponse {
+    pub ok: bool,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopDisplayInfoResponse {
+    pub display: String,
+    pub resolution: DesktopResolution,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopWindowInfo {
+    pub id: String,
+    pub title: String,
+    pub x: i32,
+    pub y: i32,
+    pub width: u32,
+    pub height: u32,
+    pub is_active: bool,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopWindowListResponse {
+    pub windows: Vec<DesktopWindowInfo>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, Default)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopRecordingStartRequest {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub fps: Option<u32>,
+}
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "lowercase")]
+pub enum DesktopRecordingStatus {
+    Recording,
+    Completed,
+    Failed,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopRecordingInfo {
+    pub id: String,
+    pub status: DesktopRecordingStatus,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub process_id: Option<String>,
+    pub file_name: String,
+    pub bytes: u64,
+    pub started_at: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub ended_at: Option<String>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopRecordingListResponse {
+    pub recordings: Vec<DesktopRecordingInfo>,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "camelCase")]
+pub struct DesktopStreamStatusResponse {
+    pub active: bool,
+}
--- a/server/packages/sandbox-agent/src/lib.rs
+++ b/server/packages/sandbox-agent/src/lib.rs
@ -3,6 +3,12 @@
 mod acp_proxy_runtime;
 pub mod cli;
 pub mod daemon;
+mod desktop_errors;
+mod desktop_install;
+mod desktop_recording;
+mod desktop_runtime;
+mod desktop_streaming;
+pub mod desktop_types;
 mod process_runtime;
 pub mod router;
 pub mod server_logs;
--- a/server/packages/sandbox-agent/src/process_runtime.rs
+++ b/server/packages/sandbox-agent/src/process_runtime.rs
@ -1,5 +1,5 @@
 use std::collections::{HashMap, VecDeque};
-use std::sync::atomic::{AtomicU64, Ordering};
+use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
 use std::sync::Arc;
 use std::time::Instant;

@ -27,6 +27,22 @@ pub enum ProcessStream {
    Pty,
 }

+#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
+#[serde(rename_all = "lowercase")]
+pub enum ProcessOwner {
+    User,
+    Desktop,
+    System,
+}
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
+#[serde(rename_all = "snake_case")]
+pub enum RestartPolicy {
+    Never,
+    Always,
+    OnFailure,
+}
+
 #[derive(Debug, Clone)]
 pub struct ProcessStartSpec {
    pub command: String,
@ -35,6 +51,8 @@ pub struct ProcessStartSpec {
    pub env: HashMap<String, String>,
    pub tty: bool,
    pub interactive: bool,
+    pub owner: ProcessOwner,
+    pub restart_policy: Option<RestartPolicy>,
 }

 #[derive(Debug, Clone)]
@ -78,6 +96,7 @@ pub struct ProcessSnapshot {
    pub cwd: Option<String>,
    pub tty: bool,
    pub interactive: bool,
+    pub owner: ProcessOwner,
    pub status: ProcessStatus,
    pub pid: Option<u32>,
    pub exit_code: Option<i32>,
@ -129,17 +148,27 @@ struct ManagedProcess {
    cwd: Option<String>,
    tty: bool,
    interactive: bool,
+    owner: ProcessOwner,
+    #[allow(dead_code)]
+    restart_policy: RestartPolicy,
+    spec: ProcessStartSpec,
    created_at_ms: i64,
-    pid: Option<u32>,
    max_log_bytes: usize,
-    stdin: Mutex<Option<ProcessStdin>>,
-    #[cfg(unix)]
-    pty_resize_fd: Mutex<Option<std::fs::File>>,
+    runtime: Mutex<ManagedRuntime>,
    status: RwLock<ManagedStatus>,
    sequence: AtomicU64,
    logs: Mutex<VecDeque<StoredLog>>,
    total_log_bytes: Mutex<usize>,
    log_tx: broadcast::Sender<ProcessLogLine>,
+    stop_requested: AtomicBool,
+}
+
+#[derive(Debug)]
+struct ManagedRuntime {
+    pid: Option<u32>,
+    stdin: Option<ProcessStdin>,
+    #[cfg(unix)]
+    pty_resize_fd: Option<std::fs::File>,
 }

 #[derive(Debug)]
@ -162,17 +191,17 @@ struct ManagedStatus {
 }

 struct SpawnedPipeProcess {
-    process: Arc<ManagedProcess>,
    child: Child,
    stdout: tokio::process::ChildStdout,
    stderr: tokio::process::ChildStderr,
+    runtime: ManagedRuntime,
 }

 #[cfg(unix)]
 struct SpawnedTtyProcess {
-    process: Arc<ManagedProcess>,
    child: Child,
    reader: tokio::fs::File,
+    runtime: ManagedRuntime,
 }

 impl ProcessRuntime {
@ -224,21 +253,14 @@ impl ProcessRuntime {
        &self,
        spec: ProcessStartSpec,
    ) -> Result<ProcessSnapshot, SandboxError> {
-        let config = self.get_config().await;
-
-        let process_refs = {
-            let processes = self.inner.processes.read().await;
-            processes.values().cloned().collect::<Vec<_>>()
-        };
-
-        let mut running_count = 0usize;
-        for process in process_refs {
-            if process.status.read().await.status == ProcessStatus::Running {
-                running_count += 1;
-            }
+        if spec.command.trim().is_empty() {
+            return Err(SandboxError::InvalidRequest {
+                message: "command must not be empty".to_string(),
+            });
        }

-        if running_count >= config.max_concurrent_processes {
+        let config = self.get_config().await;
+        if self.running_process_count().await >= config.max_concurrent_processes {
            return Err(SandboxError::Conflict {
                message: format!(
                    "max concurrent process limit reached ({})",
@ -247,73 +269,44 @@ impl ProcessRuntime {
            });
        }

-        if spec.command.trim().is_empty() {
-            return Err(SandboxError::InvalidRequest {
-                message: "command must not be empty".to_string(),
-            });
-        }
-
        let id_num = self.inner.next_id.fetch_add(1, Ordering::Relaxed);
        let id = format!("proc_{id_num}");
+        let process = Arc::new(ManagedProcess {
+            id: id.clone(),
+            command: spec.command.clone(),
+            args: spec.args.clone(),
+            cwd: spec.cwd.clone(),
+            tty: spec.tty,
+            interactive: spec.interactive,
+            owner: spec.owner,
+            restart_policy: spec.restart_policy.unwrap_or(RestartPolicy::Never),
+            spec,
+            created_at_ms: now_ms(),
+            max_log_bytes: config.max_log_bytes_per_process,
+            runtime: Mutex::new(ManagedRuntime {
+                pid: None,
+                stdin: None,
+                #[cfg(unix)]
+                pty_resize_fd: None,
+            }),
+            status: RwLock::new(ManagedStatus {
+                status: ProcessStatus::Running,
+                exit_code: None,
+                exited_at_ms: None,
+            }),
+            sequence: AtomicU64::new(1),
+            logs: Mutex::new(VecDeque::new()),
+            total_log_bytes: Mutex::new(0),
+            log_tx: broadcast::channel(512).0,
+            stop_requested: AtomicBool::new(false),
+        });

-        if spec.tty {
-            #[cfg(unix)]
-            {
-                let spawned = self
-                    .spawn_tty_process(id.clone(), spec, config.max_log_bytes_per_process)
-                    .await?;
-                let process = spawned.process.clone();
-                self.inner
-                    .processes
-                    .write()
-                    .await
-                    .insert(id, process.clone());
-
-                let p = process.clone();
-                tokio::spawn(async move {
-                    pump_output(p, spawned.reader, ProcessStream::Pty).await;
-                });
-
-                let p = process.clone();
-                tokio::spawn(async move {
-                    watch_exit(p, spawned.child).await;
-                });
-
-                return Ok(process.snapshot().await);
-            }
-            #[cfg(not(unix))]
-            {
-                return Err(SandboxError::StreamError {
-                    message: "tty process mode is not supported on this platform".to_string(),
-                });
-            }
-        }
-
-        let spawned = self
-            .spawn_pipe_process(id.clone(), spec, config.max_log_bytes_per_process)
-            .await?;
-        let process = spawned.process.clone();
+        self.spawn_existing_process(process.clone()).await?;
        self.inner
            .processes
            .write()
            .await
            .insert(id, process.clone());
-
-        let p = process.clone();
-        tokio::spawn(async move {
-            pump_output(p, spawned.stdout, ProcessStream::Stdout).await;
-        });
-
-        let p = process.clone();
-        tokio::spawn(async move {
-            pump_output(p, spawned.stderr, ProcessStream::Stderr).await;
-        });
-
-        let p = process.clone();
-        tokio::spawn(async move {
-            watch_exit(p, spawned.child).await;
-        });
-
        Ok(process.snapshot().await)
    }

@ -412,11 +405,13 @@ impl ProcessRuntime {
        })
    }

-    pub async fn list_processes(&self) -> Vec<ProcessSnapshot> {
+    pub async fn list_processes(&self, owner: Option<ProcessOwner>) -> Vec<ProcessSnapshot> {
        let processes = self.inner.processes.read().await;
        let mut items = Vec::with_capacity(processes.len());
        for process in processes.values() {
-            items.push(process.snapshot().await);
+            if owner.is_none_or(|expected| process.owner == expected) {
+                items.push(process.snapshot().await);
+            }
        }
        items.sort_by(|a, b| a.id.cmp(&b.id));
        items
@ -453,6 +448,7 @@ impl ProcessRuntime {
        wait_ms: Option<u64>,
    ) -> Result<ProcessSnapshot, SandboxError> {
        let process = self.lookup_process(id).await?;
+        process.stop_requested.store(true, Ordering::SeqCst);
        process.send_signal(SIGTERM).await?;
        maybe_wait_for_exit(process.clone(), wait_ms.unwrap_or(2_000)).await;
        Ok(process.snapshot().await)
@ -464,6 +460,7 @@ impl ProcessRuntime {
        wait_ms: Option<u64>,
    ) -> Result<ProcessSnapshot, SandboxError> {
        let process = self.lookup_process(id).await?;
+        process.stop_requested.store(true, Ordering::SeqCst);
        process.send_signal(SIGKILL).await?;
        maybe_wait_for_exit(process.clone(), wait_ms.unwrap_or(1_000)).await;
        Ok(process.snapshot().await)
@ -506,6 +503,17 @@ impl ProcessRuntime {
        Ok(process.log_tx.subscribe())
    }

+    async fn running_process_count(&self) -> usize {
+        let processes = self.inner.processes.read().await;
+        let mut running = 0usize;
+        for process in processes.values() {
+            if process.status.read().await.status == ProcessStatus::Running {
+                running += 1;
+            }
+        }
+        running
+    }
+
    async fn lookup_process(&self, id: &str) -> Result<Arc<ManagedProcess>, SandboxError> {
        let process = self.inner.processes.read().await.get(id).cloned();
        process.ok_or_else(|| SandboxError::NotFound {
@ -514,11 +522,83 @@ impl ProcessRuntime {
        })
    }

-    async fn spawn_pipe_process(
+    async fn spawn_existing_process(
        &self,
-        id: String,
-        spec: ProcessStartSpec,
-        max_log_bytes: usize,
+        process: Arc<ManagedProcess>,
+    ) -> Result<(), SandboxError> {
+        process.stop_requested.store(false, Ordering::SeqCst);
+        let mut runtime_guard = process.runtime.lock().await;
+        let mut status_guard = process.status.write().await;
+
+        if process.tty {
+            #[cfg(unix)]
+            {
+                let SpawnedTtyProcess {
+                    child,
+                    reader,
+                    runtime,
+                } = self.spawn_tty_process(&process.spec)?;
+                *runtime_guard = runtime;
+                status_guard.status = ProcessStatus::Running;
+                status_guard.exit_code = None;
+                status_guard.exited_at_ms = None;
+                drop(status_guard);
+                drop(runtime_guard);
+
+                let process_for_output = process.clone();
+                tokio::spawn(async move {
+                    pump_output(process_for_output, reader, ProcessStream::Pty).await;
+                });
+
+                let runtime = self.clone();
+                tokio::spawn(async move {
+                    watch_exit(runtime, process, child).await;
+                });
+
+                return Ok(());
+            }
+            #[cfg(not(unix))]
+            {
+                return Err(SandboxError::StreamError {
+                    message: "tty process mode is not supported on this platform".to_string(),
+                });
+            }
+        }
+
+        let SpawnedPipeProcess {
+            child,
+            stdout,
+            stderr,
+            runtime,
+        } = self.spawn_pipe_process(&process.spec)?;
+        *runtime_guard = runtime;
+        status_guard.status = ProcessStatus::Running;
+        status_guard.exit_code = None;
+        status_guard.exited_at_ms = None;
+        drop(status_guard);
+        drop(runtime_guard);
+
+        let process_for_stdout = process.clone();
+        tokio::spawn(async move {
+            pump_output(process_for_stdout, stdout, ProcessStream::Stdout).await;
+        });
+
+        let process_for_stderr = process.clone();
+        tokio::spawn(async move {
+            pump_output(process_for_stderr, stderr, ProcessStream::Stderr).await;
+        });
+
+        let runtime = self.clone();
+        tokio::spawn(async move {
+            watch_exit(runtime, process, child).await;
+        });
+
+        Ok(())
+    }
+
+    fn spawn_pipe_process(
+        &self,
+        spec: &ProcessStartSpec,
    ) -> Result<SpawnedPipeProcess, SandboxError> {
        let mut cmd = Command::new(&spec.command);
        cmd.args(&spec.args)
@ -551,35 +631,14 @@ impl ProcessRuntime {
            .ok_or_else(|| SandboxError::StreamError {
                message: "failed to capture stderr".to_string(),
            })?;
-        let pid = child.id();
-
-        let (tx, _rx) = broadcast::channel(512);
-        let process = Arc::new(ManagedProcess {
-            id,
-            command: spec.command,
-            args: spec.args,
-            cwd: spec.cwd,
-            tty: false,
-            interactive: spec.interactive,
-            created_at_ms: now_ms(),
-            pid,
-            max_log_bytes,
-            stdin: Mutex::new(stdin.map(ProcessStdin::Pipe)),
-            #[cfg(unix)]
-            pty_resize_fd: Mutex::new(None),
-            status: RwLock::new(ManagedStatus {
-                status: ProcessStatus::Running,
-                exit_code: None,
-                exited_at_ms: None,
-            }),
-            sequence: AtomicU64::new(1),
-            logs: Mutex::new(VecDeque::new()),
-            total_log_bytes: Mutex::new(0),
-            log_tx: tx,
-        });

        Ok(SpawnedPipeProcess {
-            process,
+            runtime: ManagedRuntime {
+                pid: child.id(),
+                stdin: stdin.map(ProcessStdin::Pipe),
+                #[cfg(unix)]
+                pty_resize_fd: None,
+            },
            child,
            stdout,
            stderr,
@ -587,11 +646,9 @@ impl ProcessRuntime {
    }

    #[cfg(unix)]
-    async fn spawn_tty_process(
+    fn spawn_tty_process(
        &self,
-        id: String,
-        spec: ProcessStartSpec,
-        max_log_bytes: usize,
+        spec: &ProcessStartSpec,
    ) -> Result<SpawnedTtyProcess, SandboxError> {
        use std::os::fd::AsRawFd;
        use std::process::Stdio;
@ -632,8 +689,8 @@ impl ProcessRuntime {
        let child = cmd.spawn().map_err(|err| SandboxError::StreamError {
            message: format!("failed to spawn tty process: {err}"),
        })?;
-
        let pid = child.id();
+
        drop(slave_fd);

        let master_raw = master_fd.as_raw_fd();
@ -644,32 +701,12 @@ impl ProcessRuntime {
        let writer_file = tokio::fs::File::from_std(std::fs::File::from(writer_fd));
        let resize_file = std::fs::File::from(resize_fd);

-        let (tx, _rx) = broadcast::channel(512);
-        let process = Arc::new(ManagedProcess {
-            id,
-            command: spec.command,
-            args: spec.args,
-            cwd: spec.cwd,
-            tty: true,
-            interactive: spec.interactive,
-            created_at_ms: now_ms(),
-            pid,
-            max_log_bytes,
-            stdin: Mutex::new(Some(ProcessStdin::Pty(writer_file))),
-            pty_resize_fd: Mutex::new(Some(resize_file)),
-            status: RwLock::new(ManagedStatus {
-                status: ProcessStatus::Running,
-                exit_code: None,
-                exited_at_ms: None,
-            }),
-            sequence: AtomicU64::new(1),
-            logs: Mutex::new(VecDeque::new()),
-            total_log_bytes: Mutex::new(0),
-            log_tx: tx,
-        });
-
        Ok(SpawnedTtyProcess {
-            process,
+            runtime: ManagedRuntime {
+                pid,
+                stdin: Some(ProcessStdin::Pty(writer_file)),
+                pty_resize_fd: Some(resize_file),
+            },
            child,
            reader: reader_file,
        })
@ -694,6 +731,7 @@ pub struct ProcessLogFilter {
 impl ManagedProcess {
    async fn snapshot(&self) -> ProcessSnapshot {
        let status = self.status.read().await.clone();
+        let pid = self.runtime.lock().await.pid;
        ProcessSnapshot {
            id: self.id.clone(),
            command: self.command.clone(),
@ -701,8 +739,9 @@ impl ManagedProcess {
            cwd: self.cwd.clone(),
            tty: self.tty,
            interactive: self.interactive,
+            owner: self.owner,
            status: status.status,
-            pid: self.pid,
+            pid,
            exit_code: status.exit_code,
            created_at_ms: self.created_at_ms,
            exited_at_ms: status.exited_at_ms,
@ -752,10 +791,13 @@ impl ManagedProcess {
            });
        }

-        let mut guard = self.stdin.lock().await;
-        let stdin = guard.as_mut().ok_or_else(|| SandboxError::Conflict {
-            message: "process does not accept stdin".to_string(),
-        })?;
+        let mut runtime = self.runtime.lock().await;
+        let stdin = runtime
+            .stdin
+            .as_mut()
+            .ok_or_else(|| SandboxError::Conflict {
+                message: "process does not accept stdin".to_string(),
+            })?;

        match stdin {
            ProcessStdin::Pipe(pipe) => {
@ -825,7 +867,7 @@ impl ManagedProcess {
        if self.status.read().await.status != ProcessStatus::Running {
            return Ok(());
        }
-        let Some(pid) = self.pid else {
+        let Some(pid) = self.runtime.lock().await.pid else {
            return Ok(());
        };

@ -840,8 +882,9 @@ impl ManagedProcess {
        #[cfg(unix)]
        {
            use std::os::fd::AsRawFd;
-            let guard = self.pty_resize_fd.lock().await;
-            let Some(fd) = guard.as_ref() else {
+
+            let runtime = self.runtime.lock().await;
+            let Some(fd) = runtime.pty_resize_fd.as_ref() else {
                return Err(SandboxError::Conflict {
                    message: "PTY resize handle unavailable".to_string(),
                });
@ -857,6 +900,32 @@ impl ManagedProcess {

        Ok(())
    }
+
+    #[allow(dead_code)]
+    fn should_restart(&self, exit_code: Option<i32>) -> bool {
+        match self.restart_policy {
+            RestartPolicy::Never => false,
+            RestartPolicy::Always => true,
+            RestartPolicy::OnFailure => exit_code.unwrap_or(1) != 0,
+        }
+    }
+
+    async fn mark_exited(&self, exit_code: Option<i32>, exited_at_ms: Option<i64>) {
+        {
+            let mut status = self.status.write().await;
+            status.status = ProcessStatus::Exited;
+            status.exit_code = exit_code;
+            status.exited_at_ms = exited_at_ms;
+        }
+
+        let mut runtime = self.runtime.lock().await;
+        runtime.pid = None;
+        let _ = runtime.stdin.take();
+        #[cfg(unix)]
+        {
+            let _ = runtime.pty_resize_fd.take();
+        }
+    }
 }

 fn stream_matches(stream: ProcessStream, filter: ProcessLogFilterStream) -> bool {
@ -909,21 +978,16 @@ where
    }
 }

-async fn watch_exit(process: Arc<ManagedProcess>, mut child: Child) {
+async fn watch_exit(runtime: ProcessRuntime, process: Arc<ManagedProcess>, mut child: Child) {
+    let _ = runtime;
    let wait = child.wait().await;
    let (exit_code, exited_at_ms) = match wait {
        Ok(status) => (status.code(), Some(now_ms())),
        Err(_) => (None, Some(now_ms())),
    };

-    {
-        let mut state = process.status.write().await;
-        state.status = ProcessStatus::Exited;
-        state.exit_code = exit_code;
-        state.exited_at_ms = exited_at_ms;
-    }
-
-    let _ = process.stdin.lock().await.take();
+    let _ = process.stop_requested.swap(false, Ordering::SeqCst);
+    process.mark_exited(exit_code, exited_at_ms).await;
 }

 async fn capture_output<R>(mut reader: R, max_bytes: usize) -> std::io::Result<(Vec<u8>, bool)>
--- a/server/packages/sandbox-agent/src/router.rs
+++ b/server/packages/sandbox-agent/src/router.rs
--- a/server/packages/sandbox-agent/src/router/support.rs
+++ b/server/packages/sandbox-agent/src/router/support.rs
@ -33,7 +33,8 @@ pub(super) async fn require_token(
        .and_then(|value| value.to_str().ok())
        .and_then(|value| value.strip_prefix("Bearer "));

-    let allow_query_token = request.uri().path().ends_with("/terminal/ws");
+    let allow_query_token = request.uri().path().ends_with("/terminal/ws")
+        || request.uri().path().ends_with("/stream/ws");
    let query_token = if allow_query_token {
        request
            .uri()
--- a/server/packages/sandbox-agent/src/router/types.rs
+++ b/server/packages/sandbox-agent/src/router/types.rs
@ -425,6 +425,14 @@ pub enum ProcessState {
    Exited,
 }

+#[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
+#[serde(rename_all = "lowercase")]
+pub enum ProcessOwner {
+    User,
+    Desktop,
+    System,
+}
+
 #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
 #[serde(rename_all = "camelCase")]
 pub struct ProcessInfo {
@ -435,6 +443,7 @@ pub struct ProcessInfo {
    pub cwd: Option<String>,
    pub tty: bool,
    pub interactive: bool,
+    pub owner: ProcessOwner,
    pub status: ProcessState,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub pid: Option<u32>,
@ -451,6 +460,13 @@ pub struct ProcessListResponse {
    pub processes: Vec<ProcessInfo>,
 }

+#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, IntoParams)]
+#[serde(rename_all = "camelCase")]
+pub struct ProcessListQuery {
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub owner: Option<ProcessOwner>,
+}
+
 #[derive(Debug, Clone, Copy, Serialize, Deserialize, JsonSchema, ToSchema, PartialEq, Eq)]
 #[serde(rename_all = "lowercase")]
 pub enum ProcessLogsStream {
--- a/server/packages/sandbox-agent/tests/support/docker.rs
+++ b/server/packages/sandbox-agent/tests/support/docker.rs
@ -0,0 +1,593 @@
+use std::collections::{BTreeMap, BTreeSet};
+use std::fs;
+use std::io::{Read, Write};
+use std::net::TcpStream;
+use std::path::{Path, PathBuf};
+use std::process::Command;
+use std::sync::atomic::{AtomicU64, Ordering};
+use std::sync::OnceLock;
+use std::thread;
+use std::time::{Duration, SystemTime, UNIX_EPOCH};
+
+use sandbox_agent::router::AuthConfig;
+use serial_test::serial;
+use tempfile::TempDir;
+
+const CONTAINER_PORT: u16 = 3000;
+const DEFAULT_PATH: &str = "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";
+const DEFAULT_IMAGE_TAG: &str = "sandbox-agent-test:dev";
+const STANDARD_PATHS: &[&str] = &[
+    "/usr/local/sbin",
+    "/usr/local/bin",
+    "/usr/sbin",
+    "/usr/bin",
+    "/sbin",
+    "/bin",
+];
+
+static IMAGE_TAG: OnceLock<String> = OnceLock::new();
+static DOCKER_BIN: OnceLock<PathBuf> = OnceLock::new();
+static CONTAINER_COUNTER: AtomicU64 = AtomicU64::new(0);
+
+#[derive(Clone)]
+pub struct DockerApp {
+    base_url: String,
+}
+
+impl DockerApp {
+    pub fn http_url(&self, path: &str) -> String {
+        format!("{}{}", self.base_url, path)
+    }
+
+    pub fn ws_url(&self, path: &str) -> String {
+        let suffix = self
+            .base_url
+            .strip_prefix("http://")
+            .unwrap_or(&self.base_url);
+        format!("ws://{suffix}{path}")
+    }
+}
+
+pub struct TestApp {
+    pub app: DockerApp,
+    install_dir: PathBuf,
+    _root: TempDir,
+    container_id: String,
+}
+
+#[derive(Default)]
+pub struct TestAppOptions {
+    pub env: BTreeMap<String, String>,
+    pub extra_paths: Vec<PathBuf>,
+    pub replace_path: bool,
+}
+
+impl TestApp {
+    pub fn new(auth: AuthConfig) -> Self {
+        Self::with_setup(auth, |_| {})
+    }
+
+    pub fn with_setup<F>(auth: AuthConfig, setup: F) -> Self
+    where
+        F: FnOnce(&Path),
+    {
+        Self::with_options(auth, TestAppOptions::default(), setup)
+    }
+
+    pub fn with_options<F>(auth: AuthConfig, options: TestAppOptions, setup: F) -> Self
+    where
+        F: FnOnce(&Path),
+    {
+        let root = tempfile::tempdir().expect("create docker test root");
+        let layout = TestLayout::new(root.path());
+        layout.create();
+        setup(&layout.install_dir);
+
+        let container_id = unique_container_id();
+        let image = ensure_test_image();
+        let env = build_env(&layout, &auth, &options);
+        let mounts = build_mounts(root.path(), &env);
+        let base_url = run_container(&container_id, &image, &mounts, &env, &auth);
+
+        Self {
+            app: DockerApp { base_url },
+            install_dir: layout.install_dir,
+            _root: root,
+            container_id,
+        }
+    }
+
+    pub fn install_path(&self) -> &Path {
+        &self.install_dir
+    }
+
+    pub fn root_path(&self) -> &Path {
+        self._root.path()
+    }
+}
+
+impl Drop for TestApp {
+    fn drop(&mut self) {
+        let _ = Command::new(docker_bin())
+            .args(["rm", "-f", &self.container_id])
+            .output();
+    }
+}
+
+pub struct LiveServer {
+    base_url: String,
+}
+
+impl LiveServer {
+    pub async fn spawn(app: DockerApp) -> Self {
+        Self {
+            base_url: app.base_url,
+        }
+    }
+
+    pub fn http_url(&self, path: &str) -> String {
+        format!("{}{}", self.base_url, path)
+    }
+
+    pub fn ws_url(&self, path: &str) -> String {
+        let suffix = self
+            .base_url
+            .strip_prefix("http://")
+            .unwrap_or(&self.base_url);
+        format!("ws://{suffix}{path}")
+    }
+
+    pub async fn shutdown(self) {}
+}
+
+struct TestLayout {
+    home: PathBuf,
+    xdg_data_home: PathBuf,
+    xdg_state_home: PathBuf,
+    appdata: PathBuf,
+    local_appdata: PathBuf,
+    install_dir: PathBuf,
+}
+
+impl TestLayout {
+    fn new(root: &Path) -> Self {
+        let home = root.join("home");
+        let xdg_data_home = root.join("xdg-data");
+        let xdg_state_home = root.join("xdg-state");
+        let appdata = root.join("appdata").join("Roaming");
+        let local_appdata = root.join("appdata").join("Local");
+        let install_dir = xdg_data_home.join("sandbox-agent").join("bin");
+        Self {
+            home,
+            xdg_data_home,
+            xdg_state_home,
+            appdata,
+            local_appdata,
+            install_dir,
+        }
+    }
+
+    fn create(&self) {
+        for dir in [
+            &self.home,
+            &self.xdg_data_home,
+            &self.xdg_state_home,
+            &self.appdata,
+            &self.local_appdata,
+            &self.install_dir,
+        ] {
+            fs::create_dir_all(dir).expect("create docker test dir");
+        }
+    }
+}
+
+fn ensure_test_image() -> String {
+    IMAGE_TAG
+        .get_or_init(|| {
+            let repo_root = repo_root();
+            let image_tag = std::env::var("SANDBOX_AGENT_TEST_IMAGE")
+                .unwrap_or_else(|_| DEFAULT_IMAGE_TAG.to_string());
+            let output = Command::new(docker_bin())
+                .args(["build", "--tag", &image_tag, "--file"])
+                .arg(
+                    repo_root
+                        .join("docker")
+                        .join("test-agent")
+                        .join("Dockerfile"),
+                )
+                .arg(&repo_root)
+                .output()
+                .expect("build sandbox-agent test image");
+            if !output.status.success() {
+                panic!(
+                    "failed to build sandbox-agent test image: {}",
+                    String::from_utf8_lossy(&output.stderr)
+                );
+            }
+            image_tag
+        })
+        .clone()
+}
+
+fn build_env(
+    layout: &TestLayout,
+    auth: &AuthConfig,
+    options: &TestAppOptions,
+) -> BTreeMap<String, String> {
+    let mut env = BTreeMap::new();
+    env.insert(
+        "HOME".to_string(),
+        layout.home.to_string_lossy().to_string(),
+    );
+    env.insert(
+        "USERPROFILE".to_string(),
+        layout.home.to_string_lossy().to_string(),
+    );
+    env.insert(
+        "XDG_DATA_HOME".to_string(),
+        layout.xdg_data_home.to_string_lossy().to_string(),
+    );
+    env.insert(
+        "XDG_STATE_HOME".to_string(),
+        layout.xdg_state_home.to_string_lossy().to_string(),
+    );
+    env.insert(
+        "APPDATA".to_string(),
+        layout.appdata.to_string_lossy().to_string(),
+    );
+    env.insert(
+        "LOCALAPPDATA".to_string(),
+        layout.local_appdata.to_string_lossy().to_string(),
+    );
+
+    for (key, value) in std::env::vars() {
+        if key == "PATH" {
+            continue;
+        }
+        if key == "XDG_STATE_HOME" || key == "HOME" || key == "USERPROFILE" {
+            continue;
+        }
+        if key.starts_with("SANDBOX_AGENT_") || key.starts_with("OPENCODE_COMPAT_") {
+            env.insert(key.clone(), rewrite_localhost_url(&key, &value));
+        }
+    }
+
+    if let Some(token) = auth.token.as_ref() {
+        env.insert("SANDBOX_AGENT_TEST_AUTH_TOKEN".to_string(), token.clone());
+    }
+
+    if options.replace_path {
+        env.insert(
+            "PATH".to_string(),
+            options.env.get("PATH").cloned().unwrap_or_default(),
+        );
+    } else {
+        let mut custom_path_entries =
+            custom_path_entries(layout.install_dir.parent().expect("install base"));
+        custom_path_entries.extend(explicit_path_entries());
+        custom_path_entries.extend(
+            options
+                .extra_paths
+                .iter()
+                .filter(|path| path.is_absolute() && path.exists())
+                .cloned(),
+        );
+        custom_path_entries.sort();
+        custom_path_entries.dedup();
+
+        if custom_path_entries.is_empty() {
+            env.insert("PATH".to_string(), DEFAULT_PATH.to_string());
+        } else {
+            let joined = custom_path_entries
+                .iter()
+                .map(|path| path.to_string_lossy().to_string())
+                .collect::<Vec<_>>()
+                .join(":");
+            env.insert("PATH".to_string(), format!("{joined}:{DEFAULT_PATH}"));
+        }
+    }
+
+    for (key, value) in &options.env {
+        if key == "PATH" {
+            continue;
+        }
+        env.insert(key.clone(), rewrite_localhost_url(key, value));
+    }
+
+    env
+}
+
+fn build_mounts(root: &Path, env: &BTreeMap<String, String>) -> Vec<PathBuf> {
+    let mut mounts = BTreeSet::new();
+    mounts.insert(root.to_path_buf());
+
+    for key in [
+        "HOME",
+        "USERPROFILE",
+        "XDG_DATA_HOME",
+        "XDG_STATE_HOME",
+        "APPDATA",
+        "LOCALAPPDATA",
+        "SANDBOX_AGENT_DESKTOP_FAKE_STATE_DIR",
+    ] {
+        if let Some(value) = env.get(key) {
+            let path = PathBuf::from(value);
+            if path.is_absolute() {
+                mounts.insert(path);
+            }
+        }
+    }
+
+    if let Some(path_value) = env.get("PATH") {
+        for entry in path_value.split(':') {
+            if entry.is_empty() || STANDARD_PATHS.contains(&entry) {
+                continue;
+            }
+            let path = PathBuf::from(entry);
+            if path.is_absolute() && path.exists() {
+                mounts.insert(path);
+            }
+        }
+    }
+
+    mounts.into_iter().collect()
+}
+
+fn run_container(
+    container_id: &str,
+    image: &str,
+    mounts: &[PathBuf],
+    env: &BTreeMap<String, String>,
+    auth: &AuthConfig,
+) -> String {
+    let mut args = vec![
+        "run".to_string(),
+        "-d".to_string(),
+        "--rm".to_string(),
+        "--name".to_string(),
+        container_id.to_string(),
+        "-p".to_string(),
+        format!("127.0.0.1::{CONTAINER_PORT}"),
+    ];
+
+    #[cfg(unix)]
+    {
+        args.push("--user".to_string());
+        args.push(format!("{}:{}", unsafe { libc::geteuid() }, unsafe {
+            libc::getegid()
+        }));
+    }
+
+    if cfg!(target_os = "linux") {
+        args.push("--add-host".to_string());
+        args.push("host.docker.internal:host-gateway".to_string());
+    }
+
+    for mount in mounts {
+        args.push("-v".to_string());
+        args.push(format!("{}:{}", mount.display(), mount.display()));
+    }
+
+    for (key, value) in env {
+        args.push("-e".to_string());
+        args.push(format!("{key}={value}"));
+    }
+
+    args.push(image.to_string());
+    args.push("server".to_string());
+    args.push("--host".to_string());
+    args.push("0.0.0.0".to_string());
+    args.push("--port".to_string());
+    args.push(CONTAINER_PORT.to_string());
+    match auth.token.as_ref() {
+        Some(token) => {
+            args.push("--token".to_string());
+            args.push(token.clone());
+        }
+        None => args.push("--no-token".to_string()),
+    }
+
+    let output = Command::new(docker_bin())
+        .args(&args)
+        .output()
+        .expect("start docker test container");
+    if !output.status.success() {
+        panic!(
+            "failed to start docker test container: {}",
+            String::from_utf8_lossy(&output.stderr)
+        );
+    }
+
+    let port_output = Command::new(docker_bin())
+        .args(["port", container_id, &format!("{CONTAINER_PORT}/tcp")])
+        .output()
+        .expect("resolve mapped docker port");
+    if !port_output.status.success() {
+        panic!(
+            "failed to resolve docker test port: {}",
+            String::from_utf8_lossy(&port_output.stderr)
+        );
+    }
+
+    let mapping = String::from_utf8(port_output.stdout)
+        .expect("docker port utf8")
+        .trim()
+        .to_string();
+    let host_port = mapping.rsplit(':').next().expect("mapped host port").trim();
+    let base_url = format!("http://127.0.0.1:{host_port}");
+    wait_for_health(&base_url, auth.token.as_deref());
+    base_url
+}
+
+fn wait_for_health(base_url: &str, token: Option<&str>) {
+    let started = SystemTime::now();
+    loop {
+        if probe_health(base_url, token) {
+            return;
+        }
+
+        if started
+            .elapsed()
+            .unwrap_or_else(|_| Duration::from_secs(0))
+            .gt(&Duration::from_secs(30))
+        {
+            panic!("timed out waiting for sandbox-agent docker test server");
+        }
+        thread::sleep(Duration::from_millis(200));
+    }
+}
+
+fn probe_health(base_url: &str, token: Option<&str>) -> bool {
+    let address = base_url.strip_prefix("http://").unwrap_or(base_url);
+    let mut stream = match TcpStream::connect(address) {
+        Ok(stream) => stream,
+        Err(_) => return false,
+    };
+    let _ = stream.set_read_timeout(Some(Duration::from_secs(2)));
+    let _ = stream.set_write_timeout(Some(Duration::from_secs(2)));
+
+    let mut request =
+        format!("GET /v1/health HTTP/1.1\r\nHost: {address}\r\nConnection: close\r\n");
+    if let Some(token) = token {
+        request.push_str(&format!("Authorization: Bearer {token}\r\n"));
+    }
+    request.push_str("\r\n");
+
+    if stream.write_all(request.as_bytes()).is_err() {
+        return false;
+    }
+
+    let mut response = String::new();
+    if stream.read_to_string(&mut response).is_err() {
+        return false;
+    }
+
+    response.starts_with("HTTP/1.1 200") || response.starts_with("HTTP/1.0 200")
+}
+
+fn custom_path_entries(root: &Path) -> Vec<PathBuf> {
+    let mut entries = Vec::new();
+    if let Some(value) = std::env::var_os("PATH") {
+        for entry in std::env::split_paths(&value) {
+            if !entry.exists() {
+                continue;
+            }
+            if entry.starts_with(root) || entry.starts_with(std::env::temp_dir()) {
+                entries.push(entry);
+            }
+        }
+    }
+    entries.sort();
+    entries.dedup();
+    entries
+}
+
+fn explicit_path_entries() -> Vec<PathBuf> {
+    let mut entries = Vec::new();
+    if let Some(value) = std::env::var_os("SANDBOX_AGENT_TEST_EXTRA_PATHS") {
+        for entry in std::env::split_paths(&value) {
+            if entry.is_absolute() && entry.exists() {
+                entries.push(entry);
+            }
+        }
+    }
+    entries
+}
+
+fn rewrite_localhost_url(key: &str, value: &str) -> String {
+    if key.ends_with("_URL") || key.ends_with("_URI") {
+        return value
+            .replace("http://127.0.0.1", "http://host.docker.internal")
+            .replace("http://localhost", "http://host.docker.internal");
+    }
+    value.to_string()
+}
+
+fn unique_container_id() -> String {
+    let millis = SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .map(|value| value.as_millis())
+        .unwrap_or(0);
+    let counter = CONTAINER_COUNTER.fetch_add(1, Ordering::Relaxed);
+    format!(
+        "sandbox-agent-test-{}-{millis}-{counter}",
+        std::process::id()
+    )
+}
+
+fn repo_root() -> PathBuf {
+    PathBuf::from(env!("CARGO_MANIFEST_DIR"))
+        .join("../../..")
+        .canonicalize()
+        .expect("repo root")
+}
+
+fn docker_bin() -> &'static Path {
+    DOCKER_BIN
+        .get_or_init(|| {
+            if let Some(value) = std::env::var_os("SANDBOX_AGENT_TEST_DOCKER_BIN") {
+                let path = PathBuf::from(value);
+                if path.exists() {
+                    return path;
+                }
+            }
+
+            for candidate in [
+                "/usr/local/bin/docker",
+                "/opt/homebrew/bin/docker",
+                "/usr/bin/docker",
+            ] {
+                let path = PathBuf::from(candidate);
+                if path.exists() {
+                    return path;
+                }
+            }
+
+            PathBuf::from("docker")
+        })
+        .as_path()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    struct EnvVarGuard {
+        key: &'static str,
+        old: Option<std::ffi::OsString>,
+    }
+
+    impl EnvVarGuard {
+        fn set(key: &'static str, value: &Path) -> Self {
+            let old = std::env::var_os(key);
+            std::env::set_var(key, value);
+            Self { key, old }
+        }
+    }
+
+    impl Drop for EnvVarGuard {
+        fn drop(&mut self) {
+            match self.old.as_ref() {
+                Some(value) => std::env::set_var(self.key, value),
+                None => std::env::remove_var(self.key),
+            }
+        }
+    }
+
+    #[test]
+    #[serial]
+    fn build_env_keeps_test_local_xdg_state_home() {
+        let root = tempfile::tempdir().expect("create docker support tempdir");
+        let host_state = tempfile::tempdir().expect("create host xdg state tempdir");
+        let _guard = EnvVarGuard::set("XDG_STATE_HOME", host_state.path());
+
+        let layout = TestLayout::new(root.path());
+        layout.create();
+
+        let env = build_env(&layout, &AuthConfig::disabled(), &TestAppOptions::default());
+        assert_eq!(
+            env.get("XDG_STATE_HOME"),
+            Some(&layout.xdg_state_home.to_string_lossy().to_string())
+        );
+    }
+}
--- a/server/packages/sandbox-agent/tests/v1_agent_process_matrix.rs
+++ b/server/packages/sandbox-agent/tests/v1_agent_process_matrix.rs
@ -1,37 +1,14 @@
 use std::fs;
 use std::path::Path;

-use axum::body::Body;
-use axum::http::{Method, Request, StatusCode};
 use futures::StreamExt;
-use http_body_util::BodyExt;
-use sandbox_agent::router::{build_router, AppState, AuthConfig};
-use sandbox_agent_agent_management::agents::AgentManager;
+use reqwest::{Method, StatusCode};
+use sandbox_agent::router::AuthConfig;
 use serde_json::{json, Value};
-use tempfile::TempDir;
-use tower::util::ServiceExt;

-struct TestApp {
-    app: axum::Router,
-    _install_dir: TempDir,
-}
-
-impl TestApp {
-    fn with_setup<F>(setup: F) -> Self
-    where
-        F: FnOnce(&Path),
-    {
-        let install_dir = tempfile::tempdir().expect("create temp install dir");
-        setup(install_dir.path());
-        let manager = AgentManager::new(install_dir.path()).expect("create agent manager");
-        let state = AppState::new(AuthConfig::disabled(), manager);
-        let app = build_router(state);
-        Self {
-            app,
-            _install_dir: install_dir,
-        }
-    }
-}
+#[path = "support/docker.rs"]
+mod docker_support;
+use docker_support::TestApp;

 fn write_executable(path: &Path, script: &str) {
    fs::write(path, script).expect("write executable");
@ -101,28 +78,29 @@ fn setup_stub_agent_process_only(install_dir: &Path, agent: &str) {
 }

 async fn send_request(
-    app: &axum::Router,
+    app: &docker_support::DockerApp,
    method: Method,
    uri: &str,
    body: Option<Value>,
 ) -> (StatusCode, Vec<u8>) {
-    let mut builder = Request::builder().method(method).uri(uri);
-    let request_body = if let Some(body) = body {
-        builder = builder.header("content-type", "application/json");
-        Body::from(body.to_string())
+    let client = reqwest::Client::new();
+    let response = if let Some(body) = body {
+        client
+            .request(method, app.http_url(uri))
+            .header("content-type", "application/json")
+            .body(body.to_string())
+            .send()
+            .await
+            .expect("request handled")
    } else {
-        Body::empty()
+        client
+            .request(method, app.http_url(uri))
+            .send()
+            .await
+            .expect("request handled")
    };
-
-    let request = builder.body(request_body).expect("build request");
-    let response = app.clone().oneshot(request).await.expect("request handled");
    let status = response.status();
-    let bytes = response
-        .into_body()
-        .collect()
-        .await
-        .expect("collect body")
-        .to_bytes();
+    let bytes = response.bytes().await.expect("collect body");

    (status, bytes.to_vec())
 }
@ -145,7 +123,7 @@ async fn agent_process_matrix_smoke_and_jsonrpc_conformance() {
        .chain(agent_process_only_agents.iter())
        .copied()
        .collect();
-    let test_app = TestApp::with_setup(|install_dir| {
+    let test_app = TestApp::with_setup(AuthConfig::disabled(), |install_dir| {
        for agent in native_agents {
            setup_stub_artifacts(install_dir, agent);
        }
@ -201,21 +179,15 @@ async fn agent_process_matrix_smoke_and_jsonrpc_conformance() {
        assert_eq!(new_json["id"], 2, "{agent}: session/new id");
        assert_eq!(new_json["result"]["echoedMethod"], "session/new");

-        let request = Request::builder()
-            .method(Method::GET)
-            .uri(format!("/v1/acp/{agent}-server"))
-            .body(Body::empty())
-            .expect("build sse request");
-
-        let response = test_app
-            .app
-            .clone()
-            .oneshot(request)
+        let response = reqwest::Client::new()
+            .get(test_app.app.http_url(&format!("/v1/acp/{agent}-server")))
+            .header("accept", "text/event-stream")
+            .send()
            .await
            .expect("sse response");
        assert_eq!(response.status(), StatusCode::OK);

-        let mut stream = response.into_body().into_data_stream();
+        let mut stream = response.bytes_stream();
        let chunk = tokio::time::timeout(std::time::Duration::from_secs(5), async move {
            while let Some(item) = stream.next().await {
                let bytes = item.expect("sse chunk");
--- a/server/packages/sandbox-agent/tests/v1_api.rs
+++ b/server/packages/sandbox-agent/tests/v1_api.rs
@ -1,128 +1,19 @@
 use std::fs;
 use std::io::{Read, Write};
-use std::net::{SocketAddr, TcpListener, TcpStream};
+use std::net::{TcpListener, TcpStream};
 use std::path::Path;
 use std::time::Duration;

-use axum::body::Body;
-use axum::http::{header, HeaderMap, Method, Request, StatusCode};
-use axum::Router;
 use futures::StreamExt;
-use http_body_util::BodyExt;
-use sandbox_agent::router::{build_router, AppState, AuthConfig};
-use sandbox_agent_agent_management::agents::AgentManager;
+use reqwest::header::{self, HeaderMap, HeaderName, HeaderValue};
+use reqwest::{Method, StatusCode};
+use sandbox_agent::router::AuthConfig;
 use serde_json::{json, Value};
 use serial_test::serial;
-use tempfile::TempDir;
-use tokio::sync::oneshot;
-use tokio::task::JoinHandle;
-use tower::util::ServiceExt;

-struct TestApp {
-    app: Router,
-    install_dir: TempDir,
-}
-
-impl TestApp {
-    fn new(auth: AuthConfig) -> Self {
-        Self::with_setup(auth, |_| {})
-    }
-
-    fn with_setup<F>(auth: AuthConfig, setup: F) -> Self
-    where
-        F: FnOnce(&Path),
-    {
-        let install_dir = tempfile::tempdir().expect("create temp install dir");
-        setup(install_dir.path());
-        let manager = AgentManager::new(install_dir.path()).expect("create agent manager");
-        let state = AppState::new(auth, manager);
-        let app = build_router(state);
-        Self { app, install_dir }
-    }
-
-    fn install_path(&self) -> &Path {
-        self.install_dir.path()
-    }
-}
-
-struct EnvVarGuard {
-    key: &'static str,
-    previous: Option<std::ffi::OsString>,
-}
-
-struct LiveServer {
-    address: SocketAddr,
-    shutdown_tx: Option<oneshot::Sender<()>>,
-    task: JoinHandle<()>,
-}
-
-impl LiveServer {
-    async fn spawn(app: Router) -> Self {
-        let listener = tokio::net::TcpListener::bind("127.0.0.1:0")
-            .await
-            .expect("bind live server");
-        let address = listener.local_addr().expect("live server address");
-        let (shutdown_tx, shutdown_rx) = oneshot::channel::<()>();
-
-        let task = tokio::spawn(async move {
-            let server =
-                axum::serve(listener, app.into_make_service()).with_graceful_shutdown(async {
-                    let _ = shutdown_rx.await;
-                });
-
-            let _ = server.await;
-        });
-
-        Self {
-            address,
-            shutdown_tx: Some(shutdown_tx),
-            task,
-        }
-    }
-
-    fn http_url(&self, path: &str) -> String {
-        format!("http://{}{}", self.address, path)
-    }
-
-    fn ws_url(&self, path: &str) -> String {
-        format!("ws://{}{}", self.address, path)
-    }
-
-    async fn shutdown(mut self) {
-        if let Some(shutdown_tx) = self.shutdown_tx.take() {
-            let _ = shutdown_tx.send(());
-        }
-
-        let _ = tokio::time::timeout(Duration::from_secs(3), async {
-            let _ = self.task.await;
-        })
-        .await;
-    }
-}
-
-impl EnvVarGuard {
-    fn set(key: &'static str, value: &str) -> Self {
-        let previous = std::env::var_os(key);
-        std::env::set_var(key, value);
-        Self { key, previous }
-    }
-
-    fn set_os(key: &'static str, value: &std::ffi::OsStr) -> Self {
-        let previous = std::env::var_os(key);
-        std::env::set_var(key, value);
-        Self { key, previous }
-    }
-}
-
-impl Drop for EnvVarGuard {
-    fn drop(&mut self) {
-        if let Some(previous) = self.previous.as_ref() {
-            std::env::set_var(self.key, previous);
-        } else {
-            std::env::remove_var(self.key);
-        }
-    }
-}
+#[path = "support/docker.rs"]
+mod docker_support;
+use docker_support::{LiveServer, TestApp};

 fn write_executable(path: &Path, script: &str) {
    fs::write(path, script).expect("write executable");
@ -168,17 +59,18 @@ exit 0
 }

 fn serve_registry_once(document: Value) -> String {
-    let listener = TcpListener::bind("127.0.0.1:0").expect("bind registry server");
-    let address = listener.local_addr().expect("registry address");
+    let listener = TcpListener::bind("0.0.0.0:0").expect("bind registry server");
+    let port = listener.local_addr().expect("registry address").port();
    let body = document.to_string();

-    std::thread::spawn(move || {
-        if let Ok((mut stream, _)) = listener.accept() {
-            respond_json(&mut stream, &body);
+    std::thread::spawn(move || loop {
+        match listener.accept() {
+            Ok((mut stream, _)) => respond_json(&mut stream, &body),
+            Err(_) => break,
        }
    });

-    format!("http://{address}/registry.json")
+    format!("http://127.0.0.1:{port}/registry.json")
 }

 fn respond_json(stream: &mut TcpStream, body: &str) {
@ -196,74 +88,96 @@ fn respond_json(stream: &mut TcpStream, body: &str) {
 }

 async fn send_request(
-    app: &Router,
+    app: &docker_support::DockerApp,
    method: Method,
    uri: &str,
    body: Option<Value>,
    headers: &[(&str, &str)],
 ) -> (StatusCode, HeaderMap, Vec<u8>) {
-    let mut builder = Request::builder().method(method).uri(uri);
+    let client = reqwest::Client::new();
+    let mut builder = client.request(method, app.http_url(uri));
    for (name, value) in headers {
-        builder = builder.header(*name, *value);
+        let header_name = HeaderName::from_bytes(name.as_bytes()).expect("header name");
+        let header_value = HeaderValue::from_str(value).expect("header value");
+        builder = builder.header(header_name, header_value);
    }

-    let request_body = if let Some(body) = body {
-        builder = builder.header(header::CONTENT_TYPE, "application/json");
-        Body::from(body.to_string())
+    let response = if let Some(body) = body {
+        builder
+            .header(header::CONTENT_TYPE, "application/json")
+            .body(body.to_string())
+            .send()
+            .await
+            .expect("request handled")
    } else {
-        Body::empty()
+        builder.send().await.expect("request handled")
    };
-
-    let request = builder.body(request_body).expect("build request");
-    let response = app.clone().oneshot(request).await.expect("request handled");
    let status = response.status();
    let headers = response.headers().clone();
-    let bytes = response
-        .into_body()
-        .collect()
-        .await
-        .expect("collect body")
-        .to_bytes();
+    let bytes = response.bytes().await.expect("collect body");

    (status, headers, bytes.to_vec())
 }

 async fn send_request_raw(
-    app: &Router,
+    app: &docker_support::DockerApp,
    method: Method,
    uri: &str,
    body: Option<Vec<u8>>,
    headers: &[(&str, &str)],
    content_type: Option<&str>,
 ) -> (StatusCode, HeaderMap, Vec<u8>) {
-    let mut builder = Request::builder().method(method).uri(uri);
+    let client = reqwest::Client::new();
+    let mut builder = client.request(method, app.http_url(uri));
    for (name, value) in headers {
-        builder = builder.header(*name, *value);
+        let header_name = HeaderName::from_bytes(name.as_bytes()).expect("header name");
+        let header_value = HeaderValue::from_str(value).expect("header value");
+        builder = builder.header(header_name, header_value);
    }

-    let request_body = if let Some(body) = body {
+    let response = if let Some(body) = body {
        if let Some(content_type) = content_type {
            builder = builder.header(header::CONTENT_TYPE, content_type);
        }
-        Body::from(body)
+        builder.body(body).send().await.expect("request handled")
    } else {
-        Body::empty()
+        builder.send().await.expect("request handled")
    };
-
-    let request = builder.body(request_body).expect("build request");
-    let response = app.clone().oneshot(request).await.expect("request handled");
    let status = response.status();
    let headers = response.headers().clone();
-    let bytes = response
-        .into_body()
-        .collect()
-        .await
-        .expect("collect body")
-        .to_bytes();
+    let bytes = response.bytes().await.expect("collect body");

    (status, headers, bytes.to_vec())
 }

+async fn launch_desktop_focus_window(app: &docker_support::DockerApp, display: &str) {
+    let command = r#"nohup xterm -geometry 80x24+40+40 -title 'Sandbox Desktop Test' -e sh -lc 'sleep 60' >/tmp/sandbox-agent-xterm.log 2>&1 < /dev/null & for _ in $(seq 1 50); do wid="$(xdotool search --onlyvisible --name 'Sandbox Desktop Test' 2>/dev/null | head -n 1 || true)"; if [ -n "$wid" ]; then xdotool windowactivate "$wid"; exit 0; fi; sleep 0.1; done; exit 1"#;
+    let (status, _, body) = send_request(
+        app,
+        Method::POST,
+        "/v1/processes/run",
+        Some(json!({
+            "command": "sh",
+            "args": ["-lc", command],
+            "env": {
+                "DISPLAY": display,
+            },
+            "timeoutMs": 10_000
+        })),
+        &[],
+    )
+    .await;
+
+    assert_eq!(
+        status,
+        StatusCode::OK,
+        "unexpected desktop focus window launch response: {}",
+        String::from_utf8_lossy(&body)
+    );
+    let parsed = parse_json(&body);
+    assert_eq!(parsed["exitCode"], 0);
+}
+
 fn parse_json(bytes: &[u8]) -> Value {
    if bytes.is_empty() {
        Value::Null
@ -284,7 +198,7 @@ fn initialize_payload() -> Value {
    })
 }

-async fn bootstrap_server(app: &Router, server_id: &str, agent: &str) {
+async fn bootstrap_server(app: &docker_support::DockerApp, server_id: &str, agent: &str) {
    let initialize = initialize_payload();
    let (status, _, _body) = send_request(
        app,
@ -297,17 +211,17 @@ async fn bootstrap_server(app: &Router, server_id: &str, agent: &str) {
    assert_eq!(status, StatusCode::OK);
 }

-async fn read_first_sse_data(app: &Router, server_id: &str) -> String {
-    let request = Request::builder()
-        .method(Method::GET)
-        .uri(format!("/v1/acp/{server_id}"))
-        .body(Body::empty())
-        .expect("build request");
-
-    let response = app.clone().oneshot(request).await.expect("sse response");
+async fn read_first_sse_data(app: &docker_support::DockerApp, server_id: &str) -> String {
+    let client = reqwest::Client::new();
+    let response = client
+        .get(app.http_url(&format!("/v1/acp/{server_id}")))
+        .header("accept", "text/event-stream")
+        .send()
+        .await
+        .expect("sse response");
    assert_eq!(response.status(), StatusCode::OK);

-    let mut stream = response.into_body().into_data_stream();
+    let mut stream = response.bytes_stream();
    tokio::time::timeout(Duration::from_secs(5), async move {
        while let Some(chunk) = stream.next().await {
            let bytes = chunk.expect("stream chunk");
@ -323,21 +237,21 @@ async fn read_first_sse_data(app: &Router, server_id: &str) -> String {
 }

 async fn read_first_sse_data_with_last_id(
-    app: &Router,
+    app: &docker_support::DockerApp,
    server_id: &str,
    last_event_id: u64,
 ) -> String {
-    let request = Request::builder()
-        .method(Method::GET)
-        .uri(format!("/v1/acp/{server_id}"))
+    let client = reqwest::Client::new();
+    let response = client
+        .get(app.http_url(&format!("/v1/acp/{server_id}")))
+        .header("accept", "text/event-stream")
        .header("last-event-id", last_event_id.to_string())
-        .body(Body::empty())
-        .expect("build request");
-
-    let response = app.clone().oneshot(request).await.expect("sse response");
+        .send()
+        .await
+        .expect("sse response");
    assert_eq!(response.status(), StatusCode::OK);

-    let mut stream = response.into_body().into_data_stream();
+    let mut stream = response.bytes_stream();
    tokio::time::timeout(Duration::from_secs(5), async move {
        while let Some(chunk) = stream.next().await {
            let bytes = chunk.expect("stream chunk");
@ -375,5 +289,7 @@ mod acp_transport;
 mod config_endpoints;
 #[path = "v1_api/control_plane.rs"]
 mod control_plane;
+#[path = "v1_api/desktop.rs"]
+mod desktop;
 #[path = "v1_api/processes.rs"]
 mod processes;
--- a/server/packages/sandbox-agent/tests/v1_api/config_endpoints.rs
+++ b/server/packages/sandbox-agent/tests/v1_api/config_endpoints.rs
@ -22,8 +22,9 @@ async fn mcp_config_requires_directory_and_name() {
 #[tokio::test]
 async fn mcp_config_crud_round_trip() {
    let test_app = TestApp::new(AuthConfig::disabled());
-    let project = tempfile::tempdir().expect("tempdir");
-    let directory = project.path().to_string_lossy().to_string();
+    let project = test_app.root_path().join("mcp-config-project");
+    fs::create_dir_all(&project).expect("create project dir");
+    let directory = project.to_string_lossy().to_string();

    let entry = json!({
        "type": "local",
@ -99,8 +100,9 @@ async fn skills_config_requires_directory_and_name() {
 #[tokio::test]
 async fn skills_config_crud_round_trip() {
    let test_app = TestApp::new(AuthConfig::disabled());
-    let project = tempfile::tempdir().expect("tempdir");
-    let directory = project.path().to_string_lossy().to_string();
+    let project = test_app.root_path().join("skills-config-project");
+    fs::create_dir_all(&project).expect("create project dir");
+    let directory = project.to_string_lossy().to_string();

    let entry = json!({
        "sources": [
--- a/server/packages/sandbox-agent/tests/v1_api/control_plane.rs
+++ b/server/packages/sandbox-agent/tests/v1_api/control_plane.rs
@ -1,4 +1,5 @@
 use super::*;
+use std::collections::BTreeMap;

 #[tokio::test]
 async fn v1_health_removed_legacy_and_opencode_unmounted() {
@ -137,10 +138,19 @@ async fn v1_filesystem_endpoints_round_trip() {
 #[tokio::test]
 #[serial]
 async fn require_preinstall_blocks_missing_agent() {
-    let test_app = {
-        let _preinstall = EnvVarGuard::set("SANDBOX_AGENT_REQUIRE_PREINSTALL", "true");
-        TestApp::new(AuthConfig::disabled())
-    };
+    let mut env = BTreeMap::new();
+    env.insert(
+        "SANDBOX_AGENT_REQUIRE_PREINSTALL".to_string(),
+        "true".to_string(),
+    );
+    let test_app = TestApp::with_options(
+        AuthConfig::disabled(),
+        docker_support::TestAppOptions {
+            env,
+            ..Default::default()
+        },
+        |_| {},
+    );

    let (status, _, body) = send_request(
        &test_app.app,
@ -176,20 +186,26 @@ async fn lazy_install_runs_on_first_bootstrap() {
        ]
    }));

-    let _registry = EnvVarGuard::set("SANDBOX_AGENT_ACP_REGISTRY_URL", &registry_url);
-    let test_app = TestApp::with_setup(AuthConfig::disabled(), |install_path| {
-        fs::create_dir_all(install_path.join("agent_processes"))
-            .expect("create agent processes dir");
-        write_executable(&install_path.join("codex"), "#!/usr/bin/env sh\nexit 0\n");
-        fs::create_dir_all(install_path.join("bin")).expect("create bin dir");
-        write_fake_npm(&install_path.join("bin").join("npm"));
-    });
+    let helper_bin_root = tempfile::tempdir().expect("helper bin tempdir");
+    let helper_bin = helper_bin_root.path().join("bin");
+    fs::create_dir_all(&helper_bin).expect("create helper bin dir");
+    write_fake_npm(&helper_bin.join("npm"));

-    let original_path = std::env::var_os("PATH").unwrap_or_default();
-    let mut paths = vec![test_app.install_path().join("bin")];
-    paths.extend(std::env::split_paths(&original_path));
-    let merged_path = std::env::join_paths(paths).expect("join PATH");
-    let _path_guard = EnvVarGuard::set_os("PATH", merged_path.as_os_str());
+    let mut env = BTreeMap::new();
+    env.insert("SANDBOX_AGENT_ACP_REGISTRY_URL".to_string(), registry_url);
+    let test_app = TestApp::with_options(
+        AuthConfig::disabled(),
+        docker_support::TestAppOptions {
+            env,
+            extra_paths: vec![helper_bin.clone()],
+            ..Default::default()
+        },
+        |install_path| {
+            fs::create_dir_all(install_path.join("agent_processes"))
+                .expect("create agent processes dir");
+            write_executable(&install_path.join("codex"), "#!/usr/bin/env sh\nexit 0\n");
+        },
+    );

    let (status, _, _) = send_request(
        &test_app.app,
--- a/server/packages/sandbox-agent/tests/v1_api/desktop.rs
+++ b/server/packages/sandbox-agent/tests/v1_api/desktop.rs
@ -0,0 +1,494 @@
+use super::*;
+use futures::{SinkExt, StreamExt};
+use serial_test::serial;
+use std::collections::BTreeMap;
+use tokio_tungstenite::connect_async;
+use tokio_tungstenite::tungstenite::Message;
+
+fn png_dimensions(bytes: &[u8]) -> (u32, u32) {
+    assert!(bytes.starts_with(b"\x89PNG\r\n\x1a\n"));
+    let width = u32::from_be_bytes(bytes[16..20].try_into().expect("png width bytes"));
+    let height = u32::from_be_bytes(bytes[20..24].try_into().expect("png height bytes"));
+    (width, height)
+}
+
+async fn recv_ws_message(
+    ws: &mut tokio_tungstenite::WebSocketStream<
+        tokio_tungstenite::MaybeTlsStream<tokio::net::TcpStream>,
+    >,
+) -> Message {
+    tokio::time::timeout(Duration::from_secs(5), ws.next())
+        .await
+        .expect("timed out waiting for websocket frame")
+        .expect("websocket stream ended")
+        .expect("websocket frame")
+}
+
+#[tokio::test]
+#[serial]
+async fn v1_desktop_status_reports_install_required_when_dependencies_are_missing() {
+    let temp = tempfile::tempdir().expect("create empty path tempdir");
+    let mut env = BTreeMap::new();
+    env.insert(
+        "PATH".to_string(),
+        temp.path().to_string_lossy().to_string(),
+    );
+
+    let test_app = TestApp::with_options(
+        AuthConfig::disabled(),
+        docker_support::TestAppOptions {
+            env,
+            replace_path: true,
+            ..Default::default()
+        },
+        |_| {},
+    );
+
+    let (status, _, body) =
+        send_request(&test_app.app, Method::GET, "/v1/desktop/status", None, &[]).await;
+
+    assert_eq!(status, StatusCode::OK);
+    let parsed = parse_json(&body);
+    assert_eq!(parsed["state"], "install_required");
+    assert!(parsed["missingDependencies"]
+        .as_array()
+        .expect("missingDependencies array")
+        .iter()
+        .any(|value| value == "Xvfb"));
+    assert_eq!(
+        parsed["installCommand"],
+        "sandbox-agent install desktop --yes"
+    );
+}
+
+#[tokio::test]
+#[serial]
+async fn v1_desktop_lifecycle_and_actions_work_with_real_runtime() {
+    let test_app = TestApp::new(AuthConfig::disabled());
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/start",
+        Some(json!({
+            "width": 1440,
+            "height": 900,
+            "dpi": 96
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(
+        status,
+        StatusCode::OK,
+        "unexpected start response: {}",
+        String::from_utf8_lossy(&body)
+    );
+    let parsed = parse_json(&body);
+    assert_eq!(parsed["state"], "active");
+    let display = parsed["display"]
+        .as_str()
+        .expect("desktop display")
+        .to_string();
+    assert!(display.starts_with(':'));
+    assert_eq!(parsed["resolution"]["width"], 1440);
+    assert_eq!(parsed["resolution"]["height"], 900);
+
+    let (status, headers, body) = send_request_raw(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/screenshot",
+        None,
+        &[],
+        None,
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(
+        headers
+            .get(header::CONTENT_TYPE)
+            .and_then(|value| value.to_str().ok()),
+        Some("image/png")
+    );
+    assert!(body.starts_with(b"\x89PNG\r\n\x1a\n"));
+    assert_eq!(png_dimensions(&body), (1440, 900));
+
+    let (status, headers, body) = send_request_raw(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/screenshot?format=jpeg&quality=50",
+        None,
+        &[],
+        None,
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(
+        headers
+            .get(header::CONTENT_TYPE)
+            .and_then(|value| value.to_str().ok()),
+        Some("image/jpeg")
+    );
+    assert!(body.starts_with(&[0xff, 0xd8, 0xff]));
+
+    let (status, headers, body) = send_request_raw(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/screenshot?scale=0.5",
+        None,
+        &[],
+        None,
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(
+        headers
+            .get(header::CONTENT_TYPE)
+            .and_then(|value| value.to_str().ok()),
+        Some("image/png")
+    );
+    assert_eq!(png_dimensions(&body), (720, 450));
+
+    let (status, _, body) = send_request_raw(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/screenshot/region?x=10&y=20&width=30&height=40",
+        None,
+        &[],
+        None,
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert!(body.starts_with(b"\x89PNG\r\n\x1a\n"));
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/display/info",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let display_info = parse_json(&body);
+    assert_eq!(display_info["display"], display);
+    assert_eq!(display_info["resolution"]["width"], 1440);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/move",
+        Some(json!({ "x": 400, "y": 300 })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let mouse = parse_json(&body);
+    assert_eq!(mouse["x"], 400);
+    assert_eq!(mouse["y"], 300);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/drag",
+        Some(json!({
+            "startX": 100,
+            "startY": 110,
+            "endX": 220,
+            "endY": 230,
+            "button": "left"
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let dragged = parse_json(&body);
+    assert_eq!(dragged["x"], 220);
+    assert_eq!(dragged["y"], 230);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/click",
+        Some(json!({
+            "x": 220,
+            "y": 230,
+            "button": "left",
+            "clickCount": 1
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let clicked = parse_json(&body);
+    assert_eq!(clicked["x"], 220);
+    assert_eq!(clicked["y"], 230);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/down",
+        Some(json!({
+            "x": 220,
+            "y": 230,
+            "button": "left"
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let mouse_down = parse_json(&body);
+    assert_eq!(mouse_down["x"], 220);
+    assert_eq!(mouse_down["y"], 230);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/move",
+        Some(json!({ "x": 260, "y": 280 })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let moved_while_down = parse_json(&body);
+    assert_eq!(moved_while_down["x"], 260);
+    assert_eq!(moved_while_down["y"], 280);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/up",
+        Some(json!({ "button": "left" })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let mouse_up = parse_json(&body);
+    assert_eq!(mouse_up["x"], 260);
+    assert_eq!(mouse_up["y"], 280);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/mouse/scroll",
+        Some(json!({
+            "x": 220,
+            "y": 230,
+            "deltaY": -3
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let scrolled = parse_json(&body);
+    assert_eq!(scrolled["x"], 220);
+    assert_eq!(scrolled["y"], 230);
+
+    let (status, _, body) =
+        send_request(&test_app.app, Method::GET, "/v1/desktop/windows", None, &[]).await;
+    assert_eq!(status, StatusCode::OK);
+    assert!(parse_json(&body)["windows"].is_array());
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/mouse/position",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let position = parse_json(&body);
+    assert_eq!(position["x"], 220);
+    assert_eq!(position["y"], 230);
+
+    launch_desktop_focus_window(&test_app.app, &display).await;
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/keyboard/type",
+        Some(json!({ "text": "hello world", "delayMs": 5 })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["ok"], true);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/keyboard/press",
+        Some(json!({ "key": "ctrl+l" })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["ok"], true);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/keyboard/press",
+        Some(json!({
+            "key": "l",
+            "modifiers": {
+                "ctrl": true
+            }
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["ok"], true);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/keyboard/down",
+        Some(json!({ "key": "shift" })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["ok"], true);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/keyboard/up",
+        Some(json!({ "key": "shift" })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["ok"], true);
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/recording/start",
+        Some(json!({ "fps": 8 })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let recording = parse_json(&body);
+    let recording_id = recording["id"].as_str().expect("recording id").to_string();
+    assert_eq!(recording["status"], "recording");
+
+    tokio::time::sleep(Duration::from_secs(2)).await;
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/recording/stop",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let stopped_recording = parse_json(&body);
+    assert_eq!(stopped_recording["id"], recording_id);
+    assert_eq!(stopped_recording["status"], "completed");
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::GET,
+        "/v1/desktop/recordings",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert!(parse_json(&body)["recordings"].is_array());
+
+    let (status, headers, body) = send_request_raw(
+        &test_app.app,
+        Method::GET,
+        &format!("/v1/desktop/recordings/{recording_id}/download"),
+        None,
+        &[],
+        None,
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(
+        headers
+            .get(header::CONTENT_TYPE)
+            .and_then(|value| value.to_str().ok()),
+        Some("video/mp4")
+    );
+    assert!(body.windows(4).any(|window| window == b"ftyp"));
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/stream/start",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["active"], true);
+
+    let (mut ws, _) = connect_async(test_app.app.ws_url("/v1/desktop/stream/ws"))
+        .await
+        .expect("connect desktop stream websocket");
+
+    let ready = recv_ws_message(&mut ws).await;
+    match ready {
+        Message::Text(text) => {
+            let value: Value = serde_json::from_str(&text).expect("desktop stream ready frame");
+            assert_eq!(value["type"], "ready");
+            assert_eq!(value["width"], 1440);
+            assert_eq!(value["height"], 900);
+        }
+        other => panic!("expected text ready frame, got {other:?}"),
+    }
+
+    let frame = recv_ws_message(&mut ws).await;
+    match frame {
+        Message::Binary(bytes) => assert!(bytes.starts_with(&[0xff, 0xd8, 0xff])),
+        other => panic!("expected binary jpeg frame, got {other:?}"),
+    }
+
+    ws.send(Message::Text(
+        json!({
+            "type": "moveMouse",
+            "x": 320,
+            "y": 330
+        })
+        .to_string()
+        .into(),
+    ))
+    .await
+    .expect("send desktop stream mouse move");
+    let _ = ws.close(None).await;
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/stream/stop",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["active"], false);
+
+    let (status, _, _) = send_request(
+        &test_app.app,
+        Method::DELETE,
+        &format!("/v1/desktop/recordings/{recording_id}"),
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::NO_CONTENT);
+
+    let (status, _, body) =
+        send_request(&test_app.app, Method::POST, "/v1/desktop/stop", None, &[]).await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["state"], "inactive");
+}
--- a/server/packages/sandbox-agent/tests/v1_api/processes.rs
+++ b/server/packages/sandbox-agent/tests/v1_api/processes.rs
@ -2,6 +2,7 @@ use super::*;
 use base64::engine::general_purpose::STANDARD as BASE64;
 use base64::Engine;
 use futures::{SinkExt, StreamExt};
+use serial_test::serial;
 use tokio_tungstenite::connect_async;
 use tokio_tungstenite::tungstenite::Message;

@ -277,6 +278,98 @@ async fn v1_process_tty_input_and_logs() {
    assert_eq!(status, StatusCode::NO_CONTENT);
 }

+#[tokio::test]
+#[serial]
+async fn v1_processes_owner_filter_separates_user_and_desktop_processes() {
+    let test_app = TestApp::new(AuthConfig::disabled());
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/processes",
+        Some(json!({
+            "command": "sh",
+            "args": ["-lc", "sleep 30"],
+            "tty": false,
+            "interactive": false
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let user_process_id = parse_json(&body)["id"]
+        .as_str()
+        .expect("process id")
+        .to_string();
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::POST,
+        "/v1/desktop/start",
+        Some(json!({
+            "width": 1024,
+            "height": 768
+        })),
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["state"], "active");
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::GET,
+        "/v1/processes?owner=user",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let user_processes = parse_json(&body)["processes"]
+        .as_array()
+        .cloned()
+        .unwrap_or_default();
+    assert!(user_processes
+        .iter()
+        .any(|process| process["id"] == user_process_id));
+    assert!(user_processes
+        .iter()
+        .all(|process| process["owner"] == "user"));
+
+    let (status, _, body) = send_request(
+        &test_app.app,
+        Method::GET,
+        "/v1/processes?owner=desktop",
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+    let desktop_processes = parse_json(&body)["processes"]
+        .as_array()
+        .cloned()
+        .unwrap_or_default();
+    assert!(desktop_processes.len() >= 2);
+    assert!(desktop_processes
+        .iter()
+        .all(|process| process["owner"] == "desktop"));
+
+    let (status, _, _) = send_request(
+        &test_app.app,
+        Method::POST,
+        &format!("/v1/processes/{user_process_id}/kill"),
+        None,
+        &[],
+    )
+    .await;
+    assert_eq!(status, StatusCode::OK);
+
+    let (status, _, body) =
+        send_request(&test_app.app, Method::POST, "/v1/desktop/stop", None, &[]).await;
+    assert_eq!(status, StatusCode::OK);
+    assert_eq!(parse_json(&body)["state"], "inactive");
+}
+
 #[tokio::test]
 async fn v1_process_not_found_returns_404() {
    let test_app = TestApp::new(AuthConfig::disabled());
@ -413,22 +506,17 @@ async fn v1_process_logs_follow_sse_streams_entries() {
        .expect("process id")
        .to_string();

-    let request = Request::builder()
-        .method(Method::GET)
-        .uri(format!(
+    let response = reqwest::Client::new()
+        .get(test_app.app.http_url(&format!(
            "/v1/processes/{process_id}/logs?stream=stdout&follow=true"
-        ))
-        .body(Body::empty())
-        .expect("build request");
-    let response = test_app
-        .app
-        .clone()
-        .oneshot(request)
+        )))
+        .header("accept", "text/event-stream")
+        .send()
        .await
        .expect("sse response");
    assert_eq!(response.status(), StatusCode::OK);

-    let mut stream = response.into_body().into_data_stream();
+    let mut stream = response.bytes_stream();
    let chunk = tokio::time::timeout(Duration::from_secs(5), async move {
        while let Some(chunk) = stream.next().await {
            let bytes = chunk.expect("stream chunk");