feat: desktop computer-use APIs with neko-based streaming

Add desktop runtime management (Xvfb, openbox, dbus), screen capture,
mouse/keyboard input, and video streaming via neko binary extracted
from the m1k1o/neko container. Includes Docker test rig, TypeScript SDK
desktop support, and inspector Desktop tab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nathan Flurry 2026-03-16 17:56:39 -07:00
parent 3895e34bdb
commit 33821d8660
66 changed files with 13190 additions and 1135 deletions

View file

@ -277,3 +277,13 @@ Update this file continuously during the migration.
- Owner: Unassigned.
- Status: resolved
- Links: `sdks/acp-http-client/src/index.ts`, `sdks/acp-http-client/tests/smoke.test.ts`, `sdks/typescript/tests/integration.test.ts`
- Date: 2026-03-07
- Area: Desktop host/runtime API boundary
- Issue: Desktop automation needed screenshot/input/file-transfer-like host capabilities, but routing it through ACP would have mixed agent protocol semantics with host-owned runtime control and binary payloads.
- Impact: A desktop feature built as ACP methods would blur the division between agent/session behavior and Sandbox Agent host/runtime APIs, and would complicate binary screenshot transport.
- Proposed direction: Ship desktop as first-party HTTP endpoints under `/v1/desktop/*`, keep health/install/remediation in the server runtime, and expose the feature through the SDK and inspector without ACP extension methods.
- Decision: Accepted and implemented for phase one.
- Owner: Unassigned.
- Status: resolved
- Links: `server/packages/sandbox-agent/src/router.rs`, `server/packages/sandbox-agent/src/desktop_runtime.rs`, `sdks/typescript/src/client.ts`, `frontend/packages/inspector/src/components/debug/DesktopTab.tsx`