feat: add turn streaming and inspector updates

This commit is contained in:
Nathan Flurry 2026-01-27 06:18:43 -08:00
parent bf58891edf
commit 34d4f3693e
49 changed files with 4629 additions and 1146 deletions

View file

@ -105,6 +105,7 @@ Each session tracks:
POST /v1/sessions/{sessionId} Create session, auto-install agent
POST /v1/sessions/{id}/messages Spawn agent subprocess, stream output
POST /v1/sessions/{id}/messages/stream Post and stream a single turn
GET /v1/sessions/{id}/events Poll for new events (offset-based)
GET /v1/sessions/{id}/events/sse Subscribe to SSE stream
@ -133,16 +134,30 @@ When a message is sent:
## Agent Execution
Each agent has a different execution model and communication pattern.
Each agent has a different execution model and communication pattern. There are two main architectural patterns:
### Architecture Patterns
**Subprocess Model (Claude, Amp):**
- New process spawned per message/turn
- Process terminates after turn completes
- Multi-turn via CLI resume flags (`--resume`, `--continue`)
- Simple but has process spawn overhead
**Client/Server Model (OpenCode, Codex):**
- Single long-running server process
- Multiple sessions/threads multiplexed via RPC
- Multi-turn via server-side thread persistence
- More efficient for repeated interactions
### Overview
| Agent | Execution Model | Binary Source | Session Resume |
|-------|-----------------|---------------|----------------|
| Claude Code | CLI subprocess | GCS (Anthropic) | Yes (`--resume`) |
| Codex | App Server subprocess (JSON-RPC) | GitHub releases | No |
| OpenCode | HTTP server + SSE | GitHub releases | Yes (server-side) |
| Amp | CLI subprocess | GCS (Amp) | Yes (`--continue`) |
| Agent | Architecture | Binary Source | Multi-Turn Method |
|-------|--------------|---------------|-------------------|
| Claude Code | Subprocess (per-turn) | GCS (Anthropic) | `--resume` flag |
| Codex | **Shared Server (JSON-RPC)** | GitHub releases | **Thread persistence** |
| OpenCode | HTTP Server (SSE) | GitHub releases | Server-side sessions |
| Amp | Subprocess (per-turn) | GCS (Amp) | `--continue` flag |
### Claude Code
@ -161,15 +176,25 @@ claude --print --output-format stream-json --verbose \
### Codex
Spawned as a subprocess using the App Server JSON-RPC protocol:
Uses a **shared app-server process** that handles multiple sessions via JSON-RPC over stdio:
```bash
codex app-server
```
- JSON-RPC over stdio (JSONL)
- Uses `initialize`, `thread/start`, and `turn/start` requests
- Approval requests arrive as server JSON-RPC requests
**Daemon flow:**
1. First Codex session triggers `codex app-server` spawn
2. Performs `initialize` / `initialized` handshake
3. Each session creation sends `thread/start` → receives `thread_id`
4. Messages sent via `turn/start` with `thread_id`
5. Notifications routed back to session by `thread_id`
**Key characteristics:**
- Single process handles all Codex sessions
- JSON-RPC over stdio (JSONL format)
- Thread IDs map to daemon session IDs
- Approval requests arrive as server-to-client JSON-RPC requests
- Process lifetime matches daemon lifetime (not per-turn)
### OpenCode
@ -208,12 +233,21 @@ amp [--execute|--print] [--output-format stream-json] \
### Communication Patterns
**Subprocess agents (Claude, Codex, Amp):**
**Per-turn subprocess agents (Claude, Amp):**
1. Agent CLI spawned with appropriate flags
2. Stdout/stderr read line-by-line
3. Each line parsed as JSON
4. Events converted via `parse_agent_line()` → agent-specific converter
5. Universal events recorded and broadcast to SSE subscribers
6. Process terminated on turn completion
**Shared stdio server agent (Codex):**
1. Single `codex app-server` process started on first session
2. `initialize`/`initialized` handshake performed once
3. New sessions send `thread/start`, receive `thread_id`
4. Messages sent via `turn/start` with `thread_id`
5. Notifications read from stdout, routed by `thread_id`
6. Process persists across sessions and turns
**HTTP server agent (OpenCode):**
1. Server started on available port (if not running)

View file

@ -131,13 +131,15 @@ timestamps, not ordering.
## Optional raw payloads
If you need provider-level debugging, pass `include_raw=true` when streaming or polling events to
receive the `raw` payload for each event.
If you need provider-level debugging, pass `include_raw=true` when streaming or polling events
(including one-turn streams) to receive the `raw` payload for each event.
## SSE vs polling
## SSE vs polling vs turn streaming
- SSE gives low-latency updates and simplifies streaming UIs.
- Polling is simpler to debug and works in any environment.
- Turn streaming (`POST /v1/sessions/{session_id}/messages/stream`) is a one-shot stream tied to a
single prompt. The stream closes automatically once the turn completes.
Both yield the same event payloads.

View file

@ -67,6 +67,16 @@ sandbox-agent sessions send-message my-session \
```
</details>
<details>
<summary><strong>sessions send-message-stream</strong></summary>
```bash
sandbox-agent sessions send-message-stream my-session \
--message "Summarize the repository" \
--endpoint http://127.0.0.1:2468
```
</details>
<details>
<summary><strong>sessions events</strong></summary>

View file

@ -408,6 +408,60 @@
}
}
},
"/v1/sessions/{session_id}/messages/stream": {
"post": {
"tags": [
"sessions"
],
"operationId": "post_message_stream",
"parameters": [
{
"name": "session_id",
"in": "path",
"description": "Session id",
"required": true,
"schema": {
"type": "string"
}
},
{
"name": "include_raw",
"in": "query",
"description": "Include raw provider payloads",
"required": false,
"schema": {
"type": "boolean",
"nullable": true
}
}
],
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/MessageRequest"
}
}
},
"required": true
},
"responses": {
"200": {
"description": "SSE event stream"
},
"404": {
"description": "",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ProblemDetails"
}
}
}
}
}
}
},
"/v1/sessions/{session_id}/permissions/{permission_id}/reply": {
"post": {
"tags": [
@ -1431,6 +1485,15 @@
"daemon"
]
},
"TurnStreamQuery": {
"type": "object",
"properties": {
"includeRaw": {
"type": "boolean",
"nullable": true
}
}
},
"UniversalEvent": {
"type": "object",
"required": [

View file

@ -70,6 +70,15 @@ curl "http://127.0.0.1:2468/v1/sessions/my-session/events/sse?offset=0" \
-H "Authorization: Bearer $SANDBOX_TOKEN"
```
For a single-turn stream (post a message and get one streamed response):
```bash
curl -N -X POST "http://127.0.0.1:2468/v1/sessions/my-session/messages/stream" \
-H "Authorization: Bearer $SANDBOX_TOKEN" \
-H "content-type: application/json" \
-d '{"message":"Hello"}'
```
## 5. CLI shortcuts
The CLI mirrors the HTTP API:
@ -78,4 +87,6 @@ The CLI mirrors the HTTP API:
sandbox-agent sessions create my-session --agent claude --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
sandbox-agent sessions send-message my-session --message "Hello" --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
sandbox-agent sessions send-message-stream my-session --message "Hello" --endpoint http://127.0.0.1:2468 --token "$SANDBOX_TOKEN"
```

View file

@ -86,10 +86,21 @@ for await (const event of client.streamEvents("demo-session", {
The SDK parses `text/event-stream` into `UniversalEvent` objects. If you want full control, use
`getEventsSse()` and parse the stream yourself.
## Stream a single turn
```ts
for await (const event of client.streamTurn("demo-session", { message: "Hello" })) {
console.log(event.type, event.data);
}
```
This method posts the message and streams only the next turn. For manual control, call
`postMessageStream()` and parse the SSE response yourself.
## Optional raw payloads
Set `includeRaw: true` on `getEvents` or `streamEvents` to include the raw provider payload in
`event.raw`. This is useful for debugging and conversion analysis.
Set `includeRaw: true` on `getEvents`, `streamEvents`, or `streamTurn` to include the raw provider
payload in `event.raw`. This is useful for debugging and conversion analysis.
## Error handling

View file

@ -3,7 +3,7 @@ title: "Telemetry"
description: "Anonymous telemetry collected by sandbox-agent."
---
sandbox-agent sends a small, anonymous telemetry payload on startup to help us understand usage and improve reliability.
sandbox-agent sends a small, anonymous telemetry payload on startup and then every 5 minutes to help us understand usage and improve reliability.
## What gets sent
@ -12,6 +12,7 @@ sandbox-agent sends a small, anonymous telemetry payload on startup to help us u
- Detected sandbox provider (for example: Docker, E2B, Vercel Sandboxes).
Each sandbox gets a random anonymous ID stored on disk so usage can be counted without identifying users.
The last successful send time is also stored on disk, and heartbeats are rate-limited to at most one every 5 minutes.
## Opting out