mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 05:02:11 +00:00
feat: add mock server mode for UI testing
This commit is contained in:
parent
f5d1a6383d
commit
d24f983e2c
21 changed files with 1108 additions and 848 deletions
|
|
@ -34,10 +34,12 @@ Universal schema guidance:
|
|||
- Do not make breaking changes to API endpoints.
|
||||
- When changing API routes, ensure the HTTP/SSE test suite has full coverage of every route.
|
||||
- When agent schema changes, ensure API tests cover the new schema and event shapes end-to-end.
|
||||
- When the universal schema changes, update mock-mode events to cover the new fields or event types.
|
||||
- Update `docs/conversion.md` whenever agent-native schema terms, synthetic events, identifier mappings, or conversion logic change.
|
||||
- Never use synthetic data or mocked responses in tests.
|
||||
- Never manually write agent types; always use generated types in `resources/agent-schemas/`. If types are broken, fix the generated types.
|
||||
- The universal schema must provide consistent behavior across providers; avoid requiring frontend/client logic to special-case agents.
|
||||
- The UI must reflect every field in AgentCapabilities; keep it in sync with the README feature matrix and `agent_capabilities_for`.
|
||||
- When parsing agent data, if something is unexpected or does not match the schema, bail out and surface the error rather than trying to continue with partial parsing.
|
||||
- When defining the universal schema, choose the option most compatible with native agent APIs, and add synthetics to fill gaps for other agents.
|
||||
- Use `docs/glossary.md` as the source of truth for universal schema terminology and keep it updated alongside schema changes.
|
||||
|
|
|
|||
|
|
@ -62,6 +62,10 @@ Docs
|
|||
|
||||
### Server
|
||||
|
||||
- Install server
|
||||
- curl (fastest & does not require npm)
|
||||
- npm i -g (slower)
|
||||
- npx (for quick runs)
|
||||
- Run server
|
||||
- Auth
|
||||
|
||||
|
|
@ -71,6 +75,10 @@ Docs
|
|||
|
||||
Docs
|
||||
|
||||
### Tip: Extracting API Keys
|
||||
|
||||
TODO: npx command to get API keys
|
||||
|
||||
## Project Goals
|
||||
|
||||
This project aims to solve 3 problems with agents:
|
||||
|
|
|
|||
|
|
@ -1,5 +1,6 @@
|
|||
## launch
|
||||
|
||||
- examples for daytona, e2b
|
||||
- provide mock data for validating your rendering
|
||||
- provides history with all items, then iterates thorugh all items on a stream
|
||||
- this is a special type of serve function
|
||||
|
|
|
|||
2
bugs.md
2
bugs.md
|
|
@ -1,2 +0,0 @@
|
|||
- openai exteacted credentials do not work
|
||||
|
||||
167
docs/building-chat-ui.mdx
Normal file
167
docs/building-chat-ui.mdx
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
---
|
||||
title: "Building a Chat UI"
|
||||
description: "Design a client that renders universal session events consistently across providers."
|
||||
---
|
||||
|
||||
This guide explains how to build a chat UI that works across all agents using the universal event
|
||||
stream.
|
||||
|
||||
## High-level flow
|
||||
|
||||
1. List agents and read their capabilities.
|
||||
2. Create a session for the selected agent.
|
||||
3. Send user messages.
|
||||
4. Subscribe to events (polling or SSE).
|
||||
5. Render items and deltas into a stable message timeline.
|
||||
|
||||
## Use agent capabilities
|
||||
|
||||
Capabilities tell you which features are supported for the selected agent:
|
||||
|
||||
- `tool_calls` and `tool_results` indicate tool execution events.
|
||||
- `questions` and `permissions` indicate HITL flows.
|
||||
- `plan_mode` indicates that the agent supports plan-only execution.
|
||||
|
||||
Use these to enable or disable UI affordances (tool panels, approval buttons, etc.).
|
||||
|
||||
## Event model
|
||||
|
||||
Every event includes:
|
||||
|
||||
- `event_id`, `sequence`, and `time` for ordering.
|
||||
- `session_id` for the universal session.
|
||||
- `native_session_id` for provider-specific debugging.
|
||||
- `event_type` with one of:
|
||||
- `session.started`, `session.ended`
|
||||
- `item.started`, `item.delta`, `item.completed`
|
||||
- `permission.requested`, `permission.resolved`
|
||||
- `question.requested`, `question.resolved`
|
||||
- `error`, `agent.unparsed`
|
||||
- `data` which holds the payload for the event type.
|
||||
- `synthetic` and `source` to show daemon-generated events.
|
||||
- `raw` (optional) when `include_raw=true`.
|
||||
|
||||
## Rendering items
|
||||
|
||||
Items are emitted in three phases:
|
||||
|
||||
- `item.started`: first snapshot of a message or tool item.
|
||||
- `item.delta`: incremental updates (token streaming or synthetic deltas).
|
||||
- `item.completed`: final snapshot.
|
||||
|
||||
Recommended render flow:
|
||||
|
||||
```ts
|
||||
type ItemState = {
|
||||
item: UniversalItem;
|
||||
deltas: string[];
|
||||
};
|
||||
|
||||
const items = new Map<string, ItemState>();
|
||||
const order: string[] = [];
|
||||
|
||||
function applyEvent(event: UniversalEvent) {
|
||||
if (event.event_type === "item.started") {
|
||||
const item = event.data.item;
|
||||
items.set(item.item_id, { item, deltas: [] });
|
||||
order.push(item.item_id);
|
||||
}
|
||||
|
||||
if (event.event_type === "item.delta") {
|
||||
const { item_id, delta } = event.data;
|
||||
const state = items.get(item_id);
|
||||
if (state) {
|
||||
state.deltas.push(delta);
|
||||
}
|
||||
}
|
||||
|
||||
if (event.event_type === "item.completed") {
|
||||
const item = event.data.item;
|
||||
const state = items.get(item.item_id);
|
||||
if (state) {
|
||||
state.item = item;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When rendering, combine the item content with accumulated deltas. If you receive a delta before a
|
||||
started event (should not happen), treat it as an error.
|
||||
|
||||
## Content parts
|
||||
|
||||
Each `UniversalItem` has `content` parts. Your UI can branch on `part.type`:
|
||||
|
||||
- `text` for normal chat text.
|
||||
- `tool_call` and `tool_result` for tool execution.
|
||||
- `file_ref` for file read/write/patch previews.
|
||||
- `reasoning` if you display public reasoning text.
|
||||
- `status` for progress updates.
|
||||
- `image` for image outputs.
|
||||
|
||||
Treat `item.kind` as the primary layout decision (message vs tool call vs system), and use content
|
||||
parts for the detailed rendering.
|
||||
|
||||
## Questions and permissions
|
||||
|
||||
Question and permission events are out-of-band from item flow. Render them as modal or inline UI
|
||||
blocks that must be resolved via:
|
||||
|
||||
- `POST /v1/sessions/{session_id}/questions/{question_id}/reply`
|
||||
- `POST /v1/sessions/{session_id}/questions/{question_id}/reject`
|
||||
- `POST /v1/sessions/{session_id}/permissions/{permission_id}/reply`
|
||||
|
||||
If an agent does not advertise these capabilities, keep those UI controls hidden.
|
||||
|
||||
## Error and unparsed events
|
||||
|
||||
- `error` events are structured failures from the daemon or agent.
|
||||
- `agent.unparsed` indicates the provider emitted something the converter could not parse.
|
||||
|
||||
Treat `agent.unparsed` as a hard failure in development so you can fix converters quickly.
|
||||
|
||||
## Event ordering
|
||||
|
||||
Prefer `sequence` for ordering. It is monotonic for a given session. The `time` field is for
|
||||
timestamps, not ordering.
|
||||
|
||||
## Handling session end
|
||||
|
||||
`session.ended` includes the reason and who terminated it. Disable input after a terminal event.
|
||||
|
||||
## Optional raw payloads
|
||||
|
||||
If you need provider-level debugging, pass `include_raw=true` when streaming or polling events to
|
||||
receive the `raw` payload for each event.
|
||||
|
||||
## SSE vs polling
|
||||
|
||||
- SSE gives low-latency updates and simplifies streaming UIs.
|
||||
- Polling is simpler to debug and works in any environment.
|
||||
|
||||
Both yield the same event payloads.
|
||||
|
||||
## Mock mode for UI testing
|
||||
|
||||
Run the server with `--mock` to emit a looping, feature-complete event history for UI development:
|
||||
|
||||
```bash
|
||||
sandbox-agent server --mock --no-token
|
||||
```
|
||||
|
||||
Behavior in mock mode:
|
||||
|
||||
- Sessions emit a fixed history that covers every event type and content part.
|
||||
- The history repeats in a loop, with ~200ms between events and a ~2s pause between loops.
|
||||
- `session.started` and `session.ended` are included in every loop so UIs can exercise lifecycle handling.
|
||||
- `send-message` is accepted but does not change the mock stream.
|
||||
|
||||
If your UI stops rendering after `session.ended`, disable that behavior while testing mock mode so the
|
||||
loop remains visible.
|
||||
|
||||
## Reference implementation
|
||||
|
||||
The [Inspector chat UI](https://github.com/rivet-dev/sandbox-agent/blob/main/frontend/packages/inspector/src/App.tsx)
|
||||
is a complete reference implementation showing how to build a chat interface using the universal event
|
||||
stream. It demonstrates session management, event rendering, item lifecycle handling, and HITL approval
|
||||
flows.
|
||||
|
|
@ -30,7 +30,14 @@
|
|||
{
|
||||
"group": "Operations",
|
||||
"pages": [
|
||||
"frontend"
|
||||
"frontend",
|
||||
"building-chat-ui"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "SDKs",
|
||||
"pages": [
|
||||
"sdks/typescript"
|
||||
]
|
||||
}
|
||||
]
|
||||
|
|
|
|||
130
docs/sdks/typescript.mdx
Normal file
130
docs/sdks/typescript.mdx
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
---
|
||||
title: "TypeScript SDK"
|
||||
description: "Use the generated client to manage sessions and stream events."
|
||||
---
|
||||
|
||||
The TypeScript SDK is generated from the OpenAPI spec that ships with the daemon. It provides a typed
|
||||
client for sessions, events, and agent operations.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
npm install sandbox-agent
|
||||
```
|
||||
|
||||
## Create a client
|
||||
|
||||
```ts
|
||||
import { SandboxDaemonClient } from "sandbox-agent";
|
||||
|
||||
const client = new SandboxDaemonClient({
|
||||
baseUrl: "http://127.0.0.1:2468",
|
||||
token: process.env.SANDBOX_TOKEN,
|
||||
});
|
||||
```
|
||||
|
||||
Or with the factory helper:
|
||||
|
||||
```ts
|
||||
import { createSandboxDaemonClient } from "sandbox-agent";
|
||||
|
||||
const client = createSandboxDaemonClient({
|
||||
baseUrl: "http://127.0.0.1:2468",
|
||||
});
|
||||
```
|
||||
|
||||
## Autospawn (Node only)
|
||||
|
||||
If you run locally, the SDK can launch the daemon for you.
|
||||
|
||||
```ts
|
||||
import { connectSandboxDaemonClient } from "sandbox-agent";
|
||||
|
||||
const client = await connectSandboxDaemonClient({
|
||||
spawn: { enabled: true },
|
||||
});
|
||||
|
||||
await client.dispose();
|
||||
```
|
||||
|
||||
Autospawn uses the local `sandbox-agent` binary. Install `@sandbox-agent/cli` (recommended) or set
|
||||
`SANDBOX_AGENT_BIN` to a custom path.
|
||||
|
||||
## Sessions and messages
|
||||
|
||||
```ts
|
||||
await client.createSession("demo-session", {
|
||||
agent: "codex",
|
||||
agent_mode: "default",
|
||||
permission_mode: "plan",
|
||||
});
|
||||
|
||||
await client.postMessage("demo-session", { message: "Hello" });
|
||||
```
|
||||
|
||||
List agents and pick a compatible one:
|
||||
|
||||
```ts
|
||||
const agents = await client.listAgents();
|
||||
const codex = agents.agents.find((agent) => agent.id === "codex");
|
||||
console.log(codex?.capabilities);
|
||||
```
|
||||
|
||||
## Poll events
|
||||
|
||||
```ts
|
||||
const events = await client.getEvents("demo-session", {
|
||||
offset: 0,
|
||||
limit: 200,
|
||||
include_raw: false,
|
||||
});
|
||||
|
||||
for (const event of events.events) {
|
||||
console.log(event.event_type, event.data);
|
||||
}
|
||||
```
|
||||
|
||||
## Stream events (SSE)
|
||||
|
||||
```ts
|
||||
for await (const event of client.streamEvents("demo-session", {
|
||||
offset: 0,
|
||||
include_raw: false,
|
||||
})) {
|
||||
console.log(event.event_type, event.data);
|
||||
}
|
||||
```
|
||||
|
||||
The SDK parses `text/event-stream` into `UniversalEvent` objects. If you want full control, use
|
||||
`getEventsSse()` and parse the stream yourself.
|
||||
|
||||
## Optional raw payloads
|
||||
|
||||
Set `include_raw: true` on `getEvents` or `streamEvents` to include the raw provider payload in
|
||||
`event.raw`. This is useful for debugging and conversion analysis.
|
||||
|
||||
## Error handling
|
||||
|
||||
All HTTP errors throw `SandboxDaemonError`:
|
||||
|
||||
```ts
|
||||
import { SandboxDaemonError } from "sandbox-agent";
|
||||
|
||||
try {
|
||||
await client.postMessage("missing-session", { message: "Hi" });
|
||||
} catch (error) {
|
||||
if (error instanceof SandboxDaemonError) {
|
||||
console.error(error.status, error.problem);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Types
|
||||
|
||||
The SDK exports OpenAPI-derived types for events, items, and capabilities:
|
||||
|
||||
```ts
|
||||
import type { UniversalEvent, UniversalItem, AgentCapabilities } from "sandbox-agent";
|
||||
```
|
||||
|
||||
See `docs/universal-api.mdx` for the universal schema fields and semantics.
|
||||
|
|
@ -1314,6 +1314,37 @@
|
|||
white-space: nowrap;
|
||||
}
|
||||
|
||||
/* Capability Badges */
|
||||
.capability-badges {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 6px;
|
||||
}
|
||||
|
||||
.capability-badge {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 4px;
|
||||
padding: 3px 8px;
|
||||
border-radius: 4px;
|
||||
font-size: 10px;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.capability-badge.enabled {
|
||||
background: rgba(48, 209, 88, 0.12);
|
||||
color: var(--success);
|
||||
}
|
||||
|
||||
.capability-badge.disabled {
|
||||
background: rgba(255, 255, 255, 0.04);
|
||||
color: var(--muted-2);
|
||||
}
|
||||
|
||||
.capability-badge svg {
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
/* Scrollbar */
|
||||
.messages-container::-webkit-scrollbar,
|
||||
.debug-content::-webkit-scrollbar {
|
||||
|
|
|
|||
|
|
@ -2,6 +2,7 @@ import {
|
|||
Clipboard,
|
||||
Cloud,
|
||||
Download,
|
||||
GitBranch,
|
||||
HelpCircle,
|
||||
MessageSquare,
|
||||
PauseCircle,
|
||||
|
|
@ -11,6 +12,7 @@ import {
|
|||
Send,
|
||||
Shield,
|
||||
Terminal,
|
||||
Wrench,
|
||||
Zap
|
||||
} from "lucide-react";
|
||||
import { useCallback, useEffect, useMemo, useRef, useState } from "react";
|
||||
|
|
@ -85,14 +87,24 @@ const formatJson = (value: unknown) => {
|
|||
|
||||
const escapeSingleQuotes = (value: string) => value.replace(/'/g, `'\\''`);
|
||||
|
||||
const formatCapabilities = (capabilities: AgentCapabilities) => {
|
||||
const parts = [
|
||||
`planMode ${capabilities.planMode ? "✓" : "—"}`,
|
||||
`permissions ${capabilities.permissions ? "✓" : "—"}`,
|
||||
`questions ${capabilities.questions ? "✓" : "—"}`,
|
||||
`toolCalls ${capabilities.toolCalls ? "✓" : "—"}`
|
||||
const CapabilityBadges = ({ capabilities }: { capabilities: AgentCapabilities }) => {
|
||||
const items = [
|
||||
{ key: "planMode", label: "Plan", icon: GitBranch, enabled: capabilities.planMode },
|
||||
{ key: "permissions", label: "Perms", icon: Shield, enabled: capabilities.permissions },
|
||||
{ key: "questions", label: "Q&A", icon: HelpCircle, enabled: capabilities.questions },
|
||||
{ key: "toolCalls", label: "Tools", icon: Wrench, enabled: capabilities.toolCalls }
|
||||
];
|
||||
return parts.join(" · ");
|
||||
|
||||
return (
|
||||
<div className="capability-badges">
|
||||
{items.map(({ key, label, icon: Icon, enabled }) => (
|
||||
<span key={key} className={`capability-badge ${enabled ? "enabled" : "disabled"}`}>
|
||||
<Icon size={12} />
|
||||
<span>{label}</span>
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
const buildCurl = (method: string, url: string, body?: string, token?: string) => {
|
||||
|
|
@ -1459,8 +1471,8 @@ export default function App() {
|
|||
{agent.version ? `v${agent.version}` : "Version unknown"}
|
||||
{agent.path && <span className="mono muted" style={{ marginLeft: 8 }}>{agent.path}</span>}
|
||||
</div>
|
||||
<div className="card-meta" style={{ marginTop: 8 }}>
|
||||
Capabilities: {formatCapabilities(agent.capabilities ?? emptyCapabilities)}
|
||||
<div style={{ marginTop: 8 }}>
|
||||
<CapabilityBadges capabilities={agent.capabilities ?? emptyCapabilities} />
|
||||
</div>
|
||||
{modesByAgent[agent.id] && modesByAgent[agent.id].length > 0 && (
|
||||
<div className="card-meta" style={{ marginTop: 8 }}>
|
||||
|
|
|
|||
|
|
@ -5,6 +5,12 @@ export default defineConfig(({ command }) => ({
|
|||
base: command === "build" ? "/ui/" : "/",
|
||||
plugins: [react()],
|
||||
server: {
|
||||
port: 5173
|
||||
}
|
||||
port: 5173,
|
||||
proxy: {
|
||||
"/v1": {
|
||||
target: "http://localhost:2468",
|
||||
changeOrigin: true,
|
||||
},
|
||||
},
|
||||
},
|
||||
}));
|
||||
|
|
|
|||
|
|
@ -7,7 +7,8 @@
|
|||
"build": "turbo run build",
|
||||
"dev": "turbo run dev --parallel",
|
||||
"generate": "turbo run generate",
|
||||
"typecheck": "turbo run typecheck"
|
||||
"typecheck": "turbo run typecheck",
|
||||
"docs": "pnpm dlx mintlify dev docs"
|
||||
},
|
||||
"devDependencies": {
|
||||
"turbo": "^2.4.0"
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ use sandbox_agent_agent_management::credentials::{
|
|||
ProviderCredentials,
|
||||
};
|
||||
use sandbox_agent::router::{
|
||||
AgentInstallRequest, AppState, AuthConfig, CreateSessionRequest, MessageRequest,
|
||||
AgentInstallRequest, AppState, AuthConfig, CreateSessionRequest, MessageRequest, MockConfig,
|
||||
PermissionReply, PermissionReplyRequest, QuestionReplyRequest,
|
||||
};
|
||||
use sandbox_agent::router::{AgentListResponse, AgentModesResponse, CreateSessionResponse, EventsResponse};
|
||||
|
|
@ -72,6 +72,9 @@ struct ServerArgs {
|
|||
|
||||
#[arg(long = "cors-allow-credentials", short = 'C')]
|
||||
cors_allow_credentials: bool,
|
||||
|
||||
#[arg(long)]
|
||||
mock: bool,
|
||||
}
|
||||
|
||||
#[derive(Args, Debug)]
|
||||
|
|
@ -334,7 +337,12 @@ fn run_server(cli: &Cli, server: &ServerArgs) -> Result<(), CliError> {
|
|||
|
||||
let agent_manager =
|
||||
AgentManager::new(default_install_dir()).map_err(|err| CliError::Server(err.to_string()))?;
|
||||
let state = AppState::new(auth, agent_manager);
|
||||
let mock = if server.mock {
|
||||
MockConfig::enabled()
|
||||
} else {
|
||||
MockConfig::disabled()
|
||||
};
|
||||
let state = AppState::new(auth, agent_manager, mock);
|
||||
let mut router = build_router(state);
|
||||
|
||||
if let Some(cors) = build_cors_layer(server)? {
|
||||
|
|
|
|||
|
|
@ -65,6 +65,34 @@ use sandbox_agent_agent_management::credentials::{
|
|||
};
|
||||
use crate::ui;
|
||||
|
||||
const MOCK_EVENT_DELAY_MS: u64 = 200;
|
||||
const MOCK_LOOP_DELAY_MS: u64 = 2000;
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct MockConfig {
|
||||
enabled: bool,
|
||||
event_delay: Duration,
|
||||
loop_delay: Duration,
|
||||
}
|
||||
|
||||
impl MockConfig {
|
||||
pub fn disabled() -> Self {
|
||||
Self {
|
||||
enabled: false,
|
||||
event_delay: Duration::ZERO,
|
||||
loop_delay: Duration::ZERO,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn enabled() -> Self {
|
||||
Self {
|
||||
enabled: true,
|
||||
event_delay: Duration::from_millis(MOCK_EVENT_DELAY_MS),
|
||||
loop_delay: Duration::from_millis(MOCK_LOOP_DELAY_MS),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct AppState {
|
||||
auth: AuthConfig,
|
||||
|
|
@ -73,9 +101,9 @@ pub struct AppState {
|
|||
}
|
||||
|
||||
impl AppState {
|
||||
pub fn new(auth: AuthConfig, agent_manager: AgentManager) -> Self {
|
||||
pub fn new(auth: AuthConfig, agent_manager: AgentManager, mock: MockConfig) -> Self {
|
||||
let agent_manager = Arc::new(agent_manager);
|
||||
let session_manager = Arc::new(SessionManager::new(agent_manager.clone()));
|
||||
let session_manager = Arc::new(SessionManager::new(agent_manager.clone(), mock));
|
||||
Self {
|
||||
auth,
|
||||
agent_manager,
|
||||
|
|
@ -565,6 +593,7 @@ struct SessionManager {
|
|||
sessions: Mutex<HashMap<String, SessionState>>,
|
||||
opencode_server: Mutex<Option<OpencodeServer>>,
|
||||
http_client: Client,
|
||||
mock: MockConfig,
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
|
|
@ -580,12 +609,13 @@ struct SessionSubscription {
|
|||
}
|
||||
|
||||
impl SessionManager {
|
||||
fn new(agent_manager: Arc<AgentManager>) -> Self {
|
||||
fn new(agent_manager: Arc<AgentManager>, mock: MockConfig) -> Self {
|
||||
Self {
|
||||
agent_manager,
|
||||
sessions: Mutex::new(HashMap::new()),
|
||||
opencode_server: Mutex::new(None),
|
||||
http_client: Client::new(),
|
||||
mock,
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -602,6 +632,10 @@ impl SessionManager {
|
|||
}
|
||||
}
|
||||
|
||||
if self.mock.enabled {
|
||||
return self.create_mock_session(session_id, agent_id, request).await;
|
||||
}
|
||||
|
||||
let manager = self.agent_manager.clone();
|
||||
let agent_version = request.agent_version.clone();
|
||||
let agent_name = request.agent.clone();
|
||||
|
|
@ -660,6 +694,32 @@ impl SessionManager {
|
|||
})
|
||||
}
|
||||
|
||||
async fn create_mock_session(
|
||||
self: &Arc<Self>,
|
||||
session_id: String,
|
||||
agent_id: AgentId,
|
||||
request: CreateSessionRequest,
|
||||
) -> Result<CreateSessionResponse, SandboxError> {
|
||||
let mut session = SessionState::new(session_id.clone(), agent_id, &request)?;
|
||||
session.native_session_id = Some(format!("mock-{session_id}"));
|
||||
let native_session_id = session.native_session_id.clone();
|
||||
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
sessions.insert(session_id.clone(), session);
|
||||
drop(sessions);
|
||||
|
||||
let manager = Arc::clone(self);
|
||||
tokio::spawn(async move {
|
||||
manager.run_mock_loop(session_id).await;
|
||||
});
|
||||
|
||||
Ok(CreateSessionResponse {
|
||||
healthy: true,
|
||||
error: None,
|
||||
native_session_id,
|
||||
})
|
||||
}
|
||||
|
||||
async fn agent_modes(&self, agent: AgentId) -> Result<Vec<AgentModeInfo>, SandboxError> {
|
||||
if agent != AgentId::Opencode {
|
||||
return Ok(agent_modes_for(agent));
|
||||
|
|
@ -683,6 +743,11 @@ impl SessionManager {
|
|||
session_id: String,
|
||||
message: String,
|
||||
) -> Result<(), SandboxError> {
|
||||
if self.mock.enabled {
|
||||
self.session_snapshot(&session_id, false).await?;
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let session_snapshot = self.session_snapshot(&session_id, false).await?;
|
||||
if session_snapshot.agent == AgentId::Opencode {
|
||||
self.ensure_opencode_stream(session_id.clone()).await?;
|
||||
|
|
@ -833,6 +898,43 @@ impl SessionManager {
|
|||
question_id: &str,
|
||||
answers: Vec<Vec<String>>,
|
||||
) -> Result<(), SandboxError> {
|
||||
if self.mock.enabled {
|
||||
let pending = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
let session = sessions.get_mut(session_id).ok_or_else(|| SandboxError::SessionNotFound {
|
||||
session_id: session_id.to_string(),
|
||||
})?;
|
||||
if let Some(err) = session.ended_error() {
|
||||
return Err(err);
|
||||
}
|
||||
session.take_question(question_id)
|
||||
};
|
||||
let (prompt, options) = match pending {
|
||||
Some(pending) => (pending.prompt, pending.options),
|
||||
None => (
|
||||
"Mock question prompt".to_string(),
|
||||
vec!["Option A".to_string(), "Option B".to_string()],
|
||||
),
|
||||
};
|
||||
let response = answers
|
||||
.first()
|
||||
.and_then(|inner| inner.first())
|
||||
.cloned();
|
||||
let resolved = EventConversion::new(
|
||||
UniversalEventType::QuestionResolved,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_id.to_string(),
|
||||
prompt,
|
||||
options,
|
||||
response,
|
||||
status: QuestionStatus::Answered,
|
||||
}),
|
||||
)
|
||||
.synthetic();
|
||||
let _ = self.record_conversions(session_id, vec![resolved]).await;
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let (agent, native_session_id, pending_question) = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
let session = sessions.get_mut(session_id).ok_or_else(|| SandboxError::SessionNotFound {
|
||||
|
|
@ -891,6 +993,39 @@ impl SessionManager {
|
|||
session_id: &str,
|
||||
question_id: &str,
|
||||
) -> Result<(), SandboxError> {
|
||||
if self.mock.enabled {
|
||||
let pending = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
let session = sessions.get_mut(session_id).ok_or_else(|| SandboxError::SessionNotFound {
|
||||
session_id: session_id.to_string(),
|
||||
})?;
|
||||
if let Some(err) = session.ended_error() {
|
||||
return Err(err);
|
||||
}
|
||||
session.take_question(question_id)
|
||||
};
|
||||
let (prompt, options) = match pending {
|
||||
Some(pending) => (pending.prompt, pending.options),
|
||||
None => (
|
||||
"Mock question prompt".to_string(),
|
||||
vec!["Option A".to_string(), "Option B".to_string()],
|
||||
),
|
||||
};
|
||||
let resolved = EventConversion::new(
|
||||
UniversalEventType::QuestionResolved,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_id.to_string(),
|
||||
prompt,
|
||||
options,
|
||||
response: None,
|
||||
status: QuestionStatus::Rejected,
|
||||
}),
|
||||
)
|
||||
.synthetic();
|
||||
let _ = self.record_conversions(session_id, vec![resolved]).await;
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let (agent, native_session_id, pending_question) = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
let session = sessions.get_mut(session_id).ok_or_else(|| SandboxError::SessionNotFound {
|
||||
|
|
@ -945,6 +1080,40 @@ impl SessionManager {
|
|||
permission_id: &str,
|
||||
reply: PermissionReply,
|
||||
) -> Result<(), SandboxError> {
|
||||
if self.mock.enabled {
|
||||
let pending = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
let session = sessions.get_mut(session_id).ok_or_else(|| SandboxError::SessionNotFound {
|
||||
session_id: session_id.to_string(),
|
||||
})?;
|
||||
if let Some(err) = session.ended_error() {
|
||||
return Err(err);
|
||||
}
|
||||
session.take_permission(permission_id)
|
||||
};
|
||||
|
||||
let (action, metadata) = match pending {
|
||||
Some(pending) => (pending.action, pending.metadata),
|
||||
None => ("mock.permission".to_string(), None),
|
||||
};
|
||||
let status = match reply {
|
||||
PermissionReply::Reject => PermissionStatus::Denied,
|
||||
PermissionReply::Once | PermissionReply::Always => PermissionStatus::Approved,
|
||||
};
|
||||
let resolved = EventConversion::new(
|
||||
UniversalEventType::PermissionResolved,
|
||||
UniversalEventData::Permission(PermissionEventData {
|
||||
permission_id: permission_id.to_string(),
|
||||
action,
|
||||
status,
|
||||
metadata,
|
||||
}),
|
||||
)
|
||||
.synthetic();
|
||||
let _ = self.record_conversions(session_id, vec![resolved]).await;
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let reply_for_status = reply.clone();
|
||||
let (agent, native_session_id, codex_sender, pending_permission) = {
|
||||
let mut sessions = self.sessions.lock().await;
|
||||
|
|
@ -1055,6 +1224,50 @@ impl SessionManager {
|
|||
Ok(())
|
||||
}
|
||||
|
||||
async fn run_mock_loop(self: Arc<Self>, session_id: String) {
|
||||
let mut cycle = 0_u64;
|
||||
let event_delay = self.mock.event_delay;
|
||||
let loop_delay = self.mock.loop_delay;
|
||||
|
||||
loop {
|
||||
if self.is_session_ended(&session_id).await {
|
||||
return;
|
||||
}
|
||||
let snapshot = match self.session_snapshot(&session_id, true).await {
|
||||
Ok(snapshot) => snapshot,
|
||||
Err(_) => return,
|
||||
};
|
||||
cycle = cycle.saturating_add(1);
|
||||
let conversions = mock_event_conversions(cycle, &snapshot);
|
||||
for conversion in conversions {
|
||||
if self
|
||||
.record_conversions(&session_id, vec![conversion])
|
||||
.await
|
||||
.is_err()
|
||||
{
|
||||
return;
|
||||
}
|
||||
if event_delay != Duration::ZERO {
|
||||
sleep(event_delay).await;
|
||||
}
|
||||
if self.is_session_ended(&session_id).await {
|
||||
return;
|
||||
}
|
||||
}
|
||||
if loop_delay != Duration::ZERO {
|
||||
sleep(loop_delay).await;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async fn is_session_ended(&self, session_id: &str) -> bool {
|
||||
let sessions = self.sessions.lock().await;
|
||||
match sessions.get(session_id) {
|
||||
Some(session) => session.ended,
|
||||
None => true,
|
||||
}
|
||||
}
|
||||
|
||||
async fn session_snapshot(
|
||||
&self,
|
||||
session_id: &str,
|
||||
|
|
@ -1756,10 +1969,22 @@ pub struct AgentModesResponse {
|
|||
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema, JsonSchema)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
pub struct AgentCapabilities {
|
||||
// TODO: add agent-agnostic tests that cover every capability flag here.
|
||||
pub plan_mode: bool,
|
||||
pub permissions: bool,
|
||||
pub questions: bool,
|
||||
pub tool_calls: bool,
|
||||
pub tool_results: bool,
|
||||
pub text_messages: bool,
|
||||
pub images: bool,
|
||||
pub file_attachments: bool,
|
||||
pub session_lifecycle: bool,
|
||||
pub error_events: bool,
|
||||
pub reasoning: bool,
|
||||
pub command_execution: bool,
|
||||
pub file_changes: bool,
|
||||
pub mcp_tools: bool,
|
||||
pub streaming_deltas: bool,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema, JsonSchema)]
|
||||
|
|
@ -2246,24 +2471,68 @@ fn agent_capabilities_for(agent: AgentId) -> AgentCapabilities {
|
|||
permissions: false,
|
||||
questions: false,
|
||||
tool_calls: false,
|
||||
tool_results: false,
|
||||
text_messages: true,
|
||||
images: false,
|
||||
file_attachments: false,
|
||||
session_lifecycle: false,
|
||||
error_events: false,
|
||||
reasoning: false,
|
||||
command_execution: false,
|
||||
file_changes: false,
|
||||
mcp_tools: false,
|
||||
streaming_deltas: false,
|
||||
},
|
||||
AgentId::Codex => AgentCapabilities {
|
||||
plan_mode: true,
|
||||
permissions: true,
|
||||
questions: false,
|
||||
tool_calls: true,
|
||||
tool_results: true,
|
||||
text_messages: true,
|
||||
images: true,
|
||||
file_attachments: true,
|
||||
session_lifecycle: true,
|
||||
error_events: true,
|
||||
reasoning: true,
|
||||
command_execution: true,
|
||||
file_changes: true,
|
||||
mcp_tools: true,
|
||||
streaming_deltas: true,
|
||||
},
|
||||
AgentId::Opencode => AgentCapabilities {
|
||||
plan_mode: false,
|
||||
permissions: false,
|
||||
questions: false,
|
||||
tool_calls: true,
|
||||
tool_results: true,
|
||||
text_messages: true,
|
||||
images: true,
|
||||
file_attachments: true,
|
||||
session_lifecycle: true,
|
||||
error_events: true,
|
||||
reasoning: false,
|
||||
command_execution: false,
|
||||
file_changes: false,
|
||||
mcp_tools: false,
|
||||
streaming_deltas: true,
|
||||
},
|
||||
AgentId::Amp => AgentCapabilities {
|
||||
plan_mode: false,
|
||||
permissions: false,
|
||||
questions: false,
|
||||
tool_calls: true,
|
||||
tool_results: true,
|
||||
text_messages: true,
|
||||
images: false,
|
||||
file_attachments: false,
|
||||
session_lifecycle: false,
|
||||
error_events: true,
|
||||
reasoning: false,
|
||||
command_execution: false,
|
||||
file_changes: false,
|
||||
mcp_tools: false,
|
||||
streaming_deltas: false,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
|
@ -3172,6 +3441,441 @@ fn text_delta_from_parts(parts: &[ContentPart]) -> Option<String> {
|
|||
}
|
||||
}
|
||||
|
||||
fn mock_event_conversions(loop_index: u64, session: &SessionSnapshot) -> Vec<EventConversion> {
|
||||
let prefix = format!("mock_{loop_index}");
|
||||
let system_native = format!("{prefix}_system");
|
||||
let user_native = format!("{prefix}_user");
|
||||
let assistant_native = format!("{prefix}_assistant");
|
||||
let status_native = format!("{prefix}_status");
|
||||
let tool_call_native = format!("{prefix}_tool_call");
|
||||
let tool_result_native = format!("{prefix}_tool_result");
|
||||
let image_native = format!("{prefix}_image");
|
||||
let unknown_native = format!("{prefix}_unknown");
|
||||
let permission_id = format!("{prefix}_permission");
|
||||
let permission_deny_id = format!("{prefix}_permission_denied");
|
||||
let question_id = format!("{prefix}_question");
|
||||
let question_reject_id = format!("{prefix}_question_reject");
|
||||
let call_id = format!("{prefix}_call");
|
||||
|
||||
let metadata = json!({
|
||||
"agent": session.agent.as_str(),
|
||||
"agentMode": session.agent_mode.clone(),
|
||||
"permissionMode": session.permission_mode.clone(),
|
||||
"model": session.model.clone(),
|
||||
"variant": session.variant.clone(),
|
||||
"mockCycle": loop_index,
|
||||
});
|
||||
|
||||
let mut events = Vec::new();
|
||||
|
||||
events.push(
|
||||
EventConversion::new(
|
||||
UniversalEventType::SessionStarted,
|
||||
UniversalEventData::SessionStarted(SessionStartedData {
|
||||
metadata: Some(metadata),
|
||||
}),
|
||||
)
|
||||
.synthetic(),
|
||||
);
|
||||
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
system_native.clone(),
|
||||
ItemKind::System,
|
||||
ItemRole::System,
|
||||
ItemStatus::InProgress,
|
||||
vec![ContentPart::Text {
|
||||
text: "System ready for mock events.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
system_native,
|
||||
ItemKind::System,
|
||||
ItemRole::System,
|
||||
ItemStatus::Completed,
|
||||
vec![ContentPart::Text {
|
||||
text: "System ready for mock events.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
user_native.clone(),
|
||||
ItemKind::Message,
|
||||
ItemRole::User,
|
||||
ItemStatus::InProgress,
|
||||
vec![ContentPart::Text {
|
||||
text: "User: run the mock pipeline.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
user_native,
|
||||
ItemKind::Message,
|
||||
ItemRole::User,
|
||||
ItemStatus::Completed,
|
||||
vec![ContentPart::Text {
|
||||
text: "User: run the mock pipeline.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
|
||||
let assistant_parts = vec![
|
||||
ContentPart::Text {
|
||||
text: "Mock assistant response with rich content.".to_string(),
|
||||
},
|
||||
ContentPart::Reasoning {
|
||||
text: "Public reasoning for display.".to_string(),
|
||||
visibility: ReasoningVisibility::Public,
|
||||
},
|
||||
ContentPart::Reasoning {
|
||||
text: "Private reasoning hidden by default.".to_string(),
|
||||
visibility: ReasoningVisibility::Private,
|
||||
},
|
||||
ContentPart::Json {
|
||||
json: json!({
|
||||
"stage": "analysis",
|
||||
"ok": true,
|
||||
"cycle": loop_index
|
||||
}),
|
||||
},
|
||||
];
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
assistant_native.clone(),
|
||||
ItemKind::Message,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::InProgress,
|
||||
assistant_parts.clone(),
|
||||
),
|
||||
));
|
||||
events.push(mock_delta(assistant_native.clone(), "Mock assistant "));
|
||||
events.push(mock_delta(assistant_native.clone(), "streaming delta."));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
assistant_native,
|
||||
ItemKind::Message,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::Completed,
|
||||
assistant_parts,
|
||||
),
|
||||
));
|
||||
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
status_native.clone(),
|
||||
ItemKind::Status,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::InProgress,
|
||||
vec![ContentPart::Status {
|
||||
label: "Indexing".to_string(),
|
||||
detail: Some("2 files".to_string()),
|
||||
}],
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
status_native,
|
||||
ItemKind::Status,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::Completed,
|
||||
vec![ContentPart::Status {
|
||||
label: "Indexing".to_string(),
|
||||
detail: Some("Done".to_string()),
|
||||
}],
|
||||
),
|
||||
));
|
||||
|
||||
let tool_call_part = ContentPart::ToolCall {
|
||||
name: "mock.search".to_string(),
|
||||
arguments: "{\"query\":\"example\"}".to_string(),
|
||||
call_id: call_id.clone(),
|
||||
};
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
tool_call_native.clone(),
|
||||
ItemKind::ToolCall,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::InProgress,
|
||||
vec![tool_call_part.clone()],
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
tool_call_native,
|
||||
ItemKind::ToolCall,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::Completed,
|
||||
vec![tool_call_part],
|
||||
),
|
||||
));
|
||||
|
||||
let tool_result_parts = vec![
|
||||
ContentPart::ToolResult {
|
||||
call_id: call_id.clone(),
|
||||
output: "mock search results".to_string(),
|
||||
},
|
||||
ContentPart::FileRef {
|
||||
path: format!("{prefix}/readme.md"),
|
||||
action: FileAction::Read,
|
||||
diff: None,
|
||||
},
|
||||
ContentPart::FileRef {
|
||||
path: format!("{prefix}/output.txt"),
|
||||
action: FileAction::Write,
|
||||
diff: Some("+mock output\n".to_string()),
|
||||
},
|
||||
ContentPart::FileRef {
|
||||
path: format!("{prefix}/patch.txt"),
|
||||
action: FileAction::Patch,
|
||||
diff: Some("@@ -1,1 +1,1 @@\n-old\n+new\n".to_string()),
|
||||
},
|
||||
];
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
tool_result_native.clone(),
|
||||
ItemKind::ToolResult,
|
||||
ItemRole::Tool,
|
||||
ItemStatus::InProgress,
|
||||
tool_result_parts.clone(),
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
tool_result_native,
|
||||
ItemKind::ToolResult,
|
||||
ItemRole::Tool,
|
||||
ItemStatus::Failed,
|
||||
tool_result_parts,
|
||||
),
|
||||
));
|
||||
|
||||
let image_parts = vec![
|
||||
ContentPart::Text {
|
||||
text: "Here is a mock image output.".to_string(),
|
||||
},
|
||||
ContentPart::Image {
|
||||
path: format!("{prefix}/image.png"),
|
||||
mime: Some("image/png".to_string()),
|
||||
},
|
||||
];
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
image_native.clone(),
|
||||
ItemKind::Message,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::InProgress,
|
||||
image_parts.clone(),
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
image_native,
|
||||
ItemKind::Message,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::Completed,
|
||||
image_parts,
|
||||
),
|
||||
));
|
||||
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemStarted,
|
||||
mock_item(
|
||||
unknown_native.clone(),
|
||||
ItemKind::Unknown,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::InProgress,
|
||||
vec![ContentPart::Text {
|
||||
text: "Unknown item kind example.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
events.push(mock_item_event(
|
||||
UniversalEventType::ItemCompleted,
|
||||
mock_item(
|
||||
unknown_native,
|
||||
ItemKind::Unknown,
|
||||
ItemRole::Assistant,
|
||||
ItemStatus::Completed,
|
||||
vec![ContentPart::Text {
|
||||
text: "Unknown item kind example.".to_string(),
|
||||
}],
|
||||
),
|
||||
));
|
||||
|
||||
let permission_metadata = json!({
|
||||
"codexRequestKind": "commandExecution",
|
||||
"command": "echo mock"
|
||||
});
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::PermissionRequested,
|
||||
UniversalEventData::Permission(PermissionEventData {
|
||||
permission_id: permission_id.clone(),
|
||||
action: "command_execution".to_string(),
|
||||
status: PermissionStatus::Requested,
|
||||
metadata: Some(permission_metadata),
|
||||
}),
|
||||
));
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::PermissionResolved,
|
||||
UniversalEventData::Permission(PermissionEventData {
|
||||
permission_id: permission_id,
|
||||
action: "command_execution".to_string(),
|
||||
status: PermissionStatus::Approved,
|
||||
metadata: None,
|
||||
}),
|
||||
));
|
||||
|
||||
let permission_metadata_deny = json!({
|
||||
"codexRequestKind": "fileChange",
|
||||
"path": format!("{prefix}/deny.txt")
|
||||
});
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::PermissionRequested,
|
||||
UniversalEventData::Permission(PermissionEventData {
|
||||
permission_id: permission_deny_id.clone(),
|
||||
action: "file_change".to_string(),
|
||||
status: PermissionStatus::Requested,
|
||||
metadata: Some(permission_metadata_deny),
|
||||
}),
|
||||
));
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::PermissionResolved,
|
||||
UniversalEventData::Permission(PermissionEventData {
|
||||
permission_id: permission_deny_id,
|
||||
action: "file_change".to_string(),
|
||||
status: PermissionStatus::Denied,
|
||||
metadata: None,
|
||||
}),
|
||||
));
|
||||
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::QuestionRequested,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_id.clone(),
|
||||
prompt: "Choose a color".to_string(),
|
||||
options: vec!["Red".to_string(), "Blue".to_string()],
|
||||
response: None,
|
||||
status: QuestionStatus::Requested,
|
||||
}),
|
||||
));
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::QuestionResolved,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_id,
|
||||
prompt: "Choose a color".to_string(),
|
||||
options: vec!["Red".to_string(), "Blue".to_string()],
|
||||
response: Some("Blue".to_string()),
|
||||
status: QuestionStatus::Answered,
|
||||
}),
|
||||
));
|
||||
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::QuestionRequested,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_reject_id.clone(),
|
||||
prompt: "Allow mock experiment?".to_string(),
|
||||
options: vec!["Yes".to_string(), "No".to_string()],
|
||||
response: None,
|
||||
status: QuestionStatus::Requested,
|
||||
}),
|
||||
));
|
||||
events.push(EventConversion::new(
|
||||
UniversalEventType::QuestionResolved,
|
||||
UniversalEventData::Question(QuestionEventData {
|
||||
question_id: question_reject_id,
|
||||
prompt: "Allow mock experiment?".to_string(),
|
||||
options: vec!["Yes".to_string(), "No".to_string()],
|
||||
response: None,
|
||||
status: QuestionStatus::Rejected,
|
||||
}),
|
||||
));
|
||||
|
||||
events.push(
|
||||
EventConversion::new(
|
||||
UniversalEventType::Error,
|
||||
UniversalEventData::Error(ErrorData {
|
||||
message: "Mock error event.".to_string(),
|
||||
code: Some("mock_error".to_string()),
|
||||
details: Some(json!({ "cycle": loop_index })),
|
||||
}),
|
||||
)
|
||||
.synthetic(),
|
||||
);
|
||||
events.push(agent_unparsed(
|
||||
"mock.stream",
|
||||
"unsupported payload",
|
||||
json!({ "raw": "mock" }),
|
||||
));
|
||||
|
||||
events.push(
|
||||
EventConversion::new(
|
||||
UniversalEventType::SessionEnded,
|
||||
UniversalEventData::SessionEnded(SessionEndedData {
|
||||
reason: SessionEndReason::Completed,
|
||||
terminated_by: TerminatedBy::Agent,
|
||||
}),
|
||||
)
|
||||
.synthetic(),
|
||||
);
|
||||
|
||||
events
|
||||
}
|
||||
|
||||
fn mock_item(
|
||||
native_item_id: String,
|
||||
kind: ItemKind,
|
||||
role: ItemRole,
|
||||
status: ItemStatus,
|
||||
content: Vec<ContentPart>,
|
||||
) -> UniversalItem {
|
||||
UniversalItem {
|
||||
item_id: String::new(),
|
||||
native_item_id: Some(native_item_id),
|
||||
parent_id: None,
|
||||
kind,
|
||||
role: Some(role),
|
||||
content,
|
||||
status,
|
||||
}
|
||||
}
|
||||
|
||||
fn mock_item_event(event_type: UniversalEventType, item: UniversalItem) -> EventConversion {
|
||||
EventConversion::new(
|
||||
event_type,
|
||||
UniversalEventData::Item(ItemEventData { item }),
|
||||
)
|
||||
}
|
||||
|
||||
fn mock_delta(native_item_id: String, delta: &str) -> EventConversion {
|
||||
EventConversion::new(
|
||||
UniversalEventType::ItemDelta,
|
||||
UniversalEventData::ItemDelta(ItemDeltaData {
|
||||
item_id: String::new(),
|
||||
native_item_id: Some(native_item_id),
|
||||
delta: delta.to_string(),
|
||||
}),
|
||||
)
|
||||
}
|
||||
|
||||
fn agent_unparsed(location: &str, error: &str, raw: Value) -> EventConversion {
|
||||
EventConversion::new(
|
||||
UniversalEventType::AgentUnparsed,
|
||||
|
|
|
|||
|
|
@ -17,6 +17,7 @@ use sandbox_agent::router::{
|
|||
AgentCapabilities,
|
||||
AgentListResponse,
|
||||
AuthConfig,
|
||||
MockConfig,
|
||||
};
|
||||
|
||||
const PROMPT: &str = "Reply with exactly the single word OK.";
|
||||
|
|
@ -41,7 +42,11 @@ impl TestApp {
|
|||
let install_dir = tempfile::tempdir().expect("create temp install dir");
|
||||
let manager = AgentManager::new(install_dir.path())
|
||||
.expect("create agent manager");
|
||||
let state = sandbox_agent::router::AppState::new(AuthConfig::disabled(), manager);
|
||||
let state = sandbox_agent::router::AppState::new(
|
||||
AuthConfig::disabled(),
|
||||
manager,
|
||||
MockConfig::disabled(),
|
||||
);
|
||||
let app = build_router(state);
|
||||
Self {
|
||||
app,
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ use tempfile::TempDir;
|
|||
use sandbox_agent_agent_management::agents::{AgentId, AgentManager};
|
||||
use sandbox_agent_agent_management::testing::{test_agents_from_env, TestAgentConfig};
|
||||
use sandbox_agent_agent_credentials::ExtractedCredentials;
|
||||
use sandbox_agent::router::{build_router, AppState, AuthConfig};
|
||||
use sandbox_agent::router::{build_router, AppState, AuthConfig, MockConfig};
|
||||
use tower::util::ServiceExt;
|
||||
use tower_http::cors::CorsLayer;
|
||||
|
||||
|
|
@ -39,7 +39,7 @@ impl TestApp {
|
|||
let install_dir = tempfile::tempdir().expect("create temp install dir");
|
||||
let manager = AgentManager::new(install_dir.path())
|
||||
.expect("create agent manager");
|
||||
let state = AppState::new(auth, manager);
|
||||
let state = AppState::new(auth, manager, MockConfig::disabled());
|
||||
let mut app = build_router(state);
|
||||
if let Some(cors) = cors {
|
||||
app = app.layer(cors);
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@ use axum::body::Body;
|
|||
use axum::http::{Request, StatusCode};
|
||||
use http_body_util::BodyExt;
|
||||
use sandbox_agent_agent_management::agents::AgentManager;
|
||||
use sandbox_agent::router::{build_router, AppState, AuthConfig};
|
||||
use sandbox_agent::router::{build_router, AppState, AuthConfig, MockConfig};
|
||||
use sandbox_agent::ui;
|
||||
use tempfile::TempDir;
|
||||
use tower::util::ServiceExt;
|
||||
|
|
@ -15,7 +15,7 @@ async fn serves_inspector_ui() {
|
|||
|
||||
let install_dir = TempDir::new().expect("create temp install dir");
|
||||
let manager = AgentManager::new(install_dir.path()).expect("create agent manager");
|
||||
let state = AppState::new(AuthConfig::disabled(), manager);
|
||||
let state = AppState::new(AuthConfig::disabled(), manager, MockConfig::disabled());
|
||||
let app = build_router(state);
|
||||
|
||||
let request = Request::builder()
|
||||
|
|
|
|||
|
|
@ -1,5 +0,0 @@
|
|||
# Open Questions / Ambiguities
|
||||
|
||||
- OpenCode server HTTP paths and payloads may differ; current implementation assumes `POST /session`, `POST /session/{id}/prompt`, and `GET /event/subscribe` with JSON `data:` SSE frames.
|
||||
- OpenCode question/permission reply endpoints are assumed as `POST /question/reply`, `/question/reject`, `/permission/reply` with `requestID` fields; confirm actual API shape.
|
||||
- SSE events may not always include `sessionID`/`sessionId` fields; confirm if filtering should use a different field.
|
||||
|
|
@ -1,7 +0,0 @@
|
|||
# Required Tests
|
||||
|
||||
- Session manager streams JSONL line-by-line for Claude/Codex/Amp and yields incremental events.
|
||||
- `/sessions/{id}/messages` returns immediately while background ingestion populates `/events` and `/events/sse`.
|
||||
- SSE subscription delivers live events after the initial offset batch.
|
||||
- OpenCode server mode: create session, send prompt, and receive SSE events filtered to the session.
|
||||
- OpenCode question/permission reply endpoints forward to server APIs.
|
||||
|
|
@ -1,553 +0,0 @@
|
|||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"title": "UniversalEvent",
|
||||
"type": "object",
|
||||
"required": [
|
||||
"data",
|
||||
"event_id",
|
||||
"sequence",
|
||||
"session_id",
|
||||
"source",
|
||||
"synthetic",
|
||||
"time",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"data": {
|
||||
"$ref": "#/definitions/UniversalEventData"
|
||||
},
|
||||
"event_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"native_session_id": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"raw": true,
|
||||
"sequence": {
|
||||
"type": "integer",
|
||||
"format": "uint64",
|
||||
"minimum": 0.0
|
||||
},
|
||||
"session_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"source": {
|
||||
"$ref": "#/definitions/EventSource"
|
||||
},
|
||||
"synthetic": {
|
||||
"type": "boolean"
|
||||
},
|
||||
"time": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"$ref": "#/definitions/UniversalEventType"
|
||||
}
|
||||
},
|
||||
"definitions": {
|
||||
"AgentUnparsedData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"error",
|
||||
"location"
|
||||
],
|
||||
"properties": {
|
||||
"error": {
|
||||
"type": "string"
|
||||
},
|
||||
"location": {
|
||||
"type": "string"
|
||||
},
|
||||
"raw_hash": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"ContentPart": {
|
||||
"oneOf": [
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"text",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"text": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"text"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"json",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"json": true,
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"json"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"arguments",
|
||||
"call_id",
|
||||
"name",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"arguments": {
|
||||
"type": "string"
|
||||
},
|
||||
"call_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"name": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"tool_call"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"call_id",
|
||||
"output",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"call_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"output": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"tool_result"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"action",
|
||||
"path",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"action": {
|
||||
"$ref": "#/definitions/FileAction"
|
||||
},
|
||||
"diff": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"path": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"file_ref"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"text",
|
||||
"type",
|
||||
"visibility"
|
||||
],
|
||||
"properties": {
|
||||
"text": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"reasoning"
|
||||
]
|
||||
},
|
||||
"visibility": {
|
||||
"$ref": "#/definitions/ReasoningVisibility"
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"path",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"mime": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"path": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"image"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"required": [
|
||||
"label",
|
||||
"type"
|
||||
],
|
||||
"properties": {
|
||||
"detail": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"label": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"status"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"ErrorData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"message"
|
||||
],
|
||||
"properties": {
|
||||
"code": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"details": true,
|
||||
"message": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
},
|
||||
"EventSource": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"agent",
|
||||
"daemon"
|
||||
]
|
||||
},
|
||||
"FileAction": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"read",
|
||||
"write",
|
||||
"patch"
|
||||
]
|
||||
},
|
||||
"ItemDeltaData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"delta",
|
||||
"item_id"
|
||||
],
|
||||
"properties": {
|
||||
"delta": {
|
||||
"type": "string"
|
||||
},
|
||||
"item_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"native_item_id": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"ItemEventData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"item"
|
||||
],
|
||||
"properties": {
|
||||
"item": {
|
||||
"$ref": "#/definitions/UniversalItem"
|
||||
}
|
||||
}
|
||||
},
|
||||
"ItemKind": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"message",
|
||||
"tool_call",
|
||||
"tool_result",
|
||||
"system",
|
||||
"status",
|
||||
"unknown"
|
||||
]
|
||||
},
|
||||
"ItemRole": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"user",
|
||||
"assistant",
|
||||
"system",
|
||||
"tool"
|
||||
]
|
||||
},
|
||||
"ItemStatus": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"in_progress",
|
||||
"completed",
|
||||
"failed"
|
||||
]
|
||||
},
|
||||
"PermissionEventData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"action",
|
||||
"permission_id",
|
||||
"status"
|
||||
],
|
||||
"properties": {
|
||||
"action": {
|
||||
"type": "string"
|
||||
},
|
||||
"metadata": true,
|
||||
"permission_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"status": {
|
||||
"$ref": "#/definitions/PermissionStatus"
|
||||
}
|
||||
}
|
||||
},
|
||||
"PermissionStatus": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"requested",
|
||||
"approved",
|
||||
"denied"
|
||||
]
|
||||
},
|
||||
"QuestionEventData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"options",
|
||||
"prompt",
|
||||
"question_id",
|
||||
"status"
|
||||
],
|
||||
"properties": {
|
||||
"options": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"prompt": {
|
||||
"type": "string"
|
||||
},
|
||||
"question_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"response": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"status": {
|
||||
"$ref": "#/definitions/QuestionStatus"
|
||||
}
|
||||
}
|
||||
},
|
||||
"QuestionStatus": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"requested",
|
||||
"answered",
|
||||
"rejected"
|
||||
]
|
||||
},
|
||||
"ReasoningVisibility": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"public",
|
||||
"private"
|
||||
]
|
||||
},
|
||||
"SessionEndReason": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"completed",
|
||||
"error",
|
||||
"terminated"
|
||||
]
|
||||
},
|
||||
"SessionEndedData": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"reason",
|
||||
"terminated_by"
|
||||
],
|
||||
"properties": {
|
||||
"reason": {
|
||||
"$ref": "#/definitions/SessionEndReason"
|
||||
},
|
||||
"terminated_by": {
|
||||
"$ref": "#/definitions/TerminatedBy"
|
||||
}
|
||||
}
|
||||
},
|
||||
"SessionStartedData": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"metadata": true
|
||||
}
|
||||
},
|
||||
"TerminatedBy": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"agent",
|
||||
"daemon"
|
||||
]
|
||||
},
|
||||
"UniversalEventData": {
|
||||
"anyOf": [
|
||||
{
|
||||
"$ref": "#/definitions/SessionStartedData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/SessionEndedData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/ItemEventData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/ItemDeltaData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/ErrorData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/PermissionEventData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/QuestionEventData"
|
||||
},
|
||||
{
|
||||
"$ref": "#/definitions/AgentUnparsedData"
|
||||
}
|
||||
]
|
||||
},
|
||||
"UniversalEventType": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"session.started",
|
||||
"session.ended",
|
||||
"item.started",
|
||||
"item.delta",
|
||||
"item.completed",
|
||||
"error",
|
||||
"permission.requested",
|
||||
"permission.resolved",
|
||||
"question.requested",
|
||||
"question.resolved",
|
||||
"agent.unparsed"
|
||||
]
|
||||
},
|
||||
"UniversalItem": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"content",
|
||||
"item_id",
|
||||
"kind",
|
||||
"status"
|
||||
],
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/definitions/ContentPart"
|
||||
}
|
||||
},
|
||||
"item_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"kind": {
|
||||
"$ref": "#/definitions/ItemKind"
|
||||
},
|
||||
"native_item_id": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"parent_id": {
|
||||
"type": [
|
||||
"string",
|
||||
"null"
|
||||
]
|
||||
},
|
||||
"role": {
|
||||
"anyOf": [
|
||||
{
|
||||
"$ref": "#/definitions/ItemRole"
|
||||
},
|
||||
{
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
},
|
||||
"status": {
|
||||
"$ref": "#/definitions/ItemStatus"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1,143 +0,0 @@
|
|||
# Universal Schema (Single Version, Breaking)
|
||||
|
||||
This document defines the canonical universal session + event model. It replaces prior versions; there is no v2. The design prioritizes compatibility with native agent APIs and fills gaps with explicit synthetics.
|
||||
|
||||
Principles
|
||||
- Most-compatible-first: choose semantics that map cleanly to native APIs (Codex/OpenCode/Amp/Claude).
|
||||
- Uniform behavior: clients should not special-case agents; the daemon normalizes differences.
|
||||
- Synthetics fill gaps: when a provider lacks a feature (session start/end, deltas, user messages), we synthesize events with `source=daemon`.
|
||||
- Raw preservation: always keep native payloads in `raw` for agent-sourced events.
|
||||
- UI coverage: update the inspector/UI to the new schema and ensure UI tests cover all session features (messages, deltas, tools, permissions, questions, errors, termination).
|
||||
|
||||
Identifiers
|
||||
- session_id: daemon-generated session identifier.
|
||||
- native_session_id: provider thread/session/run identifier (thread_id is merged here).
|
||||
- item_id: daemon-generated identifier for any universal item.
|
||||
- native_item_id: provider-native item/message identifier if available; otherwise null.
|
||||
|
||||
Event envelope
|
||||
```json
|
||||
{
|
||||
"event_id": "evt_...",
|
||||
"sequence": 42,
|
||||
"time": "2026-01-27T19:10:11Z",
|
||||
"session_id": "sess_...",
|
||||
"native_session_id": "provider_...",
|
||||
"synthetic": false,
|
||||
"source": "agent|daemon",
|
||||
"type": "session.started|session.ended|item.started|item.delta|item.completed|error|permission.requested|permission.resolved|question.requested|question.resolved|agent.unparsed",
|
||||
"data": { "..." : "..." },
|
||||
"raw": { "..." : "..." }
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
- `source=agent` for native events; `source=daemon` for synthetics.
|
||||
- `synthetic` is always present and mirrors whether the event is daemon-produced.
|
||||
- `raw` is always present. It may be null unless the client opts in to raw payloads; when opt-in is enabled, raw is populated for all events.
|
||||
- For synthetic events derived from native payloads, include the underlying payload in `raw` when possible.
|
||||
- Parsing failures emit agent.unparsed (source=daemon, synthetic=true) and should be treated as test failures.
|
||||
|
||||
Raw payload opt-in
|
||||
- Events endpoints accept `include_raw=true` to populate the `raw` field.
|
||||
- When `include_raw` is not set or false, `raw` is still present but null.
|
||||
- Applies to both HTTP and SSE event streams.
|
||||
|
||||
Item model
|
||||
```json
|
||||
{
|
||||
"item_id": "itm_...",
|
||||
"native_item_id": "provider_item_...",
|
||||
"parent_id": "itm_parent_or_null",
|
||||
"kind": "message|tool_call|tool_result|system|status|unknown",
|
||||
"role": "user|assistant|system|tool|null",
|
||||
"content": [ { "type": "...", "...": "..." } ],
|
||||
"status": "in_progress|completed|failed"
|
||||
}
|
||||
```
|
||||
|
||||
Content parts (non-exhaustive; extend as needed)
|
||||
- text: `{ "type": "text", "text": "..." }`
|
||||
- json: `{ "type": "json", "json": { ... } }`
|
||||
- tool_call: `{ "type": "tool_call", "name": "...", "arguments": "...", "call_id": "..." }`
|
||||
- tool_result: `{ "type": "tool_result", "call_id": "...", "output": "..." }`
|
||||
- file_ref: `{ "type": "file_ref", "path": "...", "action": "read|write|patch", "diff": "..." }`
|
||||
- reasoning: `{ "type": "reasoning", "text": "...", "visibility": "public|private" }`
|
||||
- image: `{ "type": "image", "path": "...", "mime": "..." }`
|
||||
- status: `{ "type": "status", "label": "...", "detail": "..." }`
|
||||
|
||||
Event types
|
||||
|
||||
session.started
|
||||
```json
|
||||
{ "metadata": { "...": "..." } }
|
||||
```
|
||||
|
||||
session.ended
|
||||
```json
|
||||
{ "reason": "completed|error|terminated", "terminated_by": "agent|daemon" }
|
||||
```
|
||||
|
||||
item.started
|
||||
```json
|
||||
{ "item": { ...Item } }
|
||||
```
|
||||
|
||||
item.delta
|
||||
```json
|
||||
{ "item_id": "itm_...", "native_item_id": "provider_item_or_null", "delta": "text fragment" }
|
||||
```
|
||||
|
||||
item.completed
|
||||
```json
|
||||
{ "item": { ...Item } }
|
||||
```
|
||||
|
||||
error
|
||||
```json
|
||||
{ "message": "...", "code": "optional", "details": { "...": "..." } }
|
||||
```
|
||||
|
||||
agent.unparsed
|
||||
```json
|
||||
{ "error": "parse failure message", "location": "agent parser name", "raw_hash": "optional" }
|
||||
```
|
||||
|
||||
permission.requested / permission.resolved
|
||||
```json
|
||||
{ "permission_id": "...", "action": "...", "status": "requested|approved|denied", "metadata": { "...": "..." } }
|
||||
```
|
||||
|
||||
question.requested / question.resolved
|
||||
```json
|
||||
{ "question_id": "...", "prompt": "...", "options": ["..."], "response": "...", "status": "requested|answered|rejected" }
|
||||
```
|
||||
|
||||
Delta policy (uniform across agents)
|
||||
- Always emit item.delta for messages.
|
||||
- For agents without native deltas (Claude/Amp), emit a single synthetic delta containing the full final content immediately before item.completed.
|
||||
- For Codex/OpenCode, forward native deltas as-is and still emit item.completed with the final content.
|
||||
|
||||
User messages
|
||||
- If the provider emits user messages (Codex/OpenCode/Amp), map directly to message items with role=user.
|
||||
- If the provider does not emit user messages (Claude), synthesize user message items from the input we send; mark source=daemon and set native_item_id=null.
|
||||
|
||||
Tool normalization
|
||||
- Tool calls/results are always emitted as their own items (kind=tool_call/tool_result) with parent_id pointing to the originating message item.
|
||||
- Codex: mcp tool call progress and tool items map directly.
|
||||
- OpenCode: tool parts in message.part.updated are mapped into tool items with lifecycle states.
|
||||
- Amp: tool_call/tool_result map directly.
|
||||
- Claude: synthesize tool items from CLI tool usage where possible; if insufficient, omit tool items and preserve raw payloads.
|
||||
|
||||
OpenCode ordering rule
|
||||
- OpenCode may emit message.part.updated before message.updated.
|
||||
- When a part delta arrives first, create a stub item.started (source=daemon) for the parent message item, then emit item.delta.
|
||||
|
||||
Session lifecycle
|
||||
- If an agent does not emit a session start/end, emit session.started/session.ended synthetically (source=daemon).
|
||||
- session.ended uses terminated_by=daemon when our termination API is used; terminated_by=agent when the provider ends the session.
|
||||
|
||||
Native ID mapping
|
||||
- native_session_id is the only provider session identifier.
|
||||
- native_item_id preserves the provider item/message id when available; otherwise null.
|
||||
- item_id is always daemon-generated.
|
||||
118
todo.md
118
todo.md
|
|
@ -1,116 +1,4 @@
|
|||
# TODO (from spec.md)
|
||||
# Todo
|
||||
|
||||
## Universal API + Types
|
||||
- [x] Define universal base types for agent input/output (common denominator across schemas)
|
||||
- [x] Add universal question + permission types (HITL) and ensure they are supported end-to-end
|
||||
- [x] Define `UniversalEvent` + `UniversalEventData` union and `AgentError` shape
|
||||
- [x] Define a universal message type for "failed to parse" with raw JSON payload
|
||||
- [x] Implement 2-way converters:
|
||||
- [x] Universal input message <-> agent-specific input
|
||||
- [x] Universal event <-> agent-specific event
|
||||
- [x] Normalize Claude system/init events into universal started events
|
||||
- [x] Support Codex CLI type-based event format in universal converter
|
||||
- [x] Enforce agentMode vs permissionMode semantics + defaults at the API boundary
|
||||
- [x] Ensure session id vs agentSessionId semantics are respected and surfaced consistently
|
||||
|
||||
## Daemon (Rust HTTP server)
|
||||
- [x] Build axum router + utoipa + schemars integration
|
||||
- [x] Implement RFC 7807 Problem Details error responses backed by a `thiserror` enum
|
||||
- [x] Implement canonical error `type` values + required error variants from spec
|
||||
- [x] Implement offset semantics for events (exclusive last-seen id, default offset 0)
|
||||
- [x] Implement SSE endpoint for events with same semantics as JSON endpoint
|
||||
- [x] Replace in-memory session store with sandbox session manager (questions/permissions routing, long-lived processes)
|
||||
- [x] Remove legacy token header support
|
||||
- [x] Embed inspector frontend and serve it at `/ui`
|
||||
- [x] Log inspector URL when starting the HTTP server
|
||||
|
||||
## CLI
|
||||
- [x] Implement clap CLI flags: `--token`, `--no-token`, `--host`, `--port`, CORS flags
|
||||
- [x] Implement a CLI endpoint for every HTTP endpoint
|
||||
- [x] Update `CLAUDE.md` to keep CLI endpoints in sync with HTTP API changes
|
||||
- [x] Prefix CLI API requests with `/v1`
|
||||
- [x] Add CLI credentials extractor subcommand
|
||||
- [x] Move daemon startup to `server` subcommand
|
||||
- [x] Add `sandbox-daemon` CLI alias
|
||||
|
||||
## HTTP API Endpoints
|
||||
- [x] POST `/agents/{}/install` with `reinstall` handling
|
||||
- [x] GET `/agents/{}/modes` (mode discovery or hardcoded)
|
||||
- [x] GET `/agents` (installed/version/path; version checked at request time)
|
||||
- [x] POST `/sessions/{}` (create session, install if needed, return health + agentSessionId)
|
||||
- [x] POST `/sessions/{}/messages` (send prompt)
|
||||
- [x] GET `/sessions/{}/events` (pagination with offset/limit)
|
||||
- [x] GET `/sessions/{}/events/sse` (streaming)
|
||||
- [x] POST `/sessions/{}/questions/{questionId}/reply`
|
||||
- [x] POST `/sessions/{}/questions/{questionId}/reject`
|
||||
- [x] POST `/sessions/{}/permissions/{permissionId}/reply`
|
||||
- [x] Prefix all HTTP API endpoints with `/v1`
|
||||
|
||||
## Agent Management
|
||||
- [x] Implement install/version/spawn basics for Claude/Codex/OpenCode/Amp
|
||||
- [x] Implement agent install URL patterns + platform mappings for supported OS/arch
|
||||
- [x] Parse JSONL output for subprocess agents and extract session/result metadata
|
||||
- [x] Migrate Codex subprocess to App Server JSON-RPC protocol
|
||||
- [x] Map permissionMode to agent CLI flags (Claude/Codex/Amp)
|
||||
- [x] Implement session resume flags for Claude/OpenCode/Amp (Codex unsupported)
|
||||
- [x] Replace sandbox-agent core agent modules with new agent-management crate (delete originals)
|
||||
- [x] Stabilize agent-management crate API and fix build issues (sandbox-agent currently wired to WIP crate)
|
||||
- [x] Implement OpenCode shared server lifecycle (`opencode serve`, health, restart)
|
||||
- [x] Implement OpenCode HTTP session APIs + SSE event stream integration
|
||||
- [x] Implement JSONL parsing for subprocess agents and map to `UniversalEvent`
|
||||
- [x] Capture agent session id from events and expose as `agentSessionId`
|
||||
- [x] Handle agent process exit and map to `agent_process_exited` error
|
||||
- [x] Implement agentMode discovery rules (OpenCode API, hardcoded others)
|
||||
- [x] Enforce permissionMode behavior (default/plan/bypass) for subprocesses
|
||||
|
||||
## Credentials
|
||||
- [x] Implement credential extraction module (Claude/Codex/OpenCode)
|
||||
- [x] Add Amp credential extraction (config-based)
|
||||
- [x] Move credential extraction into `agent-credentials` crate
|
||||
- [ ] Pass extracted credentials into subprocess env vars per agent
|
||||
- [ ] Ensure OpenCode server reads credentials from config on startup
|
||||
|
||||
## Testing
|
||||
- [ ] Build a universal agent test suite that exercises all features (messages, questions, permissions, etc.) using HTTP API
|
||||
- [ ] Run the full suite against every agent (Claude/Codex/OpenCode/Amp) without mocks
|
||||
- [x] Add real install/version/spawn tests for Claude/Codex/OpenCode (Amp conditional)
|
||||
- [x] Expand agent lifecycle tests (reinstall, session id extraction, resume, plan mode)
|
||||
- [x] Add OpenCode server-mode tests (session create, prompt, SSE)
|
||||
- [ ] Add tests for question/permission flows using deterministic prompts
|
||||
- [x] Add HTTP/SSE snapshot tests for real agents (env-configured)
|
||||
- [x] Add snapshot coverage for auth, CORS, and concurrent sessions
|
||||
- [x] Add inspector UI route test
|
||||
|
||||
## Frontend (frontend/packages/inspector)
|
||||
- [x] Build Vite + React app with connect screen (endpoint + optional token)
|
||||
- [x] Add instructions to run sandbox-agent (including CORS)
|
||||
- [x] Implement full agent UI covering all features
|
||||
- [x] Add HTTP request log with copyable curl command
|
||||
- [x] Add Content-Type header to CORS callout command
|
||||
- [x] Default inspector endpoint to current origin and auto-connect via health check
|
||||
- [x] Update inspector to universal schema events (items, deltas, approvals, errors)
|
||||
|
||||
## TypeScript SDK
|
||||
- [x] Generate OpenAPI from utoipa and run `openapi-typescript`
|
||||
- [x] Implement a thin fetch-based client wrapper
|
||||
- [x] Update `CLAUDE.md` to require SDK + CLI updates when API changes
|
||||
- [x] Prefix SDK requests with `/v1`
|
||||
|
||||
## Examples + Tests
|
||||
- [ ] Add examples for Docker, E2B, Daytona, Vercel Sandboxes, Cloudflare Sandboxes
|
||||
- [ ] Add Vitest unit test for each example (Cloudflare requires special setup)
|
||||
|
||||
## Documentation
|
||||
- [ ] Write README covering architecture, agent compatibility, and deployment guide
|
||||
- [ ] Add universal API feature checklist (questions, approve plan, etc.)
|
||||
- [ ] Document CLI, HTTP API, frontend app, and TypeScript SDK usage
|
||||
- [ ] Use collapsible sections for endpoints and SDK methods
|
||||
- [x] Integrate OpenAPI spec with Mintlify (docs/openapi.json + validation)
|
||||
|
||||
---
|
||||
|
||||
- [x] implement release pipeline
|
||||
- implement e2b example
|
||||
- implement typescript "start locally" by pulling form server using version
|
||||
- [x] Move agent schema sources to src/agents
|
||||
- [x] Add Vercel AI SDK UIMessage schema extractor
|
||||
- [x] Add server --mock mode with looping mock session events.
|
||||
- [x] Document mock mode in building chat UI docs and update CLAUDE.md guidance.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue