This commit is contained in:
NathanFlurry 2026-02-11 14:47:41 +00:00
parent 70287ec471
commit e72eb9f611
No known key found for this signature in database
GPG key ID: 6A5F43A4F3241BCA
264 changed files with 18559 additions and 51021 deletions

View file

@ -1,122 +1,40 @@
# Server
# Server Instructions
See [ARCHITECTURE.md](./ARCHITECTURE.md) for detailed architecture documentation covering the daemon, agent schema pipeline, session management, agent execution patterns, and SDK modes.
## ACP v2 Architecture
## Skill Source Installation
- Public API routes are defined in `server/packages/sandbox-agent/src/router.rs`.
- ACP runtime/process bridge is in `server/packages/sandbox-agent/src/acp_runtime.rs`.
- `/v2` is the only active API surface for sessions/prompts (`/v2/rpc`).
- Keep binary filesystem transfer endpoints as dedicated HTTP APIs:
- `GET /v2/fs/file`
- `PUT /v2/fs/file`
- `POST /v2/fs/upload-batch`
- Rationale: host-owned cross-agent-consistent behavior and large binary transfer needs that ACP JSON-RPC is not suited to stream efficiently.
- Maintain ACP variants in parallel only when they share the same underlying filesystem implementation; SDK defaults should still prefer HTTP for large/binary transfers.
- `/v1/*` must remain hard-removed (`410`) and `/opencode/*` stays disabled (`503`) until Phase 7.
- Agent install logic (native + ACP agent process + lazy install) is handled by `server/packages/agent-management/`.
Skills are installed via `skills.sources` in the session create request. The [vercel-labs/skills](https://github.com/vercel-labs/skills) repo (`~/misc/skills`) provides reference for skill installation patterns and source parsing logic. The server handles fetching GitHub repos (via zip download) and git repos (via clone) to `~/.sandbox-agent/skills-cache/`, discovering `SKILL.md` files, and symlinking into agent skill roots.
## API Contract Rules
# Server Testing
- Every `#[utoipa::path(...)]` handler needs a summary line + description lines in its doc comment.
- Every `responses(...)` entry must include `description`.
- Regenerate `docs/openapi.json` after endpoint contract changes.
- Keep CLI and HTTP endpoint behavior aligned (`docs/cli.mdx`).
## Test placement
## Tests
Place all new tests under `server/packages/**/tests/` (or a package-specific `tests/` folder). Avoid inline tests inside source files unless there is no viable alternative.
Primary v2 integration coverage:
- `server/packages/sandbox-agent/tests/v2_api.rs`
- `server/packages/sandbox-agent/tests/v2_agent_process_matrix.rs`
## Test locations (overview)
- Sandbox-agent integration tests live under `server/packages/sandbox-agent/tests/`:
- Agent flow coverage in `agent-flows/`
- Agent management coverage in `agent-management/`
- Shared server manager coverage in `server-manager/`
- HTTP endpoint snapshots in `http/` (snapshots in `http/snapshots/`)
- Session feature coverage snapshots in `sessions/` (one file per feature, e.g. `session_lifecycle.rs`, `permissions.rs`, `questions.rs`, `reasoning.rs`, `status.rs`; snapshots in `sessions/snapshots/`)
- UI coverage in `ui/`
- Shared helpers in `common/`
- Extracted agent schema roundtrip tests live under `server/packages/extracted-agent-schemas/tests/`
## Snapshot tests
HTTP endpoint snapshot entrypoint:
- `server/packages/sandbox-agent/tests/http_endpoints.rs`
Session snapshot entrypoint:
- `server/packages/sandbox-agent/tests/sessions.rs`
Snapshots are written to:
- `server/packages/sandbox-agent/tests/http/snapshots/` (HTTP endpoint snapshots)
- `server/packages/sandbox-agent/tests/sessions/snapshots/` (session/feature coverage snapshots)
## Agent selection
`SANDBOX_TEST_AGENTS` controls which agents run. It accepts a comma-separated list or `all`.
If it is **not set**, tests will auto-detect installed agents by checking:
- binaries on `PATH`, and
- the default install dir (`$XDG_DATA_HOME/sandbox-agent/bin` or `./.sandbox-agent/bin`)
If no agents are found, tests fail with a clear error.
## Credential handling
Credentials are pulled from the host by default via `extract_all_credentials`:
- environment variables (e.g. `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`)
- local CLI configs (Claude/Codex/Amp/OpenCode)
You can override host credentials for tests with:
- `SANDBOX_TEST_ANTHROPIC_API_KEY`
- `SANDBOX_TEST_OPENAI_API_KEY`
If `SANDBOX_TEST_AGENTS` includes an agent that requires a provider credential and it is missing,
tests fail before starting.
## Credential health checks
Before running agent tests, credentials are validated with minimal API calls:
- Anthropic: `GET https://api.anthropic.com/v1/models`
- `x-api-key` for API keys
- `Authorization: Bearer` for OAuth tokens
- `anthropic-version: 2023-06-01`
- OpenAI: `GET https://api.openai.com/v1/models` with `Authorization: Bearer`
401/403 yields a hard failure (`invalid credentials`). Other non-2xx responses or network
errors fail with a health-check error.
Health checks run in a blocking thread to avoid Tokio runtime drop errors inside async tests.
## Snapshot stability
To keep snapshots deterministic:
- Use the mock agent as the **master** event sequence; all other agents must match its behavior 1:1.
- Snapshots should compare a **canonical event skeleton** (event order matters) with strict ordering across:
- `item.started``item.delta``item.completed`
- presence/absence of `session.ended`
- permission/question request and resolution flows
- Scrub non-deterministic fields from snapshots:
- IDs, timestamps, native IDs
- text content, tool inputs/outputs, provider-specific metadata
- `source` and `synthetic` flags (these are implementation details)
- Scrub `reasoning` and `status` content from session-baseline snapshots to keep the core event skeleton consistent across agents; validate those content types separately in their feature-coverage-specific tests.
- The sandbox-agent is responsible for emitting **synthetic events** so that real agents match the mock sequence exactly.
- Event streams are truncated after the first assistant or error event.
- Permission flow snapshots are truncated after the permission request (or first assistant) event.
- Unknown events are preserved as `kind: unknown` (raw payload in universal schema).
- Prefer snapshot-based event skeleton assertions over manual event-order assertions in tests.
- **Never update snapshots based on any agent that is not the mock agent.** The mock agent is the source of truth for snapshots; other agents must be compared against the mock snapshots without regenerating them.
- Agent-specific endpoints keep per-agent snapshots; any session-related snapshots must use the mock baseline as the single source of truth.
## Typical commands
Run only Claude session snapshots:
```
SANDBOX_TEST_AGENTS=claude cargo test -p sandbox-agent --test sessions
Run:
```bash
cargo test -p sandbox-agent --test v2_api
cargo test -p sandbox-agent --test v2_agent_process_matrix
```
Run all detected session snapshots:
```
cargo test -p sandbox-agent --test sessions
```
## Migration Docs Sync
Run HTTP endpoint snapshots:
```
cargo test -p sandbox-agent --test http_endpoints
```
## Universal Schema
When modifying agent conversion code in `server/packages/universal-agent-schema/src/agents/` or adding/changing properties on the universal schema, update the feature matrix in `README.md` to reflect which agents support which features.
## Feature coverage sync
When updating agent feature coverage (flags or values), keep them in sync across:
- `README.md` (feature matrix / documented support)
- server Rust implementation (`AgentCapabilities` + `agent_capabilities_for`)
- frontend feature coverage views/badges (Inspector UI)
- Keep `research/acp/spec.md` as the source spec.
- Update `research/acp/todo.md` when scope/status changes.
- Log blockers/decisions in `research/acp/friction.md`.

View file

@ -9,7 +9,6 @@ repository.workspace = true
[dependencies]
sandbox-agent-agent-credentials.workspace = true
sandbox-agent-extracted-agent-schemas.workspace = true
thiserror.workspace = true
serde.workspace = true
serde_json.workspace = true

View file

@ -8,12 +8,15 @@ use utoipa::ToSchema;
#[serde(rename_all = "snake_case")]
pub enum ErrorType {
InvalidRequest,
Conflict,
UnsupportedAgent,
AgentNotInstalled,
InstallFailed,
AgentProcessExited,
TokenInvalid,
PermissionDenied,
NotAcceptable,
UnsupportedMediaType,
SessionNotFound,
SessionAlreadyExists,
ModeNotSupported,
@ -25,12 +28,15 @@ impl ErrorType {
pub fn as_urn(&self) -> &'static str {
match self {
Self::InvalidRequest => "urn:sandbox-agent:error:invalid_request",
Self::Conflict => "urn:sandbox-agent:error:conflict",
Self::UnsupportedAgent => "urn:sandbox-agent:error:unsupported_agent",
Self::AgentNotInstalled => "urn:sandbox-agent:error:agent_not_installed",
Self::InstallFailed => "urn:sandbox-agent:error:install_failed",
Self::AgentProcessExited => "urn:sandbox-agent:error:agent_process_exited",
Self::TokenInvalid => "urn:sandbox-agent:error:token_invalid",
Self::PermissionDenied => "urn:sandbox-agent:error:permission_denied",
Self::NotAcceptable => "urn:sandbox-agent:error:not_acceptable",
Self::UnsupportedMediaType => "urn:sandbox-agent:error:unsupported_media_type",
Self::SessionNotFound => "urn:sandbox-agent:error:session_not_found",
Self::SessionAlreadyExists => "urn:sandbox-agent:error:session_already_exists",
Self::ModeNotSupported => "urn:sandbox-agent:error:mode_not_supported",
@ -42,12 +48,15 @@ impl ErrorType {
pub fn title(&self) -> &'static str {
match self {
Self::InvalidRequest => "Invalid Request",
Self::Conflict => "Conflict",
Self::UnsupportedAgent => "Unsupported Agent",
Self::AgentNotInstalled => "Agent Not Installed",
Self::InstallFailed => "Install Failed",
Self::AgentProcessExited => "Agent Process Exited",
Self::TokenInvalid => "Token Invalid",
Self::PermissionDenied => "Permission Denied",
Self::NotAcceptable => "Not Acceptable",
Self::UnsupportedMediaType => "Unsupported Media Type",
Self::SessionNotFound => "Session Not Found",
Self::SessionAlreadyExists => "Session Already Exists",
Self::ModeNotSupported => "Mode Not Supported",
@ -59,12 +68,15 @@ impl ErrorType {
pub fn status_code(&self) -> u16 {
match self {
Self::InvalidRequest => 400,
Self::Conflict => 409,
Self::UnsupportedAgent => 400,
Self::AgentNotInstalled => 404,
Self::InstallFailed => 500,
Self::AgentProcessExited => 500,
Self::TokenInvalid => 401,
Self::PermissionDenied => 403,
Self::NotAcceptable => 406,
Self::UnsupportedMediaType => 415,
Self::SessionNotFound => 404,
Self::SessionAlreadyExists => 409,
Self::ModeNotSupported => 400,
@ -118,6 +130,8 @@ pub struct AgentError {
pub enum SandboxError {
#[error("invalid request: {message}")]
InvalidRequest { message: String },
#[error("conflict: {message}")]
Conflict { message: String },
#[error("unsupported agent: {agent}")]
UnsupportedAgent { agent: String },
#[error("agent not installed: {agent}")]
@ -137,6 +151,10 @@ pub enum SandboxError {
TokenInvalid { message: Option<String> },
#[error("permission denied")]
PermissionDenied { message: Option<String> },
#[error("not acceptable: {message}")]
NotAcceptable { message: String },
#[error("unsupported media type: {message}")]
UnsupportedMediaType { message: String },
#[error("session not found: {session_id}")]
SessionNotFound { session_id: String },
#[error("session already exists: {session_id}")]
@ -153,12 +171,15 @@ impl SandboxError {
pub fn error_type(&self) -> ErrorType {
match self {
Self::InvalidRequest { .. } => ErrorType::InvalidRequest,
Self::Conflict { .. } => ErrorType::Conflict,
Self::UnsupportedAgent { .. } => ErrorType::UnsupportedAgent,
Self::AgentNotInstalled { .. } => ErrorType::AgentNotInstalled,
Self::InstallFailed { .. } => ErrorType::InstallFailed,
Self::AgentProcessExited { .. } => ErrorType::AgentProcessExited,
Self::TokenInvalid { .. } => ErrorType::TokenInvalid,
Self::PermissionDenied { .. } => ErrorType::PermissionDenied,
Self::NotAcceptable { .. } => ErrorType::NotAcceptable,
Self::UnsupportedMediaType { .. } => ErrorType::UnsupportedMediaType,
Self::SessionNotFound { .. } => ErrorType::SessionNotFound,
Self::SessionAlreadyExists { .. } => ErrorType::SessionAlreadyExists,
Self::ModeNotSupported { .. } => ErrorType::ModeNotSupported,
@ -170,6 +191,11 @@ impl SandboxError {
pub fn to_agent_error(&self) -> AgentError {
let (agent, session_id, details) = match self {
Self::InvalidRequest { .. } => (None, None, None),
Self::Conflict { message } => {
let mut map = Map::new();
map.insert("message".to_string(), Value::String(message.clone()));
(None, None, Some(Value::Object(map)))
}
Self::UnsupportedAgent { agent } => (Some(agent.clone()), None, None),
Self::AgentNotInstalled { agent } => (Some(agent.clone()), None, None),
Self::InstallFailed { agent, stderr } => {
@ -228,6 +254,16 @@ impl SandboxError {
});
(None, None, details)
}
Self::NotAcceptable { message } => {
let mut map = Map::new();
map.insert("message".to_string(), Value::String(message.clone()));
(None, None, Some(Value::Object(map)))
}
Self::UnsupportedMediaType { message } => {
let mut map = Map::new();
map.insert("message".to_string(), Value::String(message.clone()));
(None, None, Some(Value::Object(map)))
}
Self::SessionNotFound { session_id } => (None, Some(session_id.clone()), None),
Self::SessionAlreadyExists { session_id } => (None, Some(session_id.clone()), None),
Self::ModeNotSupported { agent, mode } => {

View file

@ -1,21 +0,0 @@
[package]
name = "sandbox-agent-extracted-agent-schemas"
version.workspace = true
edition.workspace = true
authors.workspace = true
license.workspace = true
description.workspace = true
repository.workspace = true
[dependencies]
serde.workspace = true
serde_json.workspace = true
regress.workspace = true
chrono.workspace = true
[build-dependencies]
typify.workspace = true
serde_json.workspace = true
schemars.workspace = true
prettyplease.workspace = true
syn.workspace = true

View file

@ -1,70 +0,0 @@
use std::fs;
use std::io::{self, Write};
use std::path::Path;
fn main() {
let out_dir = std::env::var("OUT_DIR").unwrap();
let schema_dir = Path::new("../../../resources/agent-schemas/artifacts/json-schema");
let schemas = [
("opencode", "opencode.json"),
("claude", "claude.json"),
("codex", "codex.json"),
("amp", "amp.json"),
("pi", "pi.json"),
];
for (name, file) in schemas {
let schema_path = schema_dir.join(file);
// Tell cargo to rerun if schema changes
emit_stdout(&format!("cargo:rerun-if-changed={}", schema_path.display()));
if !schema_path.exists() {
emit_stdout(&format!(
"cargo:warning=Schema file not found: {}",
schema_path.display()
));
// Write empty module
let out_path = Path::new(&out_dir).join(format!("{}.rs", name));
fs::write(&out_path, "// Schema not found\n").unwrap();
continue;
}
let schema_content = fs::read_to_string(&schema_path)
.unwrap_or_else(|e| panic!("Failed to read {}: {}", schema_path.display(), e));
let schema: schemars::schema::RootSchema = serde_json::from_str(&schema_content)
.unwrap_or_else(|e| panic!("Failed to parse {}: {}", schema_path.display(), e));
let mut type_space = typify::TypeSpace::default();
type_space
.add_root_schema(schema)
.unwrap_or_else(|e| panic!("Failed to process {}: {}", schema_path.display(), e));
let contents = type_space.to_stream();
// Format the generated code
let formatted = prettyplease::unparse(
&syn::parse2(contents.clone())
.unwrap_or_else(|e| panic!("Failed to parse generated code for {}: {}", name, e)),
);
let out_path = Path::new(&out_dir).join(format!("{}.rs", name));
fs::write(&out_path, formatted)
.unwrap_or_else(|e| panic!("Failed to write {}: {}", out_path.display(), e));
// emit_stdout(&format!(
// "cargo:warning=Generated {} types from {}",
// name, file
// ));
}
}
fn emit_stdout(message: &str) {
let mut out = io::stdout();
let _ = out.write_all(message.as_bytes());
let _ = out.write_all(b"\n");
let _ = out.flush();
}

View file

@ -1,33 +0,0 @@
//! Generated types from AI coding agent JSON schemas.
//!
//! This crate provides Rust types for:
//! - OpenCode SDK
//! - Claude Code SDK
//! - Codex SDK
//! - AMP Code SDK
//! - Pi RPC
pub mod opencode {
//! OpenCode SDK types extracted from OpenAPI 3.1.1 spec.
include!(concat!(env!("OUT_DIR"), "/opencode.rs"));
}
pub mod claude {
//! Claude Code SDK types extracted from TypeScript definitions.
include!(concat!(env!("OUT_DIR"), "/claude.rs"));
}
pub mod codex {
//! Codex SDK types.
include!(concat!(env!("OUT_DIR"), "/codex.rs"));
}
pub mod amp {
//! AMP Code SDK types.
include!(concat!(env!("OUT_DIR"), "/amp.rs"));
}
pub mod pi {
//! Pi RPC types.
include!(concat!(env!("OUT_DIR"), "/pi.rs"));
}

View file

@ -1,104 +0,0 @@
use sandbox_agent_extracted_agent_schemas::{amp, claude, codex};
#[test]
fn test_claude_bash_input() {
let input = claude::BashInput {
command: "ls -la".to_string(),
timeout: Some(5000.0),
working_directory: None,
};
let json = serde_json::to_string(&input).unwrap();
assert!(json.contains("ls -la"));
let parsed: claude::BashInput = serde_json::from_str(&json).unwrap();
assert_eq!(parsed.command, "ls -la");
}
#[test]
fn test_codex_server_notification() {
let notification = codex::ServerNotification::ItemCompleted(codex::ItemCompletedNotification {
item: codex::ThreadItem::AgentMessage {
id: "msg-123".to_string(),
text: "Hello from Codex".to_string(),
},
thread_id: "thread-123".to_string(),
turn_id: "turn-456".to_string(),
});
let json = serde_json::to_string(&notification).unwrap();
assert!(json.contains("item/completed"));
assert!(json.contains("Hello from Codex"));
assert!(json.contains("agentMessage"));
}
#[test]
fn test_codex_thread_item_variants() {
let user_msg = codex::ThreadItem::UserMessage {
content: vec![codex::UserInput::Text {
text: "Hello".to_string(),
text_elements: vec![],
}],
id: "user-1".to_string(),
};
let json = serde_json::to_string(&user_msg).unwrap();
assert!(json.contains("userMessage"));
assert!(json.contains("Hello"));
let cmd = codex::ThreadItem::CommandExecution {
aggregated_output: Some("output".to_string()),
command: "ls -la".to_string(),
command_actions: vec![],
cwd: "/tmp".to_string(),
duration_ms: Some(100),
exit_code: Some(0),
id: "cmd-1".to_string(),
process_id: None,
status: codex::CommandExecutionStatus::Completed,
};
let json = serde_json::to_string(&cmd).unwrap();
assert!(json.contains("commandExecution"));
assert!(json.contains("ls -la"));
}
#[test]
fn test_amp_message() {
let msg = amp::Message {
role: amp::MessageRole::User,
content: "Hello".to_string(),
tool_calls: vec![],
};
let json = serde_json::to_string(&msg).unwrap();
assert!(json.contains("user"));
assert!(json.contains("Hello"));
}
#[test]
fn test_amp_stream_json_message_types() {
// Test that all new message types can be parsed
let system_msg = r#"{"type":"system","subtype":"init","cwd":"/tmp","session_id":"sess-1","tools":["Bash"],"mcp_servers":[]}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(system_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::System));
let user_msg = r#"{"type":"user","message":{"role":"user","content":"Hello"},"session_id":"sess-1"}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(user_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::User));
let assistant_msg = r#"{"type":"assistant","message":{"role":"assistant","content":"Hi there"},"session_id":"sess-1"}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(assistant_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::Assistant));
let result_msg = r#"{"type":"result","subtype":"success","duration_ms":1000,"is_error":false,"num_turns":1,"result":"Done","session_id":"sess-1"}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(result_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::Result));
// Test legacy types still work
let message_msg = r#"{"type":"message","id":"msg-1","content":"Hello"}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(message_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::Message));
let done_msg = r#"{"type":"done"}"#;
let parsed: amp::StreamJsonMessage = serde_json::from_str(done_msg).unwrap();
assert!(matches!(parsed.type_, amp::StreamJsonMessageType::Done));
}

View file

@ -15,7 +15,6 @@ path = "src/main.rs"
sandbox-agent-error.workspace = true
sandbox-agent-agent-management.workspace = true
sandbox-agent-agent-credentials.workspace = true
sandbox-agent-universal-agent-schema.workspace = true
thiserror.workspace = true
serde.workspace = true
serde_json.workspace = true
@ -26,7 +25,7 @@ reqwest.workspace = true
dirs.workspace = true
time.workspace = true
chrono.workspace = true
tokio.workspace = true
tokio = { workspace = true, features = ["process", "io-util", "sync"] }
tokio-stream.workspace = true
tower-http.workspace = true
utoipa.workspace = true
@ -52,6 +51,7 @@ http-body-util.workspace = true
insta.workspace = true
tower.workspace = true
tempfile.workspace = true
serial_test = "3.2"
[features]
test-utils = ["tempfile"]

View file

@ -14,9 +14,12 @@ fn main() {
.join("packages")
.join("inspector")
.join("dist");
let inspector_pkg_dir = root_dir.join("frontend").join("packages").join("inspector");
println!("cargo:rerun-if-env-changed=SANDBOX_AGENT_SKIP_INSPECTOR");
println!("cargo:rerun-if-env-changed=SANDBOX_AGENT_VERSION");
// Watch the inspector package directory so Cargo reruns when dist appears/disappears.
println!("cargo:rerun-if-changed={}", inspector_pkg_dir.display());
let dist_exists = dist_dir.exists();
if dist_exists {
println!("cargo:rerun-if-changed={}", dist_dir.display());

View file

@ -0,0 +1,330 @@
use super::*;
impl SharedAgentBackend {
pub(super) fn new_mock(agent: AgentId) -> Arc<Self> {
Arc::new(Self {
agent,
sender: BackendSender::Mock(new_mock_backend()),
pending_client_responses: Mutex::new(HashMap::new()),
})
}
pub(super) async fn new_process(
agent: AgentId,
launch: AgentProcessLaunchSpec,
runtime: Arc<AcpRuntimeInner>,
) -> Result<Arc<Self>, SandboxError> {
let mut command = Command::new(&launch.program);
command
.args(&launch.args)
.stdin(std::process::Stdio::piped())
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped());
for (key, value) in &launch.env {
command.env(key, value);
}
let mut child = command.spawn().map_err(|err| SandboxError::StreamError {
message: format!(
"failed to start ACP agent process {}: {err}",
launch.program.display()
),
})?;
let stdin = child
.stdin
.take()
.ok_or_else(|| SandboxError::StreamError {
message: "failed to capture ACP agent process stdin".to_string(),
})?;
let stdout = child
.stdout
.take()
.ok_or_else(|| SandboxError::StreamError {
message: "failed to capture ACP agent process stdout".to_string(),
})?;
let stderr = child
.stderr
.take()
.ok_or_else(|| SandboxError::StreamError {
message: "failed to capture ACP agent process stderr".to_string(),
})?;
let process = ProcessBackend {
stdin: Arc::new(Mutex::new(stdin)),
child: Arc::new(Mutex::new(child)),
stderr_capture: Arc::new(Mutex::new(StderrCapture::default())),
terminate_requested: Arc::new(AtomicBool::new(false)),
};
let backend = Arc::new(Self {
agent,
sender: BackendSender::Process(process.clone()),
pending_client_responses: Mutex::new(HashMap::new()),
});
backend.start_process_pumps(runtime, stdout, stderr, process);
Ok(backend)
}
pub(super) async fn is_alive(&self) -> bool {
match &self.sender {
BackendSender::Mock(_) => true,
BackendSender::Process(process) => process.is_alive().await,
}
}
pub(super) fn start_process_pumps(
self: &Arc<Self>,
runtime: Arc<AcpRuntimeInner>,
stdout: tokio::process::ChildStdout,
stderr: tokio::process::ChildStderr,
process: ProcessBackend,
) {
let backend = self.clone();
let runtime_stdout = runtime.clone();
tokio::spawn(async move {
let mut lines = BufReader::new(stdout).lines();
while let Ok(Some(line)) = lines.next_line().await {
if line.trim().is_empty() {
continue;
}
let message = match serde_json::from_str::<Value>(&line) {
Ok(message) => message,
Err(err) => json!({
"jsonrpc": "2.0",
"method": AGENT_UNPARSED_METHOD,
"params": {
"error": err.to_string(),
"raw": line,
},
}),
};
runtime_stdout
.handle_backend_message(backend.agent, message)
.await;
}
});
let backend = self.clone();
let stderr_capture = process.clone();
tokio::spawn(async move {
let mut lines = BufReader::new(stderr).lines();
while let Ok(Some(line)) = lines.next_line().await {
stderr_capture.record_stderr_line(line.clone()).await;
tracing::debug!(
agent = %backend.agent,
"ACP agent process stderr: {}",
line
);
}
});
let backend = self.clone();
let runtime_exit = runtime.clone();
tokio::spawn(async move {
loop {
let probe = {
let mut child = process.child.lock().await;
match child.try_wait() {
Ok(Some(status)) => Ok(Some(status)),
Ok(None) => Ok(None),
Err(err) => Err(err.to_string()),
}
};
match probe {
Ok(Some(status)) => {
runtime_exit
.remove_backend_if_same(backend.agent, &backend)
.await;
runtime_exit
.handle_backend_process_exit(
backend.agent,
Some(status),
process.terminated_by(),
process.stderr_output().await,
)
.await;
break;
}
Ok(None) => {
tokio::time::sleep(Duration::from_millis(200)).await;
}
Err(err) => {
runtime_exit
.remove_backend_if_same(backend.agent, &backend)
.await;
runtime_exit
.mark_backend_stopped(
backend.agent,
Some(format!("failed to poll ACP agent process status: {err}")),
)
.await;
break;
}
}
}
});
}
pub(super) async fn send(
&self,
runtime: Arc<AcpRuntimeInner>,
payload: Value,
) -> Result<(), SandboxError> {
match &self.sender {
BackendSender::Process(process) => {
let mut stdin = process.stdin.lock().await;
let encoded =
serde_json::to_vec(&payload).map_err(|err| SandboxError::InvalidRequest {
message: format!("failed to serialize JSON-RPC payload: {err}"),
})?;
if let Err(err) = stdin.write_all(&encoded).await {
let message = format!("failed to write to ACP agent process stdin: {err}");
runtime
.mark_backend_stopped(self.agent, Some(message.clone()))
.await;
return Err(SandboxError::StreamError { message });
}
if let Err(err) = stdin.write_all(b"\n").await {
let message =
format!("failed to write line delimiter to ACP agent process stdin: {err}");
runtime
.mark_backend_stopped(self.agent, Some(message.clone()))
.await;
return Err(SandboxError::StreamError { message });
}
if let Err(err) = stdin.flush().await {
let message = format!("failed to flush ACP agent process stdin: {err}");
runtime
.mark_backend_stopped(self.agent, Some(message.clone()))
.await;
return Err(SandboxError::StreamError { message });
}
Ok(())
}
BackendSender::Mock(mock) => {
let agent = self.agent;
Box::pin(handle_mock_payload(mock, &payload, |message| {
let runtime = runtime.clone();
async move {
runtime.handle_backend_message(agent, message).await;
}
}))
.await
}
}
}
pub(super) async fn shutdown(&self, grace: Duration) {
if let BackendSender::Process(process) = &self.sender {
process.terminate_requested.store(true, Ordering::SeqCst);
tokio::time::sleep(grace).await;
let mut child = process.child.lock().await;
match child.try_wait() {
Ok(Some(_)) => {}
Ok(None) => {
let _ = child.kill().await;
let _ = child.wait().await;
}
Err(_) => {
let _ = child.kill().await;
}
}
}
}
}
impl ProcessBackend {
pub(super) async fn record_stderr_line(&self, line: String) {
self.stderr_capture.lock().await.record(line);
}
pub(super) async fn stderr_output(&self) -> Option<StderrOutput> {
self.stderr_capture.lock().await.snapshot()
}
pub(super) async fn is_alive(&self) -> bool {
let mut child = self.child.lock().await;
matches!(child.try_wait(), Ok(None))
}
pub(super) fn terminated_by(&self) -> TerminatedBy {
if self.terminate_requested.load(Ordering::SeqCst) {
TerminatedBy::Daemon
} else {
TerminatedBy::Agent
}
}
}
impl AcpClient {
pub(super) fn new(id: String, default_agent: AgentId) -> Arc<Self> {
let (sender, _rx) = broadcast::channel(512);
Arc::new(Self {
id,
default_agent,
seq: AtomicU64::new(0),
closed: AtomicBool::new(false),
sse_stream_active: Arc::new(AtomicBool::new(false)),
sender,
ring: Mutex::new(VecDeque::with_capacity(RING_BUFFER_SIZE)),
pending: Mutex::new(HashMap::new()),
})
}
pub(super) async fn push_stream(&self, payload: Value) {
let sequence = self.seq.fetch_add(1, Ordering::SeqCst) + 1;
let message = StreamMessage { sequence, payload };
{
let mut ring = self.ring.lock().await;
ring.push_back(message.clone());
while ring.len() > RING_BUFFER_SIZE {
ring.pop_front();
}
}
let _ = self.sender.send(message);
}
pub(super) async fn subscribe(
&self,
last_event_id: Option<u64>,
) -> (Vec<StreamMessage>, broadcast::Receiver<StreamMessage>) {
let replay = {
let ring = self.ring.lock().await;
ring.iter()
.filter(|message| {
if let Some(last_event_id) = last_event_id {
message.sequence > last_event_id
} else {
true
}
})
.cloned()
.collect::<Vec<_>>()
};
(replay, self.sender.subscribe())
}
pub(super) fn try_claim_sse_stream(&self) -> bool {
self.sse_stream_active
.compare_exchange(false, true, Ordering::SeqCst, Ordering::SeqCst)
.is_ok()
}
pub(super) fn sse_active_flag(&self) -> Arc<AtomicBool> {
self.sse_stream_active.clone()
}
pub(super) async fn close(&self) {
if self.closed.swap(true, Ordering::SeqCst) {
return;
}
self.sse_stream_active.store(false, Ordering::SeqCst);
self.pending.lock().await.clear();
}
}

View file

@ -0,0 +1,123 @@
use super::*;
// Canonical extension namespace used for ACP _meta values.
pub(super) const SANDBOX_META_KEY: &str = "sandboxagent.dev";
// _meta[sandboxagent.dev].extensions key in initialize response.
pub(super) const EXTENSIONS_META_KEY: &str = "extensions";
// _meta[sandboxagent.dev].extensions.sessionDetach => method _sandboxagent/session/detach
pub(super) const EXTENSION_KEY_SESSION_DETACH: &str = "sessionDetach";
// _meta[sandboxagent.dev].extensions.sessionTerminate => method _sandboxagent/session/terminate
pub(super) const EXTENSION_KEY_SESSION_TERMINATE: &str = "sessionTerminate";
// _meta[sandboxagent.dev].extensions.sessionEndedNotification => method _sandboxagent/session/ended
pub(super) const EXTENSION_KEY_SESSION_ENDED_NOTIFICATION: &str = "sessionEndedNotification";
// _meta[sandboxagent.dev].extensions.sessionListModels => method _sandboxagent/session/list_models
pub(super) const EXTENSION_KEY_SESSION_LIST_MODELS: &str = "sessionListModels";
// _meta[sandboxagent.dev].extensions.sessionSetMetadata => method _sandboxagent/session/set_metadata
pub(super) const EXTENSION_KEY_SESSION_SET_METADATA: &str = "sessionSetMetadata";
// _meta[sandboxagent.dev].extensions.sessionAgentMeta => session/new + initialize require _meta[sandboxagent.dev].agent
pub(super) const EXTENSION_KEY_SESSION_AGENT_META: &str = "sessionAgentMeta";
// _meta[sandboxagent.dev].extensions.agentList => method _sandboxagent/agent/list
pub(super) const EXTENSION_KEY_AGENT_LIST: &str = "agentList";
// _meta[sandboxagent.dev].extensions.agentInstall => method _sandboxagent/agent/install
pub(super) const EXTENSION_KEY_AGENT_INSTALL: &str = "agentInstall";
// _meta[sandboxagent.dev].extensions.sessionList => method _sandboxagent/session/list
pub(super) const EXTENSION_KEY_SESSION_LIST: &str = "sessionList";
// _meta[sandboxagent.dev].extensions.sessionGet => method _sandboxagent/session/get
pub(super) const EXTENSION_KEY_SESSION_GET: &str = "sessionGet";
// _meta[sandboxagent.dev].extensions.fsListEntries => method _sandboxagent/fs/list_entries
pub(super) const EXTENSION_KEY_FS_LIST_ENTRIES: &str = "fsListEntries";
// _meta[sandboxagent.dev].extensions.fsReadFile => method _sandboxagent/fs/read_file
pub(super) const EXTENSION_KEY_FS_READ_FILE: &str = "fsReadFile";
// _meta[sandboxagent.dev].extensions.fsWriteFile => method _sandboxagent/fs/write_file
pub(super) const EXTENSION_KEY_FS_WRITE_FILE: &str = "fsWriteFile";
// _meta[sandboxagent.dev].extensions.fsDeleteEntry => method _sandboxagent/fs/delete_entry
pub(super) const EXTENSION_KEY_FS_DELETE_ENTRY: &str = "fsDeleteEntry";
// _meta[sandboxagent.dev].extensions.fsMkdir => method _sandboxagent/fs/mkdir
pub(super) const EXTENSION_KEY_FS_MKDIR: &str = "fsMkdir";
// _meta[sandboxagent.dev].extensions.fsMove => method _sandboxagent/fs/move
pub(super) const EXTENSION_KEY_FS_MOVE: &str = "fsMove";
// _meta[sandboxagent.dev].extensions.fsStat => method _sandboxagent/fs/stat
pub(super) const EXTENSION_KEY_FS_STAT: &str = "fsStat";
// _meta[sandboxagent.dev].extensions.fsUploadBatch => method _sandboxagent/fs/upload_batch
pub(super) const EXTENSION_KEY_FS_UPLOAD_BATCH: &str = "fsUploadBatch";
// _meta[sandboxagent.dev].extensions.methods => list of supported extension methods
pub(super) const EXTENSION_KEY_METHODS: &str = "methods";
pub(super) fn extract_sandbox_session_meta(payload: &Value) -> Option<Map<String, Value>> {
payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("_meta"))
.and_then(Value::as_object)
.and_then(|meta| meta.get(SANDBOX_META_KEY))
.and_then(Value::as_object)
.cloned()
}
pub(super) fn inject_extension_capabilities(payload: &mut Value) {
let Some(result) = payload.get_mut("result").and_then(Value::as_object_mut) else {
return;
};
let Some(agent_capabilities) = result
.get_mut("agentCapabilities")
.and_then(Value::as_object_mut)
else {
return;
};
let meta = agent_capabilities
.entry("_meta".to_string())
.or_insert_with(|| Value::Object(Map::new()));
let Some(meta_object) = meta.as_object_mut() else {
return;
};
let sandbox = meta_object
.entry(SANDBOX_META_KEY.to_string())
.or_insert_with(|| Value::Object(Map::new()));
let Some(sandbox_object) = sandbox.as_object_mut() else {
return;
};
sandbox_object.insert(
EXTENSIONS_META_KEY.to_string(),
json!({
EXTENSION_KEY_SESSION_DETACH: true,
EXTENSION_KEY_SESSION_TERMINATE: true,
EXTENSION_KEY_SESSION_ENDED_NOTIFICATION: true,
EXTENSION_KEY_SESSION_LIST_MODELS: true,
EXTENSION_KEY_SESSION_SET_METADATA: true,
EXTENSION_KEY_SESSION_AGENT_META: true,
EXTENSION_KEY_AGENT_LIST: true,
EXTENSION_KEY_AGENT_INSTALL: true,
EXTENSION_KEY_SESSION_LIST: true,
EXTENSION_KEY_SESSION_GET: true,
EXTENSION_KEY_FS_LIST_ENTRIES: true,
EXTENSION_KEY_FS_READ_FILE: true,
EXTENSION_KEY_FS_WRITE_FILE: true,
EXTENSION_KEY_FS_DELETE_ENTRY: true,
EXTENSION_KEY_FS_MKDIR: true,
EXTENSION_KEY_FS_MOVE: true,
EXTENSION_KEY_FS_STAT: true,
EXTENSION_KEY_FS_UPLOAD_BATCH: true,
EXTENSION_KEY_METHODS: [
SESSION_DETACH_METHOD,
SESSION_TERMINATE_METHOD,
SESSION_ENDED_METHOD,
SESSION_LIST_MODELS_METHOD,
SESSION_SET_METADATA_METHOD,
AGENT_LIST_METHOD,
AGENT_INSTALL_METHOD,
SESSION_LIST_METHOD,
SESSION_GET_METHOD,
FS_LIST_ENTRIES_METHOD,
FS_READ_FILE_METHOD,
FS_WRITE_FILE_METHOD,
FS_DELETE_ENTRY_METHOD,
FS_MKDIR_METHOD,
FS_MOVE_METHOD,
FS_STAT_METHOD,
FS_UPLOAD_BATCH_METHOD,
]
}),
);
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,573 @@
use super::*;
pub(super) fn validate_jsonrpc_envelope(payload: &Value) -> Result<(), SandboxError> {
let object = payload
.as_object()
.ok_or_else(|| SandboxError::InvalidRequest {
message: "JSON-RPC payload must be an object".to_string(),
})?;
let Some(jsonrpc) = object.get("jsonrpc").and_then(Value::as_str) else {
return Err(SandboxError::InvalidRequest {
message: "JSON-RPC payload must include jsonrpc field".to_string(),
});
};
if jsonrpc != "2.0" {
return Err(SandboxError::InvalidRequest {
message: "jsonrpc must be '2.0'".to_string(),
});
}
let has_method = object.get("method").is_some();
let has_id = object.get("id").is_some();
let has_result_or_error = object.get("result").is_some() || object.get("error").is_some();
if !has_method && !has_id {
return Err(SandboxError::InvalidRequest {
message: "JSON-RPC payload must include either method or id".to_string(),
});
}
if has_method && has_result_or_error {
return Err(SandboxError::InvalidRequest {
message: "JSON-RPC request/notification must not include result or error".to_string(),
});
}
Ok(())
}
pub(super) fn required_sandbox_agent_meta(
payload: &Value,
method: &str,
) -> Result<AgentId, SandboxError> {
let Some(agent) = payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("_meta"))
.and_then(Value::as_object)
.and_then(|meta| meta.get(SANDBOX_META_KEY))
.and_then(Value::as_object)
.and_then(|sandbox| sandbox.get("agent"))
.and_then(Value::as_str)
.map(str::trim)
.filter(|value| !value.is_empty())
else {
return Err(SandboxError::InvalidRequest {
message: format!("{method} requires params._meta[\"{SANDBOX_META_KEY}\"].agent"),
});
};
AgentId::parse(agent).ok_or_else(|| SandboxError::UnsupportedAgent {
agent: agent.to_string(),
})
}
pub(super) fn explicit_agent_param(payload: &Value) -> Result<Option<AgentId>, SandboxError> {
let Some(agent_value) = payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("agent"))
else {
return Ok(None);
};
let Some(agent_name) = agent_value.as_str() else {
return Err(SandboxError::InvalidRequest {
message: "params.agent must be a string".to_string(),
});
};
let agent_name = agent_name.trim();
if agent_name.is_empty() {
return Err(SandboxError::InvalidRequest {
message: "params.agent must be non-empty".to_string(),
});
}
AgentId::parse(agent_name)
.map(Some)
.ok_or_else(|| SandboxError::UnsupportedAgent {
agent: agent_name.to_string(),
})
}
pub(super) fn to_sse_event(message: StreamMessage) -> Event {
let data = serde_json::to_string(&message.payload).unwrap_or_else(|_| "{}".to_string());
Event::default()
.event("message")
.id(message.sequence.to_string())
.data(data)
}
pub(super) fn message_id_key(id: &Value) -> String {
serde_json::to_string(id).unwrap_or_else(|_| "null".to_string())
}
pub(super) fn set_payload_id(payload: &mut Value, id: Value) {
if let Some(object) = payload.as_object_mut() {
object.insert("id".to_string(), id);
}
}
pub(super) fn extract_session_id_from_payload(payload: &Value) -> Option<String> {
payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("sessionId"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
}
pub(super) fn extract_session_id_from_response(payload: &Value) -> Option<String> {
payload
.get("result")
.and_then(Value::as_object)
.and_then(|result| result.get("sessionId"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
}
pub(super) fn extract_cwd_from_payload(payload: &Value) -> Option<String> {
payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("cwd"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
}
pub(super) fn extract_model_id_from_payload(payload: &Value) -> Option<String> {
payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("modelId"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
}
pub(super) fn extract_mode_id_from_payload(payload: &Value) -> Option<String> {
payload
.get("params")
.and_then(Value::as_object)
.and_then(|params| params.get("modeId"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
}
pub(super) fn extract_models_from_response(response: &Value) -> Option<AgentModelSnapshot> {
let result = response.get("result")?.as_object()?;
let models_root = result
.get("models")
.and_then(Value::as_object)
.cloned()
.unwrap_or_else(|| result.clone());
let available_models = models_root
.get("availableModels")
.and_then(Value::as_array)?
.iter()
.filter_map(|entry| {
let object = entry.as_object()?;
let model_id = object.get("modelId").and_then(Value::as_str)?.to_string();
let mut variants = object
.get("variants")
.and_then(Value::as_array)
.map(|values| {
values
.iter()
.filter_map(Value::as_str)
.map(ToOwned::to_owned)
.collect::<Vec<_>>()
})
.unwrap_or_default();
variants.sort();
Some(AgentModelInfo {
model_id,
name: object
.get("name")
.and_then(Value::as_str)
.map(ToOwned::to_owned),
description: object
.get("description")
.and_then(Value::as_str)
.map(ToOwned::to_owned),
default_variant: object
.get("defaultVariant")
.and_then(Value::as_str)
.map(ToOwned::to_owned),
variants,
})
})
.collect::<Vec<_>>();
let current_model_id = models_root
.get("currentModelId")
.and_then(Value::as_str)
.map(ToOwned::to_owned)
.or_else(|| {
available_models
.first()
.map(|entry| entry.model_id.to_string())
});
Some(AgentModelSnapshot {
available_models,
current_model_id,
})
}
pub(super) fn extract_modes_from_response(response: &Value) -> Option<AgentModeSnapshot> {
let result = response.get("result")?.as_object()?;
let modes_root = result
.get("modes")
.and_then(Value::as_object)
.cloned()
.unwrap_or_else(|| result.clone());
let available_modes = modes_root
.get("availableModes")
.and_then(Value::as_array)?
.iter()
.filter_map(|entry| {
let object = entry.as_object()?;
let mode_id = object
.get("modeId")
.and_then(Value::as_str)
.or_else(|| object.get("id").and_then(Value::as_str))?
.to_string();
Some(AgentModeInfo {
mode_id,
name: object
.get("name")
.and_then(Value::as_str)
.map(ToOwned::to_owned),
description: object
.get("description")
.and_then(Value::as_str)
.map(ToOwned::to_owned),
})
})
.collect::<Vec<_>>();
let current_mode_id = modes_root
.get("currentModeId")
.and_then(Value::as_str)
.map(ToOwned::to_owned)
.or_else(|| {
available_modes
.first()
.map(|entry| entry.mode_id.to_string())
});
Some(AgentModeSnapshot {
available_modes,
current_mode_id,
})
}
pub(super) fn fallback_models_for_agent(agent: AgentId) -> AgentModelSnapshot {
match agent {
// Copied from pre-ACP v1 fallback behavior in router.rs.
AgentId::Claude => AgentModelSnapshot {
available_models: vec![
AgentModelInfo {
model_id: "default".to_string(),
name: Some("Default (recommended)".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
},
AgentModelInfo {
model_id: "sonnet".to_string(),
name: Some("Sonnet".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
},
AgentModelInfo {
model_id: "opus".to_string(),
name: Some("Opus".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
},
AgentModelInfo {
model_id: "haiku".to_string(),
name: Some("Haiku".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
},
],
current_model_id: Some("default".to_string()),
},
AgentId::Amp => AgentModelSnapshot {
available_models: vec![AgentModelInfo {
model_id: "amp-default".to_string(),
name: Some("Amp Default".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
}],
current_model_id: Some("amp-default".to_string()),
},
AgentId::Mock => AgentModelSnapshot {
available_models: vec![AgentModelInfo {
model_id: "mock".to_string(),
name: Some("Mock".to_string()),
description: None,
default_variant: None,
variants: Vec::new(),
}],
current_model_id: Some("mock".to_string()),
},
AgentId::Codex | AgentId::Opencode => AgentModelSnapshot::default(),
}
}
pub(super) fn to_stream_error(
error: sandbox_agent_agent_management::agents::AgentError,
) -> SandboxError {
SandboxError::StreamError {
message: error.to_string(),
}
}
pub(super) fn duration_from_env_ms(var_name: &str, default: Duration) -> Duration {
std::env::var(var_name)
.ok()
.and_then(|value| value.trim().parse::<u64>().ok())
.filter(|value| *value > 0)
.map(Duration::from_millis)
.unwrap_or(default)
}
impl SessionEndReason {
fn as_str(self) -> &'static str {
match self {
Self::Completed => "completed",
Self::Error => "error",
Self::Terminated => "terminated",
}
}
}
impl TerminatedBy {
fn as_str(self) -> &'static str {
match self {
Self::Agent => "agent",
Self::Daemon => "daemon",
}
}
}
impl StderrCapture {
pub(super) fn record(&mut self, line: String) {
self.total_lines = self.total_lines.saturating_add(1);
if self.full_if_small.len() < STDERR_HEAD_LINES + STDERR_TAIL_LINES {
self.full_if_small.push(line.clone());
}
if self.head.len() < STDERR_HEAD_LINES {
self.head.push(line.clone());
}
self.tail.push_back(line);
while self.tail.len() > STDERR_TAIL_LINES {
self.tail.pop_front();
}
}
pub(super) fn snapshot(&self) -> Option<StderrOutput> {
if self.total_lines == 0 {
return None;
}
let max_untruncated = STDERR_HEAD_LINES + STDERR_TAIL_LINES;
if self.total_lines <= max_untruncated {
let head = if self.full_if_small.is_empty() {
None
} else {
Some(self.full_if_small.join("\n"))
};
return Some(StderrOutput {
head,
tail: None,
truncated: false,
total_lines: Some(self.total_lines),
});
}
Some(StderrOutput {
head: if self.head.is_empty() {
None
} else {
Some(self.head.join("\n"))
},
tail: if self.tail.is_empty() {
None
} else {
Some(self.tail.iter().cloned().collect::<Vec<_>>().join("\n"))
},
truncated: true,
total_lines: Some(self.total_lines),
})
}
}
impl From<MetaSession> for SessionRuntimeInfo {
fn from(value: MetaSession) -> Self {
Self {
session_id: value.session_id,
created_at: value.created_at,
updated_at: value.updated_at_ms,
ended: value.ended,
event_count: value.event_count,
model_hint: value.model_hint,
mode_hint: value.mode_hint,
title: value.title,
cwd: value.cwd,
sandbox_meta: value.sandbox_meta,
agent: value.agent,
ended_data: value.ended_data,
}
}
}
impl From<AgentModelSnapshot> for RuntimeModelSnapshot {
fn from(value: AgentModelSnapshot) -> Self {
Self {
available_models: value
.available_models
.into_iter()
.map(|model| RuntimeModelInfo {
model_id: model.model_id,
name: model.name,
description: model.description,
})
.collect(),
current_model_id: value.current_model_id,
}
}
}
impl From<AgentModeSnapshot> for RuntimeModeSnapshot {
fn from(value: AgentModeSnapshot) -> Self {
Self {
available_modes: value
.available_modes
.into_iter()
.map(|mode| RuntimeModeInfo {
mode_id: mode.mode_id,
name: mode.name,
description: mode.description,
})
.collect(),
current_mode_id: value.current_mode_id,
}
}
}
pub(super) fn ended_data_to_value(data: &SessionEndedData) -> Value {
let mut output = Map::new();
output.insert(
"reason".to_string(),
Value::String(data.reason.as_str().to_string()),
);
output.insert(
"terminated_by".to_string(),
Value::String(data.terminated_by.as_str().to_string()),
);
if let Some(message) = &data.message {
output.insert("message".to_string(), Value::String(message.clone()));
}
if let Some(exit_code) = data.exit_code {
output.insert("exit_code".to_string(), Value::from(exit_code));
}
if let Some(stderr) = &data.stderr {
let mut stderr_value = Map::new();
if let Some(head) = &stderr.head {
stderr_value.insert("head".to_string(), Value::String(head.clone()));
}
if let Some(tail) = &stderr.tail {
stderr_value.insert("tail".to_string(), Value::String(tail.clone()));
}
stderr_value.insert("truncated".to_string(), Value::Bool(stderr.truncated));
if let Some(total_lines) = stderr.total_lines {
stderr_value.insert("total_lines".to_string(), Value::from(total_lines as u64));
}
output.insert("stderr".to_string(), Value::Object(stderr_value));
}
Value::Object(output)
}
pub(super) fn ended_data_from_process_exit(
status: Option<ExitStatus>,
terminated_by: TerminatedBy,
stderr: Option<StderrOutput>,
) -> SessionEndedData {
if terminated_by == TerminatedBy::Daemon {
return SessionEndedData {
reason: SessionEndReason::Terminated,
terminated_by,
message: None,
exit_code: None,
stderr: None,
};
}
if status.as_ref().is_some_and(ExitStatus::success) {
return SessionEndedData {
reason: SessionEndReason::Completed,
terminated_by,
message: None,
exit_code: None,
stderr: None,
};
}
let message = status
.as_ref()
.map(|value| format!("agent exited with status {value}"))
.or_else(|| Some("agent exited".to_string()));
SessionEndedData {
reason: SessionEndReason::Error,
terminated_by,
message,
exit_code: status.and_then(|value| value.code()),
stderr,
}
}
pub(super) fn infer_base_url_from_launch(launch: &AgentProcessLaunchSpec) -> Option<String> {
for (key, value) in &launch.env {
if (key.contains("BASE_URL") || key.ends_with("_URL")) && is_http_url(value) {
return Some(value.clone());
}
}
for arg in &launch.args {
if let Some(value) = arg.strip_prefix("--base-url=") {
if is_http_url(value) {
return Some(value.to_string());
}
}
if let Some(value) = arg.strip_prefix("--base_url=") {
if is_http_url(value) {
return Some(value.to_string());
}
}
if let Some(value) = arg.strip_prefix("--url=") {
if is_http_url(value) {
return Some(value.to_string());
}
}
}
let mut args = launch.args.iter();
while let Some(arg) = args.next() {
if arg == "--base-url" || arg == "--base_url" || arg == "--url" {
if let Some(value) = args.next() {
if is_http_url(value) {
return Some(value.to_string());
}
}
}
}
None
}
pub(super) fn is_http_url(value: &str) -> bool {
value.starts_with("http://") || value.starts_with("https://")
}
pub(super) fn now_ms() -> i64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|duration| duration.as_millis() as i64)
.unwrap_or(0)
}

View file

@ -0,0 +1,425 @@
use sandbox_agent_error::SandboxError;
use serde_json::{json, Value};
use std::collections::HashSet;
use std::future::Future;
use std::time::Duration;
use tokio::sync::Mutex;
use tokio::time::sleep;
const MOCK_WORD_STREAM_DELAY_MS: u64 = 30;
#[derive(Debug)]
pub(super) struct MockBackend {
session_counter: Mutex<u64>,
permission_counter: Mutex<u64>,
sessions: Mutex<HashSet<String>>,
ended_sessions: Mutex<HashSet<String>>,
}
pub(super) fn new_mock_backend() -> MockBackend {
MockBackend {
session_counter: Mutex::new(0),
permission_counter: Mutex::new(0),
sessions: Mutex::new(HashSet::new()),
ended_sessions: Mutex::new(HashSet::new()),
}
}
pub(super) async fn handle_mock_payload<F, Fut>(
mock: &MockBackend,
payload: &Value,
mut emit: F,
) -> Result<(), SandboxError>
where
F: FnMut(Value) -> Fut,
Fut: Future<Output = ()>,
{
if let Some(method) = payload.get("method").and_then(Value::as_str) {
let id = payload.get("id").cloned();
let params = payload.get("params").cloned().unwrap_or(Value::Null);
if let Some(id_value) = id {
let response = mock_request(mock, &mut emit, id_value, method, params).await;
emit(response).await;
return Ok(());
}
mock_notification(&mut emit, method, params).await;
return Ok(());
}
Ok(())
}
async fn mock_request<F, Fut>(
mock: &MockBackend,
emit: &mut F,
id: Value,
method: &str,
params: Value,
) -> Value
where
F: FnMut(Value) -> Fut,
Fut: Future<Output = ()>,
{
match method {
"initialize" => json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"protocolVersion": params
.get("protocolVersion")
.cloned()
.unwrap_or(Value::String("1.0".to_string())),
"agentCapabilities": {
"loadSession": true,
"promptCapabilities": {
"image": false,
"audio": false
},
"canSetMode": true,
"canSetModel": true,
"sessionCapabilities": {
"list": {}
}
},
"authMethods": []
}
}),
"session/new" => {
let mut counter = mock.session_counter.lock().await;
*counter += 1;
let session_id = format!("mock-session-{}", *counter);
mock.sessions.lock().await.insert(session_id.clone());
mock.ended_sessions.lock().await.remove(&session_id);
json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"sessionId": session_id,
"availableModes": [],
"configOptions": []
}
})
}
"session/prompt" => {
let known_session = {
let sessions = mock.sessions.lock().await;
sessions.iter().next().cloned()
};
let session_id = params
.get("sessionId")
.and_then(Value::as_str)
.map(ToString::to_string)
.or(known_session)
.unwrap_or_else(|| "mock-session-1".to_string());
mock.sessions.lock().await.insert(session_id.clone());
if mock.ended_sessions.lock().await.contains(&session_id) {
return json!({
"jsonrpc": "2.0",
"id": id,
"error": {
"code": -32000,
"message": "session already ended"
}
});
}
let prompt_text = extract_prompt_text(&params);
let response_text = prompt_text
.clone()
.map(|text| {
if text.trim().is_empty() {
"OK".to_string()
} else {
format!("mock: {text}")
}
})
.unwrap_or_else(|| "OK".to_string());
let requires_permission = prompt_text
.as_deref()
.map(|text| text.to_ascii_lowercase().contains("permission"))
.unwrap_or(false);
if requires_permission {
let mut permission_counter = mock.permission_counter.lock().await;
*permission_counter += 1;
let permission_id = format!("mock-permission-{}", *permission_counter);
emit(json!({
"jsonrpc": "2.0",
"id": permission_id,
"method": "session/request_permission",
"params": {
"sessionId": session_id,
"options": [
{
"id": "allow_once",
"name": "Allow once"
},
{
"id": "deny",
"name": "Deny"
}
],
"toolCall": {
"toolCallId": "tool-call-1",
"kind": "execute",
"status": "pending",
"rawInput": {
"command": "echo test"
}
}
}
}))
.await;
}
let should_crash = prompt_text
.as_deref()
.map(|text| text.to_ascii_lowercase().contains("crash"))
.unwrap_or(false);
if should_crash {
mock.ended_sessions.lock().await.insert(session_id.clone());
emit(json!({
"jsonrpc": "2.0",
"method": "_sandboxagent/session/ended",
"params": {
"session_id": session_id,
"data": {
"reason": "error",
"terminated_by": "agent",
"message": "mock process crashed",
"exit_code": 1,
"stderr": {
"head": "mock stderr line 1\nmock stderr line 2",
"truncated": false,
"total_lines": 2
}
}
}
}))
.await;
return json!({
"jsonrpc": "2.0",
"id": id,
"error": {
"code": -32000,
"message": "mock process crashed"
}
});
}
let word_chunks = split_text_into_word_chunks(&response_text);
for (index, chunk) in word_chunks.iter().enumerate() {
emit(json!({
"jsonrpc": "2.0",
"method": "session/update",
"params": {
"sessionId": session_id,
"update": {
"sessionUpdate": "agent_message_chunk",
"content": {
"type": "text",
"text": chunk
}
}
}
}))
.await;
if index + 1 < word_chunks.len() {
sleep(Duration::from_millis(MOCK_WORD_STREAM_DELAY_MS)).await;
}
}
json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"stopReason": "end_turn"
}
})
}
"session/list" => {
let sessions = mock
.sessions
.lock()
.await
.iter()
.cloned()
.collect::<Vec<_>>();
let sessions = sessions
.into_iter()
.map(|session_id| {
json!({
"sessionId": session_id,
"cwd": "/"
})
})
.collect::<Vec<_>>();
json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"sessions": sessions,
"nextCursor": null
}
})
}
"session/fork" | "session/resume" | "session/load" => {
let session_id = params
.get("sessionId")
.and_then(Value::as_str)
.unwrap_or("mock-session-1")
.to_string();
mock.sessions.lock().await.insert(session_id.clone());
mock.ended_sessions.lock().await.remove(&session_id);
json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"sessionId": session_id,
"configOptions": [],
"availableModes": []
}
})
}
"session/set_mode" | "session/set_model" | "session/set_config_option" => json!({
"jsonrpc": "2.0",
"id": id,
"result": {}
}),
"authenticate" => json!({
"jsonrpc": "2.0",
"id": id,
"result": {}
}),
"$/cancel_request" => json!({
"jsonrpc": "2.0",
"id": id,
"result": {}
}),
"_sandboxagent/session/terminate" => {
let fallback_session = {
let sessions = mock.sessions.lock().await;
sessions.iter().next().cloned()
};
let session_id = params
.get("sessionId")
.and_then(Value::as_str)
.map(ToOwned::to_owned)
.or(fallback_session)
.unwrap_or_else(|| "mock-session-1".to_string());
let exists = mock.sessions.lock().await.contains(&session_id);
let mut ended_sessions = mock.ended_sessions.lock().await;
let terminated = exists && ended_sessions.insert(session_id.clone());
drop(ended_sessions);
if terminated {
emit(json!({
"jsonrpc": "2.0",
"method": "_sandboxagent/session/ended",
"params": {
"session_id": session_id,
"data": {
"reason": "terminated",
"terminated_by": "daemon"
}
}
}))
.await;
}
json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"terminated": terminated,
"alreadyEnded": !terminated,
"reason": "terminated",
"terminatedBy": "daemon"
}
})
}
_ => json!({
"jsonrpc": "2.0",
"id": id,
"result": {
"_meta": {
"sandboxagent.dev": {
"mockMethod": method,
"echoParams": params
}
}
}
}),
}
}
async fn mock_notification<F, Fut>(emit: &mut F, method: &str, params: Value)
where
F: FnMut(Value) -> Fut,
Fut: Future<Output = ()>,
{
if method == "session/cancel" {
let session_id = params
.get("sessionId")
.and_then(Value::as_str)
.unwrap_or("mock-session-1");
emit(json!({
"jsonrpc": "2.0",
"method": "session/update",
"params": {
"sessionId": session_id,
"update": {
"sessionUpdate": "agent_message_chunk",
"content": {
"type": "text",
"text": "cancelled"
}
}
}
}))
.await;
}
}
fn split_text_into_word_chunks(text: &str) -> Vec<String> {
let words: Vec<&str> = text.split_whitespace().collect();
if words.is_empty() {
return vec![text.to_string()];
}
let last = words.len() - 1;
words
.into_iter()
.enumerate()
.map(|(index, word)| {
if index == last {
word.to_string()
} else {
format!("{word} ")
}
})
.collect()
}
fn extract_prompt_text(params: &Value) -> Option<String> {
let prompt = params.get("prompt")?.as_array()?;
let mut output = String::new();
for block in prompt {
if block.get("type").and_then(Value::as_str) == Some("text") {
if let Some(text) = block.get("text").and_then(Value::as_str) {
if !output.is_empty() {
output.push('\n');
}
output.push_str(text);
}
}
}
if output.is_empty() {
None
} else {
Some(output)
}
}

File diff suppressed because it is too large Load diff

View file

@ -3,6 +3,14 @@ mod unix;
#[cfg(windows)]
mod windows;
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct StderrOutput {
pub head: Option<String>,
pub tail: Option<String>,
pub truncated: bool,
pub total_lines: Option<usize>,
}
#[cfg(unix)]
pub use unix::AgentServerLogs;
#[cfg(windows)]

View file

@ -3,9 +3,10 @@ use std::io::{BufRead, BufReader};
use std::path::{Path, PathBuf};
use sandbox_agent_error::SandboxError;
use sandbox_agent_universal_agent_schema::StderrOutput;
use time::{Duration, OffsetDateTime};
use super::StderrOutput;
const LOG_RETENTION_DAYS: i64 = 7;
const LOG_HEAD_LINES: usize = 20;
const LOG_TAIL_LINES: usize = 50;

View file

@ -2,9 +2,10 @@ use std::fs::OpenOptions;
use std::path::{Path, PathBuf};
use sandbox_agent_error::SandboxError;
use sandbox_agent_universal_agent_schema::StderrOutput;
use time::{Duration, OffsetDateTime};
use super::StderrOutput;
const LOG_RETENTION_DAYS: i64 = 7;
pub struct AgentServerLogs {

File diff suppressed because it is too large Load diff

View file

@ -1 +0,0 @@
pub use sandbox_agent_agent_credentials::*;

View file

@ -145,7 +145,7 @@ pub fn is_process_running(pid: u32) -> bool {
// ---------------------------------------------------------------------------
pub fn check_health(base_url: &str, token: Option<&str>) -> Result<bool, CliError> {
let url = format!("{base_url}/v1/health");
let url = format!("{base_url}/v2/health");
let started_at = Instant::now();
let client = HttpClient::builder()
.connect_timeout(HEALTH_CHECK_CONNECT_TIMEOUT)
@ -205,7 +205,7 @@ pub fn wait_for_health(
}
}
let url = format!("{base_url}/v1/health");
let url = format!("{base_url}/v2/health");
let mut request = client.get(&url);
if let Some(token) = token {
request = request.bearer_auth(token);

View file

@ -1,8 +1,10 @@
//! Sandbox agent core utilities.
mod acp_runtime;
mod agent_server_logs;
mod opencode_session_manager;
mod universal_events;
pub mod cli;
pub mod credentials;
pub mod daemon;
pub mod http_client;
pub mod opencode_compat;

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,626 @@
use super::*;
pub(super) async fn v1_removed() -> Response {
let problem = ProblemDetails {
type_: "urn:sandbox-agent:error:v1_removed".to_string(),
title: "v1 API removed".to_string(),
status: 410,
detail: Some("v1 API removed; use /v2".to_string()),
instance: None,
extensions: serde_json::Map::new(),
};
(
StatusCode::GONE,
[(header::CONTENT_TYPE, "application/problem+json")],
Json(problem),
)
.into_response()
}
pub(super) async fn opencode_disabled() -> Response {
let problem = ProblemDetails {
type_: "urn:sandbox-agent:error:opencode_disabled".to_string(),
title: "OpenCode compatibility disabled".to_string(),
status: 503,
detail: Some(
"/opencode is disabled during ACP core bring-up and will return in Phase 7".to_string(),
),
instance: None,
extensions: serde_json::Map::new(),
};
(
StatusCode::SERVICE_UNAVAILABLE,
[(header::CONTENT_TYPE, "application/problem+json")],
Json(problem),
)
.into_response()
}
pub(super) async fn not_found() -> Response {
let problem = ProblemDetails {
type_: ErrorType::InvalidRequest.as_urn().to_string(),
title: "Not Found".to_string(),
status: 404,
detail: Some("endpoint not found".to_string()),
instance: None,
extensions: serde_json::Map::new(),
};
(
StatusCode::NOT_FOUND,
[(header::CONTENT_TYPE, "application/problem+json")],
Json(problem),
)
.into_response()
}
pub(super) async fn require_token(
State(state): State<Arc<AppState>>,
request: Request<axum::body::Body>,
next: Next,
) -> Result<Response, ApiError> {
let Some(expected) = state.auth.token.as_ref() else {
return Ok(next.run(request).await);
};
let bearer = request
.headers()
.get(header::AUTHORIZATION)
.and_then(|value| value.to_str().ok())
.and_then(|value| value.strip_prefix("Bearer "));
if bearer == Some(expected.as_str()) {
return Ok(next.run(request).await);
}
Err(ApiError::Sandbox(SandboxError::TokenInvalid {
message: Some("missing or invalid bearer token".to_string()),
}))
}
pub(super) type PinBoxSseStream =
std::pin::Pin<Box<dyn Stream<Item = Result<axum::response::sse::Event, Infallible>> + Send>>;
pub(super) fn map_runtime_session(session: crate::acp_runtime::SessionRuntimeInfo) -> SessionInfo {
SessionInfo {
session_id: session.session_id,
agent: session.agent.as_str().to_string(),
agent_mode: session
.mode_hint
.clone()
.unwrap_or_else(|| "build".to_string()),
permission_mode: session
.sandbox_meta
.get("permissionMode")
.or_else(|| session.sandbox_meta.get("permission_mode"))
.and_then(Value::as_str)
.map(ToOwned::to_owned)
.unwrap_or_else(|| "default".to_string()),
model: session.model_hint,
native_session_id: session
.sandbox_meta
.get("nativeSessionId")
.or_else(|| session.sandbox_meta.get("native_session_id"))
.and_then(Value::as_str)
.map(ToOwned::to_owned),
ended: session.ended,
event_count: session.event_count,
created_at: session.created_at,
updated_at: session.updated_at,
directory: Some(session.cwd),
title: session.title,
termination_info: session.ended_data.map(map_termination_info),
}
}
pub(super) fn map_termination_info(ended: crate::acp_runtime::SessionEndedData) -> TerminationInfo {
let reason = match ended.reason {
crate::acp_runtime::SessionEndReason::Completed => "completed",
crate::acp_runtime::SessionEndReason::Error => "error",
crate::acp_runtime::SessionEndReason::Terminated => "terminated",
}
.to_string();
let terminated_by = match ended.terminated_by {
crate::acp_runtime::TerminatedBy::Agent => "agent",
crate::acp_runtime::TerminatedBy::Daemon => "daemon",
}
.to_string();
TerminationInfo {
reason,
terminated_by,
message: ended.message,
exit_code: ended.exit_code,
stderr: ended.stderr.map(|stderr| StderrOutput {
head: stderr.head,
tail: stderr.tail,
truncated: stderr.truncated,
total_lines: stderr.total_lines,
}),
}
}
pub(super) fn map_server_status(
status: &crate::acp_runtime::RuntimeServerStatus,
) -> ServerStatusInfo {
let server_status = if status.running {
ServerStatus::Running
} else if status.last_error.is_some() {
ServerStatus::Error
} else {
ServerStatus::Stopped
};
ServerStatusInfo {
status: server_status,
base_url: status.base_url.clone(),
uptime_ms: status.uptime_ms.map(|value| value.max(0) as u64),
restart_count: status.restart_count,
last_error: status.last_error.clone(),
}
}
pub(super) fn credentials_available_for(
agent: AgentId,
has_anthropic: bool,
has_openai: bool,
) -> bool {
match agent {
AgentId::Claude | AgentId::Amp => has_anthropic,
AgentId::Codex => has_openai,
AgentId::Opencode => has_anthropic || has_openai,
AgentId::Mock => true,
}
}
pub(super) fn agent_capabilities_for(agent: AgentId) -> AgentCapabilities {
match agent {
AgentId::Claude => AgentCapabilities {
plan_mode: false,
permissions: true,
questions: true,
tool_calls: true,
tool_results: true,
text_messages: true,
images: false,
file_attachments: false,
session_lifecycle: false,
error_events: false,
reasoning: false,
status: false,
command_execution: false,
file_changes: false,
mcp_tools: true,
streaming_deltas: true,
item_started: false,
shared_process: false,
},
AgentId::Codex => AgentCapabilities {
plan_mode: true,
permissions: true,
questions: false,
tool_calls: true,
tool_results: true,
text_messages: true,
images: true,
file_attachments: true,
session_lifecycle: true,
error_events: true,
reasoning: true,
status: true,
command_execution: true,
file_changes: true,
mcp_tools: true,
streaming_deltas: true,
item_started: true,
shared_process: true,
},
AgentId::Opencode => AgentCapabilities {
plan_mode: false,
permissions: false,
questions: false,
tool_calls: true,
tool_results: true,
text_messages: true,
images: true,
file_attachments: true,
session_lifecycle: true,
error_events: true,
reasoning: false,
status: false,
command_execution: false,
file_changes: false,
mcp_tools: true,
streaming_deltas: true,
item_started: true,
shared_process: true,
},
AgentId::Amp => AgentCapabilities {
plan_mode: false,
permissions: false,
questions: false,
tool_calls: true,
tool_results: true,
text_messages: true,
images: false,
file_attachments: false,
session_lifecycle: false,
error_events: true,
reasoning: false,
status: false,
command_execution: false,
file_changes: false,
mcp_tools: true,
streaming_deltas: false,
item_started: false,
shared_process: false,
},
AgentId::Mock => AgentCapabilities {
plan_mode: true,
permissions: true,
questions: true,
tool_calls: true,
tool_results: true,
text_messages: true,
images: true,
file_attachments: true,
session_lifecycle: true,
error_events: true,
reasoning: true,
status: true,
command_execution: true,
file_changes: true,
mcp_tools: true,
streaming_deltas: true,
item_started: true,
shared_process: false,
},
}
}
pub(super) fn agent_modes_for(agent: AgentId) -> Vec<AgentModeInfo> {
match agent {
AgentId::Opencode => vec![
AgentModeInfo {
id: "build".to_string(),
name: "Build".to_string(),
description: "Default build mode".to_string(),
},
AgentModeInfo {
id: "plan".to_string(),
name: "Plan".to_string(),
description: "Planning mode".to_string(),
},
AgentModeInfo {
id: "custom".to_string(),
name: "Custom".to_string(),
description: "Any user-defined OpenCode agent name".to_string(),
},
],
AgentId::Codex => vec![
AgentModeInfo {
id: "build".to_string(),
name: "Build".to_string(),
description: "Default build mode".to_string(),
},
AgentModeInfo {
id: "plan".to_string(),
name: "Plan".to_string(),
description: "Planning mode via prompt prefix".to_string(),
},
],
AgentId::Claude => vec![
AgentModeInfo {
id: "build".to_string(),
name: "Build".to_string(),
description: "Default build mode".to_string(),
},
AgentModeInfo {
id: "plan".to_string(),
name: "Plan".to_string(),
description: "Plan mode (prompt-only)".to_string(),
},
],
AgentId::Amp => vec![AgentModeInfo {
id: "build".to_string(),
name: "Build".to_string(),
description: "Default build mode".to_string(),
}],
AgentId::Mock => vec![
AgentModeInfo {
id: "build".to_string(),
name: "Build".to_string(),
description: "Mock agent for UI testing".to_string(),
},
AgentModeInfo {
id: "plan".to_string(),
name: "Plan".to_string(),
description: "Plan-only mock mode".to_string(),
},
],
}
}
pub(super) fn fallback_models_for_agent(
agent: AgentId,
) -> Option<(Vec<AgentModelInfo>, Option<String>)> {
match agent {
AgentId::Claude => Some((
vec![
AgentModelInfo {
id: "default".to_string(),
name: Some("Default (recommended)".to_string()),
variants: None,
default_variant: None,
},
AgentModelInfo {
id: "sonnet".to_string(),
name: Some("Sonnet".to_string()),
variants: None,
default_variant: None,
},
AgentModelInfo {
id: "opus".to_string(),
name: Some("Opus".to_string()),
variants: None,
default_variant: None,
},
AgentModelInfo {
id: "haiku".to_string(),
name: Some("Haiku".to_string()),
variants: None,
default_variant: None,
},
],
Some("default".to_string()),
)),
AgentId::Amp => Some((
vec![AgentModelInfo {
id: "amp-default".to_string(),
name: Some("Amp Default".to_string()),
variants: None,
default_variant: None,
}],
Some("amp-default".to_string()),
)),
AgentId::Mock => Some((
vec![AgentModelInfo {
id: "mock".to_string(),
name: Some("Mock".to_string()),
variants: None,
default_variant: None,
}],
Some("mock".to_string()),
)),
AgentId::Codex | AgentId::Opencode => None,
}
}
pub(super) fn map_install_result(result: InstallResult) -> AgentInstallResponse {
AgentInstallResponse {
already_installed: result.already_installed,
artifacts: result
.artifacts
.into_iter()
.map(|artifact| AgentInstallArtifact {
kind: map_artifact_kind(artifact.kind),
path: artifact.path.to_string_lossy().to_string(),
source: map_install_source(artifact.source),
version: artifact.version,
})
.collect(),
}
}
pub(super) fn map_install_source(source: InstallSource) -> String {
match source {
InstallSource::Registry => "registry",
InstallSource::Fallback => "fallback",
InstallSource::LocalPath => "local_path",
InstallSource::Builtin => "builtin",
}
.to_string()
}
pub(super) fn map_artifact_kind(kind: InstalledArtifactKind) -> String {
match kind {
InstalledArtifactKind::NativeAgent => "native_agent",
InstalledArtifactKind::AgentProcess => "agent_process",
}
.to_string()
}
pub(super) async fn resolve_fs_path(
state: &Arc<AppState>,
session_id: Option<&str>,
raw_path: &str,
) -> Result<PathBuf, SandboxError> {
let path = PathBuf::from(raw_path);
if path.is_absolute() {
return Ok(path);
}
let root = resolve_fs_root(state, session_id).await?;
let relative = sanitize_relative_path(&path)?;
Ok(root.join(relative))
}
pub(super) async fn resolve_fs_root(
state: &Arc<AppState>,
session_id: Option<&str>,
) -> Result<PathBuf, SandboxError> {
if let Some(session_id) = session_id {
let session = state
.acp_runtime()
.get_session(session_id)
.await
.ok_or_else(|| SandboxError::SessionNotFound {
session_id: session_id.to_string(),
})?;
return Ok(PathBuf::from(session.cwd));
}
let home = std::env::var_os("HOME")
.or_else(|| std::env::var_os("USERPROFILE"))
.map(PathBuf::from)
.ok_or_else(|| SandboxError::InvalidRequest {
message: "home directory unavailable".to_string(),
})?;
Ok(home)
}
pub(super) fn sanitize_relative_path(path: &StdPath) -> Result<PathBuf, SandboxError> {
use std::path::Component;
let mut sanitized = PathBuf::new();
for component in path.components() {
match component {
Component::CurDir => {}
Component::Normal(value) => sanitized.push(value),
Component::ParentDir | Component::RootDir | Component::Prefix(_) => {
return Err(SandboxError::InvalidRequest {
message: format!("invalid relative path: {}", path.display()),
});
}
}
}
Ok(sanitized)
}
pub(super) fn map_fs_error(path: &StdPath, err: std::io::Error) -> SandboxError {
if err.kind() == std::io::ErrorKind::NotFound {
SandboxError::InvalidRequest {
message: format!("path not found: {}", path.display()),
}
} else {
SandboxError::StreamError {
message: err.to_string(),
}
}
}
pub(super) fn header_str(headers: &HeaderMap, name: &str) -> Option<String> {
headers
.get(name)
.and_then(|value| value.to_str().ok())
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
}
pub(super) fn content_type_is(headers: &HeaderMap, expected: &str) -> bool {
let Some(value) = headers
.get(header::CONTENT_TYPE)
.and_then(|value| value.to_str().ok())
else {
return false;
};
media_type_eq(value, expected)
}
pub(super) fn accept_allows(headers: &HeaderMap, expected: &str) -> bool {
let values = headers.get_all(header::ACCEPT);
if values.iter().next().is_none() {
return true;
}
values
.iter()
.filter_map(|value| value.to_str().ok())
.flat_map(|value| value.split(','))
.any(|value| media_type_matches(value, expected))
}
fn media_type_eq(raw: &str, expected: &str) -> bool {
normalize_media_type(raw).as_deref() == Some(expected)
}
fn media_type_matches(raw: &str, expected: &str) -> bool {
let Some(media) = normalize_media_type(raw) else {
return false;
};
if media == expected || media == "*/*" {
return true;
}
let Some((media_type, media_subtype)) = media.split_once('/') else {
return false;
};
let Some((expected_type, _expected_subtype)) = expected.split_once('/') else {
return false;
};
media_subtype == "*" && media_type == expected_type
}
fn normalize_media_type(raw: &str) -> Option<String> {
let media = raw.split(';').next().unwrap_or_default().trim();
if media.is_empty() {
return None;
}
Some(media.to_ascii_lowercase())
}
pub(super) fn parse_last_event_id(headers: &HeaderMap) -> Result<Option<u64>, SandboxError> {
let value = headers
.get("last-event-id")
.and_then(|value| value.to_str().ok());
match value {
Some(value) if !value.trim().is_empty() => {
value
.trim()
.parse::<u64>()
.map(Some)
.map_err(|_| SandboxError::InvalidRequest {
message: "Last-Event-ID must be a positive integer".to_string(),
})
}
_ => Ok(None),
}
}
pub(super) fn set_client_id_header(
response: &mut Response,
client_id: &str,
) -> Result<(), ApiError> {
let header_value = HeaderValue::from_str(client_id).map_err(|err| {
ApiError::Sandbox(SandboxError::StreamError {
message: format!("invalid client id header value: {err}"),
})
})?;
response
.headers_mut()
.insert(ACP_CLIENT_HEADER, header_value);
Ok(())
}
pub(super) fn request_principal(state: &AppState, headers: &HeaderMap) -> String {
if state.auth.token.is_some() {
headers
.get(header::AUTHORIZATION)
.and_then(|value| value.to_str().ok())
.map(ToOwned::to_owned)
.unwrap_or_else(|| "authenticated".to_string())
} else {
"anonymous".to_string()
}
}
pub(super) fn problem_from_sandbox_error(error: &SandboxError) -> ProblemDetails {
let mut problem = error.to_problem_details();
match error {
SandboxError::SessionNotFound { .. } => {
problem.type_ = "urn:sandbox-agent:error:client_not_found".to_string();
problem.title = "ACP client not found".to_string();
problem.detail = Some("unknown ACP client id".to_string());
problem.status = 404;
}
SandboxError::InvalidRequest { .. } => {
problem.status = 400;
}
SandboxError::Timeout { .. } => {
problem.status = 504;
}
_ => {}
}
problem
}

View file

@ -0,0 +1,346 @@
use super::*;
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
pub struct HealthResponse {
pub status: String,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentModeInfo {
pub id: String,
pub name: String,
pub description: String,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentModelInfo {
pub id: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub name: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub variants: Option<Vec<String>>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub default_variant: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentModelsResponse {
pub models: Vec<AgentModelInfo>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub default_model: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "lowercase")]
pub enum ServerStatus {
Running,
Stopped,
Error,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct ServerStatusInfo {
pub status: ServerStatus,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub base_url: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub uptime_ms: Option<u64>,
pub restart_count: u64,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub last_error: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentCapabilities {
pub plan_mode: bool,
pub permissions: bool,
pub questions: bool,
pub tool_calls: bool,
pub tool_results: bool,
pub text_messages: bool,
pub images: bool,
pub file_attachments: bool,
pub session_lifecycle: bool,
pub error_events: bool,
pub reasoning: bool,
pub status: bool,
pub command_execution: bool,
pub file_changes: bool,
pub mcp_tools: bool,
pub streaming_deltas: bool,
pub item_started: bool,
pub shared_process: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentInfo {
pub id: String,
pub installed: bool,
pub credentials_available: bool,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub version: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub path: Option<String>,
pub capabilities: AgentCapabilities,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub server_status: Option<ServerStatusInfo>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub models: Option<Vec<AgentModelInfo>>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub default_model: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub modes: Option<Vec<AgentModeInfo>>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct AgentListResponse {
pub agents: Vec<AgentInfo>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema, Default)]
#[serde(rename_all = "camelCase")]
pub struct AgentInstallRequest {
pub reinstall: Option<bool>,
pub agent_version: Option<String>,
pub agent_process_version: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
pub struct AgentInstallArtifact {
pub kind: String,
pub path: String,
pub source: String,
pub version: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
pub struct AgentInstallResponse {
pub already_installed: bool,
pub artifacts: Vec<AgentInstallArtifact>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct StderrOutput {
#[serde(default, skip_serializing_if = "Option::is_none")]
pub head: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub tail: Option<String>,
pub truncated: bool,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub total_lines: Option<usize>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct TerminationInfo {
pub reason: String,
pub terminated_by: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub message: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub exit_code: Option<i32>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub stderr: Option<StderrOutput>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct SessionInfo {
pub session_id: String,
pub agent: String,
pub agent_mode: String,
pub permission_mode: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub model: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub native_session_id: Option<String>,
pub ended: bool,
pub event_count: u64,
pub created_at: i64,
pub updated_at: i64,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub directory: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub title: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub termination_info: Option<TerminationInfo>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
pub struct SessionListResponse {
pub sessions: Vec<SessionInfo>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct CreateSessionRequest {
pub agent: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub agent_mode: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub permission_mode: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub model: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub variant: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub agent_version: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub directory: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub title: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub mcp: Option<Value>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub skills: Option<Value>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "lowercase")]
pub enum PermissionReply {
Once,
Always,
Reject,
}
impl std::str::FromStr for PermissionReply {
type Err = String;
fn from_str(value: &str) -> Result<Self, Self::Err> {
match value.to_ascii_lowercase().as_str() {
"once" => Ok(Self::Once),
"always" => Ok(Self::Always),
"reject" => Ok(Self::Reject),
_ => Err(format!("invalid permission reply: {value}")),
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsPathQuery {
pub path: String,
#[serde(default, skip_serializing_if = "Option::is_none", alias = "session_id")]
pub session_id: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsEntriesQuery {
#[serde(default, skip_serializing_if = "Option::is_none")]
pub path: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none", alias = "session_id")]
pub session_id: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsSessionQuery {
#[serde(default, skip_serializing_if = "Option::is_none", alias = "session_id")]
pub session_id: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsDeleteQuery {
pub path: String,
#[serde(default, skip_serializing_if = "Option::is_none", alias = "session_id")]
pub session_id: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub recursive: Option<bool>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsUploadBatchQuery {
#[serde(default, skip_serializing_if = "Option::is_none")]
pub path: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none", alias = "session_id")]
pub session_id: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "lowercase")]
pub enum FsEntryType {
File,
Directory,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsEntry {
pub name: String,
pub path: String,
pub entry_type: FsEntryType,
pub size: u64,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub modified: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsStat {
pub path: String,
pub entry_type: FsEntryType,
pub size: u64,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub modified: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsWriteResponse {
pub path: String,
pub bytes_written: u64,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsMoveRequest {
pub from: String,
pub to: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub overwrite: Option<bool>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsMoveResponse {
pub from: String,
pub to: String,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsActionResponse {
pub path: String,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
#[serde(rename_all = "camelCase")]
pub struct FsUploadBatchResponse {
pub paths: Vec<String>,
pub truncated: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, ToSchema)]
pub struct AcpEnvelope {
pub jsonrpc: String,
#[serde(default)]
pub id: Option<Value>,
#[serde(default)]
pub method: Option<String>,
#[serde(default)]
pub params: Option<Value>,
#[serde(default)]
pub result: Option<Value>,
#[serde(default)]
pub error: Option<Value>,
}

View file

@ -1,45 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use common::*;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::Value;
use std::time::Duration;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_basic_reply() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
for config in &configs {
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("basic-{}", config.agent.as_str());
create_session(&app.app, config.agent, &session_id, "default").await;
send_message(&app.app, &session_id, PROMPT).await;
let events = poll_events_until(&app.app, &session_id, Duration::from_secs(120), |events| {
has_event_type(events, "error") || find_assistant_message_item(events).is_some()
})
.await;
assert!(
!events.is_empty(),
"no events collected for {}",
config.agent.as_str()
);
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if caps.tool_calls {
assert!(
!events.iter().any(|event| {
event.get("type").and_then(Value::as_str) == Some("agent.unparsed")
}),
"agent.unparsed event detected"
);
}
}
}

View file

@ -1,105 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use axum::http::Method;
use common::*;
use sandbox_agent_agent_management::agents::AgentId;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::Value;
use std::fs;
use std::time::{Duration, Instant};
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_file_edit_flow() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
for config in &configs {
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.file_changes {
continue;
}
if config.agent == AgentId::Mock {
// Mock agent only emits synthetic file change events.
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let temp_dir = tempfile::tempdir().expect("create temp dir");
let file_path = temp_dir.path().join("edit.txt");
fs::write(&file_path, "before\n").expect("write seed file");
let session_id = format!("file-edit-{}", config.agent.as_str());
create_session(
&app.app,
config.agent,
&session_id,
test_permission_mode(config.agent),
)
.await;
let prompt = format!(
"Edit the file at {} so its entire contents are exactly 'updated' (no quotes). \
Do not change any other files. Reply only with DONE after editing.",
file_path.display()
);
send_message(&app.app, &session_id, &prompt).await;
let start = Instant::now();
let mut offset = 0u64;
let mut events = Vec::new();
let mut replied = false;
let mut updated = false;
while start.elapsed() < Duration::from_secs(180) {
let path = format!("/v1/sessions/{session_id}/events?offset={offset}&limit=200");
let (status, payload) = send_json(&app.app, Method::GET, &path, None).await;
assert_eq!(status, axum::http::StatusCode::OK, "poll events");
let new_events = payload
.get("events")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
if !new_events.is_empty() {
if let Some(last) = new_events
.last()
.and_then(|event| event.get("sequence"))
.and_then(Value::as_u64)
{
offset = last;
}
events.extend(new_events);
if !replied {
if let Some(permission_id) = find_permission_id(&events) {
let _ = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/permissions/{permission_id}/reply"),
Some(serde_json::json!({ "reply": "once" })),
)
.await;
replied = true;
}
}
}
let contents = fs::read_to_string(&file_path).unwrap_or_default();
let trimmed = contents.trim_end_matches(&['\r', '\n'][..]);
if trimmed == "updated" {
updated = true;
break;
}
tokio::time::sleep(Duration::from_millis(800)).await;
}
assert!(
updated,
"file edit did not complete for {}",
config.agent.as_str()
);
}
}

View file

@ -1,144 +0,0 @@
//! Tests for session resumption behavior.
use std::time::Duration;
use axum::body::Body;
use axum::http::{Method, Request, StatusCode};
use axum::Router;
use http_body_util::BodyExt;
use serde_json::{json, Value};
use tempfile::TempDir;
use sandbox_agent::router::{build_router, AppState, AuthConfig};
use sandbox_agent_agent_management::agents::{AgentId, AgentManager};
use tower::util::ServiceExt;
struct TestApp {
app: Router,
_install_dir: TempDir,
}
impl TestApp {
fn new() -> Self {
let install_dir = tempfile::tempdir().expect("create temp install dir");
let manager = AgentManager::new(install_dir.path()).expect("create agent manager");
let state = AppState::new(AuthConfig::disabled(), manager);
let app = build_router(state);
Self {
app,
_install_dir: install_dir,
}
}
}
async fn send_json(
app: &Router,
method: Method,
path: &str,
body: Option<Value>,
) -> (StatusCode, Value) {
let mut builder = Request::builder().method(method).uri(path);
let body = if let Some(body) = body {
builder = builder.header("content-type", "application/json");
Body::from(body.to_string())
} else {
Body::empty()
};
let request = builder.body(body).expect("request");
let response = app.clone().oneshot(request).await.expect("request handled");
let status = response.status();
let bytes = response
.into_body()
.collect()
.await
.expect("read body")
.to_bytes();
let value = if bytes.is_empty() {
Value::Null
} else {
serde_json::from_slice(&bytes)
.unwrap_or(Value::String(String::from_utf8_lossy(&bytes).to_string()))
};
(status, value)
}
async fn create_session(app: &Router, agent: AgentId, session_id: &str) {
let (status, _) = send_json(
app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({
"agent": agent.as_str(),
"permissionMode": "bypass"
})),
)
.await;
assert_eq!(status, StatusCode::OK, "create session {agent}");
}
/// Send a message and return the status code (allows checking for errors)
async fn send_message_with_status(
app: &Router,
session_id: &str,
message: &str,
) -> (StatusCode, Value) {
send_json(
app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": message })),
)
.await
}
fn is_session_ended(event: &Value) -> bool {
event
.get("type")
.and_then(Value::as_str)
.map(|t| t == "session.ended")
.unwrap_or(false)
}
/// Test that verifies the session can be reopened after ending
#[tokio::test]
async fn session_reopen_after_end() {
let test_app = TestApp::new();
let session_id = "reopen-test";
// Create session with mock agent
create_session(&test_app.app, AgentId::Mock, session_id).await;
// Send "end" command to mock agent to end the session
let (status, _) = send_message_with_status(&test_app.app, session_id, "end").await;
assert_eq!(status, StatusCode::NO_CONTENT);
// Wait for session to end
tokio::time::sleep(Duration::from_millis(500)).await;
// Verify session is ended
let path = format!("/v1/sessions/{session_id}/events?offset=0&limit=100");
let (_, payload) = send_json(&test_app.app, Method::GET, &path, None).await;
let events = payload
.get("events")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
let has_ended = events.iter().any(|e| is_session_ended(e));
assert!(has_ended, "Session should be ended after 'end' command");
// Try to send another message - mock agent supports resume so this should work
// (or fail if we haven't implemented reopen for mock)
let (status, body) = send_message_with_status(&test_app.app, session_id, "hello again").await;
// For mock agent, the session should be reopenable since mock is in agent_supports_resume
// But mock's session.ended is triggered differently than real agents
// This test documents the current behavior
if status == StatusCode::NO_CONTENT {
eprintln!("Mock agent session was successfully reopened after end");
} else {
eprintln!(
"Mock agent session could not be reopened (status {}): {:?}",
status, body
);
}
}

View file

@ -1,70 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use axum::http::Method;
use common::*;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::json;
use std::time::Duration;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_permission_flow() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
for config in &configs {
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !(caps.plan_mode && caps.permissions) {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("perm-{}", config.agent.as_str());
create_session(&app.app, config.agent, &session_id, "plan").await;
send_message(&app.app, &session_id, TOOL_PROMPT).await;
let events = poll_events_until(&app.app, &session_id, Duration::from_secs(120), |events| {
find_permission_id(events).is_some() || has_event_type(events, "error")
})
.await;
let permission_id = find_permission_id(&events).expect("permission.requested missing");
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/permissions/{permission_id}/reply"),
Some(json!({ "reply": "once" })),
)
.await;
assert_eq!(
status,
axum::http::StatusCode::NO_CONTENT,
"permission reply"
);
let resolved =
poll_events_until(&app.app, &session_id, Duration::from_secs(120), |events| {
events.iter().any(|event| {
event.get("type").and_then(serde_json::Value::as_str)
== Some("permission.resolved")
})
})
.await;
assert!(
resolved.iter().any(|event| {
event.get("type").and_then(serde_json::Value::as_str) == Some("permission.resolved")
&& event
.get("synthetic")
.and_then(serde_json::Value::as_bool)
.unwrap_or(false)
}),
"permission.resolved should be synthetic"
);
}
}

View file

@ -1,67 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use axum::http::Method;
use common::*;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::json;
use std::time::Duration;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_question_flow() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
for config in &configs {
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.questions {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("question-{}", config.agent.as_str());
create_session_with_mode(&app.app, config.agent, &session_id, "plan", "plan").await;
send_message(&app.app, &session_id, QUESTION_PROMPT).await;
let events = poll_events_until(&app.app, &session_id, Duration::from_secs(120), |events| {
find_question_id(events).is_some() || has_event_type(events, "error")
})
.await;
let question_id = find_question_id(&events).expect("question.requested missing");
let answers = find_first_answer(&events).unwrap_or_else(|| vec![vec![]]);
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/questions/{question_id}/reply"),
Some(json!({ "answers": answers })),
)
.await;
assert_eq!(status, axum::http::StatusCode::NO_CONTENT, "question reply");
let resolved =
poll_events_until(&app.app, &session_id, Duration::from_secs(120), |events| {
events.iter().any(|event| {
event.get("type").and_then(serde_json::Value::as_str)
== Some("question.resolved")
})
})
.await;
assert!(
resolved.iter().any(|event| {
event.get("type").and_then(serde_json::Value::as_str) == Some("question.resolved")
&& event
.get("synthetic")
.and_then(serde_json::Value::as_bool)
.unwrap_or(false)
}),
"question.resolved should be synthetic"
);
}
}

View file

@ -1,56 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use axum::http::Method;
use common::*;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::json;
use std::time::Duration;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_termination() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
for config in &configs {
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("terminate-{}", config.agent.as_str());
create_session(&app.app, config.agent, &session_id, "default").await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/terminate"),
None,
)
.await;
assert_eq!(
status,
axum::http::StatusCode::NO_CONTENT,
"terminate session"
);
let events = poll_events_until(&app.app, &session_id, Duration::from_secs(30), |events| {
has_event_type(events, "session.ended")
})
.await;
assert!(
has_event_type(&events, "session.ended"),
"missing session.ended"
);
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": PROMPT })),
)
.await;
assert!(
!status.is_success(),
"terminated session should reject messages"
);
}
}

View file

@ -1,93 +0,0 @@
#[path = "../common/mod.rs"]
mod common;
use axum::http::Method;
use common::*;
use sandbox_agent_agent_management::testing::test_agents_from_env;
use serde_json::Value;
use std::time::{Duration, Instant};
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_tool_flow() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
for config in &configs {
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.tool_calls {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("tool-{}", config.agent.as_str());
create_session(
&app.app,
config.agent,
&session_id,
test_permission_mode(config.agent),
)
.await;
send_message(&app.app, &session_id, TOOL_PROMPT).await;
let start = Instant::now();
let mut offset = 0u64;
let mut events = Vec::new();
let mut replied = false;
while start.elapsed() < Duration::from_secs(180) {
let path = format!("/v1/sessions/{session_id}/events?offset={offset}&limit=200");
let (status, payload) = send_json(&app.app, Method::GET, &path, None).await;
assert_eq!(status, axum::http::StatusCode::OK, "poll events");
let new_events = payload
.get("events")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
if !new_events.is_empty() {
if let Some(last) = new_events
.last()
.and_then(|event| event.get("sequence"))
.and_then(Value::as_u64)
{
offset = last;
}
events.extend(new_events);
if !replied {
if let Some(permission_id) = find_permission_id(&events) {
let _ = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/permissions/{permission_id}/reply"),
Some(serde_json::json!({ "reply": "once" })),
)
.await;
replied = true;
}
}
if has_tool_result(&events) {
break;
}
}
tokio::time::sleep(Duration::from_millis(800)).await;
}
let tool_call = find_tool_call(&events);
let tool_result = has_tool_result(&events);
assert!(
tool_call.is_some(),
"tool_call missing for tool-capable agent {}",
config.agent.as_str()
);
if tool_call.is_some() {
assert!(
tool_result,
"tool_result missing after tool_call for {}",
config.agent.as_str()
);
}
}
}

View file

@ -1,8 +0,0 @@
mod agent_basic_reply;
mod agent_file_edit_flow;
mod agent_multi_turn;
mod agent_permission_flow;
mod agent_question_flow;
mod agent_termination;
mod agent_tool_flow;
mod pi_rpc_integration;

View file

@ -1,182 +0,0 @@
use std::collections::HashMap;
use std::env;
use sandbox_agent_agent_management::agents::{
AgentError, AgentId, AgentManager, InstallOptions, SpawnOptions,
};
use sandbox_agent_agent_management::credentials::{
extract_all_credentials, CredentialExtractionOptions,
};
fn build_env() -> HashMap<String, String> {
let options = CredentialExtractionOptions::new();
let credentials = extract_all_credentials(&options);
let mut env = HashMap::new();
if let Some(anthropic) = credentials.anthropic {
env.insert("ANTHROPIC_API_KEY".to_string(), anthropic.api_key);
}
if let Some(openai) = credentials.openai {
env.insert("OPENAI_API_KEY".to_string(), openai.api_key);
}
env
}
fn amp_configured() -> bool {
let home = dirs::home_dir().unwrap_or_default();
home.join(".amp").join("config.json").exists()
}
fn prompt_ok(label: &str) -> String {
format!("Respond with exactly the text {label} and nothing else.")
}
fn pi_tests_enabled() -> bool {
env::var("SANDBOX_TEST_PI")
.map(|value| {
let value = value.trim().to_ascii_lowercase();
value == "1" || value == "true" || value == "yes"
})
.unwrap_or(false)
}
fn pi_on_path() -> bool {
let binary = AgentId::Pi.binary_name();
let path_var = match env::var_os("PATH") {
Some(path) => path,
None => return false,
};
for path in env::split_paths(&path_var) {
if path.join(binary).exists() {
return true;
}
}
false
}
#[test]
fn pi_spawn_streaming_is_rejected_with_runtime_contract_error(
) -> Result<(), Box<dyn std::error::Error>> {
let temp_dir = tempfile::tempdir()?;
let manager = AgentManager::new(temp_dir.path().join("bin"))?;
let err = manager
.spawn_streaming(AgentId::Pi, SpawnOptions::new(prompt_ok("IGNORED")))
.expect_err("expected Pi spawn_streaming to be rejected");
assert!(matches!(
err,
AgentError::UnsupportedRuntimePath {
agent: AgentId::Pi,
operation: "spawn_streaming",
..
}
));
Ok(())
}
#[test]
fn test_agents_install_version_spawn() -> Result<(), Box<dyn std::error::Error>> {
let temp_dir = tempfile::tempdir()?;
let manager = AgentManager::new(temp_dir.path().join("bin"))?;
let env = build_env();
assert!(!env.is_empty(), "expected credentials to be available");
let mut agents = vec![
AgentId::Claude,
AgentId::Codex,
AgentId::Opencode,
AgentId::Amp,
];
if pi_tests_enabled() && pi_on_path() {
agents.push(AgentId::Pi);
}
for agent in agents {
let install = manager.install(agent, InstallOptions::default())?;
assert!(install.path.exists(), "expected install for {agent}");
assert!(
manager.is_installed(agent),
"expected is_installed for {agent}"
);
manager.install(
agent,
InstallOptions {
reinstall: true,
version: None,
},
)?;
let version = manager.version(agent)?;
assert!(version.is_some(), "expected version for {agent}");
if agent != AgentId::Amp || amp_configured() {
let mut spawn = SpawnOptions::new(prompt_ok("OK"));
spawn.env = env.clone();
let result = manager.spawn(agent, spawn)?;
assert!(
result.status.success(),
"spawn failed for {agent}: {}",
result.stderr
);
assert!(
!result.events.is_empty(),
"expected events for {agent} but got none"
);
assert!(
result.session_id.is_some(),
"expected session id for {agent}"
);
let combined = format!("{}{}", result.stdout, result.stderr);
let output = result.result.clone().unwrap_or(combined);
assert!(
output.contains("OK"),
"expected OK for {agent}, got: {output}"
);
if agent == AgentId::Claude
|| agent == AgentId::Opencode
|| (agent == AgentId::Amp && amp_configured())
{
let mut resume = SpawnOptions::new(prompt_ok("OK2"));
resume.env = env.clone();
resume.session_id = result.session_id.clone();
let resumed = manager.spawn(agent, resume)?;
assert!(
resumed.status.success(),
"resume spawn failed for {agent}: {}",
resumed.stderr
);
let combined = format!("{}{}", resumed.stdout, resumed.stderr);
let output = resumed.result.clone().unwrap_or(combined);
assert!(
output.contains("OK2"),
"expected OK2 for {agent}, got: {output}"
);
} else if agent == AgentId::Codex {
let mut resume = SpawnOptions::new(prompt_ok("OK2"));
resume.env = env.clone();
resume.session_id = result.session_id.clone();
let err = manager
.spawn(agent, resume)
.expect_err("expected resume error for codex");
assert!(matches!(err, AgentError::ResumeUnsupported { .. }));
}
if agent == AgentId::Claude || agent == AgentId::Codex {
let mut plan = SpawnOptions::new(prompt_ok("OK3"));
plan.env = env.clone();
plan.permission_mode = Some("plan".to_string());
let planned = manager.spawn(agent, plan)?;
assert!(
planned.status.success(),
"plan spawn failed for {agent}: {}",
planned.stderr
);
let combined = format!("{}{}", planned.stdout, planned.stderr);
let output = planned.result.clone().unwrap_or(combined);
assert!(
output.contains("OK3"),
"expected OK3 for {agent}, got: {output}"
);
}
}
}
Ok(())
}

View file

@ -1 +0,0 @@
mod agents;

View file

@ -1,2 +0,0 @@
#[path = "agent-flows/mod.rs"]
mod agent_flows;

View file

@ -1,2 +0,0 @@
#[path = "agent-management/mod.rs"]
mod agent_management;

File diff suppressed because it is too large Load diff

View file

@ -1,339 +0,0 @@
use std::collections::HashMap;
use std::time::{Duration, Instant};
use axum::body::Body;
use axum::http::{Method, Request, StatusCode};
use axum::Router;
use http_body_util::BodyExt;
use serde_json::{json, Value};
use tempfile::TempDir;
use tower::util::ServiceExt;
use sandbox_agent::router::{build_router, AgentCapabilities, AgentListResponse, AuthConfig};
use sandbox_agent_agent_credentials::ExtractedCredentials;
use sandbox_agent_agent_management::agents::{AgentId, AgentManager};
pub const PROMPT: &str = "Reply with exactly the single word OK.";
pub const TOOL_PROMPT: &str =
"Use the bash tool to run `ls` in the current directory. Do not answer without using the tool.";
pub const QUESTION_PROMPT: &str =
"Call the AskUserQuestion tool with exactly one yes/no question and wait for a reply. Do not answer yourself.";
pub struct TestApp {
pub app: Router,
_install_dir: TempDir,
}
impl TestApp {
pub fn new() -> Self {
let install_dir = tempfile::tempdir().expect("create temp install dir");
let manager = AgentManager::new(install_dir.path()).expect("create agent manager");
let state = sandbox_agent::router::AppState::new(AuthConfig::disabled(), manager);
let app = build_router(state);
Self {
app,
_install_dir: install_dir,
}
}
}
pub struct EnvGuard {
saved: HashMap<String, Option<String>>,
}
impl Drop for EnvGuard {
fn drop(&mut self) {
for (key, value) in &self.saved {
match value {
Some(value) => std::env::set_var(key, value),
None => std::env::remove_var(key),
}
}
}
}
pub fn apply_credentials(creds: &ExtractedCredentials) -> EnvGuard {
let keys = [
"ANTHROPIC_API_KEY",
"CLAUDE_API_KEY",
"OPENAI_API_KEY",
"CODEX_API_KEY",
];
let mut saved = HashMap::new();
for key in keys {
saved.insert(key.to_string(), std::env::var(key).ok());
}
match creds.anthropic.as_ref() {
Some(cred) => {
std::env::set_var("ANTHROPIC_API_KEY", &cred.api_key);
std::env::set_var("CLAUDE_API_KEY", &cred.api_key);
}
None => {
std::env::remove_var("ANTHROPIC_API_KEY");
std::env::remove_var("CLAUDE_API_KEY");
}
}
match creds.openai.as_ref() {
Some(cred) => {
std::env::set_var("OPENAI_API_KEY", &cred.api_key);
std::env::set_var("CODEX_API_KEY", &cred.api_key);
}
None => {
std::env::remove_var("OPENAI_API_KEY");
std::env::remove_var("CODEX_API_KEY");
}
}
EnvGuard { saved }
}
pub async fn send_json(
app: &Router,
method: Method,
path: &str,
body: Option<Value>,
) -> (StatusCode, Value) {
let request = Request::builder()
.method(method)
.uri(path)
.header("content-type", "application/json")
.body(Body::from(
body.map(|value| value.to_string()).unwrap_or_default(),
))
.expect("request");
let response = app.clone().oneshot(request).await.expect("response");
let status = response.status();
let bytes = response
.into_body()
.collect()
.await
.expect("body")
.to_bytes();
let payload = if bytes.is_empty() {
Value::Null
} else {
serde_json::from_slice(&bytes).unwrap_or(Value::Null)
};
(status, payload)
}
pub async fn send_status(
app: &Router,
method: Method,
path: &str,
body: Option<Value>,
) -> StatusCode {
let (status, _) = send_json(app, method, path, body).await;
status
}
pub async fn install_agent(app: &Router, agent: AgentId) {
let status = send_status(
app,
Method::POST,
&format!("/v1/agents/{}/install", agent.as_str()),
Some(json!({})),
)
.await;
assert_eq!(
status,
StatusCode::NO_CONTENT,
"install agent {}",
agent.as_str()
);
}
pub async fn create_session(app: &Router, agent: AgentId, session_id: &str, permission_mode: &str) {
let status = send_status(
app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({
"agent": agent.as_str(),
"permissionMode": permission_mode,
})),
)
.await;
assert_eq!(status, StatusCode::OK, "create session");
}
pub async fn create_session_with_mode(
app: &Router,
agent: AgentId,
session_id: &str,
agent_mode: &str,
permission_mode: &str,
) {
let status = send_status(
app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({
"agent": agent.as_str(),
"agentMode": agent_mode,
"permissionMode": permission_mode,
})),
)
.await;
assert_eq!(status, StatusCode::OK, "create session");
}
pub fn test_permission_mode(agent: AgentId) -> &'static str {
match agent {
AgentId::Opencode | AgentId::Pi => "default",
_ => "bypass",
}
}
pub async fn send_message(app: &Router, session_id: &str, message: &str) {
let status = send_status(
app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": message })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send message");
}
pub async fn poll_events_until<F>(
app: &Router,
session_id: &str,
timeout: Duration,
mut stop: F,
) -> Vec<Value>
where
F: FnMut(&[Value]) -> bool,
{
let start = Instant::now();
let mut offset = 0u64;
let mut events = Vec::new();
while start.elapsed() < timeout {
let path = format!("/v1/sessions/{session_id}/events?offset={offset}&limit=200");
let (status, payload) = send_json(app, Method::GET, &path, None).await;
assert_eq!(status, StatusCode::OK, "poll events");
let new_events = payload
.get("events")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
if !new_events.is_empty() {
if let Some(last) = new_events
.last()
.and_then(|event| event.get("sequence"))
.and_then(Value::as_u64)
{
offset = last;
}
events.extend(new_events);
if stop(&events) {
break;
}
}
tokio::time::sleep(Duration::from_millis(800)).await;
}
events
}
pub async fn fetch_capabilities(app: &Router) -> HashMap<String, AgentCapabilities> {
let (status, payload) = send_json(app, Method::GET, "/v1/agents", None).await;
assert_eq!(status, StatusCode::OK, "list agents");
let response: AgentListResponse = serde_json::from_value(payload).expect("agents payload");
response
.agents
.into_iter()
.map(|agent| (agent.id, agent.capabilities))
.collect()
}
pub fn has_event_type(events: &[Value], event_type: &str) -> bool {
events
.iter()
.any(|event| event.get("type").and_then(Value::as_str) == Some(event_type))
}
pub fn find_assistant_message_item(events: &[Value]) -> Option<String> {
events.iter().find_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("item.completed") {
return None;
}
let item = event.get("data")?.get("item")?;
let role = item.get("role")?.as_str()?;
let kind = item.get("kind")?.as_str()?;
if role != "assistant" || kind != "message" {
return None;
}
item.get("item_id")?.as_str().map(|id| id.to_string())
})
}
pub fn find_permission_id(events: &[Value]) -> Option<String> {
events.iter().find_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("permission.requested") {
return None;
}
event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
.map(|id| id.to_string())
})
}
pub fn find_question_id(events: &[Value]) -> Option<String> {
events.iter().find_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("question.requested") {
return None;
}
event
.get("data")
.and_then(|data| data.get("question_id"))
.and_then(Value::as_str)
.map(|id| id.to_string())
})
}
pub fn find_first_answer(events: &[Value]) -> Option<Vec<Vec<String>>> {
events.iter().find_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("question.requested") {
return None;
}
let options = event
.get("data")
.and_then(|data| data.get("options"))
.and_then(Value::as_array)?;
let option = options.first()?.as_str()?.to_string();
Some(vec![vec![option]])
})
}
pub fn find_tool_call(events: &[Value]) -> Option<String> {
events.iter().find_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("item.started")
&& event.get("type").and_then(Value::as_str) != Some("item.completed")
{
return None;
}
let item = event.get("data")?.get("item")?;
let kind = item.get("kind")?.as_str()?;
if kind != "tool_call" {
return None;
}
item.get("item_id")?.as_str().map(|id| id.to_string())
})
}
pub fn has_tool_result(events: &[Value]) -> bool {
events.iter().any(|event| {
if event.get("type").and_then(Value::as_str) != Some("item.completed") {
return false;
}
let item = match event.get("data").and_then(|data| data.get("item")) {
Some(item) => item,
None => return false,
};
item.get("kind").and_then(Value::as_str) == Some("tool_result")
})
}

View file

@ -1,411 +0,0 @@
// Agent-specific HTTP endpoints live here; session-related snapshots are in tests/sessions/.
include!("../common/http.rs");
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn auth_snapshots() {
let token = "test-token";
let app = TestApp::new_with_auth(AuthConfig::with_token(token.to_string()));
let (status, payload) = send_json(&app.app, Method::GET, "/v1/health", None).await;
assert_eq!(status, StatusCode::OK, "health should be public");
insta::with_settings!({
snapshot_suffix => snapshot_name("auth_health_public", None),
}, {
insta::assert_yaml_snapshot!(json!({
"status": status.as_u16(),
"payload": normalize_health(&payload),
}));
});
let (status, payload) = send_json(&app.app, Method::GET, "/v1/agents", None).await;
assert_eq!(status, StatusCode::UNAUTHORIZED, "missing token should 401");
insta::with_settings!({
snapshot_suffix => snapshot_name("auth_missing_token", None),
}, {
insta::assert_yaml_snapshot!(json!({
"status": status.as_u16(),
"payload": payload,
}));
});
let request = Request::builder()
.method(Method::GET)
.uri("/v1/agents")
.header(header::AUTHORIZATION, "Bearer wrong-token")
.body(Body::empty())
.expect("auth invalid request");
let (status, _headers, payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::UNAUTHORIZED, "invalid token should 401");
insta::with_settings!({
snapshot_suffix => snapshot_name("auth_invalid_token", None),
}, {
insta::assert_yaml_snapshot!(json!({
"status": status.as_u16(),
"payload": payload,
}));
});
let request = Request::builder()
.method(Method::GET)
.uri("/v1/agents")
.header(header::AUTHORIZATION, format!("Bearer {token}"))
.body(Body::empty())
.expect("auth valid request");
let (status, _headers, payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "valid token should succeed");
insta::with_settings!({
snapshot_suffix => snapshot_name("auth_valid_token", None),
}, {
insta::assert_yaml_snapshot!(json!({
"status": status.as_u16(),
"payload": normalize_agent_list(&payload),
}));
});
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn cors_snapshots() {
let cors = CorsLayer::new()
.allow_origin("http://example.com".parse::<HeaderValue>().unwrap())
.allow_methods([Method::GET, Method::POST])
.allow_headers([header::CONTENT_TYPE, header::AUTHORIZATION]);
let app = TestApp::new_with_auth_and_cors(AuthConfig::disabled(), Some(cors));
let preflight = Request::builder()
.method(Method::OPTIONS)
.uri("/v1/agents")
.header(header::ORIGIN, "http://example.com")
.header(header::ACCESS_CONTROL_REQUEST_METHOD, "GET")
.header(
header::ACCESS_CONTROL_REQUEST_HEADERS,
"authorization,content-type",
)
.body(Body::empty())
.expect("cors preflight request");
let (status, headers, _payload) = send_request(&app.app, preflight).await;
insta::with_settings!({
snapshot_suffix => snapshot_name("cors_preflight", None),
}, {
insta::assert_yaml_snapshot!(snapshot_cors(status, &headers));
});
let actual = Request::builder()
.method(Method::GET)
.uri("/v1/health")
.header(header::ORIGIN, "http://example.com")
.body(Body::empty())
.expect("cors actual request");
let (status, headers, payload) = send_json_request(&app.app, actual).await;
assert_eq!(status, StatusCode::OK, "cors actual request should succeed");
insta::with_settings!({
snapshot_suffix => snapshot_name("cors_actual", None),
}, {
insta::assert_yaml_snapshot!(json!({
"cors": snapshot_cors(status, &headers),
"payload": normalize_health(&payload),
}));
});
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn agent_endpoints_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
let app = TestApp::new();
let (status, health) = send_json(&app.app, Method::GET, "/v1/health", None).await;
assert_eq!(status, StatusCode::OK, "health status");
insta::with_settings!({
snapshot_suffix => snapshot_name("health", None),
}, {
insta::assert_yaml_snapshot!(normalize_health(&health));
});
// List agents (verify IDs only; install state is environment-dependent).
let (status, agents) = send_json(&app.app, Method::GET, "/v1/agents", None).await;
assert_eq!(status, StatusCode::OK, "agents list");
insta::with_settings!({
snapshot_suffix => snapshot_name("agents_list", None),
}, {
insta::assert_yaml_snapshot!(normalize_agent_list(&agents));
});
for config in &configs {
let _guard = apply_credentials(&config.credentials);
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/agents/{}/install", config.agent.as_str()),
Some(json!({})),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "install agent");
insta::with_settings!({
snapshot_suffix => snapshot_name("agent_install", Some(config.agent)),
}, {
insta::assert_yaml_snapshot!(snapshot_status(status));
});
}
for config in &configs {
let _guard = apply_credentials(&config.credentials);
let (status, modes) = send_json(
&app.app,
Method::GET,
&format!("/v1/agents/{}/modes", config.agent.as_str()),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "agent modes");
insta::with_settings!({
snapshot_suffix => snapshot_name("agent_modes", Some(config.agent)),
}, {
insta::assert_yaml_snapshot!(normalize_agent_modes(&modes));
});
}
for config in &configs {
let _guard = apply_credentials(&config.credentials);
let (status, models) = send_json(
&app.app,
Method::GET,
&format!("/v1/agents/{}/models", config.agent.as_str()),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "agent models");
let model_count = models
.get("models")
.and_then(|value| value.as_array())
.map(|models| models.len())
.unwrap_or_default();
assert!(model_count > 0, "agent models should not be empty");
insta::with_settings!({
snapshot_suffix => snapshot_name("agent_models", Some(config.agent)),
}, {
insta::assert_yaml_snapshot!(normalize_agent_models(&models, config.agent));
});
}
}
fn pi_test_config() -> Option<TestAgentConfig> {
let configs = match test_agents_from_env() {
Ok(configs) => configs,
Err(err) => {
eprintln!("Skipping PI endpoint variant test: {err}");
return None;
}
};
configs
.into_iter()
.find(|config| config.agent == AgentId::Pi)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn pi_capabilities_and_models_expose_variants() {
let Some(config) = pi_test_config() else {
return;
};
let app = TestApp::new();
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, AgentId::Pi).await;
let capabilities = fetch_capabilities(&app.app).await;
let pi_caps = capabilities.get("pi").expect("pi capabilities");
assert!(pi_caps.variants, "pi capabilities should enable variants");
let (status, payload) = send_json(&app.app, Method::GET, "/v1/agents/pi/models", None).await;
assert_eq!(status, StatusCode::OK, "pi models endpoint");
let models = payload
.get("models")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
assert!(!models.is_empty(), "pi models should not be empty");
let full_levels = vec!["off", "minimal", "low", "medium", "high", "xhigh"];
for model in models {
let model_id = model
.get("id")
.and_then(Value::as_str)
.unwrap_or("<unknown>");
let variants = model
.get("variants")
.and_then(Value::as_array)
.expect("pi model variants");
let default_variant = model
.get("defaultVariant")
.and_then(Value::as_str)
.expect("pi model defaultVariant");
let variant_ids = variants
.iter()
.filter_map(Value::as_str)
.collect::<Vec<_>>();
assert!(
!variant_ids.is_empty(),
"pi model {model_id} has no variants"
);
if variant_ids == vec!["off"] {
assert_eq!(
default_variant, "off",
"pi model {model_id} expected default off for non-thinking model"
);
} else {
assert_eq!(
variant_ids, full_levels,
"pi model {model_id} expected full thinking levels"
);
assert_eq!(
default_variant, "medium",
"pi model {model_id} expected medium default for thinking model"
);
}
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn create_session_with_skill_sources() {
let app = TestApp::new();
// Create a temp skill directory with SKILL.md
let skill_dir = tempfile::tempdir().expect("create skill dir");
let skill_path = skill_dir.path().join("my-test-skill");
std::fs::create_dir_all(&skill_path).expect("create skill subdir");
std::fs::write(skill_path.join("SKILL.md"), "# Test Skill\nA test skill.")
.expect("write SKILL.md");
// Create session with local skill source
let (status, payload) = send_json(
&app.app,
Method::POST,
"/v1/sessions/skill-test-session",
Some(json!({
"agent": "mock",
"skills": {
"sources": [
{
"type": "local",
"source": skill_dir.path().to_string_lossy()
}
]
}
})),
)
.await;
assert_eq!(
status,
StatusCode::OK,
"create session with skills: {payload}"
);
assert!(
payload
.get("healthy")
.and_then(Value::as_bool)
.unwrap_or(false),
"session should be healthy"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn create_session_with_skill_sources_filter() {
let app = TestApp::new();
// Create a temp directory with two skills
let skill_dir = tempfile::tempdir().expect("create skill dir");
let wanted = skill_dir.path().join("wanted-skill");
let unwanted = skill_dir.path().join("unwanted-skill");
std::fs::create_dir_all(&wanted).expect("create wanted dir");
std::fs::create_dir_all(&unwanted).expect("create unwanted dir");
std::fs::write(wanted.join("SKILL.md"), "# Wanted").expect("write wanted SKILL.md");
std::fs::write(unwanted.join("SKILL.md"), "# Unwanted").expect("write unwanted SKILL.md");
// Create session with filter
let (status, payload) = send_json(
&app.app,
Method::POST,
"/v1/sessions/skill-filter-session",
Some(json!({
"agent": "mock",
"skills": {
"sources": [
{
"type": "local",
"source": skill_dir.path().to_string_lossy(),
"skills": ["wanted-skill"]
}
]
}
})),
)
.await;
assert_eq!(
status,
StatusCode::OK,
"create session with skill filter: {payload}"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn create_session_with_invalid_skill_source() {
let app = TestApp::new();
// Use a non-existent path
let (status, _payload) = send_json(
&app.app,
Method::POST,
"/v1/sessions/skill-invalid-session",
Some(json!({
"agent": "mock",
"skills": {
"sources": [
{
"type": "local",
"source": "/nonexistent/path/to/skills"
}
]
}
})),
)
.await;
// Should fail with a 4xx or 5xx error
assert_ne!(
status,
StatusCode::OK,
"session with invalid skill source should fail"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn create_session_with_skill_filter_no_match() {
let app = TestApp::new();
let skill_dir = tempfile::tempdir().expect("create skill dir");
let skill_path = skill_dir.path().join("alpha");
std::fs::create_dir_all(&skill_path).expect("create alpha dir");
std::fs::write(skill_path.join("SKILL.md"), "# Alpha").expect("write SKILL.md");
// Filter for a skill that doesn't exist
let (status, _payload) = send_json(
&app.app,
Method::POST,
"/v1/sessions/skill-nomatch-session",
Some(json!({
"agent": "mock",
"skills": {
"sources": [
{
"type": "local",
"source": skill_dir.path().to_string_lossy(),
"skills": ["nonexistent"]
}
]
}
})),
)
.await;
assert_ne!(
status,
StatusCode::OK,
"session with no matching skills should fail"
);
}

View file

@ -1,270 +0,0 @@
// Filesystem HTTP endpoints.
include!("../common/http.rs");
use std::fs as stdfs;
use tar::{Builder, Header};
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn fs_read_write_move_delete() {
let app = TestApp::new();
let cwd = std::env::current_dir().expect("cwd");
let temp = tempfile::tempdir_in(&cwd).expect("tempdir");
let dir_path = temp.path();
let file_path = dir_path.join("hello.txt");
let file_path_str = file_path.to_string_lossy().to_string();
let request = Request::builder()
.method(Method::PUT)
.uri(format!("/v1/fs/file?path={file_path_str}"))
.header(header::CONTENT_TYPE, "application/octet-stream")
.body(Body::from("hello"))
.expect("write request");
let (status, _headers, _payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "write file");
let request = Request::builder()
.method(Method::GET)
.uri(format!("/v1/fs/file?path={file_path_str}"))
.body(Body::empty())
.expect("read request");
let (status, headers, bytes) = send_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "read file");
assert_eq!(
headers
.get(header::CONTENT_TYPE)
.and_then(|value| value.to_str().ok()),
Some("application/octet-stream")
);
assert_eq!(bytes.as_ref(), b"hello");
let entries_path = dir_path.to_string_lossy().to_string();
let (status, entries) = send_json(
&app.app,
Method::GET,
&format!("/v1/fs/entries?path={entries_path}"),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "list entries");
let entry_list = entries.as_array().cloned().unwrap_or_default();
let entry_names: Vec<String> = entry_list
.iter()
.filter_map(|entry| entry.get("name").and_then(|value| value.as_str()))
.map(|value| value.to_string())
.collect();
assert!(entry_names.contains(&"hello.txt".to_string()));
let new_path = dir_path.join("moved.txt");
let new_path_str = new_path.to_string_lossy().to_string();
let (status, _payload) = send_json(
&app.app,
Method::POST,
"/v1/fs/move",
Some(json!({
"from": file_path_str,
"to": new_path_str,
"overwrite": true
})),
)
.await;
assert_eq!(status, StatusCode::OK, "move file");
assert!(new_path.exists(), "moved file exists");
let (status, _payload) = send_json(
&app.app,
Method::DELETE,
&format!("/v1/fs/entry?path={}", new_path.to_string_lossy()),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "delete file");
assert!(!new_path.exists(), "file deleted");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn fs_upload_batch_tar() {
let app = TestApp::new();
let cwd = std::env::current_dir().expect("cwd");
let dest_dir = tempfile::tempdir_in(&cwd).expect("tempdir");
let mut builder = Builder::new(Vec::new());
let mut tar_header = Header::new_gnu();
let contents = b"hello";
tar_header.set_size(contents.len() as u64);
tar_header.set_cksum();
builder
.append_data(&mut tar_header, "a.txt", &contents[..])
.expect("append tar entry");
let mut tar_header = Header::new_gnu();
let contents = b"world";
tar_header.set_size(contents.len() as u64);
tar_header.set_cksum();
builder
.append_data(&mut tar_header, "nested/b.txt", &contents[..])
.expect("append tar entry");
let tar_bytes = builder.into_inner().expect("tar bytes");
let request = Request::builder()
.method(Method::POST)
.uri(format!(
"/v1/fs/upload-batch?path={}",
dest_dir.path().to_string_lossy()
))
.header(header::CONTENT_TYPE, "application/x-tar")
.body(Body::from(tar_bytes))
.expect("tar request");
let (status, _headers, payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "upload batch");
assert!(payload
.get("paths")
.and_then(|value| value.as_array())
.map(|value| !value.is_empty())
.unwrap_or(false));
assert!(payload.get("truncated").and_then(|value| value.as_bool()) == Some(false));
let a_path = dest_dir.path().join("a.txt");
let b_path = dest_dir.path().join("nested").join("b.txt");
assert!(a_path.exists(), "a.txt extracted");
assert!(b_path.exists(), "b.txt extracted");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn fs_relative_paths_use_session_dir() {
let app = TestApp::new();
let session_id = "fs-session";
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({ "agent": "mock" })),
)
.await;
assert_eq!(status, StatusCode::OK, "create session");
let cwd = std::env::current_dir().expect("cwd");
let temp = tempfile::tempdir_in(&cwd).expect("tempdir");
let relative_dir = temp
.path()
.strip_prefix(&cwd)
.expect("strip prefix")
.to_path_buf();
let relative_path = relative_dir.join("session.txt");
let request = Request::builder()
.method(Method::PUT)
.uri(format!(
"/v1/fs/file?session_id={session_id}&path={}",
relative_path.to_string_lossy()
))
.header(header::CONTENT_TYPE, "application/octet-stream")
.body(Body::from("session"))
.expect("write request");
let (status, _headers, _payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "write relative file");
let absolute_path = cwd.join(relative_path);
let content = stdfs::read_to_string(&absolute_path).expect("read file");
assert_eq!(content, "session");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn fs_upload_batch_truncates_paths() {
let app = TestApp::new();
let cwd = std::env::current_dir().expect("cwd");
let dest_dir = tempfile::tempdir_in(&cwd).expect("tempdir");
let mut builder = Builder::new(Vec::new());
for index in 0..1030 {
let mut tar_header = Header::new_gnu();
tar_header.set_size(0);
tar_header.set_cksum();
let name = format!("file_{index}.txt");
builder
.append_data(&mut tar_header, name, &[][..])
.expect("append tar entry");
}
let tar_bytes = builder.into_inner().expect("tar bytes");
let request = Request::builder()
.method(Method::POST)
.uri(format!(
"/v1/fs/upload-batch?path={}",
dest_dir.path().to_string_lossy()
))
.header(header::CONTENT_TYPE, "application/x-tar")
.body(Body::from(tar_bytes))
.expect("tar request");
let (status, _headers, payload) = send_json_request(&app.app, request).await;
assert_eq!(status, StatusCode::OK, "upload batch");
let paths = payload
.get("paths")
.and_then(|value| value.as_array())
.cloned()
.unwrap_or_default();
assert_eq!(paths.len(), 1024);
assert_eq!(
payload.get("truncated").and_then(|value| value.as_bool()),
Some(true)
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn fs_mkdir_stat_and_delete_directory() {
let app = TestApp::new();
let cwd = std::env::current_dir().expect("cwd");
let temp = tempfile::tempdir_in(&cwd).expect("tempdir");
let dir_path = temp.path().join("nested");
let dir_path_str = dir_path.to_string_lossy().to_string();
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/fs/mkdir?path={dir_path_str}"),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "mkdir");
assert!(dir_path.exists(), "directory created");
let (status, stat) = send_json(
&app.app,
Method::GET,
&format!("/v1/fs/stat?path={dir_path_str}"),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "stat directory");
assert_eq!(stat["entryType"], "directory");
let file_path = dir_path.join("note.txt");
stdfs::write(&file_path, "content").expect("write file");
let file_path_str = file_path.to_string_lossy().to_string();
let (status, stat) = send_json(
&app.app,
Method::GET,
&format!("/v1/fs/stat?path={file_path_str}"),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "stat file");
assert_eq!(stat["entryType"], "file");
let status = send_status(
&app.app,
Method::DELETE,
&format!("/v1/fs/entry?path={dir_path_str}&recursive=true"),
None,
)
.await;
assert_eq!(status, StatusCode::OK, "delete directory");
assert!(!dir_path.exists(), "directory deleted");
}

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 874
expression: snapshot_status(status)
---
status: 204

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 898
expression: snapshot_status(status)
---
status: 204

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/http_sse_snapshots.rs
assertion_line: 1016
expression: snapshot_status(status)
---
status: 204

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 907
expression: snapshot_status(status)
---
status: 204

View file

@ -1,19 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_models(&models, config.agent)
---
nonEmpty: true
hasDefault: true
defaultInList: true
hasVariants: true
modelCount: 4
ids:
- deep
- free
- rush
- smart
defaultModel: smart
variants:
- high
- medium
- xhigh

View file

@ -1,8 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_models(&models, config.agent)
---
nonEmpty: true
hasDefault: true
defaultInList: true
hasVariants: "<redacted>"

View file

@ -1,8 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_models(&models, config.agent)
---
nonEmpty: true
hasDefault: true
defaultInList: true
hasVariants: true

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_models(&models, config.agent)
---
nonEmpty: true
hasDefault: true
defaultInList: true
hasVariants: false
modelCount: 1
ids:
- mock
defaultModel: mock

View file

@ -1,8 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_models(&models, config.agent)
---
nonEmpty: true
hasDefault: true
defaultInList: true
hasVariants: "<redacted>"

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 900
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 916
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/http_sse_snapshots.rs
assertion_line: 1034
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,14 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: custom
name: Custom
- description: true
id: plan
name: Plan

View file

@ -1,10 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: normalize_agent_list(&agents)
---
agents:
- id: amp
- id: claude
- id: codex
- id: mock
- id: opencode

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 850
expression: normalize_health(&health)
---
status: ok

View file

@ -1,8 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 765
expression: "json!({ \"status\": status.as_u16(), \"payload\": normalize_health(&payload), })"
---
payload:
status: ok
status: 200

View file

@ -1,13 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 793
expression: "json!({ \"status\": status.as_u16(), \"payload\": payload, })"
---
payload:
detail: token invalid
details:
message: missing or invalid token
status: 401
title: Token Invalid
type: "urn:sandbox-agent:error:token_invalid"
status: 401

View file

@ -1,13 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 776
expression: "json!({ \"status\": status.as_u16(), \"payload\": payload, })"
---
payload:
detail: token invalid
details:
message: missing or invalid token
status: 401
title: Token Invalid
type: "urn:sandbox-agent:error:token_invalid"
status: 401

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
expression: "json!({\n \"status\": status.as_u16(), \"payload\": normalize_agent_list(&payload),\n})"
---
payload:
agents:
- id: amp
- id: claude
- id: codex
- id: mock
- id: opencode
status: 200

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 842
expression: "json!({\n \"cors\": snapshot_cors(status, &headers), \"payload\":\n normalize_health(&payload),\n})"
---
cors:
access-control-allow-credentials: "true"
access-control-allow-origin: "http://example.com"
status: 200
vary: "origin, access-control-request-method, access-control-request-headers"
payload:
status: ok

View file

@ -1,11 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http_sse_snapshots.rs
assertion_line: 818
expression: "snapshot_cors(status, &headers)"
---
access-control-allow-credentials: "true"
access-control-allow-headers: "content-type,authorization"
access-control-allow-methods: "GET,POST"
access-control-allow-origin: "http://example.com"
status: 200
vary: "origin, access-control-request-method, access-control-request-headers"

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 145
expression: snapshot_status(status)
---
status: 204

View file

@ -1,5 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: snapshot_status(status)
---
status: 204

View file

@ -1,6 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 145
expression: snapshot_status(status)
---
status: 204

View file

@ -1,5 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: snapshot_status(status)
---
status: 204

View file

@ -1,5 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: snapshot_status(status)
---
status: 204

View file

@ -1,13 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 185
expression: "normalize_agent_models(&models, config.agent)"
---
defaultInList: true
defaultModel: amp-default
hasDefault: true
hasVariants: false
ids:
- amp-default
modelCount: 1
nonEmpty: true

View file

@ -1,9 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 185
expression: "normalize_agent_models(&models, config.agent)"
---
defaultInList: true
hasDefault: true
hasVariants: "<redacted>"
nonEmpty: true

View file

@ -1,9 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 185
expression: "normalize_agent_models(&models, config.agent)"
---
defaultInList: true
hasDefault: true
hasVariants: false
nonEmpty: true

View file

@ -1,8 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "normalize_agent_models(&models, config.agent)"
---
defaultInList: true
hasDefault: true
hasVariants: "<redacted>"
nonEmpty: true

View file

@ -1,9 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 162
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build

View file

@ -1,11 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 162
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,11 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: plan
name: Plan

View file

@ -1,14 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: normalize_agent_modes(&modes)
---
modes:
- description: true
id: build
name: Build
- description: true
id: custom
name: Custom
- description: true
id: plan
name: Plan

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 129
expression: normalize_agent_list(&agents)
---
agents:
- id: amp
- id: claude
- id: codex
- id: mock
- id: opencode
- id: pi

View file

@ -1,5 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: normalize_health(&health)
---
status: ok

View file

@ -1,7 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "json!({ \"status\": status.as_u16(), \"payload\": normalize_health(&payload), })"
---
payload:
status: ok
status: 200

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "json!({ \"status\": status.as_u16(), \"payload\": payload, })"
---
payload:
detail: token invalid
details:
message: missing or invalid token
status: 401
title: Token Invalid
type: "urn:sandbox-agent:error:token_invalid"
status: 401

View file

@ -1,12 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "json!({ \"status\": status.as_u16(), \"payload\": payload, })"
---
payload:
detail: token invalid
details:
message: missing or invalid token
status: 401
title: Token Invalid
type: "urn:sandbox-agent:error:token_invalid"
status: 401

View file

@ -1,14 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
assertion_line: 59
expression: "json!({\n \"status\": status.as_u16(), \"payload\": normalize_agent_list(&payload),\n})"
---
payload:
agents:
- id: amp
- id: claude
- id: codex
- id: mock
- id: opencode
- id: pi
status: 200

View file

@ -1,10 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "json!({\n \"cors\": snapshot_cors(status, &headers), \"payload\":\n normalize_health(&payload),\n})"
---
cors:
access-control-allow-origin: "http://example.com"
status: 200
vary: "origin, access-control-request-method, access-control-request-headers"
payload:
status: ok

View file

@ -1,9 +0,0 @@
---
source: server/packages/sandbox-agent/tests/http/agent_endpoints.rs
expression: "snapshot_cors(status, &headers)"
---
access-control-allow-headers: "content-type,authorization"
access-control-allow-methods: "GET,POST"
access-control-allow-origin: "http://example.com"
status: 200
vary: "origin, access-control-request-method, access-control-request-headers"

View file

@ -1,4 +0,0 @@
#[path = "http/agent_endpoints.rs"]
mod agent_endpoints;
#[path = "http/fs_endpoints.rs"]
mod fs_endpoints;

View file

@ -79,7 +79,7 @@ async function waitForHealth(
}
try {
const response = await fetch(`${baseUrl}/v1/health`, {
const response = await fetch(`${baseUrl}/v2/health`, {
headers: { Authorization: `Bearer ${token}` },
});
if (response.ok) {

View file

@ -34,22 +34,22 @@ describe("OpenCode-compatible Session API", () => {
}
async function getBackingSessionPermissionMode(sessionId: string) {
const response = await fetch(`${handle.baseUrl}/v1/sessions`, {
const response = await fetch(`${handle.baseUrl}/opencode/session`, {
headers: { Authorization: `Bearer ${handle.token}` },
});
expect(response.ok).toBe(true);
const data = await response.json();
const session = (data.sessions ?? []).find((item: any) => item.sessionId === sessionId);
const sessions = await response.json();
const session = (sessions ?? []).find((item: any) => item.id === sessionId);
return session?.permissionMode;
}
async function getBackingSession(sessionId: string) {
const response = await fetch(`${handle.baseUrl}/v1/sessions`, {
const response = await fetch(`${handle.baseUrl}/opencode/session`, {
headers: { Authorization: `Bearer ${handle.token}` },
});
expect(response.ok).toBe(true);
const data = await response.json();
return (data.sessions ?? []).find((item: any) => item.sessionId === sessionId);
const sessions = await response.json();
return (sessions ?? []).find((item: any) => item.id === sessionId);
}
async function initSessionViaHttp(

View file

@ -34,6 +34,13 @@ fn official_spec_path() -> PathBuf {
#[test]
fn opencode_openapi_matches_official_paths() {
let official_path = official_spec_path();
if !official_path.exists() {
eprintln!(
"skipping OpenCode OpenAPI parity check; official spec missing at {:?}",
official_path
);
return;
}
let official_json = fs::read_to_string(&official_path).unwrap_or_else(|err| {
panic!("failed to read official OpenCode spec at {official_path:?}: {err}")
});

View file

@ -1,122 +0,0 @@
use std::sync::Arc;
use sandbox_agent::router::test_utils::{exit_status, spawn_sleep_process, TestHarness};
use sandbox_agent_agent_management::agents::AgentId;
use sandbox_agent_universal_agent_schema::SessionEndReason;
use tokio::time::{timeout, Duration};
async fn wait_for_exit(child: &Arc<std::sync::Mutex<Option<std::process::Child>>>) {
for _ in 0..20 {
let done = {
let mut guard = child.lock().expect("child lock");
match guard.as_mut() {
Some(child) => child.try_wait().ok().flatten().is_some(),
None => true,
}
};
if done {
return;
}
tokio::time::sleep(Duration::from_millis(50)).await;
}
}
#[tokio::test]
async fn register_and_unregister_sessions() {
let harness = TestHarness::new().await;
harness
.register_session(AgentId::Codex, "sess-1", Some("thread-1"))
.await;
assert!(harness.has_session_mapping(AgentId::Codex, "sess-1").await);
assert_eq!(
harness
.native_mapping(AgentId::Codex, "thread-1")
.await
.as_deref(),
Some("sess-1")
);
harness
.unregister_session(AgentId::Codex, "sess-1", Some("thread-1"))
.await;
assert!(!harness.has_session_mapping(AgentId::Codex, "sess-1").await);
assert!(harness
.native_mapping(AgentId::Codex, "thread-1")
.await
.is_none());
}
#[tokio::test]
async fn shutdown_marks_servers_stopped_and_kills_child() {
let harness = TestHarness::new().await;
let child = harness
.insert_stdio_server(AgentId::Codex, Some(spawn_sleep_process()), 0)
.await;
harness.shutdown().await;
assert!(matches!(
harness.server_status(AgentId::Codex).await,
Some(sandbox_agent::router::ServerStatus::Stopped)
));
wait_for_exit(&child).await;
let exited = {
let mut guard = child.lock().expect("child lock");
guard
.as_mut()
.and_then(|child| child.try_wait().ok().flatten())
.is_some()
};
assert!(exited);
}
#[tokio::test]
async fn handle_process_exit_marks_error_and_ends_sessions() {
let harness = TestHarness::new().await;
harness
.insert_session("sess-1", AgentId::Codex, Some("thread-1"))
.await;
harness
.register_session(AgentId::Codex, "sess-1", Some("thread-1"))
.await;
harness.insert_stdio_server(AgentId::Codex, None, 1).await;
harness
.handle_process_exit(AgentId::Codex, 1, exit_status(7))
.await;
assert!(matches!(
harness.server_status(AgentId::Codex).await,
Some(sandbox_agent::router::ServerStatus::Error)
));
assert!(harness
.server_last_error(AgentId::Codex)
.await
.unwrap_or_default()
.contains("exited"));
assert!(harness.session_ended("sess-1").await);
assert!(matches!(
harness.session_end_reason("sess-1").await,
Some(SessionEndReason::Error)
));
}
#[tokio::test]
async fn auto_restart_notifier_emits_signal() {
let harness = TestHarness::new().await;
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();
harness.set_restart_notifier(tx).await;
harness.insert_http_server(AgentId::Mock, 2).await;
harness
.handle_process_exit(AgentId::Mock, 2, exit_status(2))
.await;
let received = timeout(Duration::from_millis(200), rx.recv())
.await
.expect("timeout");
assert_eq!(received, Some(AgentId::Mock));
}

View file

@ -1,2 +0,0 @@
#[cfg(feature = "test-utils")]
mod agent_server_manager;

View file

@ -1,2 +0,0 @@
#[path = "server-manager/mod.rs"]
mod server_manager;

View file

@ -1,2 +0,0 @@
#[path = "sessions/mod.rs"]
mod sessions;

View file

@ -1,6 +0,0 @@
mod multi_turn;
mod permissions;
mod questions;
mod reasoning;
mod session_lifecycle;
mod status;

View file

@ -1,128 +0,0 @@
// Multi-turn session snapshots use the mock baseline as the single source of truth.
include!("../common/http.rs");
const FIRST_PROMPT: &str = "Reply with exactly the word FIRST.";
const SECOND_PROMPT: &str = "Reply with exactly the word SECOND.";
fn session_snapshot_suffix(prefix: &str) -> String {
snapshot_name(prefix, Some(AgentId::Mock))
}
fn assert_session_snapshot(prefix: &str, value: Value) {
insta::with_settings!({
snapshot_suffix => session_snapshot_suffix(prefix),
}, {
insta::assert_yaml_snapshot!(value);
});
}
async fn send_message_with_text(app: &Router, session_id: &str, text: &str) {
let status = send_status(
app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": text })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send message");
}
async fn poll_events_until_from(
app: &Router,
session_id: &str,
offset: u64,
timeout: Duration,
) -> (Vec<Value>, u64) {
let start = Instant::now();
let mut offset = offset;
let mut events = Vec::new();
while start.elapsed() < timeout {
let path = format!("/v1/sessions/{session_id}/events?offset={offset}&limit=200");
let (status, payload) = send_json(app, Method::GET, &path, None).await;
assert_eq!(status, StatusCode::OK, "poll events");
let new_events = payload
.get("events")
.and_then(Value::as_array)
.cloned()
.unwrap_or_default();
if !new_events.is_empty() {
if let Some(last) = new_events
.last()
.and_then(|event| event.get("sequence"))
.and_then(Value::as_u64)
{
offset = last;
}
events.extend(new_events);
if should_stop(&events) {
break;
}
}
tokio::time::sleep(Duration::from_millis(800)).await;
}
(events, offset)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn multi_turn_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("multi-turn-{}", config.agent.as_str());
create_session(
&app.app,
config.agent,
&session_id,
test_permission_mode(config.agent),
)
.await;
send_message_with_text(&app.app, &session_id, FIRST_PROMPT).await;
let (first_events, offset) =
poll_events_until_from(&app.app, &session_id, 0, Duration::from_secs(120)).await;
let first_events = truncate_after_first_stop(&first_events);
assert!(
!first_events.is_empty(),
"no events collected for first turn {}",
config.agent
);
assert!(
should_stop(&first_events),
"timed out waiting for assistant/error event for first turn {}",
config.agent
);
send_message_with_text(&app.app, &session_id, SECOND_PROMPT).await;
let (second_events, _offset) =
poll_events_until_from(&app.app, &session_id, offset, Duration::from_secs(120)).await;
let second_events = truncate_after_first_stop(&second_events);
assert!(
!second_events.is_empty(),
"no events collected for second turn {}",
config.agent
);
assert!(
should_stop(&second_events),
"timed out waiting for assistant/error event for second turn {}",
config.agent
);
let snapshot = json!({
"first": normalize_events(&first_events),
"second": normalize_events(&second_events),
});
assert_session_snapshot("multi_turn", snapshot);
}
}

View file

@ -1,272 +0,0 @@
// Permission flow snapshots compare every agent to the mock baseline.
include!("../common/http.rs");
fn session_snapshot_suffix(prefix: &str) -> String {
snapshot_name(prefix, Some(AgentId::Mock))
}
fn assert_session_snapshot(prefix: &str, value: Value) {
insta::with_settings!({
snapshot_suffix => session_snapshot_suffix(prefix),
}, {
insta::assert_yaml_snapshot!(value);
});
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn permission_flow_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !(caps.plan_mode && caps.permissions) {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let permission_session = format!("perm-{}", config.agent.as_str());
create_session(&app.app, config.agent, &permission_session, "plan").await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{permission_session}/messages"),
Some(json!({ "message": PERMISSION_PROMPT })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send permission prompt");
let permission_events = poll_events_until_match(
&app.app,
&permission_session,
Duration::from_secs(120),
|events| find_permission_id(events).is_some() || should_stop(events),
)
.await;
let permission_events = truncate_permission_events(&permission_events);
assert_session_snapshot("permission_events", normalize_events(&permission_events));
if let Some(permission_id) = find_permission_id(&permission_events) {
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{permission_session}/permissions/{permission_id}/reply"),
Some(json!({ "reply": "once" })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "reply permission");
assert_session_snapshot("permission_reply", snapshot_status(status));
} else {
let (status, payload) = send_json(
&app.app,
Method::POST,
&format!("/v1/sessions/{permission_session}/permissions/missing-permission/reply"),
Some(json!({ "reply": "once" })),
)
.await;
assert!(!status.is_success(), "missing permission id should error");
assert_session_snapshot(
"permission_reply_missing",
json!({
"status": status.as_u16(),
"payload": payload,
}),
);
}
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn permission_reply_always_sets_accept_for_session_status() {
let app = TestApp::new();
install_agent(&app.app, AgentId::Mock).await;
let session_id = "perm-always-mock";
create_session(&app.app, AgentId::Mock, session_id, "plan").await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": PERMISSION_PROMPT })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send permission prompt");
let events = poll_events_until_match(&app.app, session_id, Duration::from_secs(30), |events| {
find_permission_id(events).is_some() || should_stop(events)
})
.await;
let permission_id = find_permission_id(&events).expect("permission.requested missing");
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/permissions/{permission_id}/reply"),
Some(json!({ "reply": "always" })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "reply permission always");
let resolved_events =
poll_events_until_match(&app.app, session_id, Duration::from_secs(30), |events| {
events.iter().any(|event| {
event.get("type").and_then(Value::as_str) == Some("permission.resolved")
&& event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
== Some(permission_id.as_str())
})
})
.await;
let resolved = resolved_events
.iter()
.rev()
.find(|event| {
event.get("type").and_then(Value::as_str) == Some("permission.resolved")
&& event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
== Some(permission_id.as_str())
})
.expect("permission.resolved missing");
let status = resolved
.get("data")
.and_then(|data| data.get("status"))
.and_then(Value::as_str);
assert_eq!(status, Some("accept_for_session"));
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn permission_reply_always_auto_approves_subsequent_permissions() {
let app = TestApp::new();
install_agent(&app.app, AgentId::Mock).await;
let session_id = "perm-always-auto-mock";
create_session(&app.app, AgentId::Mock, session_id, "plan").await;
let first_status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": PERMISSION_PROMPT })),
)
.await;
assert_eq!(
first_status,
StatusCode::NO_CONTENT,
"send first permission prompt"
);
let first_events =
poll_events_until_match(&app.app, session_id, Duration::from_secs(30), |events| {
find_permission_id(events).is_some() || should_stop(events)
})
.await;
let first_permission_id =
find_permission_id(&first_events).expect("first permission.requested missing");
let reply_status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/permissions/{first_permission_id}/reply"),
Some(json!({ "reply": "always" })),
)
.await;
assert_eq!(
reply_status,
StatusCode::NO_CONTENT,
"reply first permission always"
);
let second_status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": PERMISSION_PROMPT })),
)
.await;
assert_eq!(
second_status,
StatusCode::NO_CONTENT,
"send second permission prompt"
);
let events = poll_events_until_match(&app.app, session_id, Duration::from_secs(30), |events| {
let requested_ids = events
.iter()
.filter_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("permission.requested") {
return None;
}
event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
.map(|value| value.to_string())
})
.collect::<Vec<_>>();
if requested_ids.len() < 2 {
return false;
}
let second_permission_id = &requested_ids[1];
events.iter().any(|event| {
event.get("type").and_then(Value::as_str) == Some("permission.resolved")
&& event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
== Some(second_permission_id.as_str())
&& event
.get("data")
.and_then(|data| data.get("status"))
.and_then(Value::as_str)
== Some("accept_for_session")
})
})
.await;
let requested_ids = events
.iter()
.filter_map(|event| {
if event.get("type").and_then(Value::as_str) != Some("permission.requested") {
return None;
}
event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
.map(|value| value.to_string())
})
.collect::<Vec<_>>();
assert!(
requested_ids.len() >= 2,
"expected at least two permission.requested events"
);
let second_permission_id = &requested_ids[1];
let second_resolved = events.iter().any(|event| {
event.get("type").and_then(Value::as_str) == Some("permission.resolved")
&& event
.get("data")
.and_then(|data| data.get("permission_id"))
.and_then(Value::as_str)
== Some(second_permission_id.as_str())
&& event
.get("data")
.and_then(|data| data.get("status"))
.and_then(Value::as_str)
== Some("accept_for_session")
});
assert!(
second_resolved,
"second permission should auto-resolve as accept_for_session"
);
}

View file

@ -1,140 +0,0 @@
// Question flow snapshots compare every agent to the mock baseline.
include!("../common/http.rs");
fn session_snapshot_suffix(prefix: &str) -> String {
snapshot_name(prefix, Some(AgentId::Mock))
}
fn assert_session_snapshot(prefix: &str, value: Value) {
insta::with_settings!({
snapshot_suffix => session_snapshot_suffix(prefix),
}, {
insta::assert_yaml_snapshot!(value);
});
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn question_flow_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.questions {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let question_reply_session = format!("question-reply-{}", config.agent.as_str());
create_session(&app.app, config.agent, &question_reply_session, "plan").await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{question_reply_session}/messages"),
Some(json!({ "message": QUESTION_PROMPT })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send question prompt");
let question_events = poll_events_until_match(
&app.app,
&question_reply_session,
Duration::from_secs(120),
|events| find_question_id_and_answers(events).is_some() || should_stop(events),
)
.await;
let question_events = truncate_question_events(&question_events);
assert_session_snapshot("question_reply_events", normalize_events(&question_events));
if let Some((question_id, answers)) = find_question_id_and_answers(&question_events) {
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{question_reply_session}/questions/{question_id}/reply"),
Some(json!({ "answers": answers })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "reply question");
assert_session_snapshot("question_reply", snapshot_status(status));
} else {
let (status, payload) = send_json(
&app.app,
Method::POST,
&format!("/v1/sessions/{question_reply_session}/questions/missing-question/reply"),
Some(json!({ "answers": [] })),
)
.await;
assert!(!status.is_success(), "missing question id should error");
assert_session_snapshot(
"question_reply_missing",
json!({
"status": status.as_u16(),
"payload": payload,
}),
);
}
let question_reject_session = format!("question-reject-{}", config.agent.as_str());
create_session(&app.app, config.agent, &question_reject_session, "plan").await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{question_reject_session}/messages"),
Some(json!({ "message": QUESTION_PROMPT })),
)
.await;
assert_eq!(
status,
StatusCode::NO_CONTENT,
"send question prompt reject"
);
let reject_events = poll_events_until_match(
&app.app,
&question_reject_session,
Duration::from_secs(120),
|events| find_question_id_and_answers(events).is_some() || should_stop(events),
)
.await;
let reject_events = truncate_question_events(&reject_events);
assert_session_snapshot("question_reject_events", normalize_events(&reject_events));
if let Some((question_id, _)) = find_question_id_and_answers(&reject_events) {
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{question_reject_session}/questions/{question_id}/reject"),
None,
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "reject question");
assert_session_snapshot("question_reject", snapshot_status(status));
} else {
let (status, payload) = send_json(
&app.app,
Method::POST,
&format!(
"/v1/sessions/{question_reject_session}/questions/missing-question/reject"
),
None,
)
.await;
assert!(
!status.is_success(),
"missing question id reject should error"
);
assert_session_snapshot(
"question_reject_missing",
json!({
"status": status.as_u16(),
"payload": payload,
}),
);
}
}
}

View file

@ -1,53 +0,0 @@
// Reasoning capability checks are isolated from baseline snapshots.
include!("../common/http.rs");
fn reasoning_prompt(_agent: AgentId) -> &'static str {
"Answer briefly and include your reasoning."
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reasoning_events_present() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.reasoning {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("reasoning-{}", config.agent.as_str());
create_session(
&app.app,
config.agent,
&session_id,
test_permission_mode(config.agent),
)
.await;
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": reasoning_prompt(config.agent) })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send reasoning prompt");
let events =
poll_events_until_match(&app.app, &session_id, Duration::from_secs(120), |events| {
events_have_content_type(events, "reasoning") || events.iter().any(is_error_event)
})
.await;
assert!(
events_have_content_type(&events, "reasoning"),
"expected reasoning content for {}",
config.agent
);
}
}

View file

@ -1,263 +0,0 @@
// Session lifecycle and streaming snapshots use the mock baseline as the single source of truth.
include!("../common/http.rs");
fn session_snapshot_suffix(prefix: &str) -> String {
snapshot_name(prefix, Some(AgentId::Mock))
}
fn assert_session_snapshot(prefix: &str, value: Value) {
insta::with_settings!({
snapshot_suffix => session_snapshot_suffix(prefix),
}, {
insta::assert_yaml_snapshot!(value);
});
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn session_endpoints_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
let _guard = apply_credentials(&config.credentials);
install_agent(&app.app, config.agent).await;
let session_id = format!("snapshot-{}", config.agent.as_str());
let permission_mode = test_permission_mode(config.agent);
let (status, created) = send_json(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({
"agent": config.agent.as_str(),
"permissionMode": permission_mode
})),
)
.await;
assert_eq!(status, StatusCode::OK, "create session");
assert_session_snapshot("create_session", normalize_create_session(&created));
let (status, sessions) = send_json(&app.app, Method::GET, "/v1/sessions", None).await;
assert_eq!(status, StatusCode::OK, "list sessions");
assert_session_snapshot("sessions_list", normalize_sessions(&sessions));
let status = send_status(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}/messages"),
Some(json!({ "message": PROMPT })),
)
.await;
assert_eq!(status, StatusCode::NO_CONTENT, "send message");
assert_session_snapshot("send_message", snapshot_status(status));
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn http_events_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
// OpenCode's embedded bun hangs when installing plugins, blocking event streaming.
if config.agent == AgentId::Opencode {
continue;
}
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
run_http_events_snapshot(&app.app, config).await;
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn accept_edits_noop_for_non_claude() {
let app = TestApp::new();
let session_id = "accept-edits-noop";
let (status, _) = send_json(
&app.app,
Method::POST,
&format!("/v1/sessions/{session_id}"),
Some(json!({
"agent": AgentId::Mock.as_str(),
"permissionMode": "acceptEdits"
})),
)
.await;
assert_eq!(status, StatusCode::OK, "create session with acceptEdits");
let (status, sessions) = send_json(&app.app, Method::GET, "/v1/sessions", None).await;
assert_eq!(status, StatusCode::OK, "list sessions");
let sessions = sessions
.get("sessions")
.and_then(Value::as_array)
.expect("sessions list");
let session = sessions
.iter()
.find(|entry| {
entry
.get("sessionId")
.and_then(Value::as_str)
.is_some_and(|id| id == session_id)
})
.expect("created session");
let permission_mode = session
.get("permissionMode")
.and_then(Value::as_str)
.expect("permissionMode");
assert_eq!(permission_mode, "default");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn sse_events_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
// OpenCode's embedded bun hangs when installing plugins, blocking SSE event streaming.
if config.agent == AgentId::Opencode {
continue;
}
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
run_sse_events_snapshot(&app.app, config).await;
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn concurrency_snapshots() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
run_concurrency_snapshot(&app.app, config).await;
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn turn_stream_route() {
let configs = test_agents_from_env().expect("configure SANDBOX_TEST_AGENTS or install agents");
for config in &configs {
// OpenCode's embedded bun can hang while installing plugins, which blocks turn streaming.
// OpenCode turn behavior is covered by the dedicated opencode-compat suite.
if config.agent == AgentId::Opencode {
continue;
}
let app = TestApp::new();
let capabilities = fetch_capabilities(&app.app).await;
let caps = capabilities
.get(config.agent.as_str())
.expect("capabilities missing");
if !caps.session_lifecycle {
continue;
}
run_turn_stream_check(&app.app, config).await;
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn turn_stream_emits_turn_lifecycle_for_mock() {
let app = TestApp::new();
install_agent(&app.app, AgentId::Mock).await;
let session_id = "turn-lifecycle-mock";
create_session(
&app.app,
AgentId::Mock,
session_id,
test_permission_mode(AgentId::Mock),
)
.await;
let events = read_turn_stream_events(&app.app, session_id, Duration::from_secs(30)).await;
let started_count = events
.iter()
.filter(|event| event.get("type").and_then(Value::as_str) == Some("turn.started"))
.count();
let ended_count = events
.iter()
.filter(|event| event.get("type").and_then(Value::as_str) == Some("turn.ended"))
.count();
assert_eq!(started_count, 1, "expected exactly one turn.started event");
assert_eq!(ended_count, 1, "expected exactly one turn.ended event");
}
async fn run_concurrency_snapshot(app: &Router, config: &TestAgentConfig) {
let _guard = apply_credentials(&config.credentials);
install_agent(app, config.agent).await;
let session_a = format!("concurrent-a-{}", config.agent.as_str());
let session_b = format!("concurrent-b-{}", config.agent.as_str());
let perm_mode = test_permission_mode(config.agent);
create_session(app, config.agent, &session_a, perm_mode).await;
create_session(app, config.agent, &session_b, perm_mode).await;
let app_a = app.clone();
let app_b = app.clone();
let send_a = send_message(&app_a, &session_a);
let send_b = send_message(&app_b, &session_b);
tokio::join!(send_a, send_b);
let app_a = app.clone();
let app_b = app.clone();
let poll_a = poll_events_until(&app_a, &session_a, Duration::from_secs(120));
let poll_b = poll_events_until(&app_b, &session_b, Duration::from_secs(120));
let (events_a, events_b) = tokio::join!(poll_a, poll_b);
let events_a = truncate_after_first_stop(&events_a);
let events_b = truncate_after_first_stop(&events_b);
assert!(
!events_a.is_empty(),
"no events collected for concurrent session a {}",
config.agent
);
assert!(
!events_b.is_empty(),
"no events collected for concurrent session b {}",
config.agent
);
assert!(
should_stop(&events_a),
"timed out waiting for assistant/error event for concurrent session a {}",
config.agent
);
assert!(
should_stop(&events_b),
"timed out waiting for assistant/error event for concurrent session b {}",
config.agent
);
let snapshot = json!({
"session_a": normalize_events(&events_a),
"session_b": normalize_events(&events_b),
});
assert_session_snapshot("concurrency_events", snapshot);
}

View file

@ -1,87 +0,0 @@
---
source: server/packages/sandbox-agent/tests/sessions/multi_turn.rs
assertion_line: 15
expression: value
---
first:
- metadata: true
seq: 1
session: started
type: session.started
- item:
content_types:
- text
kind: message
role: user
status: in_progress
seq: 2
type: item.started
- item:
content_types:
- text
kind: message
role: user
status: completed
seq: 3
type: item.completed
- item:
content_types:
- text
kind: message
role: assistant
status: in_progress
seq: 4
type: item.started
- delta:
delta: "<redacted>"
item_id: "<redacted>"
native_item_id: "<redacted>"
seq: 5
type: item.delta
- item:
content_types:
- text
kind: message
role: assistant
status: completed
seq: 6
type: item.completed
second:
- item:
content_types:
- text
kind: message
role: user
status: in_progress
seq: 1
type: item.started
- item:
content_types:
- text
kind: message
role: user
status: completed
seq: 2
type: item.completed
- item:
content_types:
- text
kind: message
role: assistant
status: in_progress
seq: 3
type: item.started
- delta:
delta: "<redacted>"
item_id: "<redacted>"
native_item_id: "<redacted>"
seq: 4
type: item.delta
- item:
content_types:
- text
kind: message
role: assistant
status: completed
seq: 5
type: item.completed

Some files were not shown because too many files have changed in this diff Show more