mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-15 16:04:06 +00:00
4.2 KiB
4.2 KiB
Feature 6: Server Status
Implementation approach: Extension fields on GET /v1/agents and GET /v1/health
Summary
v1 had ServerStatus (Running/Stopped/Error) and ServerStatusInfo (baseUrl, lastError, restartCount, uptimeMs) per agent. v1 has none of this. Add server/agent process status tracking.
Current v1 State
GET /v1/agents returns AgentInfo with install state only:
pub struct AgentInfo {
pub id: String,
pub native_required: bool,
pub native_installed: bool,
pub native_version: Option<String>,
pub agent_process_installed: bool,
pub agent_process_source: Option<String>,
pub agent_process_version: Option<String>,
pub capabilities: AgentCapabilities,
}
No runtime status (running/stopped/error), no error tracking, no restart counts.
v1 Types (exact, from router.rs)
/// Status of a shared server process for an agent
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema, JsonSchema)]
#[serde(rename_all = "lowercase")]
pub enum ServerStatus {
/// Server is running and accepting requests
Running,
/// Server is not currently running
Stopped,
/// Server is running but unhealthy
Error,
}
#[derive(Debug, Clone, Serialize, Deserialize, ToSchema, JsonSchema)]
#[serde(rename_all = "camelCase")]
pub struct ServerStatusInfo {
pub status: ServerStatus,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub base_url: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub uptime_ms: Option<u64>,
pub restart_count: u64,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub last_error: Option<String>,
}
v1 Implementation (exact)
ManagedServer::status_info
fn status_info(&self) -> ServerStatusInfo {
let uptime_ms = self.start_time
.map(|started| started.elapsed().as_millis() as u64);
ServerStatusInfo {
status: self.status.clone(),
base_url: self.base_url(),
uptime_ms,
restart_count: self.restart_count,
last_error: self.last_error.clone(),
}
}
AgentServerManager::status_snapshot
async fn status_snapshot(&self) -> HashMap<AgentId, ServerStatusInfo> {
let servers = self.servers.lock().await;
servers.iter()
.map(|(agent, server)| (*agent, server.status_info()))
.collect()
}
AgentServerManager::update_server_error
async fn update_server_error(&self, agent: AgentId, message: String) {
let mut servers = self.servers.lock().await;
if let Some(server) = servers.get_mut(&agent) {
server.status = ServerStatus::Error;
server.start_time = None;
server.last_error = Some(message);
}
}
Implementation Plan
ACP Runtime Tracking
The AcpRuntime needs to track per-agent backend process:
struct AgentProcessStatus {
status: String, // "running" | "stopped" | "error"
start_time: Option<Instant>,
restart_count: u64,
last_error: Option<String>,
}
Track:
- Process start → set status to "running", record
start_time, incrementrestart_count - Process exit (normal) → set status to "stopped", clear
start_time - Process exit (error) → set status to "error", record
last_error, clearstart_time
Add to AgentInfo
pub struct AgentInfo {
// ... existing fields ...
pub server_status: Option<ServerStatusInfo>,
}
Only include server_status for agents that use shared processes (Codex, OpenCode).
Files to Modify
| File | Change |
|---|---|
server/packages/sandbox-agent/src/acp_runtime/mod.rs |
Track agent process lifecycle (start/stop/error/restart count) per AgentId; expose status_snapshot() method |
server/packages/sandbox-agent/src/router.rs |
Add ServerStatus, ServerStatusInfo types; add server_status to AgentInfo; query runtime for status in get_v1_agents |
sdks/typescript/src/client.ts |
Update AgentInfo type with serverStatus |
server/packages/sandbox-agent/tests/v1_api.rs |
Test server status in agent listing |
Docs to Update
| Doc | Change |
|---|---|
docs/openapi.json |
Update /v1/agents response with server_status |
docs/sdks/typescript.mdx |
Document serverStatus field |