mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-17 03:03:48 +00:00
i need to update the terminology of 'capabilities' to 'feature coverage' in the inspector ui and anywhere else its mentioned thats not in the actual api (#61)
This commit is contained in:
parent
cc37ed0458
commit
24de9e686c
11 changed files with 35 additions and 32 deletions
|
|
@ -40,7 +40,7 @@ Universal schema guidance:
|
||||||
- Never use synthetic data or mocked responses in tests.
|
- Never use synthetic data or mocked responses in tests.
|
||||||
- Never manually write agent types; always use generated types in `resources/agent-schemas/`. If types are broken, fix the generated types.
|
- Never manually write agent types; always use generated types in `resources/agent-schemas/`. If types are broken, fix the generated types.
|
||||||
- The universal schema must provide consistent behavior across providers; avoid requiring frontend/client logic to special-case agents.
|
- The universal schema must provide consistent behavior across providers; avoid requiring frontend/client logic to special-case agents.
|
||||||
- The UI must reflect every field in AgentCapabilities; keep it in sync with `docs/session-transcript-schema.mdx` and `agent_capabilities_for`.
|
- The UI must reflect every field in AgentCapabilities (feature coverage); keep it in sync with `docs/session-transcript-schema.mdx` and `agent_capabilities_for`.
|
||||||
- When parsing agent data, if something is unexpected or does not match the schema, bail out and surface the error rather than trying to continue with partial parsing.
|
- When parsing agent data, if something is unexpected or does not match the schema, bail out and surface the error rather than trying to continue with partial parsing.
|
||||||
- When defining the universal schema, choose the option most compatible with native agent APIs, and add synthetics to fill gaps for other agents.
|
- When defining the universal schema, choose the option most compatible with native agent APIs, and add synthetics to fill gaps for other agents.
|
||||||
- Use `docs/session-transcript-schema.mdx` as the source of truth for schema terminology and keep it updated alongside schema changes.
|
- Use `docs/session-transcript-schema.mdx` as the source of truth for schema terminology and keep it updated alongside schema changes.
|
||||||
|
|
|
||||||
|
|
@ -11,7 +11,7 @@ icon: "comments"
|
||||||
```ts
|
```ts
|
||||||
const { agents } = await client.listAgents();
|
const { agents } = await client.listAgents();
|
||||||
|
|
||||||
// Each agent has capabilities that determine what UI to show
|
// Each agent exposes feature coverage via `capabilities` to determine what UI to show
|
||||||
const claude = agents.find((a) => a.id === "claude");
|
const claude = agents.find((a) => a.id === "claude");
|
||||||
if (claude?.capabilities.permissions) {
|
if (claude?.capabilities.permissions) {
|
||||||
// Show permission approval UI
|
// Show permission approval UI
|
||||||
|
|
|
||||||
|
|
@ -51,7 +51,7 @@ await client.createSession("demo-session", {
|
||||||
await client.postMessage("demo-session", { message: "Hello" });
|
await client.postMessage("demo-session", { message: "Hello" });
|
||||||
```
|
```
|
||||||
|
|
||||||
List agents and pick a compatible one:
|
List agents and inspect feature coverage (available on `capabilities`):
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
const agents = await client.listAgents();
|
const agents = await client.listAgents();
|
||||||
|
|
@ -142,7 +142,7 @@ Parameters:
|
||||||
|
|
||||||
## Types
|
## Types
|
||||||
|
|
||||||
The SDK exports OpenAPI-derived types for events, items, and capabilities:
|
The SDK exports OpenAPI-derived types for events, items, and feature coverage:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import type { UniversalEvent, UniversalItem, AgentCapabilities } from "sandbox-agent";
|
import type { UniversalEvent, UniversalItem, AgentCapabilities } from "sandbox-agent";
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ The schema is defined in [OpenAPI format](https://github.com/rivet-dev/sandbox-a
|
||||||
|
|
||||||
## Coverage Matrix
|
## Coverage Matrix
|
||||||
|
|
||||||
This table shows which agent capabilities appear in the universal event stream. All agents retain their full native capabilities—this only reflects what's normalized into the schema.
|
This table shows which agent feature coverage appears in the universal event stream. All agents retain their full native feature coverage—this only reflects what's normalized into the schema.
|
||||||
|
|
||||||
| Feature | Claude | Codex | OpenCode | Amp |
|
| Feature | Claude | Codex | OpenCode | Amp |
|
||||||
|--------------------|:------:|:-----:|:------------:|:------------:|
|
|--------------------|:------:|:-----:|:------------:|:------------:|
|
||||||
|
|
|
||||||
|
|
@ -1676,14 +1676,14 @@
|
||||||
white-space: nowrap;
|
white-space: nowrap;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Capability Badges */
|
/* Feature Coverage Badges */
|
||||||
.capability-badges {
|
.feature-coverage-badges {
|
||||||
display: flex;
|
display: flex;
|
||||||
flex-wrap: wrap;
|
flex-wrap: wrap;
|
||||||
gap: 6px;
|
gap: 6px;
|
||||||
}
|
}
|
||||||
|
|
||||||
.capability-badge {
|
.feature-coverage-badge {
|
||||||
display: inline-flex;
|
display: inline-flex;
|
||||||
align-items: center;
|
align-items: center;
|
||||||
gap: 4px;
|
gap: 4px;
|
||||||
|
|
@ -1693,17 +1693,17 @@
|
||||||
font-weight: 500;
|
font-weight: 500;
|
||||||
}
|
}
|
||||||
|
|
||||||
.capability-badge.enabled {
|
.feature-coverage-badge.enabled {
|
||||||
background: rgba(48, 209, 88, 0.12);
|
background: rgba(48, 209, 88, 0.12);
|
||||||
color: var(--success);
|
color: var(--success);
|
||||||
}
|
}
|
||||||
|
|
||||||
.capability-badge.disabled {
|
.feature-coverage-badge.disabled {
|
||||||
background: rgba(255, 255, 255, 0.04);
|
background: rgba(255, 255, 255, 0.04);
|
||||||
color: var(--muted-2);
|
color: var(--muted-2);
|
||||||
}
|
}
|
||||||
|
|
||||||
.capability-badge svg {
|
.feature-coverage-badge svg {
|
||||||
flex-shrink: 0;
|
flex-shrink: 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -18,7 +18,7 @@ import {
|
||||||
Terminal,
|
Terminal,
|
||||||
Wrench
|
Wrench
|
||||||
} from "lucide-react";
|
} from "lucide-react";
|
||||||
import type { AgentCapabilitiesView } from "../../types/agents";
|
import type { FeatureCoverageView } from "../../types/agents";
|
||||||
|
|
||||||
const badges = [
|
const badges = [
|
||||||
{ key: "planMode", label: "Plan", icon: GitBranch },
|
{ key: "planMode", label: "Plan", icon: GitBranch },
|
||||||
|
|
@ -42,14 +42,14 @@ const badges = [
|
||||||
|
|
||||||
type BadgeItem = (typeof badges)[number];
|
type BadgeItem = (typeof badges)[number];
|
||||||
|
|
||||||
const getEnabled = (capabilities: AgentCapabilitiesView, key: BadgeItem["key"]) =>
|
const getEnabled = (featureCoverage: FeatureCoverageView, key: BadgeItem["key"]) =>
|
||||||
Boolean((capabilities as Record<string, boolean | undefined>)[key]);
|
Boolean((featureCoverage as Record<string, boolean | undefined>)[key]);
|
||||||
|
|
||||||
const CapabilityBadges = ({ capabilities }: { capabilities: AgentCapabilitiesView }) => {
|
const FeatureCoverageBadges = ({ featureCoverage }: { featureCoverage: FeatureCoverageView }) => {
|
||||||
return (
|
return (
|
||||||
<div className="capability-badges">
|
<div className="feature-coverage-badges">
|
||||||
{badges.map(({ key, label, icon: Icon }) => (
|
{badges.map(({ key, label, icon: Icon }) => (
|
||||||
<span key={key} className={`capability-badge ${getEnabled(capabilities, key) ? "enabled" : "disabled"}`}>
|
<span key={key} className={`feature-coverage-badge ${getEnabled(featureCoverage, key) ? "enabled" : "disabled"}`}>
|
||||||
<Icon size={12} />
|
<Icon size={12} />
|
||||||
<span>{label}</span>
|
<span>{label}</span>
|
||||||
</span>
|
</span>
|
||||||
|
|
@ -58,4 +58,4 @@ const CapabilityBadges = ({ capabilities }: { capabilities: AgentCapabilitiesVie
|
||||||
);
|
);
|
||||||
};
|
};
|
||||||
|
|
||||||
export default CapabilityBadges;
|
export default FeatureCoverageBadges;
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
import { Download, RefreshCw } from "lucide-react";
|
import { Download, RefreshCw } from "lucide-react";
|
||||||
import type { AgentInfo, AgentModeInfo } from "sandbox-agent";
|
import type { AgentInfo, AgentModeInfo } from "sandbox-agent";
|
||||||
import CapabilityBadges from "../agents/CapabilityBadges";
|
import FeatureCoverageBadges from "../agents/FeatureCoverageBadges";
|
||||||
import { emptyCapabilities } from "../../types/agents";
|
import { emptyFeatureCoverage } from "../../types/agents";
|
||||||
|
|
||||||
const AgentsTab = ({
|
const AgentsTab = ({
|
||||||
agents,
|
agents,
|
||||||
|
|
@ -41,7 +41,7 @@ const AgentsTab = ({
|
||||||
installed: false,
|
installed: false,
|
||||||
version: undefined,
|
version: undefined,
|
||||||
path: undefined,
|
path: undefined,
|
||||||
capabilities: emptyCapabilities
|
capabilities: emptyFeatureCoverage
|
||||||
}))).map((agent) => (
|
}))).map((agent) => (
|
||||||
<div key={agent.id} className="card">
|
<div key={agent.id} className="card">
|
||||||
<div className="card-header">
|
<div className="card-header">
|
||||||
|
|
@ -54,8 +54,11 @@ const AgentsTab = ({
|
||||||
{agent.version ? `v${agent.version}` : "Version unknown"}
|
{agent.version ? `v${agent.version}` : "Version unknown"}
|
||||||
{agent.path && <span className="mono muted" style={{ marginLeft: 8 }}>{agent.path}</span>}
|
{agent.path && <span className="mono muted" style={{ marginLeft: 8 }}>{agent.path}</span>}
|
||||||
</div>
|
</div>
|
||||||
|
<div className="card-meta" style={{ marginTop: 8 }}>
|
||||||
|
Feature coverage
|
||||||
|
</div>
|
||||||
<div style={{ marginTop: 8 }}>
|
<div style={{ marginTop: 8 }}>
|
||||||
<CapabilityBadges capabilities={agent.capabilities ?? emptyCapabilities} />
|
<FeatureCoverageBadges featureCoverage={agent.capabilities ?? emptyFeatureCoverage} />
|
||||||
</div>
|
</div>
|
||||||
{modesByAgent[agent.id] && modesByAgent[agent.id].length > 0 && (
|
{modesByAgent[agent.id] && modesByAgent[agent.id].length > 0 && (
|
||||||
<div className="card-meta" style={{ marginTop: 8 }}>
|
<div className="card-meta" style={{ marginTop: 8 }}>
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
import type { AgentCapabilities } from "sandbox-agent";
|
import type { AgentCapabilities } from "sandbox-agent";
|
||||||
|
|
||||||
export type AgentCapabilitiesView = AgentCapabilities & {
|
export type FeatureCoverageView = AgentCapabilities & {
|
||||||
toolResults?: boolean;
|
toolResults?: boolean;
|
||||||
textMessages?: boolean;
|
textMessages?: boolean;
|
||||||
images?: boolean;
|
images?: boolean;
|
||||||
|
|
@ -16,7 +16,7 @@ export type AgentCapabilitiesView = AgentCapabilities & {
|
||||||
itemStarted?: boolean;
|
itemStarted?: boolean;
|
||||||
};
|
};
|
||||||
|
|
||||||
export const emptyCapabilities: AgentCapabilitiesView = {
|
export const emptyFeatureCoverage: FeatureCoverageView = {
|
||||||
planMode: false,
|
planMode: false,
|
||||||
permissions: false,
|
permissions: false,
|
||||||
questions: false,
|
questions: false,
|
||||||
|
|
|
||||||
|
|
@ -140,7 +140,7 @@ const permissions: PermissionRule[] = [
|
||||||
|
|
||||||
No documented agent mode concept. Behavior controlled via:
|
No documented agent mode concept. Behavior controlled via:
|
||||||
- `--toolbox` flag for different tool configurations
|
- `--toolbox` flag for different tool configurations
|
||||||
- Permission rules for capability restrictions
|
- Permission rules for feature coverage restrictions
|
||||||
|
|
||||||
### Bypass All Permissions
|
### Bypass All Permissions
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -567,7 +567,7 @@ console.log(`Running on: ${provider.provider} (${provider.confidence} confidence
|
||||||
Several open-source projects implement cloud detection patterns:
|
Several open-source projects implement cloud detection patterns:
|
||||||
|
|
||||||
- **cloud-detect** (Python, `pip install cloud-detect`): Detects AWS, GCP, Azure, Alibaba, DigitalOcean, Oracle via filesystem + metadata
|
- **cloud-detect** (Python, `pip install cloud-detect`): Detects AWS, GCP, Azure, Alibaba, DigitalOcean, Oracle via filesystem + metadata
|
||||||
- **cloud-detect-js** (Node, `npm install cloud-detect-js`): JavaScript port with similar capabilities
|
- **cloud-detect-js** (Node, `npm install cloud-detect-js`): JavaScript port with similar feature coverage
|
||||||
- **banzaicloud/satellite** (Go): Uses two-tier detection with sysfs first, then metadata fallback
|
- **banzaicloud/satellite** (Go): Uses two-tier detection with sysfs first, then metadata fallback
|
||||||
- **OpenTelemetry Resource Detectors**: Production-grade detectors across Node.js, Python, Go — use `@opentelemetry/resource-detector-aws`, `@opentelemetry/resource-detector-gcp`, etc.
|
- **OpenTelemetry Resource Detectors**: Production-grade detectors across Node.js, Python, Go — use `@opentelemetry/resource-detector-aws`, `@opentelemetry/resource-detector-gcp`, etc.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ Place all new tests under `server/packages/**/tests/` (or a package-specific `te
|
||||||
- Agent management coverage in `agent-management/`
|
- Agent management coverage in `agent-management/`
|
||||||
- Shared server manager coverage in `server-manager/`
|
- Shared server manager coverage in `server-manager/`
|
||||||
- HTTP endpoint snapshots in `http/` (snapshots in `http/snapshots/`)
|
- HTTP endpoint snapshots in `http/` (snapshots in `http/snapshots/`)
|
||||||
- Session capability snapshots in `sessions/` (one file per capability, e.g. `session_lifecycle.rs`, `permissions.rs`, `questions.rs`, `reasoning.rs`, `status.rs`; snapshots in `sessions/snapshots/`)
|
- Session feature coverage snapshots in `sessions/` (one file per feature, e.g. `session_lifecycle.rs`, `permissions.rs`, `questions.rs`, `reasoning.rs`, `status.rs`; snapshots in `sessions/snapshots/`)
|
||||||
- UI coverage in `ui/`
|
- UI coverage in `ui/`
|
||||||
- Shared helpers in `common/`
|
- Shared helpers in `common/`
|
||||||
- Extracted agent schema roundtrip tests live under `server/packages/extracted-agent-schemas/tests/`
|
- Extracted agent schema roundtrip tests live under `server/packages/extracted-agent-schemas/tests/`
|
||||||
|
|
@ -30,7 +30,7 @@ Session snapshot entrypoint:
|
||||||
|
|
||||||
Snapshots are written to:
|
Snapshots are written to:
|
||||||
- `server/packages/sandbox-agent/tests/http/snapshots/` (HTTP endpoint snapshots)
|
- `server/packages/sandbox-agent/tests/http/snapshots/` (HTTP endpoint snapshots)
|
||||||
- `server/packages/sandbox-agent/tests/sessions/snapshots/` (session/capability snapshots)
|
- `server/packages/sandbox-agent/tests/sessions/snapshots/` (session/feature coverage snapshots)
|
||||||
|
|
||||||
## Agent selection
|
## Agent selection
|
||||||
|
|
||||||
|
|
@ -80,7 +80,7 @@ To keep snapshots deterministic:
|
||||||
- IDs, timestamps, native IDs
|
- IDs, timestamps, native IDs
|
||||||
- text content, tool inputs/outputs, provider-specific metadata
|
- text content, tool inputs/outputs, provider-specific metadata
|
||||||
- `source` and `synthetic` flags (these are implementation details)
|
- `source` and `synthetic` flags (these are implementation details)
|
||||||
- Scrub `reasoning` and `status` content from session-baseline snapshots to keep the core event skeleton consistent across agents; validate those content types separately in their capability-specific tests.
|
- Scrub `reasoning` and `status` content from session-baseline snapshots to keep the core event skeleton consistent across agents; validate those content types separately in their feature-coverage-specific tests.
|
||||||
- The sandbox-agent is responsible for emitting **synthetic events** so that real agents match the mock sequence exactly.
|
- The sandbox-agent is responsible for emitting **synthetic events** so that real agents match the mock sequence exactly.
|
||||||
- Event streams are truncated after the first assistant or error event.
|
- Event streams are truncated after the first assistant or error event.
|
||||||
- Permission flow snapshots are truncated after the permission request (or first assistant) event.
|
- Permission flow snapshots are truncated after the permission request (or first assistant) event.
|
||||||
|
|
@ -110,9 +110,9 @@ cargo test -p sandbox-agent --test http_endpoints
|
||||||
|
|
||||||
When modifying agent conversion code in `server/packages/universal-agent-schema/src/agents/` or adding/changing properties on the universal schema, update the feature matrix in `README.md` to reflect which agents support which features.
|
When modifying agent conversion code in `server/packages/universal-agent-schema/src/agents/` or adding/changing properties on the universal schema, update the feature matrix in `README.md` to reflect which agents support which features.
|
||||||
|
|
||||||
## Capabilities sync
|
## Feature coverage sync
|
||||||
|
|
||||||
When updating agent capabilities (flags or values), keep them in sync across:
|
When updating agent feature coverage (flags or values), keep them in sync across:
|
||||||
- `README.md` (feature matrix / documented support)
|
- `README.md` (feature matrix / documented support)
|
||||||
- server Rust implementation (`AgentCapabilities` + `agent_capabilities_for`)
|
- server Rust implementation (`AgentCapabilities` + `agent_capabilities_for`)
|
||||||
- frontend capability views/badges (Inspector UI)
|
- frontend feature coverage views/badges (Inspector UI)
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue