feat: expand api snapshots and schema tooling

2026-04-20 22:02:42 +00:00 · 2026-01-26 00:13:17 -08:00 · 2026-01-26 00:13:17 -08:00 · 011ca27287
commit 011ca27287
parent ee014b0838
72 changed files with 29480 additions and 1081 deletions
--- a/server/CLAUDE.md
+++ b/server/CLAUDE.md
@ -0,0 +1,68 @@
+# Server Testing
+
+## Snapshot tests
+
+The HTTP/SSE snapshot suite lives in:
+- `server/packages/sandbox-agent/tests/http_sse_snapshots.rs`
+
+Snapshots are written to:
+- `server/packages/sandbox-agent/tests/snapshots/`
+
+## Agent selection
+
+`SANDBOX_TEST_AGENTS` controls which agents run. It accepts a comma-separated list or `all`.
+If it is **not set**, tests will auto-detect installed agents by checking:
+- binaries on `PATH`, and
+- the default install dir (`$XDG_DATA_HOME/sandbox-agent/bin` or `./.sandbox-agent/bin`)
+
+If no agents are found, tests fail with a clear error.
+
+## Credential handling
+
+Credentials are pulled from the host by default via `extract_all_credentials`:
+- environment variables (e.g. `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`)
+- local CLI configs (Claude/Codex/Amp/OpenCode)
+
+You can override host credentials for tests with:
+- `SANDBOX_TEST_ANTHROPIC_API_KEY`
+- `SANDBOX_TEST_OPENAI_API_KEY`
+
+If `SANDBOX_TEST_AGENTS` includes an agent that requires a provider credential and it is missing,
+tests fail before starting.
+
+## Credential health checks
+
+Before running agent tests, credentials are validated with minimal API calls:
+- Anthropic: `GET https://api.anthropic.com/v1/models`
+  - `x-api-key` for API keys
+  - `Authorization: Bearer` for OAuth tokens
+  - `anthropic-version: 2023-06-01`
+- OpenAI: `GET https://api.openai.com/v1/models` with `Authorization: Bearer`
+
+401/403 yields a hard failure (`invalid credentials`). Other non-2xx responses or network
+errors fail with a health-check error.
+
+Health checks run in a blocking thread to avoid Tokio runtime drop errors inside async tests.
+
+## Snapshot stability
+
+To keep snapshots deterministic:
+- Event streams are truncated after the first assistant or error event.
+- Permission flow snapshots are truncated after the permission request (or first assistant) event.
+- Unknown events are preserved as `kind: unknown` (raw payload in universal schema).
+
+## Typical commands
+
+Run only Claude snapshots:
+```
+SANDBOX_TEST_AGENTS=claude cargo test -p sandbox-agent-core --test http_sse_snapshots
+```
+
+Run all detected agents:
+```
+cargo test -p sandbox-agent-core --test http_sse_snapshots
+```
+
+## Universal Schema
+
+When modifying agent conversion code in `server/packages/universal-agent-schema/src/agents/` or adding/changing properties on the universal schema, update the feature matrix in `README.md` to reflect which agents support which features.