feat(foundry): memory investigation tooling and VFS pool spec

Add memory monitoring instrumentation, investigation findings, and
SQLite VFS pool design spec for addressing WASM SQLite memory spikes.

- Add /debug/memory endpoint and periodic memory logging (dev only)
- Add mem-monitor.sh script for continuous memory profiling with
  automatic heap snapshot capture on spike detection
- Add configureRunnerPool to registry setup for engine driver support
- Document memory investigation findings (per-actor cost, spike behavior)
- Write SQLite VFS pool spec for bin-packing actors onto shared WASM instances
- Add foundry-mem-monitor and foundry-dev-engine justfile recipes
- Add compose.dev.yaml engine driver and platform support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nathan Flurry 2026-03-17 23:46:03 -07:00
parent 7b23e519c2
commit ee99d0b318
18 changed files with 888 additions and 496 deletions

View file

@ -136,6 +136,7 @@ Do not use polling (`refetchInterval`), empty "go re-fetch" broadcast events, or
- **Task actor** materializes its own detail state (session summaries, sandbox info, diffs, file tree). `getTaskDetail` reads from the task actor's own SQLite. The task actor broadcasts updates directly to clients connected to it.
- **Session data** lives on the task actor but is a separate subscription topic. The task topic includes `sessions_summary` (list without content). The `session` topic provides full transcript and draft state. Clients subscribe to the `session` topic for whichever session is active, and filter `sessionUpdated` events by session ID (ignoring events for other sessions on the same actor).
- There is no fan-out on the read path. The organization actor owns all task summaries locally.
- **Never build client-side fan-out** that iterates task summaries and calls `getTaskDetail`/`getSessionDetail` on each. This wakes every actor simultaneously and causes OOM crashes in production (~25 MB per actor wake). The subscription system connects to at most 4 actors at a time (app + org + task + session).
### Subscription manager