diff --git a/CLAUDE.md b/CLAUDE.md index aa1994b..12b33d0 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,20 +2,24 @@ Multi-agent orchestration system for Claude Code. Scale horizontally (multiple planning sessions) and vertically (weavers executing in parallel). +**Read the complete workflow:** [WORKFLOW.md](WORKFLOW.md) + ## Architecture ``` You (Terminal) - | - v -Planner (interactive) <- You talk here - | - v (specs) + │ + │ /plan + ▼ +Planner (interactive) ← You talk here + │ + │ (writes specs) + ▼ Orchestrator (tmux background) - | - +-> Weaver 01 (tmux) -> Verifier (subagent) -> PR - +-> Weaver 02 (tmux) -> Verifier (subagent) -> PR - +-> Weaver 03 (tmux) -> Verifier (subagent) -> PR + │ + ├─→ Weaver 01 (tmux) → Verifier (subagent) → PR + ├─→ Weaver 02 (tmux) → Verifier (subagent) → PR + └─→ Weaver 03 (tmux) → Verifier (subagent) → PR ``` ## Commands @@ -57,18 +61,44 @@ claude state.json # Orchestrator state summary.md # Human-readable results weavers/ - w-01.json # Weaver status + session ID + w-01.json # Weaver status w-02.json ``` ## All Agents Use Opus -Every agent in the system uses `claude-opus-4-5-20250514`: +Every agent uses `claude-opus-4-5-20250514`: - Planner - Orchestrator - Weavers - Verifiers (subagents) +## Key Rules + +### Tests Never Ship + +Weavers may write tests for verification. They are **never committed**: +```bash +git reset HEAD -- '*.test.*' '*.spec.*' '__tests__/' 'tests/' +``` + +### PRs Are Always Created + +A weaver's success = PR created. No PR = failure. + +### Verification Is Mandatory + +Weavers spawn verifier subagents. No self-verification. + +### Context Is Isolated + +| Agent | Sees | Doesn't See | +|-------|------|-------------| +| Planner | Full codebase, human | Weaver impl | +| Orchestrator | Specs, skill index | Actual code | +| Weaver | Its spec, its skills | Other weavers | +| Verifier | Verification spec | Building spec | + ## Tmux Session Naming ``` @@ -79,16 +109,10 @@ vertical--w-02 # Weaver 2 ## Skill Index -Skills live in `skill-index/skills/`. The orchestrator uses `skill-index/index.yaml` to match `skill_hints` from specs to actual skills. - -To add a skill: -1. Create `skill-index/skills//SKILL.md` -2. Add entry to `skill-index/index.yaml` +Skills live in `skill-index/skills/`. The orchestrator uses `skill-index/index.yaml` to match `skill_hints` from specs. ## Resuming Sessions -Every Claude session can be resumed: - ```bash # Get session ID from weaver status cat .claude/vertical/plans//run/weavers/w-01.json | jq -r .session_id @@ -100,20 +124,12 @@ claude --resume ## Debugging ```bash -# Source helpers source lib/tmux.sh -# List all sessions -vertical_list_sessions - -# Attach to a weaver -vertical_attach vertical-plan-20260119-1430-w-01 - -# Capture output -vertical_capture_output vertical-plan-20260119-1430-w-01 - -# Kill a plan's sessions -vertical_kill_plan plan-20260119-1430 +vertical_list_sessions # List all sessions +vertical_attach vertical-plan-20260119-1430-w-01 # Attach to weaver +vertical_capture_output # Capture output +vertical_kill_plan plan-20260119-1430 # Kill plan sessions ``` ## Spec Format @@ -149,18 +165,19 @@ pr: title: "feat: description" ``` -## Context Isolation +## Oracle for Complex Planning -Each agent gets minimal, focused context: +For complex tasks (5+ specs, unclear dependencies, architecture decisions), the planner invokes Oracle: -| Agent | Receives | Does NOT Receive | -|-------|----------|------------------| -| Planner | Full codebase, your questions | Weaver implementation | -| Orchestrator | Specs, skill index, status | Actual code | -| Weaver | Spec + skills | Other weavers' work | -| Verifier | Verification spec only | Building requirements | +```bash +npx -y @steipete/oracle \ + --engine browser \ + --model gpt-5.2-codex \ + -p "$(cat /tmp/oracle-prompt.txt)" \ + --file "src/**" +``` -This prevents context bloat and keeps agents focused. +Oracle takes 10-60 minutes and outputs `plan.md`. The planner transforms this into spec YAMLs. ## Parallel Execution @@ -173,13 +190,3 @@ Terminal 3: /plan notification system ``` Each spawns its own orchestrator and weavers. All run in parallel. - -## Weavers Always Create PRs - -Weavers follow the eval-skill pattern: -1. Build implementation -2. Spawn verifier subagent -3. Fix on failure (max 5 iterations) -4. Create PR on success - -No PR = failure. This is enforced. diff --git a/README.md b/README.md index 9a5c930..b9db6df 100644 --- a/README.md +++ b/README.md @@ -31,20 +31,30 @@ claude > /status plan-20260119-1430 ``` +## Documentation + +| Document | Description | +|----------|-------------| +| [WORKFLOW.md](WORKFLOW.md) | Complete workflow guide with examples | +| [CLAUDE.md](CLAUDE.md) | Project instructions and quick reference | +| [docs/spec-schema-v2.md](docs/spec-schema-v2.md) | Full spec YAML schema | + ## Architecture ``` You (Terminal) - | - v -Planner (interactive) <- You talk here - | - v (specs) + │ + │ /plan + ▼ +Planner (interactive) ← You talk here + │ + │ (writes specs) + ▼ Orchestrator (tmux background) - | - +-> Weaver 01 (tmux) -> Verifier (subagent) -> PR - +-> Weaver 02 (tmux) -> Verifier (subagent) -> PR - +-> Weaver 03 (tmux) -> Verifier (subagent) -> PR + │ + ├─→ Weaver 01 (tmux) → Verifier (subagent) → PR + ├─→ Weaver 02 (tmux) → Verifier (subagent) → PR + └─→ Weaver 03 (tmux) → Verifier (subagent) → PR ``` ## Commands @@ -60,15 +70,17 @@ Orchestrator (tmux background) ``` claude-code-vertical/ ├── CLAUDE.md # Project instructions +├── WORKFLOW.md # Complete workflow guide ├── skills/ │ ├── planner/ # Interactive planning │ ├── orchestrator/ # Tmux + weaver management │ ├── weaver-base/ # Base skill for all weavers -│ └── verifier/ # Verification subagent +│ ├── verifier/ # Verification subagent +│ └── oracle/ # Deep planning with GPT-5.2 Codex ├── commands/ -│ ├── plan.md -│ ├── build.md -│ └── status.md +│ ├── vertical-plan.md +│ ├── vertical-build.md +│ └── vertical-status.md ├── skill-index/ │ ├── index.yaml # Skill registry │ └── skills/ # Available skills @@ -83,16 +95,17 @@ claude-code-vertical/ Every agent uses `claude-opus-4-5-20250514` for maximum capability. +## Key Rules + +1. **Tests never ship** - Weavers may write tests for verification, but they're never committed +2. **PRs always created** - Weaver success = PR created +3. **Verification mandatory** - Weavers spawn verifier subagents +4. **Context isolated** - Each agent sees only what it needs + ## Skill Index The orchestrator matches `skill_hints` from specs to skills in `skill-index/index.yaml`. -Current skills include: -- Swift/iOS development (concurrency, SwiftUI, testing, debugging) -- Build and memory debugging -- Database and networking patterns -- Agent orchestration tools - ## Resume Any Session ```bash @@ -113,3 +126,13 @@ vertical_list_sessions # List tmux sessions vertical_attach # Attach to session vertical_kill_plan # Kill all sessions for a plan ``` + +## Oracle for Complex Planning + +For complex tasks, the planner invokes Oracle (GPT-5.2 Codex) for deep planning: + +```bash +npx -y @steipete/oracle --engine browser --model gpt-5.2-codex ... +``` + +Oracle runs 10-60 minutes and outputs `plan.md`, which the planner transforms into specs. diff --git a/WORKFLOW.md b/WORKFLOW.md new file mode 100644 index 0000000..9aca073 --- /dev/null +++ b/WORKFLOW.md @@ -0,0 +1,628 @@ +# Claude Code Vertical - Complete Workflow + +This document describes the complete multi-agent workflow for Claude Code Vertical. + +## Philosophy + +This system is **prompt-driven**. Every agent is Claude reading markdown instructions. There is no traditional code—the markdown files ARE the program. Claude's ability to follow instructions precisely makes this possible. + +**Key principles:** +- Tight prompts with no ambiguity +- Each agent has a single, focused responsibility +- Context isolation prevents bloat and confusion +- Human remains in control at key decision points + +--- + +## Architecture Overview + +``` +┌─────────────────────────────────────────────────────────────────────────────┐ +│ HUMAN │ +│ │ +│ You talk to the Planner. Everything else runs in the background. │ +│ │ +│ Commands: │ +│ /plan Start planning session │ +│ /build Execute plan (spawns orchestrator + weavers) │ +│ /status Check progress │ +│ │ +└─────────────────────────────────────────────────────────────────────────────┘ + │ + │ /plan + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ PLANNER (Interactive) │ +│ │ +│ Location: Your terminal (direct Claude session) │ +│ Skill: skills/planner/SKILL.md │ +│ │ +│ Responsibilities: │ +│ • Ask clarifying questions │ +│ • Research the codebase │ +│ • Invoke Oracle for complex tasks (optional) │ +│ • Design specs (each spec = one PR) │ +│ • Write specs to .claude/vertical/plans//specs/ │ +│ │ +│ Output: Spec YAML files + meta.json │ +│ Handoff: Tells you to run /build │ +│ │ +└─────────────────────────────────────────────────────────────────────────────┘ + │ + │ /build + ▼ +┌─────────────────────────────────────────────────────────────────────────────┐ +│ ORCHESTRATOR (tmux background) │ +│ │ +│ Location: tmux session "vertical--orch" │ +│ Skill: skills/orchestrator/SKILL.md │ +│ │ +│ Responsibilities: │ +│ • Read specs and analyze dependencies │ +│ • Match skill_hints → skill-index │ +│ • Launch weavers in parallel tmux sessions │ +│ • Track progress by polling status files │ +│ • Launch dependent specs when dependencies complete │ +│ • Write summary.md when all done │ +│ │ +│ Human Interaction: None (runs autonomously) │ +│ Output: run/state.json, run/summary.md │ +│ │ +└─────────────────────────────────────────────────────────────────────────────┘ + │ + ┌────────────────────────┼────────────────────────┐ + │ │ │ + ▼ ▼ ▼ +┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ +│ WEAVER 01 │ │ WEAVER 02 │ │ WEAVER 03 │ +│ │ │ │ │ │ +│ tmux: vertical- │ │ tmux: vertical- │ │ tmux: vertical- │ +│ -w-01 │ │ -w-02 │ │ -w-03 │ +│ │ │ │ │ │ +│ Skill: │ │ Skill: │ │ Skill: │ +│ weaver-base + │ │ weaver-base + │ │ weaver-base + │ +│ matched skills │ │ matched skills │ │ matched skills │ +│ │ │ │ │ │ +│ 1. Parse spec │ │ 1. Parse spec │ │ 1. (waiting for │ +│ 2. Build code │ │ 2. Build code │ │ dependency) │ +│ 3. Spawn verifier │ │ 3. Spawn verifier │ │ │ +│ 4. Fix if failed │ │ 4. Fix if failed │ │ │ +│ 5. Create PR │ │ 5. Create PR │ │ │ +│ │ │ │ │ │ +│ Output: PR #42 │ │ Output: PR #43 │ │ │ +└─────────┬─────────┘ └─────────┬─────────┘ └───────────────────┘ + │ │ + │ Task tool │ Task tool + ▼ ▼ +┌───────────────────┐ ┌───────────────────┐ +│ VERIFIER │ │ VERIFIER │ +│ (subagent) │ │ (subagent) │ +│ │ │ │ +│ • Run checks │ │ • Run checks │ +│ • Fast-fail │ │ • Fast-fail │ +│ • PASS or FAIL │ │ • PASS or FAIL │ +└───────────────────┘ └───────────────────┘ +``` + +--- + +## Complete Workflow + +### Phase 1: Planning + +**You start a planning session:** + +```bash +$ claude +> /plan +``` + +**What happens:** + +1. Claude loads `skills/planner/SKILL.md` +2. Claude generates plan ID: `plan-20260119-143052` +3. Claude creates `.claude/vertical/plans//specs/` directory +4. Claude asks: "What would you like to build?" + +**Interactive Q&A:** + +``` +Planner: What would you like to build? + +You: I want to add user authentication with Google OAuth and email/password. + +Planner: Let me understand the requirements better. + + 1. What's your tech stack? + +You: Next.js with Prisma and PostgreSQL. + +Planner: 2. Do you need role-based access control? + +You: Yes, admin and user roles. + +Planner: 3. Any specific features? + +You: Password reset and email verification. + +Planner: Let me research your codebase... + + [Reads existing patterns, checks file structure] + + I found: + - Stack: Next.js 14, Prisma, PostgreSQL + - Existing patterns: Using NextAuth would fit well + - Files to modify: prisma/schema.prisma, src/pages/api/... + + This is a complex task. I'll use Oracle for deep planning. +``` + +**Oracle (for complex tasks):** + +``` +Planner: Running Oracle for deep planning... + This typically takes 10-60 minutes. + I'll monitor and let you know when ready. + + [Oracle runs via browser engine with GPT-5.2 Codex] + + ... + +Planner: Oracle complete. Here's the breakdown: + + Phase 1: Schema + 01-schema.yaml - Add User, Session, Account tables + + Phase 2: Backend + 02-password-service.yaml - Password hashing with bcrypt + 03-google-oauth.yaml - Google OAuth integration + 04-auth-endpoints.yaml - Login/register/reset endpoints + + Phase 3: Frontend + 05-login-ui.yaml - Login page with Google button + + Parallel: 01, 02, 03 can run together + Sequential: 04 waits for 02 and 03; 05 waits for 04 + + Approve this breakdown? [yes/modify] +``` + +**Specs written:** + +``` +Planner: ════════════════════════════════════════════════════════════════ + PLANNING COMPLETE: plan-20260119-143052 + ════════════════════════════════════════════════════════════════ + + Specs created: + .claude/vertical/plans/plan-20260119-143052/specs/ + 01-schema.yaml + 02-password-service.yaml + 03-google-oauth.yaml + 04-auth-endpoints.yaml + 05-login-ui.yaml + + To execute: + /build plan-20260119-143052 + + ════════════════════════════════════════════════════════════════ +``` + +### Phase 2: Execution + +**You trigger the build:** + +```bash +> /build plan-20260119-143052 +``` + +**What happens:** + +1. Claude generates orchestrator prompt with plan context +2. Claude launches tmux session: `vertical-plan-20260119-143052-orch` +3. Claude confirms launch and returns control to you + +``` +════════════════════════════════════════════════════════════════ +BUILD LAUNCHED: plan-20260119-143052 +════════════════════════════════════════════════════════════════ + +Orchestrator: vertical-plan-20260119-143052-orch + +Monitor commands: + /status plan-20260119-143052 + tmux attach -t vertical-plan-20260119-143052-orch + +Results will be at: + .claude/vertical/plans/plan-20260119-143052/run/summary.md + +════════════════════════════════════════════════════════════════ +``` + +**Behind the scenes (orchestrator):** + +``` +Orchestrator: Reading specs... + Found 5 specs. + + Dependency analysis: + 01-schema.yaml → base: main (parallel) + 02-password.yaml → base: main (parallel) + 03-google-oauth.yaml → base: main (parallel) + 04-endpoints.yaml → base: feature/password (waits) + 05-login-ui.yaml → base: feature/endpoints (waits) + + Matching skills... + 01 → database-patterns + 02 → security-patterns + 03 → security-patterns + 04 → api-design + 05 → react-patterns + + Launching parallel weavers... + vertical-plan-20260119-143052-w-01 → 01-schema + vertical-plan-20260119-143052-w-02 → 02-password + vertical-plan-20260119-143052-w-03 → 03-google-oauth + + Polling status... +``` + +**Behind the scenes (weaver):** + +``` +Weaver 01: Parsing spec 01-schema.yaml... + + Requirements: + - Add User model to Prisma + - Add Session model + - Add Account model for OAuth + + Building... + + Files created: + + prisma/schema.prisma (modified) + + Spawning verifier... + +Verifier: 1. [command] npx prisma validate → PASS + 2. [command] npx prisma generate → PASS + 3. [file-contains] model User → PASS + + RESULT: PASS + +Weaver 01: Verification passed. Creating PR... + + git checkout -b auth/01-schema main + git add prisma/schema.prisma + git commit -m "feat(auth): add user authentication schema" + git push -u origin auth/01-schema + gh pr create... + + PR #42 created. +``` + +### Phase 3: Monitoring + +**You check status periodically:** + +```bash +> /status plan-20260119-143052 +``` + +``` +=== Plan: plan-20260119-143052 === +Status: running +Started: 2026-01-19T14:35:00Z + +=== Specs === + 01-schema.yaml + 02-password-service.yaml + 03-google-oauth.yaml + 04-auth-endpoints.yaml + 05-login-ui.yaml + +=== Weavers === + w-01 complete 01-schema.yaml PR #42 + w-02 verifying 02-password-service.yaml - + w-03 complete 03-google-oauth.yaml PR #44 + w-04 waiting 04-auth-endpoints.yaml - + w-05 waiting 05-login-ui.yaml - + +=== Tmux Sessions === + vertical-plan-20260119-143052-orch running + vertical-plan-20260119-143052-w-02 running +``` + +**Optional: Attach to watch:** + +```bash +$ tmux attach -t vertical-plan-20260119-143052-w-02 +``` + +### Phase 4: Results + +**When complete, the orchestrator outputs:** + +``` +╔══════════════════════════════════════════════════════════════════╗ +║ BUILD COMPLETE: plan-20260119-143052 ║ +╠══════════════════════════════════════════════════════════════════╣ +║ ✓ 01-schema complete PR #42 ║ +║ ✓ 02-password complete PR #43 ║ +║ ✓ 03-google-oauth complete PR #44 ║ +║ ✓ 04-endpoints complete PR #45 ║ +║ ✓ 05-login-ui complete PR #46 ║ +╠══════════════════════════════════════════════════════════════════╣ +║ ║ +║ Summary: .claude/vertical/plans/plan-20260119-143052/run/summary.md +║ PRs: gh pr list ║ +║ ║ +╚══════════════════════════════════════════════════════════════════╝ +``` + +**View summary:** + +```bash +$ cat .claude/vertical/plans/plan-20260119-143052/run/summary.md + +# Build Complete: plan-20260119-143052 + +**Status**: complete +**Started**: 2026-01-19T14:35:00Z +**Completed**: 2026-01-19T15:12:00Z + +## Results + +| Spec | Status | PR | +|------|--------|-----| +| 01-schema | ✓ complete | [#42](https://github.com/.../42) | +| 02-password | ✓ complete | [#43](https://github.com/.../43) | +| 03-google-oauth | ✓ complete | [#44](https://github.com/.../44) | +| 04-endpoints | ✓ complete | [#45](https://github.com/.../45) | +| 05-login-ui | ✓ complete | [#46](https://github.com/.../46) | + +## PRs Ready for Review + +Merge in this order (stacked): +1. #42 - feat(auth): add user authentication schema +2. #43 - feat(auth): add password hashing service +3. #44 - feat(auth): add Google OAuth integration +4. #45 - feat(auth): add authentication endpoints +5. #46 - feat(auth): add login UI + +## Commands + +```bash +gh pr list +gh pr merge 42 --merge +``` +``` + +**Review and merge PRs:** + +```bash +$ gh pr list +$ gh pr view 42 +$ gh pr merge 42 --merge +$ gh pr merge 43 --merge +# ... etc +``` + +--- + +## Tmux Session Reference + +| Session | Purpose | Attach Command | +|---------|---------|----------------| +| `vertical--orch` | Orchestrator | `tmux attach -t vertical--orch` | +| `vertical--w-01` | Weaver 01 | `tmux attach -t vertical--w-01` | +| `vertical--w-02` | Weaver 02 | `tmux attach -t vertical--w-02` | +| ... | ... | ... | + +**Detach from tmux:** `Ctrl+B` then `D` + +**Helper functions:** + +```bash +source lib/tmux.sh + +vertical_list_sessions # List all vertical sessions +vertical_status # Show all plans with status +vertical_attach # Attach to session +vertical_capture_output # Capture recent output +vertical_kill_plan # Kill all sessions for plan +vertical_kill_all # Kill all vertical sessions +``` + +--- + +## Directory Structure + +``` +.claude/vertical/ + plans/ + plan-20260119-143052/ + meta.json # Plan metadata + specs/ + 01-schema.yaml # Spec files + 02-password.yaml + ... + run/ + state.json # Orchestrator state + summary.md # Human-readable summary + weavers/ + w-01.json # Weaver 01 status + w-02.json # Weaver 02 status + ... +``` + +--- + +## Spec Format + +```yaml +name: feature-name +description: | + What this PR accomplishes. + +skill_hints: + - security-patterns + - typescript-patterns + +building_spec: + requirements: + - Specific requirement 1 + - Specific requirement 2 + constraints: + - Rule to follow + files: + - src/path/to/file.ts + +verification_spec: + - type: command + run: "npm run typecheck" + expect: exit_code 0 + - type: file-contains + path: src/path/to/file.ts + pattern: "expected" + +pr: + branch: feature/name + base: main + title: "feat(scope): description" +``` + +--- + +## Error Handling + +### Weaver Fails After 5 Iterations + +The weaver stops and reports failure. The orchestrator: +1. Marks the spec as failed +2. Blocks dependent specs +3. Continues with independent specs +4. Notes failure in summary + +**To debug:** + +```bash +# Check the error +cat .claude/vertical/plans//run/weavers/w-01.json + +# Resume the session +claude --resume + +# Or attach to tmux if still running +tmux attach -t vertical--w-01 +``` + +### Dependency Failure + +If spec A fails and spec B depends on A: +- B is marked as `blocked` +- B is never launched +- Summary notes the blocking + +### Killing a Stuck Build + +```bash +source lib/tmux.sh +vertical_kill_plan +``` + +--- + +## Key Rules + +### Tests Are Never Committed + +Weavers may write tests during implementation to satisfy verification. But they **never commit test files**: + +```bash +# Weaver always runs: +git reset HEAD -- '*.test.*' '*.spec.*' '__tests__/' 'tests/' +``` + +Tests are ephemeral. They verify the implementation but don't ship. + +### PRs Are Always Created + +A weaver's success condition is a created PR. No PR = failure. + +### Verification Is Mandatory + +Weavers must spawn a verifier subagent. They cannot self-verify. + +### Context Is Isolated + +| Agent | Sees | Doesn't See | +|-------|------|-------------| +| Planner | Full codebase, human input | Weaver implementation | +| Orchestrator | Specs, skill index | Actual code | +| Weaver | Its spec, its skills | Other weavers | +| Verifier | Verification spec | Building requirements | + +--- + +## Parallel Execution + +### Multiple Plans + +Run multiple planning sessions: + +``` +Terminal 1: /plan authentication +Terminal 2: /plan payments +Terminal 3: /plan notifications +``` + +Each creates its own plan ID and executes independently. + +### Parallel Weavers + +Within a plan, independent specs run in parallel: + +``` +01-schema.yaml (base: main) ──┬── parallel +02-backend.yaml (base: main) ──┤ +03-frontend.yaml (base: feature/backend) ── sequential (waits for 02) +``` + +--- + +## Resuming Sessions + +Every Claude session can be resumed: + +```bash +# Find session ID +cat .claude/vertical/plans//meta.json | jq -r .planner_session + +# Resume +claude --resume +``` + +Weaver sessions are also saved: + +```bash +cat .claude/vertical/plans//run/weavers/w-01.json | jq -r .session_id +claude --resume +``` + +--- + +## Quick Reference + +| Command | Description | +|---------|-------------| +| `/plan` | Start planning session | +| `/build ` | Execute plan | +| `/status ` | Check status | +| `tmux attach -t ` | Watch a session | +| `Ctrl+B D` | Detach from tmux | +| `source lib/tmux.sh` | Load helper functions | +| `vertical_status` | Show all plans | +| `vertical_kill_plan ` | Kill a plan | +| `gh pr list` | List created PRs | +| `gh pr merge ` | Merge a PR | diff --git a/commands/vertical-build.md b/commands/vertical-build.md index 0b2db91..8d3307e 100644 --- a/commands/vertical-build.md +++ b/commands/vertical-build.md @@ -1,5 +1,5 @@ --- -description: Execute a plan by launching orchestrator and weavers in tmux. Creates PRs for each spec. +description: Execute a plan by launching orchestrator in tmux. Creates PRs for each spec. argument-hint: [spec-names...] --- @@ -14,75 +14,129 @@ Execute a plan. Launches orchestrator in tmux, which spawns weavers for each spe /build plan-20260119-1430 01-schema 02-backend ``` -## What Happens +## What You Do -1. Read plan from `.claude/vertical/plans//` -2. Launch orchestrator in tmux: `vertical--orch` -3. Orchestrator reads specs, selects skills, spawns weavers -4. Each weaver runs in tmux: `vertical--w-01`, etc. -5. Weavers build, verify, create PRs -6. Results written to `.claude/vertical/plans//run/` +When `/build ` is invoked: -## Execution Flow - -``` -/build plan-20260119-1430 - | - +-> Orchestrator (tmux: vertical-plan-20260119-1430-orch) - | - +-> Weaver 01 (tmux: vertical-plan-20260119-1430-w-01) - | | - | +-> Verifier (subagent) - | +-> PR #42 - | - +-> Weaver 02 (tmux: vertical-plan-20260119-1430-w-02) - | | - | +-> Verifier (subagent) - | +-> PR #43 - | - +-> Summary written to run/summary.md -``` - -## Parallelization - -- Independent specs (all with `pr.base: main`) run in parallel -- Dependent specs (with `pr.base: `) wait for dependencies - -## Monitoring - -Check status while running: - -``` -/status plan-20260119-1430 -``` - -Or directly: +### Step 1: Validate Plan Exists ```bash -# List tmux sessions -tmux list-sessions | grep vertical +if [ ! -f ".claude/vertical/plans//meta.json" ]; then + echo "Error: Plan not found: " + echo "Run /status to see available plans" + exit 1 +fi +``` -# Attach to orchestrator -tmux attach -t vertical-plan-20260119-1430-orch +### Step 2: Generate Orchestrator Prompt -# Attach to a weaver -tmux attach -t vertical-plan-20260119-1430-w-01 +```bash +cat > /tmp/orch-prompt-.md << 'PROMPT_EOF' + +$(cat skills/orchestrator/SKILL.md) + -# Capture weaver output -tmux capture-pane -t vertical-plan-20260119-1430-w-01 -p + +$(pwd) + +Execute the plan. Spawn weavers. Track progress. Write summary. + +Begin now. +PROMPT_EOF +``` + +### Step 3: Launch Orchestrator in Tmux + +```bash +tmux new-session -d -s "vertical--orch" -c "$(pwd)" \ + "claude -p \"\$(cat /tmp/orch-prompt-.md)\" --dangerously-skip-permissions --model claude-opus-4-5-20250514; echo '[Orchestrator complete. Press any key to close.]'; read" +``` + +### Step 4: Confirm Launch + +Output to human: + +``` +════════════════════════════════════════════════════════════════ +BUILD LAUNCHED: +════════════════════════════════════════════════════════════════ + +Orchestrator: vertical--orch + +The orchestrator will: + 1. Read specs from .claude/vertical/plans//specs/ + 2. Spawn weavers in parallel tmux sessions + 3. Track progress and handle dependencies + 4. Write summary when complete + +Monitor commands: + /status # Check status + tmux attach -t vertical--orch # Watch orchestrator + tmux list-sessions | grep vertical # List all sessions + +Results will be at: + .claude/vertical/plans//run/summary.md + +════════════════════════════════════════════════════════════════ +``` + +## Partial Execution + +To execute specific specs only: + +``` +/build plan-20260119-1430 01-schema 02-backend +``` + +Modify the orchestrator prompt: + +``` +01-schema, 02-backend +``` + +The orchestrator will only process those specs. + +## Monitoring While Running + +### Check Status + +```bash +/status +``` + +### Attach to Orchestrator + +```bash +tmux attach -t vertical--orch +# Detach: Ctrl+B then D +``` + +### Attach to a Weaver + +```bash +tmux attach -t vertical--w-01 +# Detach: Ctrl+B then D +``` + +### Capture Weaver Output + +```bash +tmux capture-pane -t vertical--w-01 -p -S -100 ``` ## Results When complete, find results at: -- `.claude/vertical/plans//run/state.json` - Overall status -- `.claude/vertical/plans//run/summary.md` - Human-readable summary -- `.claude/vertical/plans//run/weavers/w-*.json` - Per-weaver status +| File | Contents | +|------|----------| +| `run/state.json` | Overall status and weaver states | +| `run/summary.md` | Human-readable summary with PR links | +| `run/weavers/w-*.json` | Per-weaver status | ## Debugging Failures -If a weaver fails, you can resume its session: +### Resume a Failed Weaver ```bash # Get session ID from weaver status @@ -92,7 +146,7 @@ cat .claude/vertical/plans//run/weavers/w-01.json | jq -r .session_id claude --resume ``` -Or attach to the tmux session if still running: +### Attach to Running Weaver ```bash tmux attach -t vertical--w-01 @@ -101,20 +155,40 @@ tmux attach -t vertical--w-01 ## Killing a Build ```bash -# Kill all sessions for a plan +# Source helpers source lib/tmux.sh -vertical_kill_plan plan-20260119-1430 + +# Kill all sessions for this plan +vertical_kill_plan # Or kill everything vertical_kill_all ``` -## Implementation Notes +## What Happens Behind the Scenes -This command: -1. Loads orchestrator skill -2. Generates orchestrator prompt with plan context -3. Spawns tmux session with `claude -p "" --dangerously-skip-permissions --model opus` -4. Returns immediately (orchestrator runs in background) - -The orchestrator handles everything from there. +``` +/build plan-20260119-1430 + │ + ├─→ Create orchestrator prompt + │ + ├─→ Launch tmux: vertical-plan-20260119-1430-orch + │ │ + │ ├─→ Read specs + │ ├─→ Analyze dependencies + │ ├─→ Launch weavers in parallel + │ │ │ + │ │ ├─→ vertical-plan-20260119-1430-w-01 + │ │ │ └─→ Build → Verify → PR #42 + │ │ │ + │ │ ├─→ vertical-plan-20260119-1430-w-02 + │ │ │ └─→ Build → Verify → PR #43 + │ │ │ + │ │ └─→ (waits for dependencies...) + │ │ + │ ├─→ Poll weaver status + │ ├─→ Launch dependent specs when ready + │ └─→ Write summary.md + │ + └─→ Human runs /status to check progress +``` diff --git a/commands/vertical-plan.md b/commands/vertical-plan.md index 94102d1..a989b0b 100644 --- a/commands/vertical-plan.md +++ b/commands/vertical-plan.md @@ -1,5 +1,5 @@ --- -description: Start an interactive planning session. Design specs through Q&A, then hand off to build. +description: Start an interactive planning session. Design specs through Q&A with the planner. argument-hint: [description] --- @@ -14,52 +14,100 @@ Start a planning session. You become the planner agent. /plan Add user authentication with OAuth ``` -## What Happens +## What You Do -1. Load the planner skill from `skills/planner/SKILL.md` -2. Generate a plan ID: `plan-YYYYMMDD-HHMMSS` -3. Create plan directory: `.claude/vertical/plans//` -4. Enter interactive planning mode +When `/plan` is invoked: -## Planning Flow +### Step 1: Load Planner Skill -1. **Understand** - Ask questions until the task is crystal clear -2. **Research** - Explore the codebase, find patterns -3. **Design** - Break into specs (each = one PR) -4. **Write** - Create spec files in `specs/` directory -5. **Hand off** - Tell user to run `/build ` +Read and internalize `skills/planner/SKILL.md`. -## Spec Output +### Step 2: Generate Plan ID -Specs go to: `.claude/vertical/plans//specs/` +```bash +plan_id="plan-$(date +%Y%m%d-%H%M%S)" +``` + +### Step 3: Create Plan Directory + +```bash +mkdir -p ".claude/vertical/plans/${plan_id}/specs" +mkdir -p ".claude/vertical/plans/${plan_id}/run/weavers" +``` + +### Step 4: Start Interactive Planning + +Follow the planner skill phases: + +1. **Understand** - Ask questions until task is crystal clear +2. **Research** - Explore codebase, find patterns +3. **Assess Complexity** - Decide if Oracle is needed +4. **Oracle (optional)** - For complex tasks, invoke Oracle +5. **Design** - Break into specs (each = one PR) +6. **Write** - Create spec YAML files +7. **Hand off** - Tell human to run `/build ` + +### Step 5: Confirm Plan Ready + +When specs are written: ``` -01-schema.yaml -02-backend.yaml -03-frontend.yaml +════════════════════════════════════════════════════════════════ +PLANNING COMPLETE: +════════════════════════════════════════════════════════════════ + +Specs created: + .claude/vertical/plans//specs/ + 01-schema.yaml + 02-backend.yaml + 03-frontend.yaml + +To execute: + /build + +To check status: + /status + +════════════════════════════════════════════════════════════════ +``` + +## When to Use Oracle + +The planner will invoke Oracle for: + +| Trigger | Why | +|---------|-----| +| 5+ specs needed | Complex dependency management | +| Unclear dependencies | Need deep analysis | +| Architecture decisions | Needs extended thinking | +| Performance/migration planning | Requires careful sequencing | + +Oracle runs via browser engine (10-60 minutes typical). + +## Spec Output Location + +``` +.claude/vertical/plans// + meta.json # Plan metadata + specs/ + 01-schema.yaml # First spec + 02-backend.yaml # Second spec + 03-frontend.yaml # Third spec ``` ## Transitioning to Build -When specs are ready: +When specs are ready, the human runs: ``` -Specs ready. To execute: - - /build - -To execute specific specs: - - /build 01-schema 02-backend - -To check status: - - /status +/build ``` +This launches the orchestrator in tmux, which spawns weavers. + ## Multiple Planning Sessions -You can run multiple planning sessions in parallel: +Run multiple planning sessions in parallel: ``` # Terminal 1 @@ -67,16 +115,27 @@ You can run multiple planning sessions in parallel: # Terminal 2 /plan Add payment processing + +# Terminal 3 +/plan Add notification system ``` Each gets its own plan-id and can be built independently. -## Resuming +## Resuming a Planning Session Planning sessions are Claude Code sessions. Resume with: -``` +```bash claude --resume ``` The session ID is saved in `.claude/vertical/plans//meta.json`. + +## Example Interaction + +``` +Human: /plan + +Claude: Starting plan: plan-20260119-143052 + What would you like to build? \ No newline at end of file diff --git a/commands/vertical-status.md b/commands/vertical-status.md index 07ffa26..49b9647 100644 --- a/commands/vertical-status.md +++ b/commands/vertical-status.md @@ -10,13 +10,49 @@ Check the status of plans and weavers. ## Usage ``` -/status # All plans -/status plan-20260119-1430 # Specific plan +/status # All plans +/status plan-20260119-1430 # Specific plan ``` -## Output +## What You Do -### All Plans +When `/status` is invoked: + +### Without Arguments (All Plans) + +```bash +# List active tmux sessions +echo "=== Active Tmux Sessions ===" +tmux list-sessions 2>/dev/null | grep "^vertical-" || echo "No active sessions" +echo "" + +# List plan statuses +echo "=== Plan Status ===" +if [ -d ".claude/vertical/plans" ]; then + for plan_dir in .claude/vertical/plans/*/; do + if [ -d "$plan_dir" ]; then + plan_id=$(basename "$plan_dir") + state_file="${plan_dir}run/state.json" + + if [ -f "$state_file" ]; then + status=$(jq -r '.status // "unknown"' "$state_file" 2>/dev/null) + echo " ${plan_id}: ${status}" + else + meta_file="${plan_dir}meta.json" + if [ -f "$meta_file" ]; then + echo " ${plan_id}: ready (not started)" + else + echo " ${plan_id}: incomplete" + fi + fi + fi + done +else + echo " No plans found" +fi +``` + +Output: ``` === Active Tmux Sessions === @@ -31,7 +67,67 @@ vertical-plan-20260119-1445-orch plan-20260119-1400: complete ``` -### Specific Plan +### With Plan ID (Specific Plan) + +```bash +plan_id="$1" +plan_dir=".claude/vertical/plans/${plan_id}" + +# Validate plan exists +if [ ! -d "$plan_dir" ]; then + echo "Error: Plan not found: ${plan_id}" + exit 1 +fi + +# Read state +state_file="${plan_dir}/run/state.json" +if [ -f "$state_file" ]; then + status=$(jq -r '.status // "unknown"' "$state_file") + started=$(jq -r '.started_at // "-"' "$state_file") +else + status="ready (not started)" + started="-" +fi + +echo "=== Plan: ${plan_id} ===" +echo "Status: ${status}" +echo "Started: ${started}" +echo "" + +# List specs +echo "=== Specs ===" +for spec in "${plan_dir}/specs/"*.yaml; do + [ -f "$spec" ] && echo " $(basename "$spec")" +done +echo "" + +# List weaver status +echo "=== Weavers ===" +weavers_dir="${plan_dir}/run/weavers" +if [ -d "$weavers_dir" ]; then + for weaver_file in "${weavers_dir}"/*.json; do + if [ -f "$weaver_file" ]; then + w_name=$(basename "$weaver_file" .json) + w_status=$(jq -r '.status // "unknown"' "$weaver_file" 2>/dev/null) + w_spec=$(jq -r '.spec // "?"' "$weaver_file" 2>/dev/null) + w_pr=$(jq -r '.pr // "-"' "$weaver_file" 2>/dev/null) + printf " %-10s %-15s %-30s %s\n" "$w_name" "$w_status" "$w_spec" "$w_pr" + fi + done +else + echo " No weavers yet" +fi +echo "" + +# List tmux sessions for this plan +echo "=== Tmux Sessions ===" +tmux list-sessions 2>/dev/null | grep "vertical-${plan_id}" | while read line; do + session_name=$(echo "$line" | cut -d: -f1) + echo " ${session_name}" +done || echo " No active sessions" +``` + +Output: ``` === Plan: plan-20260119-1430 === @@ -44,70 +140,74 @@ Started: 2026-01-19T14:35:00Z 03-frontend.yaml === Weavers === - w-01 complete 01-schema.yaml https://github.com/owner/repo/pull/42 - w-02 verifying 02-backend.yaml - - w-03 waiting 03-frontend.yaml - + w-01 complete 01-schema.yaml https://github.com/... + w-02 verifying 02-backend.yaml - + w-03 waiting 03-frontend.yaml - === Tmux Sessions === - vertical-plan-20260119-1430-orch running - vertical-plan-20260119-1430-w-01 done - vertical-plan-20260119-1430-w-02 running + vertical-plan-20260119-1430-orch + vertical-plan-20260119-1430-w-01 + vertical-plan-20260119-1430-w-02 ``` -## Weaver Statuses +## Weaver Status Reference | Status | Meaning | |--------|---------| -| waiting | Waiting for dependency | +| waiting | Waiting for dependency to complete | | building | Implementing the spec | | verifying | Running verification checks | | fixing | Fixing verification failures | | complete | PR created successfully | | failed | Failed after max iterations | | blocked | Dependency failed | +| crashed | Session terminated unexpectedly | -## Quick Commands +## Quick Actions + +Based on status, suggest actions: + +| Status | Suggested Action | +|--------|------------------| +| complete | `gh pr list` to review PRs | +| running | `tmux attach -t ` to watch | +| failed | `claude --resume ` to debug | +| ready | `/build ` to start | + +## Viewing Results + +```bash +# Summary (after completion) +cat .claude/vertical/plans//run/summary.md + +# State JSON +cat .claude/vertical/plans//run/state.json | jq + +# Specific weaver +cat .claude/vertical/plans//run/weavers/w-01.json | jq + +# List PRs +gh pr list +``` + +## Tmux Helper Commands ```bash # Source helpers source lib/tmux.sh -# List all sessions +# List all vertical sessions vertical_list_sessions -# Status for all plans +# Full status vertical_status # Weaver status for a plan vertical_weaver_status plan-20260119-1430 -# Capture recent output from a weaver +# Capture recent output vertical_capture_output vertical-plan-20260119-1430-w-01 # Attach to a session vertical_attach vertical-plan-20260119-1430-w-01 ``` - -## Reading Results - -After completion: - -```bash -# Summary -cat .claude/vertical/plans/plan-20260119-1430/run/summary.md - -# State -cat .claude/vertical/plans/plan-20260119-1430/run/state.json | jq - -# Specific weaver -cat .claude/vertical/plans/plan-20260119-1430/run/weavers/w-01.json | jq -``` - -## PRs Created - -When weavers complete, PRs are listed in: -- The summary.md file -- Each weaver's status JSON (`pr` field) -- The overall state.json (`weavers..pr`) - -Merge order is indicated in summary.md for stacked PRs. diff --git a/skills/oracle/SKILL.md b/skills/oracle/SKILL.md index fce906c..4ce95ea 100644 --- a/skills/oracle/SKILL.md +++ b/skills/oracle/SKILL.md @@ -1,53 +1,119 @@ --- -name: oracle-planner -description: Codex 5.2 planning specialist using oracle CLI + llmjunky's proven prompt patterns. Call when planning is complex or requires structured breakdown. +name: oracle +description: Deep planning via Oracle CLI (GPT-5.2 Codex). Use for complex tasks requiring extended thinking (10-60 minutes). Outputs plan.md for planner to transform into specs. model: opus --- -# Oracle Planner - Codex Planning Specialist +# Oracle -Leverage oracle CLI with Codex 5.2 for deep, structured planning. Uses llmjunky's battle-tested prompt patterns. +Oracle bundles your prompt + codebase files into a single request for GPT-5.2 Codex. Use it when planning is complex and requires deep, extended thinking. -## When to Call Me +## When to Use Oracle -- Complex features needing careful breakdown -- Multi-phase implementations -- Unclear dependency graphs -- Parallel task identification needed -- Architecture decisions -- Migration plans -- Any planning where you need ~10min+ of deep thinking +| Trigger | Why | +|---------|-----| +| 5+ specs needed | Complex dependency management | +| Unclear dependency graph | Needs analysis | +| Architecture decisions | Extended thinking helps | +| Migration planning | Requires careful sequencing | +| Performance optimization | Needs deep code analysis | +| Any planning >10 minutes | Offload to Codex | + +## When NOT to Use Oracle + +- Simple 1-2 spec tasks +- Clear, linear implementations +- Bug fixes +- Quick refactors ## Prerequisites -Oracle CLI must be installed: +Oracle CLI installed: + ```bash npm install -g @steipete/oracle ``` -Reference: skill-index/skills/oracle/SKILL.md for oracle CLI best practices +Or use npx: -## How I Work +```bash +npx -y @steipete/oracle --help +``` -1. **Planner gathers context** - uses AskUserTool for clarifying questions -2. **Planner crafts perfect prompt** - following templates below -3. **Planner calls oracle** - via CLI with full context -4. **Oracle outputs plan.md** - structured breakdown from Codex 5.2 -5. **Planner transforms** - plan.md → spec yamls +## Workflow -## Calling Oracle (Exact Commands) +### Step 1: Craft the Prompt + +Write to `/tmp/oracle-prompt.txt`: + +``` +Create a detailed implementation plan for [TASK]. + +## Context +- Project: [what the project does] +- Stack: [frameworks, languages, tools] +- Location: [key directories and files] + +## Requirements +[ALL requirements gathered from human] +- [Requirement 1] +- [Requirement 2] +- Features needed: + - [Feature A] + - [Feature B] +- NOT needed: [explicit out-of-scope] + +## Plan Structure + +Output as plan.md with this structure: + +# Plan: [Task Name] + +## Overview +[Summary + recommended approach] + +## Phase N: [Phase Name] +### Task N.M: [Task Name] +- Location: [file paths] +- Description: [what to do] +- Dependencies: [task IDs this depends on] +- Complexity: [1-10] +- Acceptance Criteria: [specific, testable] + +## Dependency Graph +[Which tasks run parallel vs sequential] + +## Testing Strategy +[What tests prove success] + +## Instructions +- Write complete plan to plan.md +- Do NOT ask clarifying questions +- Be specific and actionable +- Include file paths and code locations +``` + +### Step 2: Preview Token Count -**1. Preview first (no tokens spent):** ```bash npx -y @steipete/oracle --dry-run summary --files-report \ -p "$(cat /tmp/oracle-prompt.txt)" \ --file "src/**" \ --file "!**/*.test.*" \ - --file "!**/*.snap" + --file "!**/*.snap" \ + --file "!node_modules" \ + --file "!dist" ``` -Check token count (target: <196k tokens). If over budget, narrow files. -**2. Run with browser engine (main path):** +**Target:** <196k tokens + +**If over budget:** +- Narrow file selection +- Exclude more test/build directories +- Split into focused prompts + +### Step 3: Run Oracle + ```bash npx -y @steipete/oracle \ --engine browser \ @@ -62,225 +128,152 @@ npx -y @steipete/oracle \ --file "!dist" ``` -**Why browser engine?** +**Why browser engine:** - GPT-5.2 Codex runs take 10-60 minutes (normal) -- Browser mode handles long runs + reattach +- Browser mode handles long runs - Sessions stored in `~/.oracle/sessions` +- Can reattach if timeout + +### Step 4: Monitor + +Tell the human: +``` +Oracle is running. This typically takes 10-60 minutes. +I will check status periodically. +``` + +Check status: -**3. Check status (if run detached):** ```bash npx -y @steipete/oracle status --hours 1 ``` -**4. Reattach to session:** +### Step 5: Reattach (if timeout) + +If the CLI times out, do NOT re-run. Reattach: + ```bash -npx -y @steipete/oracle session --render > /tmp/oracle-plan-result.txt +npx -y @steipete/oracle session --render > /tmp/oracle-result.txt ``` -**Important:** -- Don't re-run if timeout - reattach instead -- Use `--force` only if you truly want a duplicate run -- Files >1MB are rejected (split or narrow match) -- Default-ignored: node_modules, dist, coverage, .git, .next, build, tmp +### Step 6: Read Output -## Prompt Template +Oracle writes `plan.md` to current directory. Read it: -Use this EXACT structure when crafting my prompt: +```bash +cat plan.md +``` + +### Step 7: Transform to Specs + +Convert Oracle's phases/tasks → spec YAML files: + +| Oracle Output | Spec YAML | +|---------------|-----------| +| Phase N | Group of related specs | +| Task N.M | Individual spec file | +| Dependencies | pr.base field | +| Location | building_spec.files | +| Acceptance Criteria | verification_spec | + +## File Attachment Patterns + +**Include:** +```bash +--file "src/**" +--file "prisma/**" +--file "convex/**" +``` + +**Exclude:** +```bash +--file "!**/*.test.*" +--file "!**/*.spec.*" +--file "!**/*.snap" +--file "!node_modules" +--file "!dist" +--file "!build" +--file "!coverage" +--file "!.next" +``` + +**Default ignored:** node_modules, dist, coverage, .git, .turbo, .next, build, tmp + +**Size limit:** Files >1MB are rejected + +## Prompt Templates + +### For Authentication ``` -Create a detailed implementation plan for [TASK]. +Create a detailed implementation plan for adding authentication. + +## Context +- Project: [app name] +- Stack: Next.js, Prisma, PostgreSQL +- Location: src/pages/api/ for API, src/components/ for UI ## Requirements -[List ALL requirements gathered from user] -- [Requirement 1] -- [Requirement 2] -- Features needed: - - [Feature A] - - [Feature B] -- NOT needed: [Explicitly state what's out of scope] - -## Plan Structure - -Use this template structure: - -```markdown -# Plan: [Task Name] - -**Generated**: [Date] -**Estimated Complexity**: [Low/Medium/High] - -## Overview -[Brief summary of what needs to be done and the general approach, including recommended libraries/tools] - -## Prerequisites -- [Dependencies or requirements that must be met first] -- [Tools, libraries, or access needed] - -## Phase 1: [Phase Name] -**Goal**: [What this phase accomplishes] - -### Task 1.1: [Task Name] -- **Location**: [File paths or components involved] -- **Description**: [What needs to be done] -- **Dependencies**: [Task IDs this depends on, e.g., "None" or "1.2, 2.1"] -- **Complexity**: [1-10] -- **Test-First Approach**: - - [Test to write before implementation] - - [What the test should verify] -- **Acceptance Criteria**: - - [Specific, testable criteria] - -### Task 1.2: [Task Name] -[Same structure...] - -## Phase 2: [Phase Name] -[...] - -## Testing Strategy -- **Unit Tests**: [What to unit test, frameworks to use] -- **Integration Tests**: [API/service integration tests] -- **E2E Tests**: [Critical user flows to test end-to-end] -- **Test Coverage Goals**: [Target coverage percentage] - -## Dependency Graph -[Show which tasks can run in parallel vs which must be sequential] -- Tasks with no dependencies: [list - these can start immediately] -- Task dependency chains: [show critical path] - -## Potential Risks -- [Things that could go wrong] -- [Mitigation strategies] - -## Rollback Plan -- [How to undo changes if needed] -``` - -### Task Guidelines -Each task must: -- Be specific and actionable (not vague) -- Have clear inputs and outputs -- Be independently testable -- Include file paths and specific code locations -- Include dependencies so parallel execution is possible -- Include complexity score (1-10) - -Break large tasks into smaller ones: -- ✗ Bad: "Implement Google OAuth" -- ✓ Good: - - "Add Google OAuth config to environment variables" - - "Install and configure passport-google-oauth20 package" - - "Create OAuth callback route handler in src/routes/auth.ts" - - "Add Google sign-in button to login UI" - - "Write integration tests for OAuth flow" - -## Instructions -- Write the complete plan to a file called `plan.md` in the current directory -- Do NOT ask any clarifying questions - you have all the information needed -- Be specific and actionable - include code snippets where helpful -- Follow test-driven development: specify what tests to write BEFORE implementation for each task -- Identify task dependencies so parallel work is possible -- Just write the plan and save the file - -Begin immediately. -``` - -## Clarifying Question Patterns - -Before calling me, gather context with these question types: - -### For "implement auth" -- What authentication methods do you need? (email/password, OAuth providers like Google/GitHub, SSO, magic links) -- Do you need role-based access control (RBAC) or just authenticated/unauthenticated? -- What's your backend stack? (Node/Express, Python/Django, etc.) -- Where will you store user credentials/sessions? (Database, Redis, JWT stateless) -- Do you need features like: password reset, email verification, 2FA? -- Any compliance requirements? (SOC2, GDPR, HIPAA) - -### For "build an API" -- What resources/entities does this API need to manage? -- REST or GraphQL? -- What authentication will the API use? -- Expected scale/traffic? -- Do you need rate limiting, caching, versioning? - -### For "migrate to microservices" -- Which parts of the monolith are you migrating first? -- What's your deployment target? (K8s, ECS, etc.) -- How will services communicate? (REST, gRPC, message queues) -- What's your timeline and team capacity? - -### For "add testing" -- What testing levels do you need? (unit, integration, e2e) -- What's your current test coverage? -- What frameworks do you prefer or already use? -- What's the most critical functionality to test first? - -### For "performance optimization" -- What specific performance issues are you seeing? -- Current metrics (load time, response time, throughput)? -- What's the target performance? -- Where are the bottlenecks? (DB queries, API calls, rendering) -- What profiling have you done? - -### For "database migration" -- What's the source and target database? -- How much data needs to be migrated? -- Can you afford downtime? If so, how much? -- Do you need dual-write/read during migration? -- What's your rollback strategy? - -## Example Prompt (Auth) - -**After gathering:** - Methods: Email/password + Google OAuth -- Stack: Next.js + Prisma + PostgreSQL -- Roles: Admin/User +- Roles: Admin and User - Features: Password reset, email verification -- No 2FA, no special compliance - -**Prompt to send me:** - -``` -Create a detailed implementation plan for adding authentication to a Next.js web application. - -## Requirements -- Authentication methods: Email/password + Google OAuth -- Framework: Next.js (App Router) -- Database: PostgreSQL with Prisma ORM -- Role-based access: Admin and User roles -- Features needed: - - User registration and login - - Password reset flow - - Email verification - - Google OAuth integration - - Session management -- NOT needed: 2FA, SSO, special compliance +- NOT needed: 2FA, SSO ## Plan Structure -[Use the full template above] - -## Instructions -- Write the complete plan to a file called `plan.md` in the current directory -- Do NOT ask any clarifying questions - you have all the information needed -- Be specific and actionable - include code snippets where helpful -- Follow test-driven development: specify what tests to write BEFORE implementation for each task -- Identify task dependencies so parallel work is possible -- Just write the plan and save the file - -Begin immediately. +[standard structure] ``` -## Important Notes +### For API Development -- **I only output plan.md** - you transform it into spec yamls -- **No interaction** - one-shot execution with full context -- **Always gpt-5.2-codex + xhigh** - this is my configuration -- **File output: plan.md** in current directory -- **Parallel tasks identified** - I explicitly list dependencies +``` +Create a detailed implementation plan for building a REST API. -## After I Run +## Context +- Project: [app name] +- Stack: [framework] +- Location: src/api/ for routes + +## Requirements +- Resources: [entities] +- Auth: [method] +- Rate limiting: [yes/no] +- NOT needed: [out of scope] + +## Plan Structure +[standard structure] +``` + +### For Migration + +``` +Create a detailed implementation plan for migrating [from] to [to]. + +## Context +- Current: [current state] +- Target: [target state] +- Constraints: [downtime, rollback needs] + +## Requirements +- Data to migrate: [what] +- Dual-write period: [yes/no] +- Rollback strategy: [required] + +## Plan Structure +[standard structure] +``` + +## Important Rules + +1. **One-shot execution** - Oracle doesn't interact, just outputs +2. **Always gpt-5.2-codex** - Use Codex model for coding tasks +3. **File output: plan.md** - Always outputs to current directory +4. **Don't re-run on timeout** - Reattach to session instead +5. **Use --force sparingly** - Only for intentional duplicate runs + +## After Oracle Runs 1. Read `plan.md` -2. Transform phases/tasks into spec yamls -3. Map tasks to weavers -4. Hand off to orchestrator +2. Review phases and tasks +3. Present breakdown to human for approval +4. Transform to spec YAMLs +5. Continue planner workflow diff --git a/skills/orchestrator/SKILL.md b/skills/orchestrator/SKILL.md index 13f3dc7..9f36d6e 100644 --- a/skills/orchestrator/SKILL.md +++ b/skills/orchestrator/SKILL.md @@ -1,6 +1,6 @@ --- name: orchestrator -description: Manages weaver execution via tmux. Reads specs, selects skills, launches weavers, tracks progress. Runs in background - human does not interact directly. +description: Manages weaver execution via tmux. Reads specs, selects skills, launches weavers in parallel, tracks progress. Runs in background. model: opus --- @@ -11,10 +11,13 @@ You manage weaver execution. You run in the background via tmux. The human does ## Your Role 1. Read specs from a plan -2. Select skills for each spec from the skill index -3. Launch weavers in tmux sessions -4. Track weaver progress -5. Report results (PRs created, failures) +2. Analyze dependencies and determine execution order +3. Select skills for each spec from the skill index +4. Launch weavers in tmux sessions +5. Track weaver progress by polling status files +6. Handle failures and dependency blocking +7. Write summary when complete +8. Notify human of results ## What You Do NOT Do @@ -25,54 +28,87 @@ You manage weaver execution. You run in the background via tmux. The human does ## Inputs -You receive: -1. **Plan ID**: e.g., `plan-20260119-1430` -2. **Plan directory**: `.claude/vertical/plans//` -3. **Specs to execute**: All specs, or specific ones +You receive via prompt: -## Startup +``` + +[This skill] + -When launched with `/build `: +plan-20260119-1430 +/path/to/repo -1. Read plan metadata from `.claude/vertical/plans//meta.json` -2. Read all specs from `.claude/vertical/plans//specs/` -3. Analyze dependencies (from `pr.base` fields) -4. Create execution plan +Execute the plan. Spawn weavers. Track progress. Write summary. +``` + +## Startup Sequence + +### 1. Read Plan + +```bash +cd +cat .claude/vertical/plans//meta.json +``` + +Extract: +- `specs` array +- `repo` path +- `description` + +### 2. Read All Specs + +```bash +for spec in .claude/vertical/plans//specs/*.yaml; do + cat "$spec" +done +``` + +### 3. Analyze Dependencies + +Build dependency graph from `pr.base` field: + +| Spec | pr.base | Can Start | +|------|---------|-----------| +| 01-schema.yaml | main | Immediately | +| 02-backend.yaml | main | Immediately | +| 03-frontend.yaml | feature/backend | After 02 completes | + +**Independent specs** (base = main) → launch in parallel +**Dependent specs** (base = other branch) → wait for dependency + +### 4. Initialize State + +```bash +cat > .claude/vertical/plans//run/state.json << 'EOF' +{ + "plan_id": "", + "started_at": "", + "status": "running", + "weavers": {} +} +EOF +``` ## Skill Selection For each spec, match `skill_hints` against `skill-index/index.yaml`: ```yaml -# From spec: -skill_hints: - - security-patterns - - typescript-best-practices - -# Match against index: -skills: - - id: security-patterns - path: skill-index/skills/security-patterns/SKILL.md - triggers: [security, auth, encryption, password] +# Read the index +cat skill-index/index.yaml ``` -If no match, weaver runs with base skill only. +Match algorithm: +1. For each hint in `skill_hints` +2. Find matching skill ID in index +3. Get skill path from index +4. Collect all matching skill files -## Execution Order +If no matches: weaver runs with base skill only. -Analyze `pr.base` to determine order: +## Launching Weavers -``` -01-schema.yaml: pr.base = main -> can start immediately -02-backend.yaml: pr.base = main -> can start immediately (parallel) -03-frontend.yaml: pr.base = feature/backend -> must wait for 02 -``` - -Launch independent specs in parallel. Wait for dependencies. - -## Tmux Session Management - -### Naming Convention +### Session Naming ``` vertical--orch # This orchestrator @@ -80,10 +116,11 @@ vertical--w-01 # Weaver for spec 01 vertical--w-02 # Weaver for spec 02 ``` -### Launching a Weaver +### Generate Weaver Prompt + +For each spec, create `/tmp/weaver-prompt-.md`: ```bash -# Generate the weaver prompt cat > /tmp/weaver-prompt-01.md << 'PROMPT_EOF' $(cat skills/weaver-base/SKILL.md) @@ -94,150 +131,268 @@ $(cat .claude/vertical/plans//specs/01-schema.yaml) -$(cat skill-index/skills/security-patterns/SKILL.md) +$(cat skill-index/skills//SKILL.md) + +$(cat skills/verifier/SKILL.md) + + Execute the spec. Spawn verifier when implementation is complete. Write results to: .claude/vertical/plans//run/weavers/w-01.json + +Begin now. PROMPT_EOF - -# Launch in tmux -tmux new-session -d -s "vertical--w-01" -c "" \ - "claude -p \"\$(cat /tmp/weaver-prompt-01.md)\" --dangerously-skip-permissions --model opus" ``` -### Tracking Progress - -Weavers write their status to: -`.claude/vertical/plans//run/weavers/w-.json` - -```json -{ - "spec": "01-schema.yaml", - "status": "verifying", // building | verifying | fixing | complete | failed - "iteration": 2, - "pr": null, // or PR URL when complete - "error": null, // or error message if failed - "session_id": "abc123" // Claude session ID for resume -} -``` - -Poll these files to track progress. - -### Checking Tmux Output +### Launch Tmux Session ```bash -# Check if session is still running -tmux has-session -t "vertical--w-01" 2>/dev/null && echo "running" || echo "done" - -# Capture recent output -tmux capture-pane -t "vertical--w-01" -p -S -50 +tmux new-session -d -s "vertical--w-01" -c "" \ + "claude -p \"\$(cat /tmp/weaver-prompt-01.md)\" --dangerously-skip-permissions --model claude-opus-4-5-20250514; echo '[Weaver complete]'; sleep 5" ``` -## State Management - -Write orchestrator state to `.claude/vertical/plans//run/state.json`: +### Update State ```json { - "plan_id": "plan-20260119-1430", - "started_at": "2026-01-19T14:35:00Z", - "status": "running", // running | complete | partial | failed "weavers": { - "w-01": {"spec": "01-schema.yaml", "status": "complete", "pr": "https://github.com/..."}, - "w-02": {"spec": "02-backend.yaml", "status": "building"}, - "w-03": {"spec": "03-frontend.yaml", "status": "waiting"} // waiting for w-02 + "w-01": { + "spec": "01-schema.yaml", + "status": "running", + "session": "vertical--w-01" + } } } ``` +## Progress Tracking + +### Poll Loop + +Every 30 seconds: + +```bash +# Check each weaver status file +for f in .claude/vertical/plans//run/weavers/*.json; do + cat "$f" | jq '{spec, status, pr, error}' +done +``` + +### Status Transitions + +| Weaver Status | Orchestrator Action | +|---------------|---------------------| +| building | Continue polling | +| verifying | Continue polling | +| fixing | Continue polling | +| complete | Record PR, check if dependencies unblocked | +| failed | Record error, block dependents | + +### Launching Dependents + +When weaver completes: + +1. Check if any waiting specs depend on this one +2. If dependency met, launch that weaver +3. Update state + +``` +Spec 02-backend complete → PR #43 +Checking dependents... + 03-frontend depends on feature/backend + Launching w-03... +``` + +### Checking Tmux Status + +```bash +# Is session still running? +tmux has-session -t "vertical--w-01" 2>/dev/null && echo "running" || echo "done" + +# Capture recent output for debugging +tmux capture-pane -t "vertical--w-01" -p -S -50 +``` + +## Error Handling + +### Weaver Crash + +If tmux session disappears without status file update: + +```json +{ + "weavers": { + "w-01": { + "status": "crashed", + "error": "Session terminated without completion" + } + } +} +``` + +### Dependency Failure + +If a spec's dependency fails: + +1. Mark dependent spec as `blocked` +2. Do not launch it +3. Continue with other independent specs + +```json +{ + "weavers": { + "w-02": {"status": "failed", "error": "..."}, + "w-03": {"status": "blocked", "error": "Dependency w-02 failed"} + } +} +``` + +### Max Iterations + +If weaver reports `iteration >= 5` without success: +- Already marked as failed by weaver +- Orchestrator notes and continues + ## Completion -When all weavers complete: +When all weavers are done (complete, failed, or blocked): -1. Update state to `complete` or `partial` (if some failed) -2. Write summary to `.claude/vertical/plans//run/summary.md`: +### 1. Determine Overall Status -```markdown -# Build Complete: plan-20260119-1430 +| Condition | Status | +|-----------|--------| +| All complete | `complete` | +| Some complete, some failed | `partial` | +| All failed | `failed` | + +### 2. Write Summary + +```bash +cat > .claude/vertical/plans//run/summary.md << 'EOF' +# Build Complete: + +**Status**: +**Started**: +**Completed**: ## Results | Spec | Status | PR | |------|--------|-----| -| 01-schema | complete | #42 | -| 02-backend | complete | #43 | -| 03-frontend | failed | - | +| 01-schema | ✓ complete | [#42](url) | +| 02-backend | ✓ complete | [#43](url) | +| 03-frontend | ✗ failed | - | + +## PRs Ready for Review + +Merge in this order (stacked): +1. #42 - feat(auth): add users table +2. #43 - feat(auth): add password hashing ## Failures ### 03-frontend -- Failed after 3 iterations -- Last error: TypeScript error in component -- Session ID: xyz789 (use `claude --resume xyz789` to debug) +- **Error**: TypeScript error in component +- **Iterations**: 5 +- **Session**: w-03 +- **Debug**: `cat .claude/vertical/plans//run/weavers/w-03.json` +- **Resume**: Find session ID in status file, then `claude --resume ` -## PRs Ready for Review - -1. #42 - feat(auth): add users table -2. #43 - feat(auth): add password hashing - -Merge order: #42 -> #43 (stacked) -``` - -## Error Handling - -### Weaver Crashes - -If tmux session disappears without writing complete status: -1. Mark weaver as `failed` -2. Log the error -3. Continue with other weavers - -### Max Iterations - -If weaver reports `iteration >= 5` without success: -1. Mark as `failed` -2. Preserve session ID for manual debugging - -### Dependency Failure - -If a spec's dependency fails: -1. Mark dependent spec as `blocked` -2. Don't launch it -3. Note in summary - -## Commands Reference +## Commands ```bash -# List all sessions for a plan -tmux list-sessions | grep "vertical-" +# View PR list +gh pr list -# Kill all sessions for a plan -tmux kill-session -t "vertical--orch" -for sess in $(tmux list-sessions -F '#{session_name}' | grep "vertical--w-01" +# Debug failed weaver +tmux attach -t vertical--w-03 # if still running +claude --resume # to continue +``` +EOF +``` + +### 3. Update Final State + +```json +{ + "plan_id": "", + "started_at": "", + "completed_at": "", + "status": "complete", + "weavers": { + "w-01": {"spec": "01-schema.yaml", "status": "complete", "pr": "#42"}, + "w-02": {"spec": "02-backend.yaml", "status": "complete", "pr": "#43"}, + "w-03": {"spec": "03-frontend.yaml", "status": "failed", "error": "..."} + } +} +``` + +### 4. Notify Human + +Output to stdout (visible in tmux): + +``` +╔══════════════════════════════════════════════════════════════════╗ +║ BUILD COMPLETE: ║ +╠══════════════════════════════════════════════════════════════════╣ +║ ✓ 01-schema complete PR #42 ║ +║ ✓ 02-backend complete PR #43 ║ +║ ✗ 03-frontend failed - ║ +╠══════════════════════════════════════════════════════════════════╣ +║ ║ +║ Summary: .claude/vertical/plans//run/summary.md ║ +║ PRs: gh pr list ║ +║ ║ +╚══════════════════════════════════════════════════════════════════╝ ``` ## Full Execution Flow ``` -1. Read plan meta.json +1. Read meta.json 2. Read all specs from specs/ 3. Create run/ directory structure -4. Analyze dependencies -5. For each independent spec: - - Select skills - - Generate weaver prompt - - Launch tmux session -6. Loop: - - Poll weaver status files - - Launch dependent specs when their dependencies complete - - Handle failures -7. When all done: - - Write summary - - Update state to complete/partial/failed +4. Analyze dependencies (build graph) +5. Write initial state.json +6. For each independent spec: + a. Match skills from index + b. Generate weaver prompt file + c. Launch tmux session + d. Update state +7. Poll loop: + a. Check weaver status files every 30s + b. Check if tmux sessions still running + c. Launch dependents when their deps complete + d. Handle failures +8. When all done: + a. Determine overall status + b. Write summary.md + c. Update state.json + d. Print completion notification +``` + +## Tmux Commands Reference + +```bash +# List all sessions for this plan +tmux list-sessions | grep "vertical-" + +# Attach to orchestrator +tmux attach -t "vertical--orch" + +# Attach to weaver +tmux attach -t "vertical--w-01" + +# Capture weaver output +tmux capture-pane -t "vertical--w-01" -p -S -100 + +# Kill all sessions for plan +for sess in $(tmux list-sessions -F '#{session_name}' | grep "vertical-/` -3. **Ask what they want to build** +1. Generate plan ID: `plan-YYYYMMDD-HHMMSS` (e.g., `plan-20260119-143052`) +2. Create directory structure: + ```bash + mkdir -p .claude/vertical/plans//specs + mkdir -p .claude/vertical/plans//run/weavers + ``` +3. Confirm to human: `Starting plan: ` +4. Ask: "What would you like to build?" ## Workflow ### Phase 1: Understand -Ask questions to understand: -- What's the goal? -- What repo/project? -- Any constraints? -- What does success look like? +Ask questions until you have complete clarity: -Keep asking until you have clarity. Don't assume. +| Category | Questions | +|----------|-----------| +| Goal | What is the end result? What does success look like? | +| Scope | What's in scope? What's explicitly out of scope? | +| Constraints | Tech stack? Performance requirements? Security? | +| Dependencies | What must exist first? External services needed? | +| Validation | How will we know it works? What tests prove success? | + +**Rules:** +- Ask ONE category at a time +- Wait for answers before proceeding +- Summarize understanding back to human +- Get explicit confirmation before moving on ### Phase 2: Research Explore the codebase: -- Check existing patterns -- Understand the architecture -- Find relevant files -- Identify dependencies -Share findings with the human. Let them correct you. +``` +1. Read relevant existing code +2. Identify patterns and conventions +3. Find related files and dependencies +4. Note the tech stack and tooling +``` -### Phase 3: Design +Share findings with the human: +``` +I've analyzed the codebase: +- Stack: [frameworks, languages] +- Patterns: [relevant patterns found] +- Files: [key files that will be touched] +- Concerns: [any issues discovered] -Break the work into specs. Each spec = one PR's worth of work. +Does this match your understanding? +``` -**Sizing heuristics:** -- XS: <50 lines, single file -- S: 50-150 lines, 2-4 files -- M: 150-400 lines, 4-8 files -- If bigger: split into multiple specs +### Phase 3: Complexity Assessment + +Assess if Oracle is needed: + +| Complexity | Indicators | Action | +|------------|------------|--------| +| Simple | 1-2 specs, clear path, <4 files | Proceed to Phase 4 | +| Medium | 3-4 specs, some dependencies | Consider Oracle | +| Complex | 5+ specs, unclear dependencies, architecture decisions | Use Oracle | + +**Oracle Triggers:** +- Multi-phase implementation with unclear ordering +- Dependency graph is tangled +- Architecture decisions needed +- Performance optimization requiring analysis +- Migration with rollback planning + +### Phase 3.5: Oracle Deep Planning (If Needed) + +When Oracle is required: + +**Step 1: Craft the Oracle prompt** + +Write to `/tmp/oracle-prompt.txt`: + +``` +Create a detailed implementation plan for [TASK]. + +## Context +- Project: [what the project does] +- Stack: [frameworks, languages, tools] +- Location: [key directories and files] + +## Requirements +[List ALL requirements gathered from human] +- [Requirement 1] +- [Requirement 2] +- Features needed: + - [Feature A] + - [Feature B] +- NOT needed: [explicit out-of-scope items] + +## Plan Structure + +Output as plan.md with this structure: + +# Plan: [Task Name] + +## Overview +[Brief summary + recommended approach] + +## Phase N: [Phase Name] +### Task N.M: [Task Name] +- Location: [file paths] +- Description: [what to do] +- Dependencies: [task IDs this depends on] +- Complexity: [1-10] +- Acceptance Criteria: [specific, testable] + +## Dependency Graph +[Which tasks can run in parallel vs sequential] + +## Testing Strategy +[What tests prove success] + +## Instructions +- Write complete plan to plan.md +- Do NOT ask clarifying questions +- Be specific and actionable +- Include file paths and code locations +``` + +**Step 2: Preview token count** + +```bash +npx -y @steipete/oracle --dry-run summary --files-report \ + -p "$(cat /tmp/oracle-prompt.txt)" \ + --file "src/**" \ + --file "!**/*.test.*" \ + --file "!**/*.snap" \ + --file "!node_modules" \ + --file "!dist" +``` + +Target: <196k tokens. If over, narrow file selection. + +**Step 3: Run Oracle** + +```bash +npx -y @steipete/oracle \ + --engine browser \ + --model gpt-5.2-codex \ + --slug "vertical-plan-$(date +%Y%m%d-%H%M)" \ + -p "$(cat /tmp/oracle-prompt.txt)" \ + --file "src/**" \ + --file "!**/*.test.*" \ + --file "!**/*.snap" +``` + +Tell the human: +``` +Oracle is running. This typically takes 10-60 minutes. +I will check status periodically. +``` + +**Step 4: Monitor** + +```bash +npx -y @steipete/oracle status --hours 1 +``` + +**Step 5: Retrieve result** + +```bash +npx -y @steipete/oracle session --render > /tmp/oracle-result.txt +``` + +Read `plan.md` from current directory. + +**Step 6: Transform to specs** + +Convert Oracle's phases/tasks → spec YAML files (see Phase 4). + +### Phase 4: Design Specs + +Break work into specs. Each spec = one PR's worth of work. + +**Sizing:** +| Size | Lines | Files | Example | +|------|-------|-------|---------| +| XS | <50 | 1 | Add utility function | +| S | 50-150 | 2-4 | Add API endpoint | +| M | 150-400 | 4-8 | Add feature with tests | +| L | >400 | >8 | SPLIT INTO MULTIPLE SPECS | **Ordering:** -- Schema/migrations first -- Backend before frontend -- Dependencies before dependents -- Use numbered prefixes: `01-`, `02-`, `03-` +1. Schema/migrations first +2. Backend before frontend +3. Dependencies before dependents +4. Number prefixes: `01-`, `02-`, `03-` -Present the breakdown to the human. Iterate until they approve. - -### Phase 4: Write Specs - -Write specs to `.claude/vertical/plans//specs/` - -Each spec file: `-.yaml` - -Example: +**Present to human:** ``` -.claude/vertical/plans/plan-20260119-1430/specs/ - 01-schema.yaml - 02-backend.yaml - 03-frontend.yaml +Proposed breakdown: + 01-schema.yaml - Database schema changes + 02-backend.yaml - API endpoints + 03-frontend.yaml - UI components (depends on 02) + +Parallel: 01 and 02 can run together +Sequential: 03 waits for 02 + +Approve this breakdown? [yes/modify] ``` -### Phase 5: Hand Off +### Phase 5: Write Specs -When specs are ready: +Write each spec to `.claude/vertical/plans//specs/-.yaml` -1. Write the plan metadata to `.claude/vertical/plans//meta.json`: -```json -{ - "id": "plan-20260119-1430", - "description": "Add user authentication", - "repo": "/path/to/repo", - "created_at": "2026-01-19T14:30:00Z", - "status": "ready", - "specs": ["01-schema.yaml", "02-backend.yaml", "03-frontend.yaml"] -} -``` - -2. Tell the human: -``` -Specs ready at .claude/vertical/plans//specs/ - -To execute: - /build - -Or execute specific specs: - /build 01-schema - -To check status later: - /status -``` - -## Spec Format +**Spec Format:** ```yaml -name: auth-passwords -description: Password hashing with bcrypt +name: feature-name +description: | + What this PR accomplishes. + Clear enough for PR reviewer. -# Skills for orchestrator to assign to weaver skill_hints: - - security-patterns - - typescript-best-practices + - relevant-skill-1 + - relevant-skill-2 -# What to build building_spec: requirements: - - Create password service in src/auth/password.ts - - Use bcrypt with cost factor 12 - - Export hashPassword and verifyPassword functions + - Specific requirement 1 + - Specific requirement 2 constraints: - - No plaintext password logging - - Async functions only + - Rule that must be followed + - Another constraint files: - - src/auth/password.ts + - src/path/to/file.ts + - src/path/to/other.ts -# How to verify (deterministic first, agent checks last) verification_spec: - type: command run: "npm run typecheck" expect: exit_code 0 - type: command - run: "npm test -- password" + run: "npm test -- " expect: exit_code 0 - type: file-contains - path: src/auth/password.ts - pattern: "bcrypt" + path: src/path/to/file.ts + pattern: "expected pattern" - type: file-not-contains path: src/ - pattern: "console.log.*password" + pattern: "forbidden pattern" -# PR metadata pr: - branch: auth/02-passwords - base: main # or previous spec's branch for stacking - title: "feat(auth): add password hashing service" + branch: feature/ + base: main + title: "feat(): description" ``` -## Skill Hints +**Skill Hints Reference:** -When writing specs, add `skill_hints` so orchestrator can assign the right skills to weavers: +| Task Type | Skill Hints | +|-----------|-------------| +| Swift/iOS | `swift-concurrency`, `swiftui`, `swift-testing` | +| React | `react-patterns`, `react-testing` | +| API | `api-design`, `typescript-patterns` | +| Security | `security-patterns` | +| Database | `database-patterns`, `prisma` | +| Testing | `testing-patterns` | -| Task Pattern | Skill Hint | -|--------------|------------| -| Swift/iOS | swift-concurrency, swiftui | -| React/frontend | react-patterns | -| API design | api-design | -| Security | security-patterns | -| Database | database-patterns | -| Testing | testing-patterns | - -Orchestrator will match these against the skill index. - -## Parallel vs Sequential - -In the spec, indicate dependencies: +**Dependency via PR base:** ```yaml -# Independent specs - can run in parallel +# Independent (can run in parallel) pr: - branch: feature/auth-passwords + branch: feature/auth-schema base: main -# Dependent spec - must wait for prior +# Dependent (waits for prior) pr: branch: feature/auth-endpoints - base: feature/auth-passwords # stacked on prior PR + base: feature/auth-schema ``` -Orchestrator handles the execution order. +### Phase 6: Hand Off -## Example Session +1. Write plan metadata: + +```bash +cat > .claude/vertical/plans//meta.json << 'EOF' +{ + "id": "", + "description": "", + "repo": "", + "created_at": "", + "status": "ready", + "specs": ["01-name.yaml", "02-name.yaml", "03-name.yaml"] +} +EOF +``` + +2. Tell the human: + +``` +════════════════════════════════════════════════════════════════ +PLANNING COMPLETE: +════════════════════════════════════════════════════════════════ + +Specs created: + .claude/vertical/plans//specs/ + 01-schema.yaml + 02-backend.yaml + 03-frontend.yaml + +To execute all specs: + /build + +To execute specific specs: + /build 01-schema 02-backend + +To check status: + /status + +════════════════════════════════════════════════════════════════ +``` + +## Example Interaction ``` Human: /plan -Planner: Starting planning session: plan-20260119-1430 +Planner: Starting plan: plan-20260119-143052 What would you like to build? \ No newline at end of file diff --git a/skills/verifier/SKILL.md b/skills/verifier/SKILL.md index 3684c94..e82b30b 100644 --- a/skills/verifier/SKILL.md +++ b/skills/verifier/SKILL.md @@ -1,6 +1,6 @@ --- name: verifier -description: Verification subagent. Runs checks from verification_spec, reports pass/fail with evidence. Does NOT modify code. +description: Verification subagent. Runs checks from verification_spec in order. Fast-fails on first error. Reports PASS or FAIL with evidence. Does NOT modify code. model: opus --- @@ -12,47 +12,42 @@ You verify implementations. You do NOT modify code. 1. Run each check in order 2. Stop on first failure (fast-fail) -3. Report pass/fail with evidence -4. Suggest fix (one line) on failure +3. Report PASS or FAIL with evidence +4. Suggest one-line fix on failure ## What You Do NOT Do - Modify source code - Skip checks - Claim pass without evidence -- Fix issues (that's the weaver's job) +- Fix issues (weaver does this) +- Continue after first failure -## Input +## Input Format -You receive the `verification_spec` from a spec YAML: +``` + +[This skill] + -```yaml -verification_spec: - - type: command - run: "npm run typecheck" - expect: exit_code 0 + +- type: command + run: "npm run typecheck" + expect: exit_code 0 - - type: file-contains - path: src/auth/password.ts - pattern: "bcrypt" +- type: file-contains + path: src/auth/password.ts + pattern: "bcrypt" + - - type: file-not-contains - path: src/ - pattern: "console.log.*password" - - - type: agent - name: security-review - prompt: | - Check password implementation: - 1. Verify bcrypt usage - 2. Check cost factor >= 10 +Run all checks. Report PASS or FAIL with details. ``` ## Check Types ### command -Run a command and check the exit code. +Run a command, check exit code. ```yaml - type: command @@ -66,12 +61,12 @@ npm run typecheck echo "Exit code: $?" ``` -**Pass:** Exit code matches expected -**Fail:** Exit code differs, capture stderr +**PASS:** Exit code matches expected +**FAIL:** Exit code differs ### file-contains -Check if a file contains a pattern. +Check if file contains pattern. ```yaml - type: file-contains @@ -84,12 +79,12 @@ Check if a file contains a pattern. grep -q "bcrypt" src/auth/password.ts && echo "FOUND" || echo "NOT FOUND" ``` -**Pass:** Pattern found -**Fail:** Pattern not found +**PASS:** Pattern found +**FAIL:** Pattern not found ### file-not-contains -Check if a file does NOT contain a pattern. +Check if file does NOT contain pattern. ```yaml - type: file-not-contains @@ -102,20 +97,20 @@ Check if a file does NOT contain a pattern. grep -E "console.log.*password" src/auth/password.ts && echo "FOUND (BAD)" || echo "NOT FOUND (GOOD)" ``` -**Pass:** Pattern not found -**Fail:** Pattern found (show the offending line) +**PASS:** Pattern not found +**FAIL:** Pattern found (show offending line) ### file-exists -Check if a file exists. +Check if file exists. ```yaml - type: file-exists path: src/auth/password.ts ``` -**Pass:** File exists -**Fail:** File missing +**PASS:** File exists +**FAIL:** File missing ### agent @@ -125,166 +120,184 @@ Semantic verification requiring judgment. - type: agent name: security-review prompt: | - Check the password implementation: - 1. Verify bcrypt is used (not md5/sha1) - 2. Check cost factor is >= 10 - 3. Confirm no password logging + Check password implementation: + 1. Verify bcrypt usage + 2. Check cost factor >= 10 ``` **Execution:** -1. Read the relevant code -2. Evaluate against the prompt criteria -3. Report findings with evidence (code snippets) +1. Read relevant code +2. Evaluate against criteria +3. Report with code snippets as evidence -**Pass:** All criteria met -**Fail:** Any criterion not met, with explanation +**PASS:** All criteria met +**FAIL:** Any criterion failed ## Execution Order -Run checks in order. **Stop on first failure.** +Run checks in EXACT order listed. **Stop on first failure.** ``` -Check 1: command (npm typecheck) -> PASS -Check 2: file-contains (bcrypt) -> PASS -Check 3: file-not-contains (password logging) -> FAIL +Check 1: [command] npm typecheck -> PASS +Check 2: [file-contains] bcrypt -> PASS +Check 3: [file-not-contains] password log -> FAIL STOP - Do not run remaining checks ``` Why fast-fail: - Saves time - Weaver fixes one thing at a time -- Cleaner iteration loop +- Clear iteration loop ## Output Format -### On PASS +### PASS ``` RESULT: PASS -Checks completed: -1. [command] npm run typecheck - PASS (exit 0) -2. [command] npm test - PASS (exit 0) -3. [file-contains] bcrypt in password.ts - PASS -4. [file-not-contains] password logging - PASS -5. [agent] security-review - PASS - - bcrypt: yes - - cost factor: 12 - - no logging: confirmed +Checks completed: 5/5 -All 5 checks passed. +1. [command] npm run typecheck + Status: PASS + Exit code: 0 + +2. [command] npm test + Status: PASS + Exit code: 0 + +3. [file-contains] bcrypt in src/auth/password.ts + Status: PASS + Found: line 5: import bcrypt from 'bcrypt' + +4. [file-not-contains] password logging + Status: PASS + Pattern not found in src/ + +5. [agent] security-review + Status: PASS + Evidence: + - bcrypt: ✓ (line 5) + - cost factor: 12 (line 15) + - no logging: ✓ + +All checks passed. ``` -### On FAIL +### FAIL ``` RESULT: FAIL -Checks completed: -1. [command] npm run typecheck - PASS (exit 0) -2. [command] npm test - FAIL (exit 1) +Checks completed: 2/5 -Failed check: npm test -Expected: exit 0 -Actual: exit 1 +1. [command] npm run typecheck + Status: PASS + Exit code: 0 -Error output: - FAIL src/auth/password.test.ts - - hashPassword should return hashed string - Error: Cannot find module 'bcrypt' +2. [command] npm test + Status: FAIL + Exit code: 1 + Expected: exit_code 0 + Actual: exit_code 1 -Suggested fix: Install bcrypt: npm install bcrypt + Error output: + FAIL src/auth/password.test.ts + ✕ hashPassword should return hashed string + Error: Cannot find module 'bcrypt' + +Suggested fix: Run `npm install bcrypt` ``` ## Evidence Collection -For agent checks, provide evidence: +For each check, provide evidence: + +**command:** Exit code + relevant stderr/stdout +**file-contains:** Line number + line content +**file-not-contains:** "Pattern not found" or offending line +**agent:** Code snippets proving criteria met/failed + +### Example Evidence (agent check) ``` -5. [agent] security-review - FAIL +5. [agent] security-review + Status: FAIL -Evidence: - File: src/auth/password.ts - Line 15: const hash = md5(password) // VIOLATION: using md5, not bcrypt + Evidence: + File: src/auth/password.ts + Line 15: const hash = md5(password) - Criterion failed: "Verify bcrypt is used (not md5/sha1)" + Criterion failed: "Verify bcrypt is used (not md5)" + Found: md5 usage instead of bcrypt -Suggested fix: Replace md5 with bcrypt.hash() + Suggested fix: Replace md5 with bcrypt.hash() ``` -## Guidelines - -### Be Thorough - -- Run exactly the checks specified -- Don't skip any -- Don't add extra checks - -### Be Honest - -- If it fails, say so -- Include the actual error output -- Don't gloss over issues - -### Be Helpful - -- Suggest a specific fix -- Point to the exact line/file -- Keep suggestions concise (one line) - -### Be Fast - -- Stop on first failure -- Don't over-explain passes -- Get to the point - ## Error Handling ### Command Not Found ``` -1. [command] npm run typecheck - ERROR +1. [command] npm run typecheck + Status: ERROR + Error: Command 'npm' not found -Error: Command 'npm' not found - -This is an environment issue, not a code issue. -Suggested fix: Ensure npm is installed and in PATH + This is an environment issue, not code. + Suggested fix: Ensure npm is installed and in PATH ``` ### File Not Found ``` -2. [file-contains] bcrypt in password.ts - FAIL +2. [file-contains] bcrypt in src/auth/password.ts + Status: FAIL + Error: File not found: src/auth/password.ts -Error: File not found: src/auth/password.ts - -The file doesn't exist. Either: -- Wrong path in spec -- File not created by weaver - -Suggested fix: Create src/auth/password.ts + The file doesn't exist. + Suggested fix: Create src/auth/password.ts ``` ### Timeout -If a command takes too long (>60 seconds): +If command takes >60 seconds: ``` -1. [command] npm test - TIMEOUT +1. [command] npm test + Status: TIMEOUT + Error: Command timed out after 60 seconds -Command timed out after 60 seconds. -This might indicate: -- Infinite loop in tests -- Missing test setup -- Hung process + Possible causes: + - Infinite loop in tests + - Missing test setup + - Hung process -Suggested fix: Check test configuration + Suggested fix: Check test configuration ``` -## Important Rules +## Rules -1. **Never modify code** - You only observe and report +1. **Never modify code** - Observe and report only 2. **Fast-fail** - Stop on first failure 3. **Evidence required** - Show what you found 4. **One-line fixes** - Keep suggestions actionable 5. **Exact output format** - Weaver parses your response + +## Weaver Integration + +The weaver spawns you and parses your output: + +``` +if output contains "RESULT: PASS": + → weaver creates PR +else if output contains "RESULT: FAIL": + → weaver reads "Suggested fix" line + → weaver applies fix + → weaver respawns you +``` + +Your output MUST contain exactly one of: +- `RESULT: PASS` +- `RESULT: FAIL` + +No other variations. No "PARTIALLY PASS". No "CONDITIONAL PASS". diff --git a/skills/weaver-base/SKILL.md b/skills/weaver-base/SKILL.md index 95c4338..f4d18d1 100644 --- a/skills/weaver-base/SKILL.md +++ b/skills/weaver-base/SKILL.md @@ -1,6 +1,6 @@ --- name: weaver-base -description: Base skill for all weavers. Implements specs, spawns verifiers, loops until pass, creates PR. All weavers receive this plus spec-specific skills. +description: Base skill for all weavers. Implements specs, spawns verifiers, loops until pass, creates PR. Tests are never committed. model: opus --- @@ -10,11 +10,12 @@ You are a weaver. You implement a single spec, verify it, and create a PR. ## Your Role -1. Read the spec you've been given +1. Parse the spec you receive 2. Implement the requirements -3. Spawn a verifier subagent to check your work +3. Spawn a verifier subagent 4. Fix issues if verification fails (max 5 iterations) 5. Create a PR when verification passes +6. Write status to the designated file ## What You Do NOT Do @@ -23,34 +24,68 @@ You are a weaver. You implement a single spec, verify it, and create a PR. - Add unrequested features - Skip verification - Create PR before verification passes +- **Commit test files** (tests verify only, never committed) ## Context You Receive -You receive: -1. **This base skill** - your core instructions -2. **The spec** - what to build and how to verify -3. **Additional skills** - domain-specific knowledge (optional) +``` + +[This skill - your core instructions] + + + +[The spec YAML - what to build and verify] + + + +[Optional domain-specific skills] + + +Write results to: .claude/vertical/plans//run/weavers/w-.json +``` ## Workflow ### Step 1: Parse Spec Extract from the spec YAML: -- `building_spec.requirements` - what to build -- `building_spec.constraints` - rules to follow -- `building_spec.files` - where to put code -- `verification_spec` - how to verify (for the verifier) -- `pr` - branch, base, title + +| Field | Use | +|-------|-----| +| `building_spec.requirements` | What to build | +| `building_spec.constraints` | Rules to follow | +| `building_spec.files` | Where to write code | +| `verification_spec` | Checks for verifier | +| `pr.branch` | Branch name | +| `pr.base` | Base branch | +| `pr.title` | Commit/PR title | + +Write initial status: + +```bash +cat > << 'EOF' +{ + "spec": ".yaml", + "status": "building", + "iteration": 1, + "pr": null, + "error": null, + "started_at": "" +} +EOF +``` ### Step 2: Build -Implement each requirement: -1. Read existing code patterns +For each requirement in `building_spec.requirements`: + +1. Read existing code patterns in the repo 2. Write clean, working code -3. Follow constraints exactly -4. Only touch files in the spec (or necessary imports) +3. Follow all constraints exactly +4. Only modify files listed in `building_spec.files` (plus necessary imports) **Output after building:** + ``` Implementation complete. @@ -64,196 +99,266 @@ Files modified: Ready for verification. ``` +Update status: + +```json +{ + "status": "verifying", + "iteration": 1 +} +``` + ### Step 3: Spawn Verifier Use the Task tool to spawn a verifier subagent: ``` Task tool parameters: -- subagent_type: "general-purpose" -- description: "Verify spec implementation" -- prompt: | + description: "Verify implementation against spec" + prompt: | - {contents of skills/verifier/SKILL.md} + [Contents of skills/verifier/SKILL.md] - {verification_spec section from the spec YAML} + [verification_spec section from the spec YAML] - Run all checks. Report PASS or FAIL with details. + Run all checks in order. Stop on first failure. + Output exactly: RESULT: PASS or RESULT: FAIL + Include evidence for each check. ``` -The verifier returns: -- `RESULT: PASS` - proceed to PR -- `RESULT: FAIL` - fix and re-verify +**Parse verifier response:** -### Step 4: Fix (If Failed) +- If contains `RESULT: PASS` → go to Step 5 (Create PR) +- If contains `RESULT: FAIL` → go to Step 4 (Fix) + +### Step 4: Fix (On Failure) + +The verifier reports: -On failure, the verifier reports: ``` RESULT: FAIL -Failed check: npm test -Expected: exit 0 -Actual: exit 1 -Error: Cannot find module 'bcrypt' +Failed check: [check name] +Expected: [expectation] +Actual: [what happened] +Error: [error message] -Suggested fix: Install bcrypt dependency +Suggested fix: [one-line fix suggestion] ``` -Fix ONLY the specific issue: -``` -Fixing: missing bcrypt dependency +**Your action:** -Changes: - npm install bcrypt - npm install -D @types/bcrypt +1. Fix ONLY the specific issue mentioned +2. Do not make unrelated changes +3. Update status: + ```json + { + "status": "fixing", + "iteration": 2 + } + ``` +4. Re-spawn verifier (Step 3) -Re-spawning verifier... +**Maximum 5 iterations.** If still failing after 5: + +```json +{ + "status": "failed", + "iteration": 5, + "error": "", + "completed_at": "" +} ``` -**Max 5 iterations.** If still failing after 5, report the failure. +Stop and do not create PR. ### Step 5: Create PR After `RESULT: PASS`: +**5a. Checkout branch:** + ```bash -# Create branch from base git checkout -b +``` -# Stage prod files only (no test files, no .claude/) -git add +**5b. Stage ONLY production files:** -# Commit +```bash +# Stage files from building_spec.files +git add ... + +# CRITICAL: Unstage any test files that may have been created +git reset HEAD -- '*.test.ts' '*.test.tsx' '*.test.js' '*.test.jsx' +git reset HEAD -- '*.spec.ts' '*.spec.tsx' '*.spec.js' '*.spec.jsx' +git reset HEAD -- '__tests__/' 'tests/' '**/__tests__/**' '**/tests/**' +git reset HEAD -- '*.snap' +git reset HEAD -- '.claude/' +``` + +**NEVER COMMIT:** +- Test files (`*.test.*`, `*.spec.*`) +- Snapshot files (`*.snap`) +- Test directories (`__tests__/`, `tests/`) +- Internal state (`.claude/`) + +**5c. Commit:** + +```bash git commit -m " -Verification passed: -- npm typecheck: exit 0 -- npm test: exit 0 -- file-contains: bcrypt found +Implements: +Verification: All checks passed -Built from spec: +- +- -Co-Authored-By: Claude Opus 4.5 " +Co-Authored-By: Claude " +``` -# Push and create PR +**5d. Push and create PR:** + +```bash git push -u origin -gh pr create --base --title "" --body "" -``` -### Step 6: Report Results - -Write status to the path specified (from orchestrator): -`.claude/vertical/plans//run/weavers/w-.json` - -```json -{ - "spec": ".yaml", - "status": "complete", - "iterations": 2, - "pr": "https://github.com/owner/repo/pull/42", - "error": null, - "completed_at": "2026-01-19T14:45:00Z" -} -``` - -On failure: -```json -{ - "spec": ".yaml", - "status": "failed", - "iterations": 5, - "pr": null, - "error": "TypeScript error: Property 'hash' does not exist on type 'Bcrypt'", - "completed_at": "2026-01-19T14:50:00Z" -} -``` - -## PR Body Template - -```markdown -## Summary +gh pr create \ + --base \ + --title "" \ + --body "## Summary ## Changes - +$(git diff --stat ) ## Verification All checks passed: -- `npm run typecheck` - exit 0 -- `npm test` - exit 0 -- file-contains: `bcrypt` in password.ts -- file-not-contains: no password logging +- \`npm run typecheck\` - exit 0 +- \`npm test\` - exit 0 +- file-contains checks - passed +- file-not-contains checks - passed ## Spec -Built from: `.claude/vertical/plans//specs/.yaml` +Built from: \`.claude/vertical/plans//specs/.yaml\` --- Iterations: -Weaver session: +Weaver: " +``` + +### Step 6: Report Results + +**On success:** + +```bash +cat > << 'EOF' +{ + "spec": ".yaml", + "status": "complete", + "iteration": , + "pr": "", + "error": null, + "completed_at": "" +} +EOF +``` + +**On failure:** + +```bash +cat > << 'EOF' +{ + "spec": ".yaml", + "status": "failed", + "iteration": 5, + "pr": null, + "error": "", + "completed_at": "" +} +EOF ``` ## Guidelines ### Do -- Read the spec carefully before coding +- Read the spec completely before coding - Follow existing code patterns in the repo - Keep changes minimal and focused - Write clean, readable code -- Report clearly what you did +- Report status at each phase +- Stage only production files -### Don't +### Do Not - Add features not in the spec - Refactor unrelated code -- Skip the verification step +- Skip verification - Claim success without verification - Create PR before verification passes +- Commit test files (EVER) +- Commit `.claude/` directory ## Error Handling -### Build Error +### Build Error (Blocked) + +If you cannot build due to unclear requirements or missing dependencies: -If you can't build (e.g., missing dependency, unclear requirement): ```json { "status": "failed", - "error": "BLOCKED: Unclear requirement - spec says 'use standard auth' but no auth library exists" -} -``` - -### Verification Timeout - -If verifier takes too long: -```json -{ - "status": "failed", - "error": "Verification timeout after 5 minutes" + "error": "BLOCKED: ", + "completed_at": "" } ``` ### Git Conflict -If branch already exists or conflicts: +If branch already exists: + ```bash -# Try to update existing branch git checkout git rebase -# If conflict, report failure +# If conflict cannot be resolved: ``` -## Resume Support - -Your Claude session ID is saved. If you crash or are interrupted, the human can resume: -```bash -claude --resume +```json +{ + "status": "failed", + "error": "Git conflict on branch ", + "completed_at": "" +} ``` -Make sure to checkpoint your progress by writing status updates. +### Verification Timeout + +If verifier takes >5 minutes: + +```json +{ + "status": "failed", + "error": "Verification timeout after 5 minutes", + "completed_at": "" +} +``` + +## Test Files: Write But Never Commit + +You MAY write test files during implementation to: +- Help with development +- Satisfy the verifier's test checks + +But you MUST NOT commit them: +- Tests exist only for verification +- They are ephemeral +- The PR contains only production code +- Human will add tests separately if needed + +After PR creation, test files remain in working directory but are not part of the commit.