# 1. Codebase Analyzer Prompt - System Design > **Priority**: 🟑 HIGH β€” Core LLM logic > **Complexity**: High (prompt engineering) > **Effort Estimate**: 8-12 hours (iterative refinement) --- ## Overview The Codebase Analyzer takes structured `RepoSummary` from the introspector and generates `EvalSpec` JSON defining what tests to create. Key insight: **Claude generates specs, not code**. Test code is deterministically rendered from specs. --- ## Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Codebase Analyzer Agent β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ RepoSummary │───▢│ Claude Agent │───▢│ EvalSpec β”‚ β”‚ β”‚ β”‚ JSON β”‚ β”‚ SDK β”‚ β”‚ JSON β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚AskUserQuestionβ”‚ β”‚ β”‚ β”‚ (optional) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## Core Types ```typescript interface EvalSpec { version: '1.0'; repo: { name: string; languages: string[]; analyzedAt: string }; scenarios: EvalScenario[]; grading: { deterministic: DeterministicGrade[]; rubrics: RubricGrade[]; }; metadata: { generatedBy: string; totalTokens: number; questionsAsked: number; confidence: 'low' | 'medium' | 'high'; }; } interface EvalScenario { id: string; // "auth-login-success" name: string; description: string; target: { module: string; function: string; type: 'function' | 'method' | 'class'; }; category: 'unit' | 'integration' | 'edge-case' | 'negative'; priority: 'critical' | 'high' | 'medium' | 'low'; setup?: { fixtures: string[]; mocks: MockSpec[] }; input: { args: Record; kwargs?: Record }; assertions: Assertion[]; tags: string[]; } ``` --- ## Prompt Architecture (Three-Part) ### 1. System Prompt - Defines Claude's identity as codebase analyzer - Constraints: functional tests only, no syntax checks, ask don't assume ### 2. Developer Prompt - Contains EvalSpec JSON schema - Formatting rules (snake_case, kebab-case IDs) - Assertion type reference ### 3. User Prompt (Template) - Injects RepoSummary JSON - User context about what to evaluate - Instructions for output format --- ## Key Implementation ```typescript async function generateEvalSpec(options: GenerateOptions): Promise { const agentOptions: ClaudeAgentOptions = { systemPrompt: await loadPrompt('analyzer-system.md'), permissionMode: options.interactive ? 'default' : 'dontAsk', canUseTool: async ({ toolName, input }) => { if (toolName === 'AskUserQuestion' && options.onQuestion) { const answer = await options.onQuestion(input); return { behavior: 'allow', updatedInput: { ...input, answers: { [input.question]: answer } } }; } return { behavior: 'deny' }; }, outputFormat: { type: 'json_schema', json_schema: { name: 'EvalSpec', schema: EVAL_SPEC_SCHEMA } }, }; for await (const msg of query(prompt, agentOptions)) { if (msg.type === 'result') return msg.output as EvalSpec; } } ``` --- ## File Structure ``` src/analyzer/ β”œβ”€β”€ index.ts # Main entry point β”œβ”€β”€ types.ts # EvalSpec types β”œβ”€β”€ spec-generator.ts # Claude Agent SDK integration β”œβ”€β”€ validator.ts # JSON schema validation └── prompt-builder.ts # Builds prompts from templates prompts/ β”œβ”€β”€ analyzer-system.md β”œβ”€β”€ analyzer-developer.md └── analyzer-user.md ``` --- ## Success Criteria - [ ] Generates valid EvalSpec JSON for Python repos - [ ] Generates valid EvalSpec JSON for TypeScript repos - [ ] Asks 2-3 clarifying questions on complex repos - [ ] <10k tokens per analysis - [ ] 100% assertion coverage (every scenario has assertions)