# evaluclaude > **Zero-to-evals in one command.** Claude analyzes your codebase and generates functional tests. ![Version](https://img.shields.io/badge/version-0.1.0-blue) ![Node](https://img.shields.io/badge/node-%3E%3D18.0.0-green) ![License](https://img.shields.io/badge/license-MIT-brightgreen) ```mermaid flowchart LR subgraph Input A[Your Codebase] end subgraph Pipeline B[Introspect] C[Analyze] D[Render] E[Run] end subgraph Output F[Test Results] G[Traces] end A --> B B -->|RepoSummary| C C -->|EvalSpec| D D -->|Test Files| E E --> F E --> G B -.-|tree-sitter| B C -.-|Claude| C D -.-|pytest/vitest| D ``` A CLI tool that uses Claude to understand your codebase and generate real, runnable functional tests. Tree-sitter parses your code structure, Claude generates test specs, and deterministic renderers create the actual tests. ## Quick Start ```bash npm install -g evaluclaude-harness export ANTHROPIC_API_KEY=your-key # Run the full pipeline evaluclaude pipeline . # Or step by step evaluclaude intro . # Parse codebase evaluclaude analyze . -o spec.json -i # Generate spec (interactive) evaluclaude render spec.json # Create test files evaluclaude run # Execute tests ``` ## Commands | Command | Description | | :------------------------- | :-------------------------------------------- | | `pipeline [path]` | Run full pipeline: intro -> analyze -> render -> run | | `intro [path]` | Introspect codebase with tree-sitter | | `analyze [path]` | Generate EvalSpec with Claude | | `render ` | Render EvalSpec to test files | | `run [test-dir]` | Execute tests and collect results | | `grade ` | Grade output using LLM rubric | | `rubrics` | List available rubrics | | `calibrate` | Calibrate rubric against examples | | `view [trace-id]` | View trace details | | `traces` | List all traces | | `ui` | Launch Promptfoo dashboard | | `eval` | Run Promptfoo evaluations | ## Supported Languages | Language | Parser | Test Framework | | :------------ | :----------------------- | :-------------- | | Python | tree-sitter-python | pytest | | TypeScript | tree-sitter-typescript | vitest, jest | | JavaScript | tree-sitter-typescript | vitest, jest | ## Output Structure ``` .evaluclaude/ spec.json # Generated EvalSpec traces/ # Execution traces results/ # Test results promptfooconfig.yaml # Promptfoo config ``` ## How This Was Built This project was built in a few hours using [Amp Code](https://ampcode.com). You can explore the development threads: - [Initial setup and CLI structure](https://ampcode.com/threads/T-019bae58-69c2-74d0-a975-4be84c7d98dc) - [Introspection and tree-sitter integration](https://ampcode.com/threads/T-019bafc5-0d57-702a-a407-44b9b884b9d0) - [EvalSpec analysis with Claude](https://ampcode.com/threads/T-019baf57-7bc1-7368-9e3d-d649da47b68b) - [Test rendering and framework support](https://ampcode.com/threads/T-019baeef-6079-70d6-9c4a-3cfd439190f1) - [Grading and rubrics system](https://ampcode.com/threads/T-019baf12-e566-733d-b086-d099880c77c1) - [Promptfoo integration](https://ampcode.com/threads/T-019baf43-abcf-7715-8a65-0f5ac5df87ce) - [UI polish and final touches](https://ampcode.com/threads/T-019baf63-8c9e-7018-b8bc-538c5a3cada7) ## Development ```bash npm run build # Build npm run dev # Dev mode npm test # Run tests npm run typecheck # Type check ``` ## License MIT