evaluclaude-harness/AGENTS.md
2026-01-11 16:58:40 -05:00

1.3 KiB

Evaluclaude Harness - Agent Instructions

Project Overview

This is a CLI tool for generating evaluation tests for codebases using Claude. The core philosophy is "Zero-to-evals in one command."

Commands

# Build the project
npm run build

# Run typecheck
npm run typecheck

# Run tests
npm test

# Run the CLI
npm start -- intro <path>

Project Structure

src/
├── cli/              # Commander.js CLI
├── introspector/     # Tree-sitter codebase parsing (NO LLM)
│   ├── parsers/      # Language-specific parsers
│   ├── scanner.ts    # File discovery
│   ├── git.ts        # Git integration
│   └── summarizer.ts # Main analysis logic
└── index.ts          # Main exports

Key Principles

  1. Tree-sitter for introspection: Never send raw code to Claude for structure extraction
  2. Claude generates specs, not code: EvalSpec JSON is generated by Claude, test code is rendered deterministically
  3. Git-aware incremental: Only re-analyze changed files

Dependencies

  • tree-sitter: Native AST parsing
  • tree-sitter-python: Python grammar
  • tree-sitter-typescript: TypeScript grammar
  • commander: CLI framework
  • glob: File pattern matching

Testing

Use vitest for testing. Test files go in tests/ directory.