fix(ai): ensure maxTokens > thinkingBudget for Claude thinking models

Claude requires max_tokens > thinking.budget_tokens. When caller specifies a small maxTokens (e.g. compaction with ~13k tokens) and reasoning is enabled with high budget (16k tokens), the constraint was violated. Fix: In mapOptionsForApi, add thinkingBudget on top of caller's maxTokens (capped at model.maxTokens). If still not enough room, reduce thinkingBudget to leave space for output. Applied to both anthropic-messages and google-gemini-cli APIs. Also adds test utilities for OAuth credential resolution and tests for compaction with thinking models. fixes #413
2026-04-15 18:01:22 +00:00 · 2026-01-03 02:45:30 +01:00 · 2026-01-03 02:45:30 +01:00 · 8df22faedf
commit 8df22faedf
parent 97af788344
4 changed files with 347 additions and 7 deletions
--- a/packages/coding-agent/README.md
+++ b/packages/coding-agent/README.md
@ -586,6 +586,7 @@ Global `~/.pi/agent/settings.json` stores persistent preferences:
 | `retry.baseDelayMs` | Base delay for exponential backoff | `2000` |
 | `terminal.showImages` | Render images inline (supported terminals) | `true` |
 | `images.autoResize` | Auto-resize images to 2000x2000 max for better model compatibility | `true` |
+
 | `doubleEscapeAction` | Action for double-escape with empty editor: `tree` or `branch` | `tree` |
 | `hooks` | Additional hook file paths | `[]` |
 | `customTools` | Additional custom tool file paths | `[]` |