fix(ai): ensure maxTokens > thinkingBudget for Claude thinking models

Claude requires max_tokens > thinking.budget_tokens. When caller specifies
a small maxTokens (e.g. compaction with ~13k tokens) and reasoning is enabled
with high budget (16k tokens), the constraint was violated.

Fix: In mapOptionsForApi, add thinkingBudget on top of caller's maxTokens
(capped at model.maxTokens). If still not enough room, reduce thinkingBudget
to leave space for output.

Applied to both anthropic-messages and google-gemini-cli APIs.

Also adds test utilities for OAuth credential resolution and tests for
compaction with thinking models.

fixes #413
This commit is contained in:
Mario Zechner 2026-01-03 02:45:30 +01:00
parent 97af788344
commit 8df22faedf
4 changed files with 347 additions and 7 deletions

View file

@ -586,6 +586,7 @@ Global `~/.pi/agent/settings.json` stores persistent preferences:
| `retry.baseDelayMs` | Base delay for exponential backoff | `2000` |
| `terminal.showImages` | Render images inline (supported terminals) | `true` |
| `images.autoResize` | Auto-resize images to 2000x2000 max for better model compatibility | `true` |
| `doubleEscapeAction` | Action for double-escape with empty editor: `tree` or `branch` | `tree` |
| `hooks` | Additional hook file paths | `[]` |
| `customTools` | Additional custom tool file paths | `[]` |