feat: add maxDelayMs setting to cap server-requested retry delays

When a provider (e.g., Google Gemini CLI) requests a retry delay longer
than maxDelayMs (default: 60s), the request fails immediately with an
informative error instead of waiting silently for hours.

The error is then handled by agent-level auto-retry, which shows the
delay to the user and allows aborting with Escape.

- Add maxRetryDelayMs to StreamOptions (packages/ai)
- Add maxRetryDelayMs to AgentOptions (packages/agent)
- Add retry.maxDelayMs to settings (packages/coding-agent)
- Update _isRetryableError to match 'retry delay' errors

fixes #1123
This commit is contained in:
Mario Zechner 2026-02-01 00:50:41 +01:00
parent 1bd68327f3
commit 030a61d88c
11 changed files with 65 additions and 4 deletions

View file

@ -77,13 +77,17 @@ Edit directly or use `/settings` for common options.
| `retry.enabled` | boolean | `true` | Enable automatic retry on transient errors |
| `retry.maxRetries` | number | `3` | Maximum retry attempts |
| `retry.baseDelayMs` | number | `2000` | Base delay for exponential backoff (2s, 4s, 8s) |
| `retry.maxDelayMs` | number | `60000` | Max server-requested delay before failing (60s) |
When a provider requests a retry delay longer than `maxDelayMs` (e.g., Google's "quota will reset after 5h"), the request fails immediately with an informative error instead of waiting silently. Set to `0` to disable the cap.
```json
{
"retry": {
"enabled": true,
"maxRetries": 3,
"baseDelayMs": 2000
"baseDelayMs": 2000,
"maxDelayMs": 60000
}
}
```