mirror of
https://github.com/getcompanion-ai/co-mono.git
synced 2026-04-21 11:04:35 +00:00
feat(agent): Comprehensive reasoning token support across providers
Added provider-specific reasoning/thinking token support for: - OpenAI (o1, o3, gpt-5): Full reasoning events via Responses API, token counts via Chat Completions - Groq: reasoning_format:"parsed" for Chat Completions, no summary support for Responses - Gemini 2.5: extra_body.google.thinking_config with <thought> tag extraction - OpenRouter: Unified reasoning parameter with message.reasoning field - Anthropic: Limited support via OpenAI compatibility layer Key improvements: - Centralized provider detection based on baseURL - parseReasoningFromMessage() extracts provider-specific reasoning content - adjustRequestForProvider() handles provider-specific request modifications - Smart reasoning support detection with caching per API type - Comprehensive README documentation with provider support matrix Fixes reasoning tokens not appearing for GPT-5 and other reasoning models.
This commit is contained in:
parent
62d9eefc2a
commit
99ce76d66e
5 changed files with 345 additions and 58 deletions
|
|
@ -33,14 +33,20 @@ pi-agent
|
|||
# Continue most recently modified session in current directory
|
||||
pi-agent --continue "Follow up question"
|
||||
|
||||
# GPT-OSS via Groq
|
||||
# GPT-OSS via Groq (supports reasoning with both APIs)
|
||||
pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY --model openai/gpt-oss-120b
|
||||
|
||||
# GLM 4.5 via OpenRouter
|
||||
pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY --model z-ai/glm-4.5
|
||||
|
||||
# Claude via Anthropic (no prompt caching support - see https://docs.anthropic.com/en/api/openai-sdk)
|
||||
# Claude via Anthropic's OpenAI compatibility layer
|
||||
# Note: No prompt caching or thinking content support. For full features, use the native Anthropic API.
|
||||
# See: https://docs.anthropic.com/en/api/openai-sdk
|
||||
pi-agent --base-url https://api.anthropic.com/v1 --api-key $ANTHROPIC_API_KEY --model claude-opus-4-1-20250805
|
||||
|
||||
# Gemini via Google AI (set GEMINI_API_KEY environment variable)
|
||||
# Note: Gemini 2.5 models support reasoning but require extra_body for thinking content (not yet supported)
|
||||
pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ --api-key $GEMINI_API_KEY --model gemini-2.5-flash
|
||||
```
|
||||
|
||||
## Usage Modes
|
||||
|
|
@ -137,13 +143,18 @@ When using `--json`, the agent outputs these event types:
|
|||
- `user_message` - User input
|
||||
- `assistant_start` - Assistant begins responding
|
||||
- `assistant_message` - Assistant's response
|
||||
- `thinking` - Reasoning/thinking (for models that support it)
|
||||
- `thinking` - Reasoning/thinking (for models that support it, requires `--api responses`)
|
||||
- `tool_call` - Tool being called
|
||||
- `tool_result` - Result from tool
|
||||
- `token_usage` - Token usage statistics
|
||||
- `token_usage` - Token usage statistics (includes `reasoningTokens` for models with reasoning)
|
||||
- `error` - Error occurred
|
||||
- `interrupted` - Processing was interrupted
|
||||
|
||||
**Note:**
|
||||
- OpenAI's Chat Completions API (`--api completions`, the default) only returns reasoning token *counts* but not the actual thinking content. To see thinking events, use the Responses API with `--api responses` for supported models (o1, o3, gpt-5).
|
||||
- Anthropic's OpenAI compatibility layer doesn't return thinking content. Use the native Anthropic API for full extended thinking features.
|
||||
- Gemini 2.5 models support reasoning via `reasoning_effort` but require the `extra_body` parameter with `thinking_config.include_thoughts: true` to get thinking content, which is not yet supported in pi-agent.
|
||||
|
||||
The complete TypeScript type definition for `AgentEvent` can be found in [`src/agent.ts`](src/agent.ts#L6).
|
||||
|
||||
## Build an Interactive UI with JSON Mode
|
||||
|
|
@ -284,6 +295,66 @@ agent.on('error', (err) => {
|
|||
console.log('Pi Agent Interactive Chat');
|
||||
```
|
||||
|
||||
## Reasoning
|
||||
|
||||
Pi-agent supports reasoning/thinking tokens for models that provide this capability:
|
||||
|
||||
### Supported Providers
|
||||
|
||||
| Provider | API | Reasoning Tokens | Thinking Content | Notes |
|
||||
|----------|-----|------------------|------------------|-------|
|
||||
| OpenAI (o1, o3) | Responses | ✅ | ✅ | Full support via `reasoning` events |
|
||||
| OpenAI (o1, o3) | Chat Completions | ✅ | ❌ | Token counts only, no content |
|
||||
| OpenAI (gpt-5) | Responses | ✅ | ⚠️ | Model returns empty summaries |
|
||||
| OpenAI (gpt-5) | Chat Completions | ✅ | ❌ | Token counts only |
|
||||
| Groq (gpt-oss) | Responses | ✅ | ❌ | No reasoning.summary support |
|
||||
| Groq (gpt-oss) | Chat Completions | ✅ | ✅ | Via `reasoning_format: "parsed"` |
|
||||
| Gemini 2.5 | Chat Completions | ✅ | ✅ | Via `extra_body.google.thinking_config` |
|
||||
| Anthropic | OpenAI Compat | ❌ | ❌ | Not supported in compatibility layer |
|
||||
| OpenRouter | Various | ✅ | ✅ | Model-dependent, see provider docs |
|
||||
|
||||
### Usage Examples
|
||||
|
||||
```bash
|
||||
# OpenAI o1/o3 - see thinking content with Responses API
|
||||
pi-agent --api responses --model o1-mini "Explain quantum computing"
|
||||
|
||||
# Groq gpt-oss - reasoning with Chat Completions
|
||||
pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY \
|
||||
--model openai/gpt-oss-120b "Complex math problem"
|
||||
|
||||
# Gemini 2.5 - thinking content automatically configured
|
||||
pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ \
|
||||
--api-key $GEMINI_API_KEY --model gemini-2.5-flash "Think step by step"
|
||||
|
||||
# OpenRouter - supports various reasoning models
|
||||
pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY \
|
||||
--model "qwen/qwen3-235b-a22b-thinking-2507" "Complex reasoning task"
|
||||
```
|
||||
|
||||
### JSON Mode Events
|
||||
|
||||
When reasoning is active, you'll see:
|
||||
- `reasoning` events with thinking text (when available)
|
||||
- `token_usage` events include `reasoningTokens` field
|
||||
- Console/TUI show reasoning tokens with ⚡ symbol
|
||||
|
||||
### Technical Details
|
||||
|
||||
The agent automatically:
|
||||
- Detects provider from base URL
|
||||
- Tests model reasoning support on first use (cached)
|
||||
- Adjusts request parameters per provider:
|
||||
- OpenAI: `reasoning_effort` (minimal/low)
|
||||
- Groq: `reasoning_format: "parsed"`
|
||||
- Gemini: `extra_body.google.thinking_config`
|
||||
- OpenRouter: `reasoning` object with `effort` field
|
||||
- Parses provider-specific response formats:
|
||||
- Gemini: Extracts from `<thought>` tags
|
||||
- Groq: Uses `message.reasoning` field
|
||||
- OpenRouter: Uses `message.reasoning` field
|
||||
- OpenAI: Uses standard `reasoning` events
|
||||
|
||||
## Architecture
|
||||
|
||||
The agent is built with:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue