- Add Mistral to KnownProvider type and model generation - Implement Mistral-specific compat handling in openai-completions: - requiresToolResultName: tool results need name field - requiresAssistantAfterToolResult: synthetic assistant message between tool/user - requiresThinkingAsText: thinking blocks as <thinking> text - requiresMistralToolIds: tool IDs must be exactly 9 alphanumeric chars - Add MISTRAL_API_KEY environment variable support - Add Mistral tests across all test files - Update documentation (README, CHANGELOG) for both ai and coding-agent packages - Remove client IDs from gemini.md, reference upstream source instead Closes #165
8.4 KiB
Gemini OAuth Integration Guide
This document provides a comprehensive analysis of how OAuth authentication could be implemented for Google Gemini in the pi coding-agent, based on the existing Anthropic OAuth implementation and the Gemini CLI's approach.
Table of Contents
- Current Anthropic OAuth Implementation
- Gemini CLI Authentication Analysis
- Gemini API Capabilities
- Gemini API Endpoints
- Implementation Plan
Current Anthropic OAuth Implementation
The pi coding-agent implements OAuth for Anthropic with the following architecture:
Key Components
-
OAuth Flow (
packages/coding-agent/src/core/oauth/anthropic.ts):- Uses PKCE (Proof Key for Code Exchange) flow for security
- Client ID:
9d1c250a-e61b-44d9-88ed-5944d1962f5e - Authorization URL:
https://claude.ai/oauth/authorize - Token URL:
https://console.anthropic.com/v1/oauth/token - Scopes:
org:create_api_key user:profile user:inference
-
Token Storage (
packages/coding-agent/src/core/oauth/storage.ts):- Stores credentials in
~/.pi/agent/oauth.json - File permissions set to 0600 (owner read/write only)
- Format:
{ provider: { type: "oauth", refresh: string, access: string, expires: number } }
- Stores credentials in
-
Token Management (
packages/coding-agent/src/core/oauth/index.ts):- Auto-refresh tokens when expired (with 5-minute buffer)
- Supports multiple providers through
SupportedOAuthProvidertype - Provider info includes id, name, and availability status
-
Model Integration (
packages/coding-agent/src/core/model-config.ts):- Checks OAuth tokens first, then environment variables
- OAuth status cached to avoid repeated file reads
- Maps providers to OAuth providers via
providerToOAuthProvider
Authentication Flow
- User initiates login with
pi auth login - Authorization URL is generated with PKCE challenge
- User opens URL in browser and authorizes
- User copies authorization code (format:
code#state) - Code is exchanged for access/refresh tokens
- Tokens are saved encrypted with expiry time
Gemini CLI Authentication Analysis
The Gemini CLI uses a more complex OAuth implementation with several key differences:
Authentication Methods
Gemini supports multiple authentication types:
LOGIN_WITH_GOOGLE(OAuth personal account)USE_GEMINI(API key)USE_VERTEX_AI(Vertex AI)COMPUTE_ADC(Application Default Credentials)
OAuth Implementation Details
-
OAuth Configuration:
- Client ID and Secret: See google-gemini/gemini-cli oauth2.ts (public for installed apps per Google's OAuth docs)
- Scopes:
https://www.googleapis.com/auth/cloud-platformhttps://www.googleapis.com/auth/userinfo.emailhttps://www.googleapis.com/auth/userinfo.profile
-
Authentication Flows:
- Web Flow: Opens browser, runs local HTTP server for callback
- User Code Flow: For environments without browser (NO_BROWSER=true)
- Uses Google's
google-auth-libraryfor OAuth handling
-
Token Storage:
- Supports encrypted storage via
OAuthCredentialStorage - Falls back to plain JSON storage
- Stores user info (email) separately
- Supports encrypted storage via
-
API Integration:
- Uses
CodeAssistServerfor API calls - Endpoint:
https://cloudcode-pa.googleapis.com - Includes user tier information (FREE, STANDARD, etc.)
- Uses
Gemini API Capabilities
Based on the Gemini CLI analysis:
System Prompts
✅ Yes, Gemini supports system prompts
- Implemented via
getCoreSystemPrompt()in the codebase - System instructions are part of the
GenerateContentParameters
Tools/Function Calling
✅ Yes, Gemini supports tools and function calling
- Uses the
Tooltype from@google/genai - Extensive tool support including:
- File system operations (read, write, edit)
- Web search and fetch
- MCP (Model Context Protocol) tools
- Custom tool registration
Content Generation
- Supports streaming and non-streaming generation
- Token counting capabilities
- Embedding support
- Context compression for long conversations
Gemini API Endpoints
When using OAuth tokens, the Gemini CLI talks to:
Primary Endpoint
- Base URL:
https://cloudcode-pa.googleapis.com - API Version:
v1internal
Key Methods
generateContent- Non-streaming content generationstreamGenerateContent- Streaming content generationcountTokens- Token countingembedContent- Text embeddingsloadCodeAssist- User setup and tier informationonboardUser- User onboarding
Authentication
- OAuth tokens are passed via
AuthClientfromgoogle-auth-library - Tokens are automatically refreshed by the library
- Project ID and session ID included in requests
Implementation Plan
1. Add Gemini OAuth Provider Support
File: packages/coding-agent/src/core/oauth/gemini.ts
import { OAuth2Client } from 'google-auth-library';
import { type OAuthCredentials, saveOAuthCredentials } from "./storage.js";
// OAuth credentials from google-gemini/gemini-cli:
// https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/code_assist/oauth2.ts
const SCOPES = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile"
];
export async function loginGemini(
onAuthUrl: (url: string) => void,
onPromptCode: () => Promise<string>,
): Promise<void> {
// Implementation similar to Anthropic but using google-auth-library
}
export async function refreshGeminiToken(refreshToken: string): Promise<OAuthCredentials> {
// Use google-auth-library for refresh
}
2. Update OAuth Index
File: packages/coding-agent/src/core/oauth/index.ts
export type SupportedOAuthProvider = "anthropic" | "github-copilot" | "gemini";
// Add Gemini to provider list
{
id: "gemini",
name: "Google Gemini (Code Assist)",
available: true,
}
// Add cases for Gemini in login/refresh functions
3. Create Gemini API Client
File: packages/ai/src/providers/gemini-oauth.ts
export class GeminiOAuthProvider implements Provider {
// Implement Provider interface
// Use CodeAssistServer approach from Gemini CLI
// Map to standard pi-ai API format
}
4. Update Model Configuration
File: packages/coding-agent/src/core/model-config.ts
// Add to providerToOAuthProvider mapping
gemini: "gemini",
// Add Gemini OAuth token check
if (model.provider === "gemini") {
const oauthToken = await getOAuthToken("gemini");
if (oauthToken) return oauthToken;
const oauthEnv = process.env.GEMINI_OAUTH_TOKEN;
if (oauthEnv) return oauthEnv;
}
5. Dependencies
Add to package.json:
{
"dependencies": {
"google-auth-library": "^9.0.0"
}
}
6. Environment Variables
Support these environment variables:
GEMINI_OAUTH_TOKEN- Manual OAuth tokenGOOGLE_CLOUD_PROJECT- For project-specific featuresNO_BROWSER- Force user code flow
Key Differences from Anthropic Implementation
- Authentication Library: Use
google-auth-libraryinstead of manual OAuth - Multiple Auth Types: Support OAuth, API key, and ADC
- User Info: Fetch and cache user email/profile
- Project Context: Include project ID in API calls
- Tier Management: Handle user tier (FREE/STANDARD) responses
Challenges and Considerations
- API Access: The Code Assist API (
cloudcode-pa.googleapis.com) might require special access or be in preview - Model Naming: Need to map Gemini model names to Code Assist equivalents
- Rate Limits: Handle tier-based rate limits
- Error Handling: Map Google-specific errors to pi error types
- Token Scopes: Ensure scopes are sufficient for all operations
Testing Plan
- Test OAuth flow (browser and NO_BROWSER modes)
- Test token refresh
- Test API calls with OAuth tokens
- Test fallback to API keys
- Test error scenarios (invalid tokens, network errors)
- Test model switching and tier limits
Migration Path
- Users with
GEMINI_API_KEYcontinue working unchanged - New
pi auth login geminicommand for OAuth - OAuth takes precedence over API keys when available
- Clear messaging about benefits (higher limits, better features)