mirror of https://github.com/getcompanion-ai/co-mono.git synced 2026-04-15 23:01:30 +00:00

Mario Zechner 99b4b1aca0 Add Mistral as AI provider

- Add Mistral to KnownProvider type and model generation
- Implement Mistral-specific compat handling in openai-completions:
  - requiresToolResultName: tool results need name field
  - requiresAssistantAfterToolResult: synthetic assistant message between tool/user
  - requiresThinkingAsText: thinking blocks as <thinking> text
  - requiresMistralToolIds: tool IDs must be exactly 9 alphanumeric chars
- Add MISTRAL_API_KEY environment variable support
- Add Mistral tests across all test files
- Update documentation (README, CHANGELOG) for both ai and coding-agent packages
- Remove client IDs from gemini.md, reference upstream source instead

Closes #165

2025-12-10 20:36:19 +01:00

8.4 KiB

Raw Blame History

Gemini OAuth Integration Guide

This document provides a comprehensive analysis of how OAuth authentication could be implemented for Google Gemini in the pi coding-agent, based on the existing Anthropic OAuth implementation and the Gemini CLI's approach.

Current Anthropic OAuth Implementation
Gemini CLI Authentication Analysis
Gemini API Capabilities
Gemini API Endpoints
Implementation Plan

Current Anthropic OAuth Implementation

The pi coding-agent implements OAuth for Anthropic with the following architecture:

Key Components

OAuth Flow (packages/coding-agent/src/core/oauth/anthropic.ts):
- Uses PKCE (Proof Key for Code Exchange) flow for security
- Client ID: 9d1c250a-e61b-44d9-88ed-5944d1962f5e
- Authorization URL: https://claude.ai/oauth/authorize
- Token URL: https://console.anthropic.com/v1/oauth/token
- Scopes: org:create_api_key user:profile user:inference
Token Storage (packages/coding-agent/src/core/oauth/storage.ts):
- Stores credentials in ~/.pi/agent/oauth.json
- File permissions set to 0600 (owner read/write only)
- Format: { provider: { type: "oauth", refresh: string, access: string, expires: number } }
Token Management (packages/coding-agent/src/core/oauth/index.ts):
- Auto-refresh tokens when expired (with 5-minute buffer)
- Supports multiple providers through SupportedOAuthProvider type
- Provider info includes id, name, and availability status
Model Integration (packages/coding-agent/src/core/model-config.ts):
- Checks OAuth tokens first, then environment variables
- OAuth status cached to avoid repeated file reads
- Maps providers to OAuth providers via providerToOAuthProvider

Authentication Flow

User initiates login with pi auth login
Authorization URL is generated with PKCE challenge
User opens URL in browser and authorizes
User copies authorization code (format: code#state)
Code is exchanged for access/refresh tokens
Tokens are saved encrypted with expiry time

Gemini CLI Authentication Analysis

The Gemini CLI uses a more complex OAuth implementation with several key differences:

Authentication Methods

Gemini supports multiple authentication types:

LOGIN_WITH_GOOGLE (OAuth personal account)
USE_GEMINI (API key)
USE_VERTEX_AI (Vertex AI)
COMPUTE_ADC (Application Default Credentials)

OAuth Implementation Details

OAuth Configuration:
- Client ID and Secret: See google-gemini/gemini-cli oauth2.ts (public for installed apps per Google's OAuth docs)
- Scopes:
  - https://www.googleapis.com/auth/cloud-platform
  - https://www.googleapis.com/auth/userinfo.email
  - https://www.googleapis.com/auth/userinfo.profile
Authentication Flows:
- Web Flow: Opens browser, runs local HTTP server for callback
- User Code Flow: For environments without browser (NO_BROWSER=true)
- Uses Google's google-auth-library for OAuth handling
Token Storage:
- Supports encrypted storage via OAuthCredentialStorage
- Falls back to plain JSON storage
- Stores user info (email) separately
API Integration:
- Uses CodeAssistServer for API calls
- Endpoint: https://cloudcode-pa.googleapis.com
- Includes user tier information (FREE, STANDARD, etc.)

Gemini API Capabilities

Based on the Gemini CLI analysis:

System Prompts

✅ Yes, Gemini supports system prompts

Implemented via getCoreSystemPrompt() in the codebase
System instructions are part of the GenerateContentParameters

Tools/Function Calling

✅ Yes, Gemini supports tools and function calling

Uses the Tool type from @google/genai
Extensive tool support including:
- File system operations (read, write, edit)
- Web search and fetch
- MCP (Model Context Protocol) tools
- Custom tool registration

Content Generation

Supports streaming and non-streaming generation
Token counting capabilities
Embedding support
Context compression for long conversations

Gemini API Endpoints

When using OAuth tokens, the Gemini CLI talks to:

Primary Endpoint

Base URL: https://cloudcode-pa.googleapis.com
API Version: v1internal

Key Methods

generateContent - Non-streaming content generation
streamGenerateContent - Streaming content generation
countTokens - Token counting
embedContent - Text embeddings
loadCodeAssist - User setup and tier information
onboardUser - User onboarding

Authentication

OAuth tokens are passed via AuthClient from google-auth-library
Tokens are automatically refreshed by the library
Project ID and session ID included in requests

Implementation Plan

1. Add Gemini OAuth Provider Support

File: packages/coding-agent/src/core/oauth/gemini.ts

import { OAuth2Client } from 'google-auth-library';
import { type OAuthCredentials, saveOAuthCredentials } from "./storage.js";

// OAuth credentials from google-gemini/gemini-cli:
// https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/code_assist/oauth2.ts
const SCOPES = [
  "https://www.googleapis.com/auth/cloud-platform",
  "https://www.googleapis.com/auth/userinfo.email",
  "https://www.googleapis.com/auth/userinfo.profile"
];

export async function loginGemini(
  onAuthUrl: (url: string) => void,
  onPromptCode: () => Promise<string>,
): Promise<void> {
  // Implementation similar to Anthropic but using google-auth-library
}

export async function refreshGeminiToken(refreshToken: string): Promise<OAuthCredentials> {
  // Use google-auth-library for refresh
}

2. Update OAuth Index

File: packages/coding-agent/src/core/oauth/index.ts

export type SupportedOAuthProvider = "anthropic" | "github-copilot" | "gemini";

// Add Gemini to provider list
{
  id: "gemini",
  name: "Google Gemini (Code Assist)",
  available: true,
}

// Add cases for Gemini in login/refresh functions

3. Create Gemini API Client

File: packages/ai/src/providers/gemini-oauth.ts

export class GeminiOAuthProvider implements Provider {
  // Implement Provider interface
  // Use CodeAssistServer approach from Gemini CLI
  // Map to standard pi-ai API format
}

4. Update Model Configuration

File: packages/coding-agent/src/core/model-config.ts

// Add to providerToOAuthProvider mapping
gemini: "gemini",

// Add Gemini OAuth token check
if (model.provider === "gemini") {
  const oauthToken = await getOAuthToken("gemini");
  if (oauthToken) return oauthToken;
  const oauthEnv = process.env.GEMINI_OAUTH_TOKEN;
  if (oauthEnv) return oauthEnv;
}

5. Dependencies

Add to package.json:

{
  "dependencies": {
    "google-auth-library": "^9.0.0"
  }
}

6. Environment Variables

Support these environment variables:

GEMINI_OAUTH_TOKEN - Manual OAuth token
GOOGLE_CLOUD_PROJECT - For project-specific features
NO_BROWSER - Force user code flow

Key Differences from Anthropic Implementation

Authentication Library: Use google-auth-library instead of manual OAuth
Multiple Auth Types: Support OAuth, API key, and ADC
User Info: Fetch and cache user email/profile
Project Context: Include project ID in API calls
Tier Management: Handle user tier (FREE/STANDARD) responses

Challenges and Considerations

API Access: The Code Assist API (cloudcode-pa.googleapis.com) might require special access or be in preview
Model Naming: Need to map Gemini model names to Code Assist equivalents
Rate Limits: Handle tier-based rate limits
Error Handling: Map Google-specific errors to pi error types
Token Scopes: Ensure scopes are sufficient for all operations

Testing Plan

Test OAuth flow (browser and NO_BROWSER modes)
Test token refresh
Test API calls with OAuth tokens
Test fallback to API keys
Test error scenarios (invalid tokens, network errors)
Test model switching and tier limits

Migration Path

Users with GEMINI_API_KEY continue working unchanged
New pi auth login gemini command for OAuth
OAuth takes precedence over API keys when available
Clear messaging about benefits (higher limits, better features)

8.4 KiB Raw Blame History