# Pi Reader Browser Extension A cross-browser extension that provides an AI-powered reading assistant in a side panel (Chrome/Edge) or sidebar (Firefox), built with mini-lit components and Tailwind CSS v4. ## Browser Support - **Chrome/Edge** - Uses Side Panel API (Manifest V3) - **Firefox** - Uses Sidebar Action API (Manifest V3) - **Opera** - Sidebar support (untested but should work with Firefox manifest) ## Architecture ### High-Level Overview The extension is a full-featured AI chat interface that runs in your browser's side panel/sidebar. It can communicate with AI providers in two ways: 1. **Direct Mode** (default) - Calls AI provider APIs directly from the browser using API keys stored locally 2. **Proxy Mode** - Routes requests through a proxy server using an auth token **Browser Adaptation:** - **Chrome/Edge** - Side Panel API for dedicated panel UI - **Firefox** - Sidebar Action API for sidebar UI - **Page Content Access** - Uses `chrome.scripting.executeScript` to extract page text ### Core Architecture Layers ``` ┌─────────────────────────────────────────────────────────────┐ │ UI Layer (sidepanel.ts) │ │ ├─ Header (theme toggle, settings) │ │ └─ ChatPanel │ │ └─ AgentInterface (main chat UI) │ │ ├─ MessageList (stable messages) │ │ ├─ StreamingMessageContainer (live updates) │ │ └─ MessageEditor (input + attachments) │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ State Layer (state/) │ │ └─ AgentSession │ │ ├─ Manages conversation state │ │ ├─ Coordinates transport │ │ └─ Handles tool execution │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Transport Layer (state/transports/) │ │ ├─ DirectTransport (uses KeyStore for API keys) │ │ └─ ProxyTransport (uses auth token + proxy server) │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ AI Provider APIs / Proxy Server │ │ (Anthropic, OpenAI, Google, etc.) │ └─────────────────────────────────────────────────────────────┘ ``` ### Directory Structure by Responsibility ``` src/ ├── UI Components (what users see) │ ├── sidepanel.ts # App entry point, header │ ├── ChatPanel.ts # Main chat container, creates AgentSession │ ├── AgentInterface.ts # Complete chat UI (messages + input) │ ├── MessageList.ts # Renders stable messages │ ├── StreamingMessageContainer.ts # Handles streaming updates │ ├── Messages.ts # Message components (user, assistant, tool) │ ├── MessageEditor.ts # Input field with attachments │ ├── ConsoleBlock.ts # Console-style output display │ ├── AttachmentTile.ts # Attachment preview thumbnails │ ├── AttachmentOverlay.ts # Full-screen attachment viewer │ └── ModeToggle.ts # Toggle between document/text view │ ├── Dialogs (modal interactions) │ ├── dialogs/ │ │ ├── DialogBase.ts # Base class for all dialogs │ │ ├── ModelSelector.ts # Select AI model │ │ ├── ApiKeysDialog.ts # Manage API keys (for direct mode) │ │ └── PromptDialog.ts # Simple text input dialog │ ├── State Management (business logic) │ ├── state/ │ │ ├── agent-session.ts # Core state manager (pub/sub pattern) │ │ ├── KeyStore.ts # API key storage (Chrome local storage) │ │ └── transports/ │ │ ├── types.ts # Transport interface definitions │ │ ├── DirectTransport.ts # Direct API calls │ │ └── ProxyTransport.ts # Proxy server calls │ ├── Tools (AI function calling) │ ├── tools/ │ │ ├── types.ts # ToolRenderer interface │ │ ├── renderer-registry.ts # Global tool renderer registry │ │ ├── index.ts # Tool exports and registration │ │ └── renderers/ # Custom tool UI renderers │ │ ├── DefaultRenderer.ts # Fallback for unknown tools │ │ ├── CalculateRenderer.ts # Calculator tool UI │ │ ├── GetCurrentTimeRenderer.ts │ │ └── BashRenderer.ts # Bash command execution UI │ ├── Utilities (shared helpers) │ └── utils/ │ ├── attachment-utils.ts # PDF, Office, image processing │ ├── auth-token.ts # Proxy auth token management │ ├── format.ts # Token usage, cost formatting │ └── i18n.ts # Internationalization (EN + DE) │ └── Entry Points (browser integration) ├── background.ts # Service worker (opens side panel) ├── sidepanel.html # HTML entry point └── live-reload.ts # Hot reload during development ``` --- ## Common Development Tasks ### "I want to add a new AI tool" **Tools** are functions the AI can call (e.g., calculator, web search, code execution). Here's how to add one: #### 1. Define the Tool (use `@mariozechner/pi-ai`) Tools come from the `@mariozechner/pi-ai` package. Use existing tools or create custom ones: ```typescript // src/tools/my-custom-tool.ts import type { AgentTool } from "@mariozechner/pi-ai"; export const myCustomTool: AgentTool = { name: "my_custom_tool", label: "My Custom Tool", description: "Does something useful", parameters: { type: "object", properties: { input: { type: "string", description: "Input parameter" } }, required: ["input"] }, execute: async (params) => { // Your tool logic here const result = processInput(params.input); return { output: result, details: { /* any structured data */ } }; } }; ``` #### 2. Create a Custom Renderer (Optional) Renderers control how the tool appears in the chat. If you don't create one, `DefaultRenderer` will be used. ```typescript // src/tools/renderers/MyCustomRenderer.ts import { html } from "@mariozechner/mini-lit"; import type { ToolResultMessage } from "@mariozechner/pi-ai"; import type { ToolRenderer } from "../types.js"; export class MyCustomRenderer implements ToolRenderer { renderParams(params: any, isStreaming?: boolean) { // Show tool call parameters (e.g., "Searching for: ") return html`
${isStreaming ? "Processing..." : `Input: ${params.input}`}
`; } renderResult(params: any, result: ToolResultMessage) { // Show tool result (e.g., search results, calculation output) if (result.isError) { return html`
${result.output}
`; } return html`
Result:
${result.output}
`; } } ``` **Renderer Tips:** - Use `ConsoleBlock` for command output (see `BashRenderer.ts`) - Use `` for code/JSON (from `@mariozechner/mini-lit`) - Use `` for markdown content - Check `isStreaming` to show loading states #### 3. Register the Tool and Renderer ```typescript // src/tools/index.ts import { myCustomTool } from "./my-custom-tool.js"; import { MyCustomRenderer } from "./renderers/MyCustomRenderer.js"; import { registerToolRenderer } from "./renderer-registry.js"; // Register the renderer registerToolRenderer("my_custom_tool", new MyCustomRenderer()); // Export the tool so ChatPanel can use it export { myCustomTool }; ``` #### 4. Add Tool to ChatPanel ```typescript // src/ChatPanel.ts import { myCustomTool } from "./tools/index.js"; // In AgentSession constructor: this.session = new AgentSession({ initialState: { tools: [calculateTool, getCurrentTimeTool, myCustomTool], // Add here // ... } }); ``` **File Locations:** - Tool definition: `src/tools/my-custom-tool.ts` - Tool renderer: `src/tools/renderers/MyCustomRenderer.ts` - Registration: `src/tools/index.ts` (register renderer) - Integration: `src/ChatPanel.ts` (add to tools array) --- ### "I want to change how messages are displayed" **Message components** control how conversations appear: - **User messages**: Edit `UserMessage` in `src/Messages.ts` - **Assistant messages**: Edit `AssistantMessage` in `src/Messages.ts` - **Tool call cards**: Edit `ToolMessage` in `src/Messages.ts` - **Markdown rendering**: Comes from `@mariozechner/mini-lit` (can't customize easily) - **Code blocks**: Comes from `@mariozechner/mini-lit` (can't customize easily) **Example: Change user message styling** ```typescript // src/Messages.ts - in UserMessage component render() { return html`
`; } ``` --- ### "I want to add a new model provider" Models come from `@mariozechner/pi-ai`. The package supports: - `anthropic` (Claude) - `openai` (GPT) - `google` (Gemini) - `groq`, `cerebras`, `xai`, `openrouter`, etc. **To add a provider:** 1. Ensure `@mariozechner/pi-ai` supports it (check package docs) 2. Add API key configuration in `src/dialogs/ApiKeysDialog.ts`: - Add provider to `PROVIDERS` array - Add test model to `TEST_MODELS` object 3. Users can then select models via the model selector **No code changes needed** - the extension auto-discovers all models from `@mariozechner/pi-ai`. --- ### "I want to modify the transport layer" **Transport** determines how requests reach AI providers: #### Direct Mode (Default) - **File**: `src/state/transports/DirectTransport.ts` - **How it works**: Gets API keys from `KeyStore` → calls provider APIs directly - **When to use**: Local development, no proxy server - **Configuration**: API keys stored in Chrome local storage #### Proxy Mode - **File**: `src/state/transports/ProxyTransport.ts` - **How it works**: Gets auth token → sends request to proxy server → proxy calls providers - **When to use**: Want to hide API keys, centralized auth, usage tracking - **Configuration**: Auth token stored in localStorage, proxy URL hardcoded **Switch transport mode in ChatPanel:** ```typescript // src/ChatPanel.ts this.session = new AgentSession({ transportMode: "direct", // or "proxy" authTokenProvider: async () => getAuthToken(), // Only needed for proxy // ... }); ``` **Proxy Server Requirements:** - Must accept POST to `/api/stream` endpoint - Request format: `{ model, context, options }` - Response format: SSE stream with delta events - See `ProxyTransport.ts` for expected event types **To add a new transport:** 1. Create `src/state/transports/MyTransport.ts` 2. Implement `AgentTransport` interface: ```typescript async *run(userMessage, cfg, signal): AsyncIterable ``` 3. Register in `ChatPanel.ts` constructor --- ### "I want to change the system prompt" **System prompts** guide the AI's behavior. Change in `ChatPanel.ts`: ```typescript // src/ChatPanel.ts this.session = new AgentSession({ initialState: { systemPrompt: "You are a helpful AI assistant specialized in code review.", // ... } }); ``` Or make it dynamic: ```typescript // Read from storage, settings dialog, etc. const systemPrompt = await chrome.storage.local.get("system-prompt"); ``` --- ### "I want to add attachment support for a new file type" **Attachment processing** happens in `src/utils/attachment-utils.ts`: 1. **Add file type detection** in `loadAttachment()`: ```typescript if (mimeType === "application/my-format" || fileName.endsWith(".myext")) { const { extractedText } = await processMyFormat(arrayBuffer, fileName); return { id, type: "document", fileName, mimeType, content, extractedText }; } ``` 2. **Add processor function**: ```typescript async function processMyFormat(buffer: ArrayBuffer, fileName: string) { // Extract text from your format const text = extractTextFromMyFormat(buffer); return { extractedText: `\n${text}\n` }; } ``` 3. **Update accepted types** in `MessageEditor.ts`: ```typescript acceptedTypes = "image/*,application/pdf,.myext,..."; ``` 4. **Optional: Add preview support** in `AttachmentOverlay.ts` **Supported formats:** - Images: All image/* (preview support) - PDF: Text extraction + thumbnail generation - Office: DOCX, PPTX, XLSX (text extraction) - Text: .txt, .md, .json, .xml, etc. --- ### "I want to customize the UI theme" The extension uses the **Claude theme** from `@mariozechner/mini-lit`. Colors are defined via CSS variables: **Option 1: Override theme variables** ```css /* src/app.css */ @layer base { :root { --primary: 210 100% 50%; /* Custom blue */ --radius: 0.5rem; } } ``` **Option 2: Use a different mini-lit theme** ```css /* src/app.css */ @import "@mariozechner/mini-lit/themes/default.css"; /* Instead of claude.css */ ``` **Available variables:** - `--background`, `--foreground` - Base colors - `--card`, `--card-foreground` - Card backgrounds - `--primary`, `--primary-foreground` - Primary actions - `--muted`, `--muted-foreground` - Secondary elements - `--accent`, `--accent-foreground` - Hover states - `--destructive` - Error/delete actions - `--border`, `--input` - Border colors - `--radius` - Border radius --- ### "I want to add a new settings option" Settings currently managed via dialogs. To add persistent settings: #### 1. Create storage helpers ```typescript // src/utils/config.ts (create this file) export async function getMySetting(): Promise { const result = await chrome.storage.local.get("my-setting"); return result["my-setting"] || "default-value"; } export async function setMySetting(value: string): Promise { await chrome.storage.local.set({ "my-setting": value }); } ``` #### 2. Create or extend settings dialog ```typescript // src/dialogs/SettingsDialog.ts (create this file, similar to ApiKeysDialog) // Add UI for your setting // Call getMySetting() / setMySetting() on save ``` #### 3. Open from header ```typescript // src/sidepanel.ts - in settings button onClick SettingsDialog.open(); ``` #### 4. Use in ChatPanel ```typescript // src/ChatPanel.ts const mySetting = await getMySetting(); this.session = new AgentSession({ initialState: { /* use mySetting */ } }); ``` --- ### "I want to access the current page content" Page content extraction is in `sidepanel.ts`: ```typescript // Example: Get page text const [tab] = await chrome.tabs.query({ active: true, currentWindow: true }); const results = await chrome.scripting.executeScript({ target: { tabId: tab.id }, func: () => document.body.innerText, }); const pageText = results[0].result; ``` **To use in chat:** 1. Extract page content in `ChatPanel` 2. Add to system prompt or first user message 3. Or create a tool that reads page content **Permissions required:** - `activeTab` - Access current tab - `scripting` - Execute scripts in pages - Already configured in `manifest.*.json` --- ## Transport Modes Explained ### Direct Mode (Default) **Flow:** ``` Browser Extension → KeyStore (get API key) → DirectTransport → Provider API (Anthropic/OpenAI/etc.) → Stream response back ``` **Pros:** - No external dependencies - Lower latency (direct connection) - Works offline for API key management - Full control over requests **Cons:** - API keys stored in browser (secure, but local) - Each user needs their own API keys - CORS restrictions (some providers may not work) - Can't track usage centrally **Setup:** 1. Open extension → Settings → Manage API Keys 2. Add keys for desired providers (Anthropic, OpenAI, etc.) 3. Select model and start chatting **Files involved:** - `src/state/transports/DirectTransport.ts` - Transport implementation - `src/state/KeyStore.ts` - API key storage - `src/dialogs/ApiKeysDialog.ts` - API key UI --- ### Proxy Mode **Flow:** ``` Browser Extension → Auth Token (from localStorage) → ProxyTransport → Proxy Server (https://genai.mariozechner.at or custom) → Provider API → Stream response back through proxy ``` **Pros:** - No API keys in browser - Centralized auth/usage tracking - Can implement rate limiting, quotas - Custom logic server-side - No CORS issues **Cons:** - Requires proxy server setup - Additional network hop (latency) - Dependency on proxy availability - Need to manage auth tokens **Setup:** 1. Get auth token from proxy server admin 2. Extension prompts for token on first use 3. Token stored in localStorage 4. Start chatting (proxy handles provider APIs) **Proxy URL Configuration:** Currently hardcoded in `ProxyTransport.ts`: ```typescript const PROXY_URL = "https://genai.mariozechner.at"; ``` To make configurable: 1. Add storage helper in `utils/config.ts` 2. Add UI in SettingsDialog 3. Pass to ProxyTransport constructor **Proxy Server Requirements:** The proxy server must implement: **Endpoint:** `POST /api/stream` **Request:** ```typescript { model: Model, // Provider + model ID context: Context, // System prompt, messages, tools options: { temperature?: number, maxTokens?: number, reasoning?: string } } ``` **Response:** SSE (Server-Sent Events) stream **Event Types:** ```typescript data: {"type":"start","partial":{...}} data: {"type":"text_start","contentIndex":0} data: {"type":"text_delta","contentIndex":0,"delta":"Hello"} data: {"type":"text_end","contentIndex":0,"contentSignature":"..."} data: {"type":"thinking_start","contentIndex":1} data: {"type":"thinking_delta","contentIndex":1,"delta":"..."} data: {"type":"toolcall_start","contentIndex":2,"id":"...","toolName":"..."} data: {"type":"toolcall_delta","contentIndex":2,"delta":"..."} data: {"type":"toolcall_end","contentIndex":2} data: {"type":"done","reason":"stop","usage":{...}} ``` **Auth:** Bearer token in `Authorization` header **Error Handling:** - Return 401 for invalid auth → extension clears token and re-prompts - Return 4xx/5xx with JSON: `{"error":"message"}` **Reference Implementation:** See `src/state/transports/ProxyTransport.ts` for full event parsing logic. --- ### Switching Between Modes **At runtime** (in ChatPanel): ```typescript const mode = await getTransportMode(); // "direct" or "proxy" this.session = new AgentSession({ transportMode: mode, authTokenProvider: mode === "proxy" ? async () => getAuthToken() : undefined, // ... }); ``` **Storage helpers** (create these): ```typescript // src/utils/config.ts export type TransportMode = "direct" | "proxy"; export async function getTransportMode(): Promise { const result = await chrome.storage.local.get("transport-mode"); return (result["transport-mode"] as TransportMode) || "direct"; } export async function setTransportMode(mode: TransportMode): Promise { await chrome.storage.local.set({ "transport-mode": mode }); } ``` **UI for switching** (create this): ```typescript // src/dialogs/SettingsDialog.ts // Radio buttons: ○ Direct (use API keys) / ○ Proxy (use auth token) // On save: setTransportMode(), reload AgentSession ``` --- ## Understanding mini-lit Before working on the UI, read these files to understand the component library: - `node_modules/@mariozechner/mini-lit/README.md` - Complete component documentation - `node_modules/@mariozechner/mini-lit/llms.txt` - LLM-friendly component reference - `node_modules/@mariozechner/mini-lit/dist/*.ts` - Source files for specific components Key concepts: - **Functional Components** - Stateless functions that return `TemplateResult` (Button, Badge, etc.) - **Custom Elements** - Stateful LitElement classes (``, ``, etc.) - **Reactive State** - Use `createState()` for reactive UI updates - **Claude Theme** - We use the Claude theme from mini-lit ## Project Structure ``` packages/browser-extension/ ├── src/ │ ├── app.css # Tailwind v4 entry point with Claude theme │ ├── background.ts # Service worker for opening side panel │ ├── sidepanel.html # Side panel HTML entry point │ └── sidepanel.ts # Main side panel app with hot reload ├── scripts/ │ ├── build.mjs # esbuild bundler configuration │ └── dev-server.mjs # WebSocket server for hot reloading ├── manifest.chrome.json # Chrome/Edge manifest ├── manifest.firefox.json # Firefox manifest ├── icon-*.png # Extension icons ├── dist-chrome/ # Chrome build (git-ignored) └── dist-firefox/ # Firefox build (git-ignored) ``` ## Development Setup ### Prerequisites 1. Install dependencies from monorepo root: ```bash npm install ``` 2. Build the extension: ```bash # Build for both browsers npm run build -w @mariozechner/pi-reader-extension # Or build for specific browser npm run build:chrome -w @mariozechner/pi-reader-extension npm run build:firefox -w @mariozechner/pi-reader-extension ``` 3. Load the extension: **Chrome/Edge:** - Open `chrome://extensions/` or `edge://extensions/` - Enable "Developer mode" - Click "Load unpacked" - Select `packages/browser-extension/dist-chrome/` **Firefox:** - Open `about:debugging` - Click "This Firefox" - Click "Load Temporary Add-on" - Select any file in `packages/browser-extension/dist-firefox/` ### Development Workflow 1. **Start the dev server** (from monorepo root): ```bash # For Chrome development npm run dev -w @mariozechner/pi-reader-extension # For Firefox development npm run dev:firefox -w @mariozechner/pi-reader-extension ``` This runs three processes in parallel: - **esbuild** - Watches and rebuilds TypeScript files - **Tailwind CSS v4** - Watches and rebuilds styles - **WebSocket server** - Watches dist/ and triggers extension reload 2. **Automatic reloading**: - Any change to source files triggers a rebuild - The WebSocket server detects dist/ changes - Side panel connects to `ws://localhost:8765` - Extension auto-reloads via `chrome.runtime.reload()` 3. **Open the side panel**: - Click the extension icon in Chrome toolbar - Or use Chrome's side panel button (top-right) ## Key Files ### `src/sidepanel.ts` Main application logic: - Extracts page content via `chrome.scripting.executeScript` - Manages chat UI with mini-lit components - Handles WebSocket connection for hot reload - Direct AI API calls (no background worker needed) ### `src/app.css` Tailwind v4 configuration: - Imports Claude theme from mini-lit - Uses `@source` directive to scan mini-lit components - Compiled to `dist/app.css` during build ### `scripts/build.mjs` Build configuration: - Uses esbuild for fast TypeScript bundling - Copies static files (HTML, manifest, icons) - Supports watch mode for development ### `scripts/dev-server.mjs` Hot reload server: - WebSocket server on port 8765 - Watches `dist/` directory for changes - Sends reload messages to connected clients ## Working with mini-lit Components ### Basic Usage Read `../../mini-lit/llms.txt` and `../../mini-lit/README.md` in full. If in doubt, find the component in `../../mini-lit/src/` and read its source file in full. ### Tailwind Classes All standard Tailwind utilities work, plus mini-lit's theme variables: - `bg-background`, `text-foreground` - Theme-aware colors - `bg-card`, `border-border` - Component backgrounds - `text-muted-foreground` - Secondary text - `bg-primary`, `text-primary-foreground` - Primary actions ## Troubleshooting ### Extension doesn't reload automatically - Check WebSocket server is running (port 8765) - Check console for connection errors - Manually reload at `chrome://extensions/` ### Side panel doesn't open - Check manifest permissions - Ensure background service worker is loaded - Try clicking extension icon directly ### Styles not updating - Ensure Tailwind watcher is running - Check `src/app.css` imports - Clear Chrome extension cache ## Building for Production ```bash npm run build -w @mariozechner/pi-reader-extension ``` This creates an optimized build in `dist/` without hot reload code.