Rewrite agent README with clearer structure and event flow documentation

This commit is contained in:
Mario Zechner 2025-12-30 23:40:49 +01:00
parent 74637403b6
commit ae55389051

View file

@ -1,127 +1,160 @@
# @mariozechner/pi-agent-core # @mariozechner/pi-agent
Stateful agent with tool execution, event streaming, and extensible message types. Built on `@mariozechner/pi-ai`. Stateful agent with tool execution and event streaming. Built on `@mariozechner/pi-ai`.
## Installation ## Installation
```bash ```bash
npm install @mariozechner/pi-agent-core npm install @mariozechner/pi-agent
``` ```
## Quick Start ## Quick Start
```typescript ```typescript
import { Agent } from '@mariozechner/pi-agent-core'; import { Agent } from "@mariozechner/pi-agent";
import { getModel } from '@mariozechner/pi-ai'; import { getModel } from "@mariozechner/pi-ai";
const agent = new Agent({ const agent = new Agent({
initialState: { initialState: {
systemPrompt: 'You are a helpful assistant.', systemPrompt: "You are a helpful assistant.",
model: getModel('anthropic', 'claude-sonnet-4-20250514'), model: getModel("anthropic", "claude-sonnet-4-20250514"),
thinkingLevel: 'medium', },
tools: []
}
}); });
// Subscribe to events for reactive UI updates
agent.subscribe((event) => { agent.subscribe((event) => {
switch (event.type) { if (event.type === "message_update") {
case 'message_start': // Stream assistant response
console.log(`${event.message.role} message started`); for (const block of event.message.content) {
break; if (block.type === "text") process.stdout.write(block.text);
case 'message_update': }
// Only emitted for assistant messages during streaming
// event.message is partial - may have incomplete content
for (const block of event.message.content) {
if (block.type === 'text') process.stdout.write(block.text);
}
break;
case 'message_end':
console.log(`${event.message.role} message complete`);
break;
case 'tool_execution_start':
console.log(`Calling ${event.toolName}...`);
break;
case 'tool_execution_end':
console.log(`Result:`, event.result.content);
break;
} }
}); });
await agent.prompt('Hello, world!'); await agent.prompt("Hello!");
console.log(agent.state.messages);
``` ```
## AgentMessage vs LLM Message ## Core Concepts
The agent internally works with `AgentMessage`, a flexible type that can include: ### AgentMessage vs LLM Message
The agent works with `AgentMessage`, a flexible type that can include:
- Standard LLM messages (`user`, `assistant`, `toolResult`) - Standard LLM messages (`user`, `assistant`, `toolResult`)
- Custom app-specific message types (via declaration merging) - Custom app-specific message types via declaration merging
LLMs only understand a subset: `user`, `assistant`, and `toolResult` messages with specific content formats. The `convertToLlm` function bridges this gap. LLMs only understand `user`, `assistant`, and `toolResult`. The `convertToLlm` function bridges this gap by filtering and transforming messages before each LLM call.
### Why This Separation? ### Message Flow
1. **Rich UI state**: Store UI-specific data (attachments metadata, custom message types) alongside the conversation
2. **Session persistence**: Save the full conversation state including app-specific messages
3. **Context manipulation**: Transform messages before sending to LLM (compaction, injection, filtering)
### The Conversion Flow
``` ```
AgentMessage[] → transformContext() → AgentMessage[] → convertToLlm() → Message[] → LLM AgentMessage[] → transformContext() → AgentMessage[] → convertToLlm() → Message[] → LLM
↑ (optional) (required) (optional) (required)
|
App state with custom types,
attachments, UI metadata
``` ```
### Constraints 1. **transformContext**: Prune old messages, inject external context
2. **convertToLlm**: Filter out UI-only messages, convert custom types to LLM format
**Messages passed to `prompt()` or queued via `queueMessage()` must convert to LLM messages with `role: "user"` or `role: "toolResult"`.** ## Event Flow
When calling `continue()`, the last message in the context must also convert to `user` or `toolResult`. The LLM expects to respond to a user or tool result, not to its own assistant message. The agent emits events for UI updates. Understanding the event sequence helps build responsive interfaces.
### prompt() Event Sequence
When you call `prompt("Hello")`:
```
prompt("Hello")
├─ agent_start
├─ turn_start
├─ message_start { message: userMessage } // Your prompt
├─ message_end { message: userMessage }
├─ message_start { message: assistantMessage } // LLM starts responding
├─ message_update { message: partial... } // Streaming chunks
├─ message_update { message: partial... }
├─ message_end { message: assistantMessage } // Complete response
├─ turn_end { message, toolResults: [] }
└─ agent_end { messages: [...] }
```
### With Tool Calls
If the assistant calls tools, the loop continues:
```
prompt("Read config.json")
├─ agent_start
├─ turn_start
├─ message_start/end { userMessage }
├─ message_start { assistantMessage with toolCall }
├─ message_update...
├─ message_end { assistantMessage }
├─ tool_execution_start { toolCallId, toolName, args }
├─ tool_execution_update { partialResult } // If tool streams
├─ tool_execution_end { toolCallId, result }
├─ message_start/end { toolResultMessage }
├─ turn_end { message, toolResults: [toolResult] }
├─ turn_start // Next turn
├─ message_start { assistantMessage } // LLM responds to tool result
├─ message_update...
├─ message_end
├─ turn_end
└─ agent_end
```
### continue() Event Sequence
`continue()` resumes from existing context without adding a new message. Use it for retries after errors.
```typescript ```typescript
// OK: Standard user message // After an error, retry from current state
await agent.prompt('Hello'); await agent.continue();
// OK: Custom type that converts to user message
await agent.prompt({ role: 'hookMessage', content: 'System notification', timestamp: Date.now() });
// But convertToLlm must handle this:
convertToLlm: (messages) => messages.map(m => {
if (m.role === 'hookMessage') {
return { role: 'user', content: m.content, timestamp: m.timestamp };
}
return m;
})
// ERROR: Cannot prompt with assistant message
await agent.prompt({ role: 'assistant', content: [...], ... }); // Will fail at LLM
``` ```
The last message in context must be `user` or `toolResult` (not `assistant`).
### Event Types
| Event | Description |
|-------|-------------|
| `agent_start` | Agent begins processing |
| `agent_end` | Agent completes with all new messages |
| `turn_start` | New turn begins (one LLM call + tool executions) |
| `turn_end` | Turn completes with assistant message and tool results |
| `message_start` | Any message begins (user, assistant, toolResult) |
| `message_update` | **Assistant only.** Partial message during streaming |
| `message_end` | Message completes |
| `tool_execution_start` | Tool begins |
| `tool_execution_update` | Tool streams progress |
| `tool_execution_end` | Tool completes |
## Agent Options ## Agent Options
```typescript ```typescript
interface AgentOptions { const agent = new Agent({
initialState?: Partial<AgentState>; // Initial state
initialState: {
systemPrompt: string,
model: Model<any>,
thinkingLevel: "off" | "minimal" | "low" | "medium" | "high" | "xhigh",
tools: AgentTool<any>[],
messages: AgentMessage[],
},
// Converts AgentMessage[] to LLM-compatible Message[] before each LLM call. // Convert AgentMessage[] to LLM Message[] (required for custom message types)
// Default: filters to user/assistant/toolResult and converts image attachments. convertToLlm: (messages) => messages.filter(...),
convertToLlm?: (messages: AgentMessage[]) => Message[] | Promise<Message[]>;
// Transform context before convertToLlm (for pruning, compaction, injecting context) // Transform context before convertToLlm (for pruning, compaction)
transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>; transformContext: async (messages, signal) => pruneOldMessages(messages),
// Queue mode: 'all' sends all queued messages, 'one-at-a-time' sends one per turn // How to handle queued messages: "one-at-a-time" (default) or "all"
queueMode?: 'all' | 'one-at-a-time'; queueMode: "one-at-a-time",
// Custom stream function (for proxy backends). Default: streamSimple from pi-ai // Custom stream function (for proxy backends)
streamFn?: StreamFn; streamFn: streamProxy,
// Dynamic API key resolution (useful for expiring OAuth tokens) // Dynamic API key resolution (for expiring OAuth tokens)
getApiKey?: (provider: string) => Promise<string | undefined> | string | undefined; getApiKey: async (provider) => refreshToken(),
} });
``` ```
## Agent State ## Agent State
@ -130,250 +163,190 @@ interface AgentOptions {
interface AgentState { interface AgentState {
systemPrompt: string; systemPrompt: string;
model: Model<any>; model: Model<any>;
thinkingLevel: ThinkingLevel; // 'off' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh' thinkingLevel: ThinkingLevel;
tools: AgentTool<any>[]; tools: AgentTool<any>[];
messages: AgentMessage[]; // Full conversation including custom types messages: AgentMessage[];
isStreaming: boolean; isStreaming: boolean;
streamMessage: AgentMessage | null; // Current partial message during streaming streamMessage: AgentMessage | null; // Current partial during streaming
pendingToolCalls: Set<string>; pendingToolCalls: Set<string>;
error?: string; error?: string;
} }
``` ```
## Events Access via `agent.state`. During streaming, `streamMessage` contains the partial assistant message.
Events provide fine-grained lifecycle information for building reactive UIs. ## Methods
### Event Types ### Prompting
| Event | Description |
|-------|-------------|
| `agent_start` | Agent begins processing |
| `agent_end` | Agent completes, contains all generated messages |
| `turn_start` | New turn begins (one LLM response + tool executions) |
| `turn_end` | Turn completes with assistant message and tool results |
| `message_start` | Message begins (user, assistant, or toolResult) |
| `message_update` | **Assistant messages only.** Partial message during streaming |
| `message_end` | Message completes |
| `tool_execution_start` | Tool begins execution |
| `tool_execution_update` | Tool streams progress |
| `tool_execution_end` | Tool completes with result |
### Message Events for prompt() and queueMessage()
When you call `prompt(message)`, the agent emits `message_start` and `message_end` events for that message before the assistant responds:
```
prompt(userMessage)
→ agent_start
→ turn_start
→ message_start { message: userMessage }
→ message_end { message: userMessage }
→ message_start { message: assistantMessage } // LLM starts responding
→ message_update { message: partialAssistant } // streaming...
→ message_end { message: assistantMessage }
...
```
Queued messages (via `queueMessage()`) emit the same events when injected:
```
// During tool execution, a message is queued
agent.queueMessage(interruptMessage)
// After tool completes, before next LLM call:
→ message_start { message: interruptMessage }
→ message_end { message: interruptMessage }
→ message_start { message: assistantMessage } // LLM responds to interrupt
...
```
### Handling Partial Messages in Reactive UIs
`message_update` events contain partial assistant messages during streaming. The `event.message` may have:
- Incomplete text (truncated mid-word)
- Partial tool call arguments
- Missing content blocks that haven't started streaming yet
**Pattern for reactive UIs:**
```typescript ```typescript
agent.subscribe((event) => { // Text prompt
switch (event.type) { await agent.prompt("Hello");
case 'message_start':
if (event.message.role === 'assistant') {
// Create placeholder in UI
ui.addMessage({ id: tempId, role: 'assistant', content: [] });
}
break;
case 'message_update': // With images
// Replace placeholder content with partial content await agent.prompt("What's in this image?", [
// This is only emitted for assistant messages { type: "image", data: base64Data, mimeType: "image/jpeg" }
ui.updateMessage(tempId, event.message.content); ]);
break;
case 'message_end': // AgentMessage directly
if (event.message.role === 'assistant') { await agent.prompt({ role: "user", content: "Hello", timestamp: Date.now() });
// Finalize with complete message
ui.finalizeMessage(tempId, event.message); // Continue from current context (last message must be user or toolResult)
} await agent.continue();
break; ```
}
}); ### State Management
```
```typescript
**Accessing the current partial message:** agent.setSystemPrompt("New prompt");
agent.setModel(getModel("openai", "gpt-4o"));
During streaming, `agent.state.streamMessage` contains the current partial message. This is useful for rendering outside the event handler: agent.setThinkingLevel("medium");
agent.setTools([myTool]);
```typescript agent.replaceMessages(newMessages);
// In a render loop or reactive binding agent.appendMessage(message);
if (agent.state.isStreaming && agent.state.streamMessage) { agent.clearMessages();
renderPartialMessage(agent.state.streamMessage); agent.reset(); // Clear everything
} ```
```
### Control
## Custom Message Types
```typescript
Extend `AgentMessage` for app-specific messages via declaration merging: agent.abort(); // Cancel current operation
await agent.waitForIdle(); // Wait for completion
```typescript ```
declare module '@mariozechner/pi-agent-core' {
interface CustomAgentMessages { ### Events
artifact: { role: 'artifact'; code: string; language: string; timestamp: number };
notification: { role: 'notification'; text: string; timestamp: number }; ```typescript
} const unsubscribe = agent.subscribe((event) => {
} console.log(event.type);
// AgentMessage now includes your custom types
const msg: AgentMessage = { role: 'artifact', code: '...', language: 'typescript', timestamp: Date.now() };
```
Custom messages are stored in state but filtered out by the default `convertToLlm`. Provide your own converter to handle them:
```typescript
const agent = new Agent({
convertToLlm: (messages) => {
return messages
.filter(m => m.role !== 'notification') // Filter out UI-only messages
.map(m => {
if (m.role === 'artifact') {
// Convert to user message so LLM sees the artifact
return { role: 'user', content: `[Artifact: ${m.language}]\n${m.code}`, timestamp: m.timestamp };
}
return m;
});
}
}); });
unsubscribe();
``` ```
## Message Queue ## Message Queue
Queue messages to inject at the next turn: Queue messages to inject during tool execution (for user interruptions):
```typescript ```typescript
agent.setQueueMode('one-at-a-time'); agent.setQueueMode("one-at-a-time");
// Queue while agent is streaming // While agent is running tools
agent.queueMessage({ agent.queueMessage({
role: 'user', role: "user",
content: 'Stop what you are doing and focus on this instead.', content: "Stop! Do this instead.",
timestamp: Date.now() timestamp: Date.now(),
}); });
``` ```
When queued messages are detected after a tool call, remaining tool calls are skipped with error results ("Skipped due to queued user message"). The queued message is then injected before the next assistant response. When queued messages are detected after a tool completes:
1. Remaining tools are skipped with error results
2. Queued message is injected
3. LLM responds to the interruption
## Images ## Custom Message Types
User messages can include images: Extend `AgentMessage` via declaration merging:
```typescript ```typescript
await agent.prompt('What is in this image?', [ declare module "@mariozechner/pi-agent" {
{ type: 'image', data: base64ImageData, mimeType: 'image/jpeg' } interface CustomAgentMessages {
]); notification: { role: "notification"; text: string; timestamp: number };
}
}
// Now valid
const msg: AgentMessage = { role: "notification", text: "Info", timestamp: Date.now() };
```
Handle custom types in `convertToLlm`:
```typescript
const agent = new Agent({
convertToLlm: (messages) => messages.flatMap(m => {
if (m.role === "notification") return []; // Filter out
return [m];
}),
});
```
## Tools
Tools extend `Tool` from pi-ai with an `execute` function:
```typescript
import { Type } from "@sinclair/typebox";
const readFileTool: AgentTool = {
name: "read_file",
label: "Read File", // For UI display
description: "Read a file's contents",
parameters: Type.Object({
path: Type.String({ description: "File path" }),
}),
execute: async (toolCallId, params, signal, onUpdate) => {
const content = await fs.readFile(params.path, "utf-8");
// Optional: stream progress
onUpdate?.({ content: [{ type: "text", text: "Reading..." }], details: {} });
return {
content: [{ type: "text", text: content }],
details: { path: params.path, size: content.length },
};
},
};
agent.setTools([readFileTool]);
``` ```
## Proxy Usage ## Proxy Usage
For browser apps that need to proxy through a backend, use `streamProxy`: For browser apps that proxy through a backend:
```typescript ```typescript
import { Agent, streamProxy } from '@mariozechner/pi-agent-core'; import { Agent, streamProxy } from "@mariozechner/pi-agent";
const agent = new Agent({ const agent = new Agent({
streamFn: (model, context, options) => streamProxy( streamFn: (model, context, options) =>
'/api/agent', streamProxy(model, context, {
model, ...options,
context, authToken: "...",
options, proxyUrl: "https://your-server.com",
{ 'Authorization': 'Bearer ...' } }),
)
}); });
``` ```
## Low-Level API ## Low-Level API
For more control, use `agentLoop` and `agentLoopContinue` directly: For direct control without the Agent class:
```typescript ```typescript
import { agentLoop, agentLoopContinue, AgentContext, AgentLoopConfig } from '@mariozechner/pi-agent-core'; import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent";
import { getModel, streamSimple } from '@mariozechner/pi-ai';
const context: AgentContext = { const context: AgentContext = {
systemPrompt: 'You are helpful.', systemPrompt: "You are helpful.",
messages: [], messages: [],
tools: [myTool] tools: [],
}; };
const config: AgentLoopConfig = { const config: AgentLoopConfig = {
model: getModel('openai', 'gpt-4o-mini'), model: getModel("openai", "gpt-4o"),
convertToLlm: (msgs) => msgs.filter(m => ['user', 'assistant', 'toolResult'].includes(m.role)) convertToLlm: (msgs) => msgs.filter(m => ["user", "assistant", "toolResult"].includes(m.role)),
}; };
const userMessage = { role: 'user', content: 'Hello', timestamp: Date.now() }; const userMessage = { role: "user", content: "Hello", timestamp: Date.now() };
for await (const event of agentLoop(userMessage, context, config, undefined, streamSimple)) { for await (const event of agentLoop([userMessage], context, config)) {
console.log(event.type); console.log(event.type);
} }
// Continue from existing context (e.g., after overflow recovery) // Continue from existing context
// Last message in context must convert to 'user' or 'toolResult' for await (const event of agentLoopContinue(context, config)) {
for await (const event of agentLoopContinue(context, config, undefined, streamSimple)) {
console.log(event.type); console.log(event.type);
} }
``` ```
## API Reference
### Agent Methods
| Method | Description |
|--------|-------------|
| `prompt(text, images?)` | Send a user prompt with optional images |
| `prompt(message)` | Send an AgentMessage directly (must convert to user/toolResult) |
| `continue()` | Continue from current context (last message must convert to user/toolResult) |
| `abort()` | Abort current operation |
| `waitForIdle()` | Promise that resolves when agent is idle |
| `reset()` | Clear all messages and state |
| `subscribe(fn)` | Subscribe to events, returns unsubscribe function |
| `queueMessage(msg)` | Queue message for next turn (must convert to user/toolResult) |
| `clearMessageQueue()` | Clear queued messages |
### State Mutators
| Method | Description |
|--------|-------------|
| `setSystemPrompt(v)` | Update system prompt |
| `setModel(m)` | Switch model |
| `setThinkingLevel(l)` | Set reasoning level |
| `setQueueMode(m)` | Set queue mode |
| `setTools(t)` | Update available tools |
| `replaceMessages(ms)` | Replace all messages |
| `appendMessage(m)` | Append a message |
| `clearMessages()` | Clear all messages |
## License ## License
MIT MIT