mirror of https://github.com/getcompanion-ai/co-mono.git synced 2026-04-15 06:04:40 +00:00

Mario Zechner d103af4ca2 Fix: cumulative file tracking applies to both compaction and branch summarization

2025-12-31 12:38:25 +01:00

12 KiB

Raw Blame History

Compaction & Branch Summarization

LLMs have limited context windows. When conversations grow too long, pi uses compaction to summarize older content while preserving recent work. This page covers both auto-compaction and branch summarization.

Overview

Pi has two summarization mechanisms:

Mechanism	Trigger	Purpose
Compaction	Context exceeds threshold, or `/compact`	Summarize old messages to free up context
Branch summarization	`/tree` navigation	Preserve context when switching branches

Both use the same structured summary format and track file operations cumulatively.

Compaction

When It Triggers

Auto-compaction triggers when:

contextTokens > contextWindow - reserveTokens

By default, reserveTokens is 16384 tokens (configurable in ~/.pi/agent/settings.json). This leaves room for the LLM's response.

You can also trigger manually with /compact [instructions], where optional instructions focus the summary.

How It Works

Find cut point: Walk backwards from newest message, accumulating token estimates until keepRecentTokens (default 20k, configurable in ~/.pi/agent/settings.json) is reached
Extract messages: Collect messages from previous compaction (or start) up to cut point
Generate summary: Call LLM to summarize with structured format
Append entry: Save CompactionEntry with summary and firstKeptEntryId
Reload: Session reloads, using summary + messages from firstKeptEntryId onwards

Before compaction:

  entry:  0     1     2     3      4     5     6      7      8     9
        ┌─────┬─────┬─────┬─────┬──────┬─────┬─────┬──────┬──────┬─────┐
        │ hdr │ usr │ ass │ tool │ usr │ ass │ tool │ tool │ ass │ tool│
        └─────┴─────┴─────┴──────┴─────┴─────┴──────┴──────┴─────┴─────┘
                └────────┬───────┘ └──────────────┬──────────────┘
               messagesToSummarize            kept messages
                                   ↑
                          firstKeptEntryId (entry 4)

After compaction (new entry appended):

  entry:  0     1     2     3      4     5     6      7      8     9     10
        ┌─────┬─────┬─────┬─────┬──────┬─────┬─────┬──────┬──────┬─────┬─────┐
        │ hdr │ usr │ ass │ tool │ usr │ ass │ tool │ tool │ ass │ tool│ cmp │
        └─────┴─────┴─────┴──────┴─────┴─────┴──────┴──────┴─────┴─────┴─────┘
               └──────────┬──────┘ └──────────────────────┬───────────────────┘
                 not sent to LLM                    sent to LLM
                                                         ↑
                                              starts from firstKeptEntryId

What the LLM sees:

  ┌────────┬─────────┬─────┬─────┬──────┬──────┬─────┬──────┐
  │ system │ summary │ usr │ ass │ tool │ tool │ ass │ tool │
  └────────┴─────────┴─────┴─────┴──────┴──────┴─────┴──────┘
       ↑         ↑      └─────────────────┬────────────────┘
    prompt   from cmp          messages from firstKeptEntryId

Split Turns

A "turn" starts with a user message and includes all assistant responses and tool calls until the next user message. Normally, compaction cuts at turn boundaries.

When a single turn exceeds keepRecentTokens, the cut point lands mid-turn at an assistant message. This is a "split turn":

Split turn (one huge turn exceeds budget):

  entry:  0     1     2      3     4      5      6     7      8
        ┌─────┬─────┬─────┬──────┬─────┬──────┬──────┬─────┬──────┐
        │ hdr │ usr │ ass │ tool │ ass │ tool │ tool │ ass │ tool │
        └─────┴─────┴─────┴──────┴─────┴──────┴──────┴─────┴──────┘
                ↑                                     ↑
         turnStartIndex = 1                  firstKeptEntryId = 7
                │                                     │
                └──── turnPrefixMessages (1-6) ───────┘
                                                      └── kept (7-8)

  isSplitTurn = true
  messagesToSummarize = []  (no complete turns before)
  turnPrefixMessages = [usr, ass, tool, ass, tool, tool]

For split turns, pi generates two summaries and merges them:

History summary: Previous context (if any)
Turn prefix summary: The early part of the split turn

Cut Point Rules

Valid cut points are:

User messages
Assistant messages
BashExecution messages
Hook messages (custom_message, branch_summary)

Never cut at tool results (they must stay with their tool call).

CompactionEntry Structure

interface CompactionEntry<T = unknown> {
  type: "compaction";
  id: string;
  parentId: string;
  timestamp: number;
  summary: string;
  firstKeptEntryId: string;
  tokensBefore: number;
  fromHook?: boolean;  // true if hook provided the compaction
  details?: T;         // hook-specific data
}

// Default compaction uses this for details:
interface CompactionDetails {
  readFiles: string[];
  modifiedFiles: string[];
}

Hooks can store any JSON-serializable data in details. The default compaction tracks file operations, but custom compaction hooks can use their own structure.

Branch Summarization

When It Triggers

When you use /tree to navigate to a different branch, pi offers to summarize the work you're leaving. This injects context from the left branch into the new branch.

How It Works

Find common ancestor: Deepest node shared by old and new positions
Collect entries: Walk from old leaf back to common ancestor
Prepare with budget: Include messages up to token budget (newest first)
Generate summary: Call LLM with structured format
Append entry: Save BranchSummaryEntry at navigation point

Tree before navigation:

         ┌─ B ─ C ─ D (old leaf, being abandoned)
    A ───┤
         └─ E ─ F (target)

Common ancestor: A
Entries to summarize: B, C, D

After navigation with summary:

         ┌─ B ─ C ─ D ─ [summary of B,C,D]
    A ───┤
         └─ E ─ F (new leaf)

Cumulative File Tracking

Both compaction and branch summarization track files cumulatively. When generating a summary, pi extracts file operations from:

Tool calls in the messages being summarized
Previous compaction or branch summary details (if any)

This means file tracking accumulates across multiple compactions or nested branch summaries, preserving the full history of read and modified files.

BranchSummaryEntry Structure

interface BranchSummaryEntry<T = unknown> {
  type: "branch_summary";
  id: string;
  parentId: string;
  timestamp: number;
  summary: string;
  fromId: string;      // Entry we navigated from
  fromHook?: boolean;  // true if hook provided the summary
  details?: T;         // hook-specific data
}

// Default branch summarization uses this for details:
interface BranchSummaryDetails {
  readFiles: string[];
  modifiedFiles: string[];
}

Same as compaction, hooks can store custom data in details.

Summary Format

Both compaction and branch summarization use the same structured format:

## Goal
[What the user is trying to accomplish]

## Constraints & Preferences
- [Requirements mentioned by user]

## Progress
### Done
- [x] [Completed tasks]

### In Progress
- [ ] [Current work]

### Blocked
- [Issues, if any]

## Key Decisions
- **[Decision]**: [Rationale]

## Next Steps
1. [What should happen next]

## Critical Context
- [Data needed to continue]

<read-files>
path/to/file1.ts
path/to/file2.ts
</read-files>

<modified-files>
path/to/changed.ts
</modified-files>

Message Serialization

Before summarization, messages are serialized to text:

[User]: What they said
[Assistant thinking]: Internal reasoning
[Assistant]: Response text
[Assistant tool calls]: read(path="foo.ts"); edit(path="bar.ts", ...)
[Tool result]: Output from tool

This prevents the model from treating it as a conversation to continue.

Custom Summarization via Hooks

Hooks can intercept and customize both compaction and branch summarization.

session_before_compact

Fired before auto-compaction or /compact. Can cancel or provide custom summary.

pi.on("session_before_compact", async (event, ctx) => {
  const { preparation, branchEntries, customInstructions, signal } = event;

  // preparation.messagesToSummarize - messages to summarize
  // preparation.turnPrefixMessages - split turn prefix (if isSplitTurn)
  // preparation.previousSummary - previous compaction summary
  // preparation.fileOps - extracted file operations
  // preparation.tokensBefore - context tokens before compaction
  // preparation.firstKeptEntryId - where kept messages start
  // preparation.settings - compaction settings

  // branchEntries - all entries on current branch (for custom state)
  // signal - AbortSignal (pass to LLM calls)

  // Cancel:
  return { cancel: true };

  // Custom summary:
  return {
    compaction: {
      summary: "Your summary...",
      firstKeptEntryId: preparation.firstKeptEntryId,
      tokensBefore: preparation.tokensBefore,
      details: { /* custom data */ },
    }
  };
});

See examples/hooks/custom-compaction.ts for a complete example using a different model.

session_before_tree

Fired before /tree navigation with summarization. Can cancel or provide custom summary.

pi.on("session_before_tree", async (event, ctx) => {
  const { preparation, signal } = event;

  // preparation.targetId - where we're navigating to
  // preparation.oldLeafId - current position (being abandoned)
  // preparation.commonAncestorId - shared ancestor
  // preparation.entriesToSummarize - entries to summarize
  // preparation.userWantsSummary - whether user chose to summarize

  // Cancel navigation:
  return { cancel: true };

  // Custom summary (only if userWantsSummary):
  return {
    summary: {
      summary: "Your summary...",
      details: { /* custom data */ },
    }
  };
});

Settings

Configure compaction in ~/.pi/agent/settings.json:

{
  "compaction": {
    "enabled": true,
    "reserveTokens": 16384,
    "keepRecentTokens": 20000
  }
}

Setting	Default	Description
`enabled`	`true`	Enable auto-compaction
`reserveTokens`	`16384`	Tokens to reserve for LLM response
`keepRecentTokens`	`20000`	Recent tokens to keep (not summarized)

Disable auto-compaction with "enabled": false. You can still compact manually with /compact.

12 KiB Raw Blame History

Compaction & Branch Summarization

Overview

Compaction

When It Triggers

How It Works

Split Turns

Cut Point Rules

CompactionEntry Structure

Branch Summarization

When It Triggers

How It Works

Cumulative File Tracking

BranchSummaryEntry Structure

Summary Format

Message Serialization

Custom Summarization via Hooks

session_before_compact

session_before_tree

Settings

12 KiB

Raw Blame History