docs(coding-agent): fix context compaction documentation

- Correctly describe cut point mechanism that keeps recent messages verbatim
- Document keepRecentTokens setting (default 20k)
- Fix reserveTokens default (16k not 20k)
- Explain summary chaining for multiple compactions
This commit is contained in:
Mario Zechner 2025-12-04 02:57:45 +01:00
parent a57768b91b
commit 0c460fcfe8
2 changed files with 15 additions and 9 deletions

View file

@ -4,10 +4,11 @@
### Added
- **Context Compaction**: Long sessions can now be compacted to reduce context usage while preserving conversation history. ([#92](https://github.com/badlogic/pi-mono/issues/92))
- **Context Compaction**: Long sessions can now be compacted to reduce context usage while preserving recent conversation history. ([#92](https://github.com/badlogic/pi-mono/issues/92))
- `/compact [instructions]`: Manually compact context with optional custom instructions for the summary
- `/autocompact`: Toggle automatic compaction when context exceeds threshold
- Auto-compaction triggers when context reaches `contextWindow - reserveTokens` (default 20k reserve)
- Compaction summarizes older messages while keeping recent messages (default 20k tokens) verbatim
- Auto-compaction triggers when context reaches `contextWindow - reserveTokens` (default 16k reserve)
- Compacted sessions show a collapsible summary in the TUI (toggle with `o` key)
- HTML exports include compaction summaries as collapsible sections
- RPC mode supports `{"type":"compact"}` command and auto-compaction (emits compaction events)

View file

@ -781,16 +781,19 @@ pi --session /path/to/my-session.jsonl
## Context Compaction
Long sessions can exhaust the model's context window. Context compaction summarizes the conversation history into a condensed form, preserving key information while dramatically reducing token usage.
Long sessions can exhaust the model's context window. Context compaction summarizes older conversation history while preserving recent messages, allowing sessions to continue indefinitely.
### How It Works
When compaction runs (manually via `/compact` or automatically):
1. The current conversation is sent to the model with instructions to create a comprehensive summary
2. The summary captures key decisions, code changes, file modifications, and ongoing tasks
3. The original messages are replaced with the summary as a single "context handoff" message
4. The session continues with the compacted context
1. A **cut point** is calculated to keep approximately `keepRecentTokens` (default: 20k) worth of recent messages
2. Messages **before** the cut point are sent to the model for summarization
3. Messages **after** the cut point are kept verbatim
4. The summary replaces the older messages as a "context handoff" message
5. If there was a previous compaction, its summary is included as context for the new summary (chaining)
Cut points are always placed at user message boundaries to preserve turn integrity.
The summary is displayed in the TUI as a collapsible block (toggle with `o` key). HTML exports also show compaction summaries as collapsible sections.
@ -819,13 +822,15 @@ Power users can tune compaction behavior in `~/.pi/agent/settings.json`:
{
"compaction": {
"enabled": true,
"reserveTokens": 20000
"reserveTokens": 16384,
"keepRecentTokens": 20000
}
}
```
- **enabled**: Whether auto-compaction is active (toggle with `/autocompact`)
- **reserveTokens**: Token buffer to keep free (default: 20,000). Auto-compaction triggers when `contextTokens > contextWindow - reserveTokens`
- **reserveTokens**: Token buffer to keep free (default: 16,384). Auto-compaction triggers when `contextTokens > contextWindow - reserveTokens`
- **keepRecentTokens**: How many tokens worth of recent messages to preserve verbatim (default: 20,000). Older messages are summarized.
### Supported Modes