feat(core): Unified Context Management and Tool Distillation. (#24157)

This commit is contained in:
joshualitt
2026-03-30 15:29:59 -07:00
committed by GitHub
parent 117a2d3844
commit dfba0e91e2
22 changed files with 1717 additions and 314 deletions

View File

@@ -155,21 +155,18 @@ they appear in the UI.
### Experimental
| UI Label | Setting | Description | Default |
| ---------------------------------- | ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| Enable Tool Output Masking | `experimental.toolOutputMasking.enabled` | Enables tool output masking to save tokens. | `true` |
| Enable Git Worktrees | `experimental.worktrees` | Enable automated Git worktree management for parallel work. | `false` |
| Use OSC 52 Paste | `experimental.useOSC52Paste` | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
| Use OSC 52 Copy | `experimental.useOSC52Copy` | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
| Plan | `experimental.plan` | Enable Plan Mode. | `true` |
| Model Steering | `experimental.modelSteering` | Enable model steering (user hints) to guide the model during tool execution. | `false` |
| Direct Web Fetch | `experimental.directWebFetch` | Enable web fetch behavior that bypasses LLM summarization. | `false` |
| Memory Manager Agent | `experimental.memoryManager` | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories. | `false` |
| Agent History Truncation | `experimental.agentHistoryTruncation` | Enable truncation window logic for the Agent History Provider. | `false` |
| Agent History Truncation Threshold | `experimental.agentHistoryTruncationThreshold` | The maximum number of messages before history is truncated. | `30` |
| Agent History Retained Messages | `experimental.agentHistoryRetainedMessages` | The number of recent messages to retain after truncation. | `15` |
| Agent History Summarization | `experimental.agentHistorySummarization` | Enable summarization of truncated content via a small model for the Agent History Provider. | `false` |
| Topic & Update Narration | `experimental.topicUpdateNarration` | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting. | `false` |
| UI Label | Setting | Description | Default |
| -------------------------- | ---------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| Enable Tool Output Masking | `experimental.toolOutputMasking.enabled` | Enables tool output masking to save tokens. | `true` |
| Enable Git Worktrees | `experimental.worktrees` | Enable automated Git worktree management for parallel work. | `false` |
| Use OSC 52 Paste | `experimental.useOSC52Paste` | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
| Use OSC 52 Copy | `experimental.useOSC52Copy` | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
| Plan | `experimental.plan` | Enable Plan Mode. | `true` |
| Model Steering | `experimental.modelSteering` | Enable model steering (user hints) to guide the model during tool execution. | `false` |
| Direct Web Fetch | `experimental.directWebFetch` | Enable web fetch behavior that bypasses LLM summarization. | `false` |
| Memory Manager Agent | `experimental.memoryManager` | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories. | `false` |
| Enable Context Management | `experimental.contextManagement` | Enable logic for context management. | `false` |
| Topic & Update Narration | `experimental.topicUpdateNarration` | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting. | `false` |
### Skills

View File

@@ -1702,25 +1702,8 @@ their corresponding top-level category object in your `settings.json` file.
- **Default:** `false`
- **Requires restart:** Yes
- **`experimental.agentHistoryTruncation`** (boolean):
- **Description:** Enable truncation window logic for the Agent History
Provider.
- **Default:** `false`
- **Requires restart:** Yes
- **`experimental.agentHistoryTruncationThreshold`** (number):
- **Description:** The maximum number of messages before history is truncated.
- **Default:** `30`
- **Requires restart:** Yes
- **`experimental.agentHistoryRetainedMessages`** (number):
- **Description:** The number of recent messages to retain after truncation.
- **Default:** `15`
- **Requires restart:** Yes
- **`experimental.agentHistorySummarization`** (boolean):
- **Description:** Enable summarization of truncated content via a small model
for the Agent History Provider.
- **`experimental.contextManagement`** (boolean):
- **Description:** Enable logic for context management.
- **Default:** `false`
- **Requires restart:** Yes
@@ -1815,6 +1798,49 @@ their corresponding top-level category object in your `settings.json` file.
prioritize available tools dynamically.
- **Default:** `[]`
#### `contextManagement`
- **`contextManagement.historyWindow.maxTokens`** (number):
- **Description:** The number of tokens to allow before triggering
compression.
- **Default:** `150000`
- **Requires restart:** Yes
- **`contextManagement.historyWindow.retainedTokens`** (number):
- **Description:** The number of tokens to always retain.
- **Default:** `40000`
- **Requires restart:** Yes
- **`contextManagement.messageLimits.normalMaxTokens`** (number):
- **Description:** The target number of tokens to budget for a normal
conversation turn.
- **Default:** `2500`
- **Requires restart:** Yes
- **`contextManagement.messageLimits.retainedMaxTokens`** (number):
- **Description:** The maximum number of tokens a single conversation turn can
consume before truncation.
- **Default:** `12000`
- **Requires restart:** Yes
- **`contextManagement.messageLimits.normalizationHeadRatio`** (number):
- **Description:** The ratio of tokens to retain from the beginning of a
truncated message (0.0 to 1.0).
- **Default:** `0.25`
- **Requires restart:** Yes
- **`contextManagement.toolDistillation.maxOutputTokens`** (number):
- **Description:** Maximum tokens to show when truncating large tool outputs.
- **Default:** `10000`
- **Requires restart:** Yes
- **`contextManagement.toolDistillation.summarizationThresholdTokens`**
(number):
- **Description:** Threshold above which truncated tool outputs will be
summarized by an LLM.
- **Default:** `20000`
- **Requires restart:** Yes
#### `admin`
- **`admin.secureModeEnabled`** (boolean):