feat(core): Unified Context Management and Tool Distillation. (#24157)

2026-06-13 04:48:09 -07:00 · 2026-03-30 15:29:59 -07:00
parent 117a2d3844
commit dfba0e91e2
22 changed files with 1717 additions and 314 deletions
@@ -155,21 +155,18 @@ they appear in the UI.

 ### Experimental

-| UI Label                           | Setting                                        | Description                                                                                                                                               | Default |
-| ---------------------------------- | ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
-| Enable Tool Output Masking         | `experimental.toolOutputMasking.enabled`       | Enables tool output masking to save tokens.                                                                                                               | `true`  |
-| Enable Git Worktrees               | `experimental.worktrees`                       | Enable automated Git worktree management for parallel work.                                                                                               | `false` |
-| Use OSC 52 Paste                   | `experimental.useOSC52Paste`                   | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
-| Use OSC 52 Copy                    | `experimental.useOSC52Copy`                    | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
-| Plan                               | `experimental.plan`                            | Enable Plan Mode.                                                                                                                                         | `true`  |
-| Model Steering                     | `experimental.modelSteering`                   | Enable model steering (user hints) to guide the model during tool execution.                                                                              | `false` |
-| Direct Web Fetch                   | `experimental.directWebFetch`                  | Enable web fetch behavior that bypasses LLM summarization.                                                                                                | `false` |
-| Memory Manager Agent               | `experimental.memoryManager`                   | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories.             | `false` |
-| Agent History Truncation           | `experimental.agentHistoryTruncation`          | Enable truncation window logic for the Agent History Provider.                                                                                            | `false` |
-| Agent History Truncation Threshold | `experimental.agentHistoryTruncationThreshold` | The maximum number of messages before history is truncated.                                                                                               | `30`    |
-| Agent History Retained Messages    | `experimental.agentHistoryRetainedMessages`    | The number of recent messages to retain after truncation.                                                                                                 | `15`    |
-| Agent History Summarization        | `experimental.agentHistorySummarization`       | Enable summarization of truncated content via a small model for the Agent History Provider.                                                               | `false` |
-| Topic & Update Narration           | `experimental.topicUpdateNarration`            | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting.                                      | `false` |
+| UI Label                   | Setting                                  | Description                                                                                                                                               | Default |
+| -------------------------- | ---------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
+| Enable Tool Output Masking | `experimental.toolOutputMasking.enabled` | Enables tool output masking to save tokens.                                                                                                               | `true`  |
+| Enable Git Worktrees       | `experimental.worktrees`                 | Enable automated Git worktree management for parallel work.                                                                                               | `false` |
+| Use OSC 52 Paste           | `experimental.useOSC52Paste`             | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
+| Use OSC 52 Copy            | `experimental.useOSC52Copy`              | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
+| Plan                       | `experimental.plan`                      | Enable Plan Mode.                                                                                                                                         | `true`  |
+| Model Steering             | `experimental.modelSteering`             | Enable model steering (user hints) to guide the model during tool execution.                                                                              | `false` |
+| Direct Web Fetch           | `experimental.directWebFetch`            | Enable web fetch behavior that bypasses LLM summarization.                                                                                                | `false` |
+| Memory Manager Agent       | `experimental.memoryManager`             | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories.             | `false` |
+| Enable Context Management  | `experimental.contextManagement`         | Enable logic for context management.                                                                                                                      | `false` |
+| Topic & Update Narration   | `experimental.topicUpdateNarration`      | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting.                                      | `false` |

 ### Skills

@@ -1702,25 +1702,8 @@ their corresponding top-level category object in your `settings.json` file.
  - **Default:** `false`
  - **Requires restart:** Yes

- **`experimental.agentHistoryTruncation`** (boolean):
-  - **Description:** Enable truncation window logic for the Agent History
-    Provider.
-  - **Default:** `false`
-  - **Requires restart:** Yes
-
- **`experimental.agentHistoryTruncationThreshold`** (number):
-  - **Description:** The maximum number of messages before history is truncated.
-  - **Default:** `30`
-  - **Requires restart:** Yes
-
- **`experimental.agentHistoryRetainedMessages`** (number):
-  - **Description:** The number of recent messages to retain after truncation.
-  - **Default:** `15`
-  - **Requires restart:** Yes
-
- **`experimental.agentHistorySummarization`** (boolean):
-  - **Description:** Enable summarization of truncated content via a small model
-    for the Agent History Provider.
+- **`experimental.contextManagement`** (boolean):
+  - **Description:** Enable logic for context management.
  - **Default:** `false`
  - **Requires restart:** Yes

@@ -1815,6 +1798,49 @@ their corresponding top-level category object in your `settings.json` file.
    prioritize available tools dynamically.
  - **Default:** `[]`

+#### `contextManagement`
+
+- **`contextManagement.historyWindow.maxTokens`** (number):
+  - **Description:** The number of tokens to allow before triggering
+    compression.
+  - **Default:** `150000`
+  - **Requires restart:** Yes
+
+- **`contextManagement.historyWindow.retainedTokens`** (number):
+  - **Description:** The number of tokens to always retain.
+  - **Default:** `40000`
+  - **Requires restart:** Yes
+
+- **`contextManagement.messageLimits.normalMaxTokens`** (number):
+  - **Description:** The target number of tokens to budget for a normal
+    conversation turn.
+  - **Default:** `2500`
+  - **Requires restart:** Yes
+
+- **`contextManagement.messageLimits.retainedMaxTokens`** (number):
+  - **Description:** The maximum number of tokens a single conversation turn can
+    consume before truncation.
+  - **Default:** `12000`
+  - **Requires restart:** Yes
+
+- **`contextManagement.messageLimits.normalizationHeadRatio`** (number):
+  - **Description:** The ratio of tokens to retain from the beginning of a
+    truncated message (0.0 to 1.0).
+  - **Default:** `0.25`
+  - **Requires restart:** Yes
+
+- **`contextManagement.toolDistillation.maxOutputTokens`** (number):
+  - **Description:** Maximum tokens to show when truncating large tool outputs.
+  - **Default:** `10000`
+  - **Requires restart:** Yes
+
+- **`contextManagement.toolDistillation.summarizationThresholdTokens`**
+  (number):
+  - **Description:** Threshold above which truncated tool outputs will be
+    summarized by an LLM.
+  - **Default:** `20000`
+  - **Requires restart:** Yes
+
 #### `admin`

 - **`admin.secureModeEnabled`** (boolean):