From 2a5b169179331d04850aaa57c9afc2b29ee3292f Mon Sep 17 00:00:00 2001 From: Your Name Date: Thu, 9 Apr 2026 19:36:23 +0000 Subject: [PATCH] delete old design docs --- packages/core/src/context/DESIGN.md | 175 ----------------- packages/core/src/context/migration-plan.md | 37 ---- .../core/src/context/oldContextManager.txt | 183 ------------------ .../processors/nodeDistillationProcessor.ts | 2 +- packages/core/src/context/typed-context-ir.md | 146 -------------- .../src/context/utils/snapshotGenerator.ts | 2 +- packages/core/src/telemetry/llmRole.ts | 2 +- 7 files changed, 3 insertions(+), 544 deletions(-) delete mode 100644 packages/core/src/context/DESIGN.md delete mode 100644 packages/core/src/context/migration-plan.md delete mode 100644 packages/core/src/context/oldContextManager.txt delete mode 100644 packages/core/src/context/typed-context-ir.md diff --git a/packages/core/src/context/DESIGN.md b/packages/core/src/context/DESIGN.md deleted file mode 100644 index 37c76e9be2..0000000000 --- a/packages/core/src/context/DESIGN.md +++ /dev/null @@ -1,175 +0,0 @@ -# Context Manager V0: High-Level Design - -## 1. Introduction & Motivation - -This document provides a high-level orientation to the Context Management system -within `@google/gemini-cli-core`. - -Previously, context management in the CLI was decentralized, synchronous, and -relied on fixed-function, destructive mutations of the raw Gemini `Content[]` -history. Because all context management was local, this approach made it nearly -impossible to reason about the global impact of any specific change. For -example, should we distill tool outputs, or mask them? Or maybe it's contextual? -What about other processors like the snapshotter, should they see masked -results? Distilled results? What about new approaches to context management, how -do they fit into the solution we've already built. The old approach to context -management made it nearly challenging to even attempt to answer any one of these -questions, let alone to try and answer all of them. - -To address these issues, we went back to the drawing board to create an explicit -Context Manager. As opposed to our old approach, the new Context Manager V0 is a -robust, event-driven, pluggable system. It introduces a non-destructive Episodic -Intermediate Representation (IR) and an asynchronous processing pipeline, -allowing the CLI to run expensive LLM summarization tasks in the background and -opportunistically project an optimized view of the history only when budget -constraints require it. - ---- - -## 2. Chief Innovations & Salient Features - -The architecture is built upon seven core principles that distinguish it from -the legacy system: - -1. **Centralized Budgeting:** The `ContextManager` is the sole source of truth - for the token budget. It makes the final, just-in-time decision about what - gets projected to the LLM. -2. **Statelessness via IR:** Raw history is never mutated or deleted. Instead, - it is translated into an Intermediate Representation (IR). Context reduction - is achieved by attaching compressed `Variant`s to the IR graph. The original - text is always recoverable. -3. **Asynchronicity:** Designed around a `ContextEventBus`. Heavy context - operations (like LLM-powered summarization) run as detached background tasks - without blocking the main agent loop. -4. **Configurability:** Driven by a typed JSON "Sidecar" configuration. Token - ceilings, fallback strategies, and processing pipelines are entirely - data-driven. -5. **Pluggability:** `ContextProcessor`s are isolated plugins with typed - schemas. They are registered via Dependency Injection and can be arranged - into arbitrary pipelines. -6. **Debuggability:** A built-in `ContextTracer` tracks every step of the - pipeline, providing full audit trails of exactly when, why, and how a - message was altered. -7. **Testability:** Global state has been eliminated. The system uses strict - Dependency Injection (`SidecarRegistry`, `ContextEnvironment`, - `ContextEventBus`), making every layer easily unit-testable. -8. **Orthogonality via Targets:** Processors do not implicitly scan the entire - history graph. The `ContextManager` computes exact Deltas (e.g., new nodes - just added, or specific nodes that just aged out of the retained buffer). - Processors are sandboxed by the `EpisodeEditor` to only iterate over and - mutate these specific `targetNodes`, ensuring surgical and highly efficient - reductions. - ---- - -## 3. The Major Pieces: Roles & Responsibilities - -### The Brain: `ContextManager` - -The central coordinator. It owns the "Pristine History" (the ground-truth -Episodic IR graph). Its primary responsibility is exposing -`projectCompressedHistory()`, which flattens the IR graph into a standard -`Content[]` array strictly adhering to the configured token budget. - -### The Data Model: Episodic Intermediate Representation (IR) - -Instead of a flat array of messages, interactions are grouped into `Episode`s. -An Episode represents a single turn: a User Prompt, followed by the Agent's -Thoughts and Tool Executions (Steps), concluding with a Yield. - -- **`IrNode`:** The base unit (e.g., `ToolExecution`, `AgentThought`). -- **`Variant`:** Compressed alternatives to the raw node (e.g., - `SummaryVariant`, `MaskedVariant`, `SnapshotVariant`). -- **`IrMetadata`:** An audit trail attached to every node, tracking token counts - and the chronological list of `transformations` applied by processors. - -### The Engine: `PipelineOrchestrator` & Sidecar - -The orchestrator reads the `SidecarConfig`. It manages the lifecycle of the -pipelines, registering triggers and executing processors in order. It dictates -whether a pipeline blocks the main thread or runs in the background. - -### The Workers: `ContextProcessor`s - -Small, highly-focused classes that implement context reduction strategies. They -do not mutate the graph directly; instead, they are given an `EpisodeEditor` -which provides a safe, scoped API to attach `Variant`s and append metadata. - -- _Examples:_ `ToolMaskingProcessor`, `NodeDistillationProcessor`, - `BlobDegradationProcessor`. - -### The Glue: `ContextEventBus` - -A Pub/Sub bus that decouples the components. It enables the `HistoryObserver` to -notify the system of new messages, and allows background processors to notify -the `ContextManager` when a new compressed variant is ready to be used. - ---- - -## 4. How They Interact: The Life of a Message - -To understand how these pieces fit together, let's walk through the lifecycle of -a single interaction as it moves through the context system. - -### Phase 1: Ingestion & Translation - -1. **Action:** The user sends a prompt, and the agent responds with a tool - call. These raw messages are appended to the standard `AgentChatHistory`. -2. **Observation:** The `HistoryObserver` detects the new messages. -3. **Translation:** The observer passes the raw `Content[]` to the `IrMapper`. - The mapper groups the prompt and the tool execution into a single, - structured `Episode`. -4. **Registration:** The new `Episode` is added to the `ContextManager`'s - pristine graph. - -### Phase 2: Triggering the Pipelines - -1. **Delta Generation:** The `ContextManager` receives the updated pristine - graph. It diffs it against the previous state and extracts a Delta—the exact - Set of new `IrNode` IDs. -2. **Event Emission:** The `ContextManager` fires a `ChunkReceivedEvent` (with - the Delta targets) over the `ContextEventBus`. -3. **Orchestration:** The `PipelineOrchestrator` hears the event and evaluates - its configured `PipelineDef`s. It finds a pipeline with the trigger - `on_turn`. -4. **Execution:** The Orchestrator creates an `EpisodeEditor` heavily sandboxed - to _only_ allow access to the targeted Delta nodes, and begins running the - processors in that pipeline sequentially. - -### Phase 3: Processing & Safe Editing - -1. **Processing:** A processor (e.g., `ToolMaskingProcessor`) receives the - `EpisodeEditor`. It iterates over `editor.targets` (ignoring the rest of the - historical graph). It identifies a massive JSON payload in one of the new - tool executions. -2. **Editing:** Instead of deleting the JSON, it calls `editor.editEpisode()`. - It creates a `MaskedVariant` containing a string summary of the JSON. If it - had attempted to edit a node outside its target Delta, the editor would have - thrown an error. -3. **Auditing:** The editor automatically appends a record to the node's - `IrMetadata.transformations` indicating that the `ToolMaskingProcessor` - applied a `MASKED` action. - -### Phase 4: Async Resolution - -1. **Completion:** The background pipeline finishes. The orchestrator fires a - `VariantReadyEvent` over the bus. -2. **Integration:** The `ContextManager` receives the event and securely - attaches the `MaskedVariant` to the correct `Episode` in the pristine graph. - (If the pipeline was synchronous/blocking, this happens immediately). - -### Phase 5: Just-In-Time Projection - -1. **Request:** The agent is ready to send the next prompt to Gemini. The core - routing logic calls `contextManager.projectCompressedHistory()`. -2. **Budget Evaluation:** The `IrProjector` calculates the current total tokens - of the pristine graph and compares it to the `SidecarConfig` budget. -3. **Variant Selection:** If the graph exceeds the budget, the projector looks - for available `Variant`s. It sees the newly attached `MaskedVariant` and - calculates the token deficit recovered by using it. -4. **Flattening:** The `graphUtils` safely swap the raw node for the - `MaskedVariant` in a temporary view, and flatten the Episodic IR back into a - raw Gemini `Content[]` array. -5. **Delivery:** The optimized, budget-compliant array is sent to the LLM. The - underlying pristine graph remains completely untouched and available for - future reference or alternative projections. diff --git a/packages/core/src/context/migration-plan.md b/packages/core/src/context/migration-plan.md deleted file mode 100644 index 6cdf27435a..0000000000 --- a/packages/core/src/context/migration-plan.md +++ /dev/null @@ -1,37 +0,0 @@ -# The Nodes of Theseus Migration Checklist - -- [x] **Phase 1: Core Types (`ir/types.ts`)** - - [x] Add `ConcreteNode` and `LogicalNode` types. - - [x] Add `episodeId` (or generic `parentId`) to all `ConcreteNode` - interfaces. - - [x] Add `replacesId` and `abstractsIds` pointers. - - [x] Remove `variants` dictionary from `IrNode`. - -- [x] **Phase 2: Processor Pipeline (`pipeline.ts`)** - - [x] Delete `EpisodeEditor`. - - [x] Define `ContextPatch`. - - [x] Update `ContextProcessor` signature to accept `ProcessArgs` and return - `Promise`. - -- [x] **Phase 3: The Reducer (`sidecar/orchestrator.ts`)** - - [x] Update `executePipeline` and `executeTriggerSync` to act as a reducer. - - [x] Map `ContextPatch` results onto the flat Nodes array. - -- [x] **Phase 4: Pristine Graph & Mapping (`contextManager.ts` & `ir/toIr.ts`)** - - [x] Update `toIr` to produce a flat list of `ConcreteNode`s and a tree of - `LogicalNode`s. - - [x] Make `ContextManager` track the Pristine Graph and instantiate the flat - Nodes. - - [x] Commit patches to the Pristine Graph history. - -- [x] **Phase 5: The Walker (`ir/projector.ts`)** - - [x] Update projection to simply walk the flat `ReadonlyArray`. - - [x] Skip nodes whose IDs are in a "skipped" set (based on `abstractsIds`). - -- [ ] **Phase 6: Refactoring Processors** - - [ ] `ToolMaskingProcessor` - - [ ] `NodeDistillationProcessor` - - [ ] `BlobDegradationProcessor` - - [ ] `HistoryTruncationProcessor` - - [ ] `NodeTruncationProcessor` - - [ ] `StateSnapshotProcessor` diff --git a/packages/core/src/context/oldContextManager.txt b/packages/core/src/context/oldContextManager.txt deleted file mode 100644 index b869441e87..0000000000 --- a/packages/core/src/context/oldContextManager.txt +++ /dev/null @@ -1,183 +0,0 @@ -export class ContextManager { - // The stateful, pristine Episodic Intermediate Representation graph. - // This allows the agent to remember and summarize continuously without losing data across turns. - private pristineEpisodes: Episode[] = []; - private readonly eventBus: ContextEventBus; - - // Internal sub-components - // Synchronous processors are instantiated but effectively used as singletons within this class - private orchestrator: PipelineOrchestrator; - private historyObserver?: HistoryObserver; - - static create( - sidecar: SidecarConfig, - env: ContextEnvironment, - tracer: ContextTracer, - orchestrator?: PipelineOrchestrator, - registry?: SidecarRegistry, - ): ContextManager { - if (!registry) { - registry = new SidecarRegistry(); - registerBuiltInProcessors(registry); - } - const orch = - orchestrator || - new PipelineOrchestrator(sidecar, env, env.eventBus, tracer, registry); - return new ContextManager(sidecar, env, tracer, orch); - } - - // Use ContextManager.create() instead - private constructor( - private sidecar: SidecarConfig, - private env: ContextEnvironment, - private readonly tracer: ContextTracer, - orchestrator: PipelineOrchestrator, - ) { - this.eventBus = env.eventBus; - this.orchestrator = orchestrator; - - this.eventBus.onPristineHistoryUpdated((event) => { - this.pristineEpisodes = event.episodes; - this.evaluateTriggers(event.newNodes); - }); - - this.eventBus.onVariantReady((event) => { - // Find the target episode in the pristine graph - const targetEp = this.pristineEpisodes.find( - (ep) => ep.id === event.targetId, - ); - if (targetEp) { - if (!targetEp.variants) { - targetEp.variants = {}; - } - targetEp.variants[event.variantId] = event.variant; - this.tracer.logEvent( - 'ContextManager', - `Received async variant [${event.variantId}] for Episode ${event.targetId}`, - ); - debugLogger.log( - `ContextManager: Received async variant [${event.variantId}] for Episode ${event.targetId}.`, - ); - } - }); - } - - /** - * Safely stops background workers and clears event listeners. - */ - shutdown() { - this.orchestrator.shutdown(); - if (this.historyObserver) { - this.historyObserver.stop(); - } - } - - /** - * Evaluates if the current working buffer exceeds configured budget thresholds, - * firing consolidation events if necessary. - */ - private evaluateTriggers(newNodes: Set) { - if (!this.sidecar.budget) return; - - const workingBuffer = this.getWorkingBufferView(); - const currentTokens = - this.env.tokenCalculator.calculateEpisodeListTokens(workingBuffer); - - this.tracer.logEvent('ContextManager', 'Evaluated triggers', { - currentTokens, - retainedTokens: this.sidecar.budget.retainedTokens, - }); - - // 1. Eager Compute Trigger (on_turn) - if (newNodes.size > 0) { - this.eventBus.emitChunkReceived({ episodes: this.pristineEpisodes, targetNodeIds: newNodes }); - } - - // 2. Budget Crossed Trigger - if (currentTokens > this.sidecar.budget.retainedTokens) { - const deficit = currentTokens - this.sidecar.budget.retainedTokens; - - // Calculate exactly which nodes aged out of the retainedTokens budget to form our target delta - const agedOutNodes = new Set(); - let rollingTokens = 0; - // Start from newest and count backwards - for (let i = workingBuffer.length - 1; i >= 0; i--) { - const ep = workingBuffer[i]; - const epTokens = this.env.tokenCalculator.calculateEpisodeListTokens([ep]); - rollingTokens += epTokens; - if (rollingTokens > this.sidecar.budget.retainedTokens) { - agedOutNodes.add(ep.id); - agedOutNodes.add(ep.trigger.id); - for (const step of ep.steps) agedOutNodes.add(step.id); - if (ep.yield) agedOutNodes.add(ep.yield.id); - } - } - - this.tracer.logEvent( - 'ContextManager', - 'Budget crossed. Emitting ConsolidationNeeded', - { deficit, agedOutCount: agedOutNodes.size }, - ); - this.eventBus.emitConsolidationNeeded({ - episodes: workingBuffer, - targetDeficit: deficit, - targetNodeIds: agedOutNodes, - }); - } - } - - /** - * Subscribes to the core AgentChatHistory to natively track all message events, - * converting them seamlessly into pristine Episodes. - */ - subscribeToHistory(chatHistory: AgentChatHistory) { - if (this.historyObserver) { - this.historyObserver.stop(); - } - - this.historyObserver = new HistoryObserver( - chatHistory, - this.eventBus, - this.tracer, - this.env.tokenCalculator, - ); - this.historyObserver.start(); - } - - /** - * Generates a computed view of the pristine log. - * Sweeps backwards (newest to oldest), tracking rolling tokens. - * When rollingTokens > retainedTokens, it injects the "best" available ready variant - * (snapshot > summary > masked) instead of the raw text. - * Handles N-to-1 variant skipping automatically. - */ - getWorkingBufferView(): Episode[] { - return generateWorkingBufferView( - this.pristineEpisodes, - this.sidecar.budget.retainedTokens, - this.tracer, - this.env, - ); - } - - /** - * Returns a temporary, compressed Content[] array to be used exclusively for the LLM request. - * This does NOT mutate the pristine episodic graph. - */ - async projectCompressedHistory(): Promise { - this.tracer.logEvent('ContextManager', 'Projection requested.'); - const protectedIds = new Set(); - if (this.pristineEpisodes.length > 0) { - protectedIds.add(this.pristineEpisodes[0].id); // Structural invariant - } - - return IrProjector.project( - this.getWorkingBufferView(), - this.orchestrator, - this.sidecar, - this.tracer, - this.env, - protectedIds, - ); - } -} diff --git a/packages/core/src/context/processors/nodeDistillationProcessor.ts b/packages/core/src/context/processors/nodeDistillationProcessor.ts index dedde91050..0d54a54ee9 100644 --- a/packages/core/src/context/processors/nodeDistillationProcessor.ts +++ b/packages/core/src/context/processors/nodeDistillationProcessor.ts @@ -55,7 +55,7 @@ export class NodeDistillationProcessor implements ContextProcessor { try { const response = await this.env.llmClient.generateContent({ role: LlmRole.UTILITY_COMPRESSOR, - modelConfigKey: { model: 'default' }, + modelConfigKey: { model: 'gemini-3-flash-base' }, promptId: this.env.promptId, abortSignal: new AbortController().signal, contents: [ diff --git a/packages/core/src/context/typed-context-ir.md b/packages/core/src/context/typed-context-ir.md deleted file mode 100644 index fc11091e2f..0000000000 --- a/packages/core/src/context/typed-context-ir.md +++ /dev/null @@ -1,146 +0,0 @@ -# Context Manager: The Pure Functional "Nodes of Theseus" IR - -This document outlines the architectural transition from the V0 Mutating Editor -pattern to the V1 Pure Functional, Immutable Episodic IR, designed to scale into -a multi-agent, async state transformation system. - -## 1. Core Philosophy: The Nodes of Theseus - -The primary constraint of deep immutable trees is the cascading cost of cloning -parent nodes when a leaf node changes. To solve this, we decouple the structural -hierarchy of the context from the actual data sent to the LLM. - -The IR is divided into two distinct domains: - -1. **Logical Nodes:** Structural boundaries that define the hierarchy (e.g., - `Task`, `Episode`). These nodes **do not render** to the LLM. They exist to - group related interactions and provide semantic meaning. -2. **Concrete Nodes:** The atomic, renderable pieces of data (e.g., - `UserPrompt`, `ToolExecution`, `Snapshot`, `RollingSummary`). These are the - actual "planks" of the nodes. - -Because Concrete Nodes carry a reference to their Logical Parent (e.g., -`episodeId`), they can be stored and processed as a **Flat List**. - -## 2. The Autonomous `ContextWorkingBuffer` - -The "Nodes" is no longer a dumb array; it is encapsulated in a rich -`ContextWorkingBuffer` entity. - -### Encapsulation of History - -The Buffer manages its own audit trail and lineage. If a processor needs the -pristine, unaltered data of a deeply compressed node (e.g., a Snapshotter -summarizing masked tools), it queries the Buffer directly: -`buffer.getPristineNode(id)` - -### Linear Temporal Progression (The Conveyor Belt) - -Processors do not vote or compete. Context degradation is a linear temporal -progression defined by triggers: - -1. **Frontbuffer Trim:** E.g., Tool Masking replaces raw tools immediately. -2. **Backbuffer Normalize:** E.g., Summarization replaces aging nodes in the - background. -3. **GC Backstop:** E.g., Truncation brutally destroys nodes only when the - absolute budget is breached. - -When a pipeline triggers, the Orchestrator runs its processors, gathers their -`ContextPatch`es, and applies them to the Buffer immediately. The state simply -advances. - -## 3. Type-Safe Async Coordination (The `ContextInbox`) - -To solve the async/sync barrier (where a slow background worker generates a -summary that a fast synchronous emergency backstop needs instantly), we -introduce the `ContextInbox`. - -This is a strictly-typed messaging system. A worker dispatches a -`SNAPSHOT_READY` message to the Inbox. The backstop peeks at the Inbox, -instantly retrieving the pre-computed summary and applying it. - -## 4. The Processor Contract - -Processors are purely functional map/filter operations. They evaluate a list of -unprotected targets and return the exact list of nodes they intend to -substitute. They do **not** generate manual `ContextPatch` objects or manage -`IrMetadata`. - -```typescript -export type InboxMessage = - | { type: 'SNAPSHOT_READY'; snapshot: Snapshot; abstractsIds: string[] } - | { type: 'BACKGROUND_SUMMARY'; summary: RollingSummary; targetId: string }; - -export interface ContextInbox { - dispatch(message: InboxMessage): void; - peek( - type: T, - ): Extract | undefined; -} - -export interface ContextWorkingBuffer { - /** The current active (projected) flat list of ConcreteNodes. */ - readonly nodes: ReadonlyArray; - - /** Retrieves the historical, pristine version of a node (before any masks/summaries). */ - getPristineNode(id: string): ConcreteNode | undefined; - - /** Retrieves the full audit lineage of a specific node ID. */ - getLineage(id: string): ReadonlyArray; -} - -export interface ProcessArgs { - /** The rich buffer containing current nodes and their history. */ - readonly buffer: ContextWorkingBuffer; - - /** - * The specific unprotected, mutable nodes the pipeline is allowed to operate on. - * The Orchestrator strictly filters out ANY protected nodes (like active tasks) before calling. - * Processors can assume all targets passed here are legally theirs to mutate or drop. - */ - readonly targets: ReadonlyArray; - - /** The token budget and accounting state. */ - readonly state: ContextAccountingState; - - /** Type-safe messaging system for async/sync coordination. */ - readonly inbox: ContextInbox; -} - -export interface ContextProcessor { - readonly id: string; - readonly name: string; - - /** - * A pure function. Returns the new state of the `targets`. - * If an ID from `targets` is missing in the return array, the Orchestrator deletes it. - * If a new synthetic node is in the return array, the Orchestrator inserts it. - * The Orchestrator automatically appends audit `IrMetadata` to any changes. - */ - process(args: ProcessArgs): Promise>; -} -``` - -## 5. The Node Taxonomy (`IrNodeType`) - -The `IrNodeType` union explicitly defines all valid nodes. Synthetic nodes (like -`Snapshot`) are first-class citizens. - -```typescript -export type IrNodeType = - // Logical Nodes - | 'TASK' - | 'EPISODE' - - // Organic Concrete Nodes - | 'USER_PROMPT' - | 'SYSTEM_EVENT' - | 'AGENT_THOUGHT' - | 'TOOL_EXECUTION' - | 'AGENT_YIELD' - - // Synthetic Concrete Nodes - | 'SNAPSHOT' - | 'ROLLING_SUMMARY' - | 'MASKED_TOOL'; -``` diff --git a/packages/core/src/context/utils/snapshotGenerator.ts b/packages/core/src/context/utils/snapshotGenerator.ts index 9d0a57a175..20865e9863 100644 --- a/packages/core/src/context/utils/snapshotGenerator.ts +++ b/packages/core/src/context/utils/snapshotGenerator.ts @@ -40,7 +40,7 @@ Output ONLY the raw factual snapshot, formatted compactly. Do not include markdo const response = await this.env.llmClient.generateContent({ role: LlmRole.UTILITY_STATE_SNAPSHOT_PROCESSOR, - modelConfigKey: { model: 'default' }, + modelConfigKey: { model: 'gemini-3-flash-base' }, contents: [{ role: 'user', parts: [{ text: userPromptText }] }], systemInstruction: { role: 'system', parts: [{ text: systemPrompt }] }, promptId: this.env.promptId, diff --git a/packages/core/src/telemetry/llmRole.ts b/packages/core/src/telemetry/llmRole.ts index 7d8f5d8df6..e2146755e0 100644 --- a/packages/core/src/telemetry/llmRole.ts +++ b/packages/core/src/telemetry/llmRole.ts @@ -16,5 +16,5 @@ export enum LlmRole { UTILITY_EDIT_CORRECTOR = 'utility_edit_corrector', UTILITY_AUTOCOMPLETE = 'utility_autocomplete', UTILITY_FAST_ACK_HELPER = 'utility_fast_ack_helper', - UTILITY_STATE_SNAPSHOT_PROCESSOR = 'utility_state_snapshot_processr', + UTILITY_STATE_SNAPSHOT_PROCESSOR = 'utility_state_snapshot_processor', }