delete old design docs

2026-05-16 06:43:07 -07:00 · 2026-04-09 19:36:23 +00:00
parent 17c9b4341a
commit 2a5b169179
7 changed files with 3 additions and 544 deletions
@@ -1,175 +0,0 @@
-# Context Manager V0: High-Level Design
-
-## 1. Introduction & Motivation
-
-This document provides a high-level orientation to the Context Management system
-within `@google/gemini-cli-core`.
-
-Previously, context management in the CLI was decentralized, synchronous, and
-relied on fixed-function, destructive mutations of the raw Gemini `Content[]`
-history. Because all context management was local, this approach made it nearly
-impossible to reason about the global impact of any specific change. For
-example, should we distill tool outputs, or mask them? Or maybe it's contextual?
-What about other processors like the snapshotter, should they see masked
-results? Distilled results? What about new approaches to context management, how
-do they fit into the solution we've already built. The old approach to context
-management made it nearly challenging to even attempt to answer any one of these
-questions, let alone to try and answer all of them.
-
-To address these issues, we went back to the drawing board to create an explicit
-Context Manager. As opposed to our old approach, the new Context Manager V0 is a
-robust, event-driven, pluggable system. It introduces a non-destructive Episodic
-Intermediate Representation (IR) and an asynchronous processing pipeline,
-allowing the CLI to run expensive LLM summarization tasks in the background and
-opportunistically project an optimized view of the history only when budget
-constraints require it.
-
---
-
-## 2. Chief Innovations & Salient Features
-
-The architecture is built upon seven core principles that distinguish it from
-the legacy system:
-
-1.  **Centralized Budgeting:** The `ContextManager` is the sole source of truth
-    for the token budget. It makes the final, just-in-time decision about what
-    gets projected to the LLM.
-2.  **Statelessness via IR:** Raw history is never mutated or deleted. Instead,
-    it is translated into an Intermediate Representation (IR). Context reduction
-    is achieved by attaching compressed `Variant`s to the IR graph. The original
-    text is always recoverable.
-3.  **Asynchronicity:** Designed around a `ContextEventBus`. Heavy context
-    operations (like LLM-powered summarization) run as detached background tasks
-    without blocking the main agent loop.
-4.  **Configurability:** Driven by a typed JSON "Sidecar" configuration. Token
-    ceilings, fallback strategies, and processing pipelines are entirely
-    data-driven.
-5.  **Pluggability:** `ContextProcessor`s are isolated plugins with typed
-    schemas. They are registered via Dependency Injection and can be arranged
-    into arbitrary pipelines.
-6.  **Debuggability:** A built-in `ContextTracer` tracks every step of the
-    pipeline, providing full audit trails of exactly when, why, and how a
-    message was altered.
-7.  **Testability:** Global state has been eliminated. The system uses strict
-    Dependency Injection (`SidecarRegistry`, `ContextEnvironment`,
-    `ContextEventBus`), making every layer easily unit-testable.
-8.  **Orthogonality via Targets:** Processors do not implicitly scan the entire
-    history graph. The `ContextManager` computes exact Deltas (e.g., new nodes
-    just added, or specific nodes that just aged out of the retained buffer).
-    Processors are sandboxed by the `EpisodeEditor` to only iterate over and
-    mutate these specific `targetNodes`, ensuring surgical and highly efficient
-    reductions.
-
---
-
-## 3. The Major Pieces: Roles & Responsibilities
-
-### The Brain: `ContextManager`
-
-The central coordinator. It owns the "Pristine History" (the ground-truth
-Episodic IR graph). Its primary responsibility is exposing
-`projectCompressedHistory()`, which flattens the IR graph into a standard
-`Content[]` array strictly adhering to the configured token budget.
-
-### The Data Model: Episodic Intermediate Representation (IR)
-
-Instead of a flat array of messages, interactions are grouped into `Episode`s.
-An Episode represents a single turn: a User Prompt, followed by the Agent's
-Thoughts and Tool Executions (Steps), concluding with a Yield.
-
- **`IrNode`:** The base unit (e.g., `ToolExecution`, `AgentThought`).
- **`Variant`:** Compressed alternatives to the raw node (e.g.,
-  `SummaryVariant`, `MaskedVariant`, `SnapshotVariant`).
- **`IrMetadata`:** An audit trail attached to every node, tracking token counts
-  and the chronological list of `transformations` applied by processors.
-
-### The Engine: `PipelineOrchestrator` & Sidecar
-
-The orchestrator reads the `SidecarConfig`. It manages the lifecycle of the
-pipelines, registering triggers and executing processors in order. It dictates
-whether a pipeline blocks the main thread or runs in the background.
-
-### The Workers: `ContextProcessor`s
-
-Small, highly-focused classes that implement context reduction strategies. They
-do not mutate the graph directly; instead, they are given an `EpisodeEditor`
-which provides a safe, scoped API to attach `Variant`s and append metadata.
-
- _Examples:_ `ToolMaskingProcessor`, `NodeDistillationProcessor`,
-  `BlobDegradationProcessor`.
-
-### The Glue: `ContextEventBus`
-
-A Pub/Sub bus that decouples the components. It enables the `HistoryObserver` to
-notify the system of new messages, and allows background processors to notify
-the `ContextManager` when a new compressed variant is ready to be used.
-
---
-
-## 4. How They Interact: The Life of a Message
-
-To understand how these pieces fit together, let's walk through the lifecycle of
-a single interaction as it moves through the context system.
-
-### Phase 1: Ingestion & Translation
-
-1.  **Action:** The user sends a prompt, and the agent responds with a tool
-    call. These raw messages are appended to the standard `AgentChatHistory`.
-2.  **Observation:** The `HistoryObserver` detects the new messages.
-3.  **Translation:** The observer passes the raw `Content[]` to the `IrMapper`.
-    The mapper groups the prompt and the tool execution into a single,
-    structured `Episode`.
-4.  **Registration:** The new `Episode` is added to the `ContextManager`'s
-    pristine graph.
-
-### Phase 2: Triggering the Pipelines
-
-1.  **Delta Generation:** The `ContextManager` receives the updated pristine
-    graph. It diffs it against the previous state and extracts a Delta—the exact
-    Set of new `IrNode` IDs.
-2.  **Event Emission:** The `ContextManager` fires a `ChunkReceivedEvent` (with
-    the Delta targets) over the `ContextEventBus`.
-3.  **Orchestration:** The `PipelineOrchestrator` hears the event and evaluates
-    its configured `PipelineDef`s. It finds a pipeline with the trigger
-    `on_turn`.
-4.  **Execution:** The Orchestrator creates an `EpisodeEditor` heavily sandboxed
-    to _only_ allow access to the targeted Delta nodes, and begins running the
-    processors in that pipeline sequentially.
-
-### Phase 3: Processing & Safe Editing
-
-1.  **Processing:** A processor (e.g., `ToolMaskingProcessor`) receives the
-    `EpisodeEditor`. It iterates over `editor.targets` (ignoring the rest of the
-    historical graph). It identifies a massive JSON payload in one of the new
-    tool executions.
-2.  **Editing:** Instead of deleting the JSON, it calls `editor.editEpisode()`.
-    It creates a `MaskedVariant` containing a string summary of the JSON. If it
-    had attempted to edit a node outside its target Delta, the editor would have
-    thrown an error.
-3.  **Auditing:** The editor automatically appends a record to the node's
-    `IrMetadata.transformations` indicating that the `ToolMaskingProcessor`
-    applied a `MASKED` action.
-
-### Phase 4: Async Resolution
-
-1.  **Completion:** The background pipeline finishes. The orchestrator fires a
-    `VariantReadyEvent` over the bus.
-2.  **Integration:** The `ContextManager` receives the event and securely
-    attaches the `MaskedVariant` to the correct `Episode` in the pristine graph.
-    (If the pipeline was synchronous/blocking, this happens immediately).
-
-### Phase 5: Just-In-Time Projection
-
-1.  **Request:** The agent is ready to send the next prompt to Gemini. The core
-    routing logic calls `contextManager.projectCompressedHistory()`.
-2.  **Budget Evaluation:** The `IrProjector` calculates the current total tokens
-    of the pristine graph and compares it to the `SidecarConfig` budget.
-3.  **Variant Selection:** If the graph exceeds the budget, the projector looks
-    for available `Variant`s. It sees the newly attached `MaskedVariant` and
-    calculates the token deficit recovered by using it.
-4.  **Flattening:** The `graphUtils` safely swap the raw node for the
-    `MaskedVariant` in a temporary view, and flatten the Episodic IR back into a
-    raw Gemini `Content[]` array.
-5.  **Delivery:** The optimized, budget-compliant array is sent to the LLM. The
-    underlying pristine graph remains completely untouched and available for
-    future reference or alternative projections.
@@ -1,37 +0,0 @@
-# The Nodes of Theseus Migration Checklist
-
- [x] **Phase 1: Core Types (`ir/types.ts`)**
-  - [x] Add `ConcreteNode` and `LogicalNode` types.
-  - [x] Add `episodeId` (or generic `parentId`) to all `ConcreteNode`
-        interfaces.
-  - [x] Add `replacesId` and `abstractsIds` pointers.
-  - [x] Remove `variants` dictionary from `IrNode`.
-
- [x] **Phase 2: Processor Pipeline (`pipeline.ts`)**
-  - [x] Delete `EpisodeEditor`.
-  - [x] Define `ContextPatch`.
-  - [x] Update `ContextProcessor` signature to accept `ProcessArgs` and return
-        `Promise<ContextPatch[]>`.
-
- [x] **Phase 3: The Reducer (`sidecar/orchestrator.ts`)**
-  - [x] Update `executePipeline` and `executeTriggerSync` to act as a reducer.
-  - [x] Map `ContextPatch` results onto the flat Nodes array.
-
- [x] **Phase 4: Pristine Graph & Mapping (`contextManager.ts` & `ir/toIr.ts`)**
-  - [x] Update `toIr` to produce a flat list of `ConcreteNode`s and a tree of
-        `LogicalNode`s.
-  - [x] Make `ContextManager` track the Pristine Graph and instantiate the flat
-        Nodes.
-  - [x] Commit patches to the Pristine Graph history.
-
- [x] **Phase 5: The Walker (`ir/projector.ts`)**
-  - [x] Update projection to simply walk the flat `ReadonlyArray<ConcreteNode>`.
-  - [x] Skip nodes whose IDs are in a "skipped" set (based on `abstractsIds`).
-
- [ ] **Phase 6: Refactoring Processors**
-  - [ ] `ToolMaskingProcessor`
-  - [ ] `NodeDistillationProcessor`
-  - [ ] `BlobDegradationProcessor`
-  - [ ] `HistoryTruncationProcessor`
-  - [ ] `NodeTruncationProcessor`
-  - [ ] `StateSnapshotProcessor`
@@ -1,183 +0,0 @@
-export class ContextManager {
-  // The stateful, pristine Episodic Intermediate Representation graph.
-  // This allows the agent to remember and summarize continuously without losing data across turns.
-  private pristineEpisodes: Episode[] = [];
-  private readonly eventBus: ContextEventBus;
-
-  // Internal sub-components
-  // Synchronous processors are instantiated but effectively used as singletons within this class
-  private orchestrator: PipelineOrchestrator;
-  private historyObserver?: HistoryObserver;
-
-  static create(
-    sidecar: SidecarConfig,
-    env: ContextEnvironment,
-    tracer: ContextTracer,
-    orchestrator?: PipelineOrchestrator,
-    registry?: SidecarRegistry,
-  ): ContextManager {
-    if (!registry) {
-      registry = new SidecarRegistry();
-      registerBuiltInProcessors(registry);
-    }
-    const orch =
-      orchestrator ||
-      new PipelineOrchestrator(sidecar, env, env.eventBus, tracer, registry);
-    return new ContextManager(sidecar, env, tracer, orch);
-  }
-
-  // Use ContextManager.create() instead
-  private constructor(
-    private sidecar: SidecarConfig,
-    private env: ContextEnvironment,
-    private readonly tracer: ContextTracer,
-    orchestrator: PipelineOrchestrator,
-  ) {
-    this.eventBus = env.eventBus;
-    this.orchestrator = orchestrator;
-
-    this.eventBus.onPristineHistoryUpdated((event) => {
-      this.pristineEpisodes = event.episodes;
-      this.evaluateTriggers(event.newNodes);
-    });
-
-    this.eventBus.onVariantReady((event) => {
-      // Find the target episode in the pristine graph
-      const targetEp = this.pristineEpisodes.find(
-        (ep) => ep.id === event.targetId,
-      );
-      if (targetEp) {
-        if (!targetEp.variants) {
-          targetEp.variants = {};
-        }
-        targetEp.variants[event.variantId] = event.variant;
-        this.tracer.logEvent(
-          'ContextManager',
-          `Received async variant [${event.variantId}] for Episode ${event.targetId}`,
-        );
-        debugLogger.log(
-          `ContextManager: Received async variant [${event.variantId}] for Episode ${event.targetId}.`,
-        );
-      }
-    });
-  }
-
-  /**
-   * Safely stops background workers and clears event listeners.
-   */
-  shutdown() {
-    this.orchestrator.shutdown();
-    if (this.historyObserver) {
-      this.historyObserver.stop();
-    }
-  }
-
-  /**
-   * Evaluates if the current working buffer exceeds configured budget thresholds,
-   * firing consolidation events if necessary.
-   */
-  private evaluateTriggers(newNodes: Set<string>) {
-    if (!this.sidecar.budget) return;
-
-    const workingBuffer = this.getWorkingBufferView();
-    const currentTokens =
-      this.env.tokenCalculator.calculateEpisodeListTokens(workingBuffer);
-
-    this.tracer.logEvent('ContextManager', 'Evaluated triggers', {
-      currentTokens,
-      retainedTokens: this.sidecar.budget.retainedTokens,
-    });
-
-    // 1. Eager Compute Trigger (on_turn)
-    if (newNodes.size > 0) {
-      this.eventBus.emitChunkReceived({ episodes: this.pristineEpisodes, targetNodeIds: newNodes });
-    }
-
-    // 2. Budget Crossed Trigger
-    if (currentTokens > this.sidecar.budget.retainedTokens) {
-      const deficit = currentTokens - this.sidecar.budget.retainedTokens;
-      
-      // Calculate exactly which nodes aged out of the retainedTokens budget to form our target delta
-      const agedOutNodes = new Set<string>();
-      let rollingTokens = 0;
-      // Start from newest and count backwards
-      for (let i = workingBuffer.length - 1; i >= 0; i--) {
-        const ep = workingBuffer[i];
-        const epTokens = this.env.tokenCalculator.calculateEpisodeListTokens([ep]);
-        rollingTokens += epTokens;
-        if (rollingTokens > this.sidecar.budget.retainedTokens) {
-          agedOutNodes.add(ep.id);
-          agedOutNodes.add(ep.trigger.id);
-          for (const step of ep.steps) agedOutNodes.add(step.id);
-          if (ep.yield) agedOutNodes.add(ep.yield.id);
-        }
-      }
-
-      this.tracer.logEvent(
-        'ContextManager',
-        'Budget crossed. Emitting ConsolidationNeeded',
-        { deficit, agedOutCount: agedOutNodes.size },
-      );
-      this.eventBus.emitConsolidationNeeded({
-        episodes: workingBuffer,
-        targetDeficit: deficit,
-        targetNodeIds: agedOutNodes,
-      });
-    }
-  }
-
-  /**
-   * Subscribes to the core AgentChatHistory to natively track all message events,
-   * converting them seamlessly into pristine Episodes.
-   */
-  subscribeToHistory(chatHistory: AgentChatHistory) {
-    if (this.historyObserver) {
-      this.historyObserver.stop();
-    }
-
-    this.historyObserver = new HistoryObserver(
-      chatHistory,
-      this.eventBus,
-      this.tracer,
-      this.env.tokenCalculator,
-    );
-    this.historyObserver.start();
-  }
-
-  /**
-   * Generates a computed view of the pristine log.
-   * Sweeps backwards (newest to oldest), tracking rolling tokens.
-   * When rollingTokens > retainedTokens, it injects the "best" available ready variant
-   * (snapshot > summary > masked) instead of the raw text.
-   * Handles N-to-1 variant skipping automatically.
-   */
-  getWorkingBufferView(): Episode[] {
-    return generateWorkingBufferView(
-      this.pristineEpisodes,
-      this.sidecar.budget.retainedTokens,
-      this.tracer,
-      this.env,
-    );
-  }
-
-  /**
-   * Returns a temporary, compressed Content[] array to be used exclusively for the LLM request.
-   * This does NOT mutate the pristine episodic graph.
-   */
-  async projectCompressedHistory(): Promise<Content[]> {
-    this.tracer.logEvent('ContextManager', 'Projection requested.');
-    const protectedIds = new Set<string>();
-    if (this.pristineEpisodes.length > 0) {
-      protectedIds.add(this.pristineEpisodes[0].id); // Structural invariant
-    }
-
-    return IrProjector.project(
-      this.getWorkingBufferView(),
-      this.orchestrator,
-      this.sidecar,
-      this.tracer,
-      this.env,
-      protectedIds,
-    );
-  }
-}
@@ -55,7 +55,7 @@ export class NodeDistillationProcessor implements ContextProcessor {
    try {
      const response = await this.env.llmClient.generateContent({
        role: LlmRole.UTILITY_COMPRESSOR,
-        modelConfigKey: { model: 'default' },
+        modelConfigKey: { model: 'gemini-3-flash-base' },
        promptId: this.env.promptId,
        abortSignal: new AbortController().signal,
        contents: [
@@ -1,146 +0,0 @@
-# Context Manager: The Pure Functional "Nodes of Theseus" IR
-
-This document outlines the architectural transition from the V0 Mutating Editor
-pattern to the V1 Pure Functional, Immutable Episodic IR, designed to scale into
-a multi-agent, async state transformation system.
-
-## 1. Core Philosophy: The Nodes of Theseus
-
-The primary constraint of deep immutable trees is the cascading cost of cloning
-parent nodes when a leaf node changes. To solve this, we decouple the structural
-hierarchy of the context from the actual data sent to the LLM.
-
-The IR is divided into two distinct domains:
-
-1.  **Logical Nodes:** Structural boundaries that define the hierarchy (e.g.,
-    `Task`, `Episode`). These nodes **do not render** to the LLM. They exist to
-    group related interactions and provide semantic meaning.
-2.  **Concrete Nodes:** The atomic, renderable pieces of data (e.g.,
-    `UserPrompt`, `ToolExecution`, `Snapshot`, `RollingSummary`). These are the
-    actual "planks" of the nodes.
-
-Because Concrete Nodes carry a reference to their Logical Parent (e.g.,
-`episodeId`), they can be stored and processed as a **Flat List**.
-
-## 2. The Autonomous `ContextWorkingBuffer`
-
-The "Nodes" is no longer a dumb array; it is encapsulated in a rich
-`ContextWorkingBuffer` entity.
-
-### Encapsulation of History
-
-The Buffer manages its own audit trail and lineage. If a processor needs the
-pristine, unaltered data of a deeply compressed node (e.g., a Snapshotter
-summarizing masked tools), it queries the Buffer directly:
-`buffer.getPristineNode(id)`
-
-### Linear Temporal Progression (The Conveyor Belt)
-
-Processors do not vote or compete. Context degradation is a linear temporal
-progression defined by triggers:
-
-1.  **Frontbuffer Trim:** E.g., Tool Masking replaces raw tools immediately.
-2.  **Backbuffer Normalize:** E.g., Summarization replaces aging nodes in the
-    background.
-3.  **GC Backstop:** E.g., Truncation brutally destroys nodes only when the
-    absolute budget is breached.
-
-When a pipeline triggers, the Orchestrator runs its processors, gathers their
-`ContextPatch`es, and applies them to the Buffer immediately. The state simply
-advances.
-
-## 3. Type-Safe Async Coordination (The `ContextInbox`)
-
-To solve the async/sync barrier (where a slow background worker generates a
-summary that a fast synchronous emergency backstop needs instantly), we
-introduce the `ContextInbox`.
-
-This is a strictly-typed messaging system. A worker dispatches a
-`SNAPSHOT_READY` message to the Inbox. The backstop peeks at the Inbox,
-instantly retrieving the pre-computed summary and applying it.
-
-## 4. The Processor Contract
-
-Processors are purely functional map/filter operations. They evaluate a list of
-unprotected targets and return the exact list of nodes they intend to
-substitute. They do **not** generate manual `ContextPatch` objects or manage
-`IrMetadata`.
-
-```typescript
-export type InboxMessage =
-  | { type: 'SNAPSHOT_READY'; snapshot: Snapshot; abstractsIds: string[] }
-  | { type: 'BACKGROUND_SUMMARY'; summary: RollingSummary; targetId: string };
-
-export interface ContextInbox {
-  dispatch(message: InboxMessage): void;
-  peek<T extends InboxMessage['type']>(
-    type: T,
-  ): Extract<InboxMessage, { type: T }> | undefined;
-}
-
-export interface ContextWorkingBuffer {
-  /** The current active (projected) flat list of ConcreteNodes. */
-  readonly nodes: ReadonlyArray<ConcreteNode>;
-
-  /** Retrieves the historical, pristine version of a node (before any masks/summaries). */
-  getPristineNode(id: string): ConcreteNode | undefined;
-
-  /** Retrieves the full audit lineage of a specific node ID. */
-  getLineage(id: string): ReadonlyArray<ConcreteNode>;
-}
-
-export interface ProcessArgs {
-  /** The rich buffer containing current nodes and their history. */
-  readonly buffer: ContextWorkingBuffer;
-
-  /**
-   * The specific unprotected, mutable nodes the pipeline is allowed to operate on.
-   * The Orchestrator strictly filters out ANY protected nodes (like active tasks) before calling.
-   * Processors can assume all targets passed here are legally theirs to mutate or drop.
-   */
-  readonly targets: ReadonlyArray<ConcreteNode>;
-
-  /** The token budget and accounting state. */
-  readonly state: ContextAccountingState;
-
-  /** Type-safe messaging system for async/sync coordination. */
-  readonly inbox: ContextInbox;
-}
-
-export interface ContextProcessor {
-  readonly id: string;
-  readonly name: string;
-
-  /**
-   * A pure function. Returns the new state of the `targets`.
-   * If an ID from `targets` is missing in the return array, the Orchestrator deletes it.
-   * If a new synthetic node is in the return array, the Orchestrator inserts it.
-   * The Orchestrator automatically appends audit `IrMetadata` to any changes.
-   */
-  process(args: ProcessArgs): Promise<ReadonlyArray<ConcreteNode>>;
-}
-```
-
-## 5. The Node Taxonomy (`IrNodeType`)
-
-The `IrNodeType` union explicitly defines all valid nodes. Synthetic nodes (like
-`Snapshot`) are first-class citizens.
-
-```typescript
-export type IrNodeType =
-  // Logical Nodes
-  | 'TASK'
-  | 'EPISODE'
-
-  // Organic Concrete Nodes
-  | 'USER_PROMPT'
-  | 'SYSTEM_EVENT'
-  | 'AGENT_THOUGHT'
-  | 'TOOL_EXECUTION'
-  | 'AGENT_YIELD'
-
-  // Synthetic Concrete Nodes
-  | 'SNAPSHOT'
-  | 'ROLLING_SUMMARY'
-  | 'MASKED_TOOL';
-```
@@ -40,7 +40,7 @@ Output ONLY the raw factual snapshot, formatted compactly. Do not include markdo

    const response = await this.env.llmClient.generateContent({
      role: LlmRole.UTILITY_STATE_SNAPSHOT_PROCESSOR,
-      modelConfigKey: { model: 'default' },
+      modelConfigKey: { model: 'gemini-3-flash-base' },
      contents: [{ role: 'user', parts: [{ text: userPromptText }] }],
      systemInstruction: { role: 'system', parts: [{ text: systemPrompt }] },
      promptId: this.env.promptId,
@@ -16,5 +16,5 @@ export enum LlmRole {
  UTILITY_EDIT_CORRECTOR = 'utility_edit_corrector',
  UTILITY_AUTOCOMPLETE = 'utility_autocomplete',
  UTILITY_FAST_ACK_HELPER = 'utility_fast_ack_helper',
-  UTILITY_STATE_SNAPSHOT_PROCESSOR = 'utility_state_snapshot_processr',
+  UTILITY_STATE_SNAPSHOT_PROCESSOR = 'utility_state_snapshot_processor',
 }