From 10009760b60b1bd02734d6d74f6bd63f3ea9e117 Mon Sep 17 00:00:00 2001
From: Your Name <joshualitt@google.com>
Date: Fri, 10 Apr 2026 16:54:19 +0000
Subject: [PATCH] cleanup docs for now

---
 docs/context-manager-async-mutations.md       |  52 ----
 .../src/context/context-manager-v0-design.md  | 243 ------------------
 2 files changed, 295 deletions(-)
 delete mode 100644 docs/context-manager-async-mutations.md
 delete mode 100644 packages/core/src/context/context-manager-v0-design.md

diff --git a/docs/context-manager-async-mutations.md b/docs/context-manager-async-mutations.md
deleted file mode 100644
index db4545b4b5..0000000000
--- a/docs/context-manager-async-mutations.md
+++ /dev/null
@@ -1,52 +0,0 @@
-# Async Context Mutations (V1 Architecture)
-
-## The Problem
-
-In V0, the \`ContextManager\` processes LLM inputs sequentially and
-synchronously. Processors like \`NodeTruncation\` can safely mutate the graph
-because they hold an exclusive lock on the context state.
-
-However, operations like \`StateSnapshotAsyncProcessor\` take a long time to run
-(distilling thousands of tokens). If they run synchronously, they block the user
-from interacting with the agent. If they run asynchronously in the background,
-by the time they finish, the active context graph has likely moved on (new
-messages, tool calls, or other truncations have occurred).
-
-## The V1 Solution: Ancestral Replacement (Optimistic Concurrency)
-
-To allow async background pipelines to mutate the live context graph safely, we
-use an Optimistic Concurrency Control mechanism called **Ancestral
-Replacement**.
-
-### 1. Proof of Claim
-
-When an \`AsyncContextProcessor\` triggers, it is handed a \`ProcessArgs\`
-containing a snapshot of the graph (the targets it was asked to process). The
-processor records the specific IDs of the \`ConcreteNode\`s it is reading. This
-is its "Proof of Claim".
-
-### 2. Background Execution
-
-The processor runs in the background, completely detached from the live graph.
-It synthesizes a new state (e.g., a summarized snapshot node).
-
-### 3. The Commit Phase
-
-When the processor finishes, it returns its proposed mutations (an array of new
-\`ConcreteNode\`s that specify which old nodes they replace via the
-\`replacesId\` property).
-
-The Orchestrator then attempts to "rebase" this mutation into the live graph:
-
-1. It looks at the live graph.
-2. It checks: _Do all the original nodes (the Proof of Claim) still exist
-   unmodified in the live graph?_
-3. **If Yes (Clean Fast-Forward):** The orchestrator deletes the old nodes and
-   inserts the new synthesized nodes.
-4. **If No (Conflict):** If _any_ of the original nodes were deleted or modified
-   by another processor while the async task was running, the orchestrator
-   rejects the async mutation entirely (or handles it via a conflict resolution
-   strategy).
-
-This guarantees that async pipelines can never corrupt the context state by
-overwriting newer information with stale data.
diff --git a/packages/core/src/context/context-manager-v0-design.md b/packages/core/src/context/context-manager-v0-design.md
deleted file mode 100644
index 9490d3b248..0000000000
--- a/packages/core/src/context/context-manager-v0-design.md
+++ /dev/null
@@ -1,243 +0,0 @@
-# Context Management Architecture: A Foundation for Scalable Context Exploration and Experimentation
-
-## 1. Executive Summary & Motivation
-
-As our agentic capabilities grow, the active context window becomes our most
-critical and constrained resource. Mismanaged context leads to broken caching,
-hallucination, and exorbitant token costs. Historically, context management has
-been decentralized, stateful, and highly coupled—making it dangerous to mutate
-and nearly impossible to safely experiment with.
-
-Our primary goal with this architecture is to establish a rigorous, structured
-model for context computation that guarantees:
-
-1. **Safety & Auditability:** Mutations happen in a predictable, auditable, and
-   recoverable way.
-2. **Asynchronous Safety:** Long-running LLM-driven graph analysis can execute
-   safely without blocking the user or creating race conditions.
-3. **Trivial Extensibility:** The system can be effortlessly augmented with new
-   compression strategies to scale our experiments.
-4. **A Universal Data-Plane:** Beyond simple text compression, the architecture
-   generalizes to safely structure _any_ computation in and around the context
-   (e.g., continuous reflection, semantic routing, long-term memory extraction).
-
-Because the ultimate "right answer" for context compression is unknown and
-constantly shifting, we have designed this architecture strictly around the
-**Open-Closed Principle**. We have "closed" the state-mutation engine to
-guarantee structural integrity, while leaving the behavioral logic entirely
-"open" for extension via decoupled primitives.
-
-To be clear: this is not an exercise in over-engineering. Rather, we are
-applying proven, boring, industry-standard software paradigms—specifically
-Functional Reactive Programming (FRP), the Actor Model, and the Open-Closed
-Principle—to tame the inherent complexity of managing an agent's context and
-prevent the system from collapsing under its own state.
-
-## 2. Embracing the Unknowns
-
-Before defining the architecture, we must acknowledge the fundamental realities
-of context management:
-
-- **The Trilemma (Quality vs. Cost vs. Caching):** There will never be one
-  single correct way to manage history. Aggressive summarization saves tokens
-  but breaks exact-string caching; retaining raw history maximizes caching but
-  blows up token budgets.
-- **The Precision / Recall Tension:** Context compression inherently damages
-  _recall_ (the ability to perfectly retrieve specific past details). However,
-  it protects _precision_ (preventing the LLM from being distracted by
-  irrelevant noise). We hypothesize that over time, active context management
-  will focus primarily on protecting precision, while recall will be better
-  protected in a targeted way via external memory, tasks, and planning systems.
-- **The Need for Rapid Experimentation:** Because these tradeoffs depend heavily
-  on the specific model, task, and user budget, we need a system that lets us
-  explore and express a wide range of options. We must be able to run a large
-  number of experiments—including in production—without risking catastrophic
-  state corruption.
-
-## 3. The Core Architectural Primitives
-
-To facilitate safe experimentation, we have separated the _execution_ of context
-mutations from the _logic_ of context mutations. This division reflects
-established, robust patterns.
-
-### The "Closed" Foundation: Synchronous Pipelines (Functional Reactive Programming)
-
-Drawing from the principles of **Functional Reactive Programming (FRP)**, the
-Context Working Buffer is treated as an immutable, ahead-of-time tracked graph.
-It can **only** be mutated synchronously, via an event-triggered **Pipeline**. A
-Pipeline is simply a linear list of functionally composed processors. By forcing
-all mutations through a synchronous, blocking pipeline of pure functions, we
-guarantee that the context is always modified in a sane, predictable, and
-mathematically sound sequence.
-
-### The "Open" Extensions: Processors, AsyncProcessors, and Inboxes (The Actor Model)
-
-To extend the system, developers author two types of plugins:
-
-1. **Context Processors:** Pure, fast, synchronous functions that take an input
-   graph and return an immutable mutated graph. They run inside Pipelines.
-2. **Context AsyncProcessors:** Inspired by the **Actor Model**, these are
-   event-triggered background jobs designed for isolated, long-running async
-   computations (e.g., asking an LLM to distill 50 turns of history).
-3. **Inboxes:** Because the graph can only be mutated synchronously,
-   AsyncProcessors cannot touch the graph directly (preventing race conditions).
-   Instead, they drop their results via message-passing into point-in-time
-   snapshots called _Inboxes_. Processors later read from these Inboxes during a
-   synchronous pipeline run to safely apply the async processor's findings.
-
-## 4. Proofs of Construction
-
-To demonstrate why these primitives are perfectly suited to the problem,
-consider the following structural pseudocode.
-
-### Example A: Fast, Synchronous Sanitization (The Processor)
-
-_Scenario: We need to immediately truncate massive tool outputs before they blow
-out the context window._
-
-Because this requires no LLM calls, it is expressed as a simple **Processor**
-running in an ingestion pipeline.
-
-```typescript
-// A pure function that guarantees safe mutation
-class ToolMaskingProcessor implements ContextProcessor {
-  apply(graph: ContextGraph): ContextGraph {
-    const mutatedGraph = graph.clone();
-
-    for (const node of mutatedGraph.getNodes('TOOL_OUTPUT')) {
-      if (node.length > MAX_CHARS) {
-        // Safely replace the node, retaining structural lineage
-        mutatedGraph.replace(node, {
-          type: 'MASKED_TOOL',
-          text: `[Output truncated. Original size: ${node.length}]`,
-        });
-      }
-    }
-
-    return mutatedGraph;
-  }
-}
-```
-
-### Example B: Long-Running Summarization (async pipeline + Inbox + Processor)
-
-_Scenario: The user has exceeded their token budget. We need to use an LLM to
-summarize the oldest 20 turns of conversation, but we cannot block the user from
-continuing to chat while the LLM generates the summary._
-
-This requires our async-to-sync bridge.
-
-**Step 1: The async pipeline (Async Analysis)**
-
-```typescript
-class StateSnapshotasync pipeline implements Contextasync pipeline {
-  // Triggers automatically in the background when the budget is exceeded
-  async onBudgetExceeded(event: BudgetEvent, inbox: Inbox) {
-    const agedOutNodes = event.getAgedOutNodes();
-
-    // Slow, async LLM call
-    const summaryText = await llm.summarize(agedOutNodes);
-
-    // The async pipeline CANNOT mutate the graph. It leaves a message in the Inbox.
-    inbox.deliver('SUMMARY_READY', {
-      targetNodes: agedOutNodes,
-      summary: summaryText,
-    });
-  }
-}
-```
-
-**Step 2: The Processor (Sync Application)**
-
-```typescript
-class StateSnapshotProcessor implements ContextProcessor {
-  // Runs fast and synchronously during the next Pipeline execution
-  apply(graph: ContextGraph, inbox: InboxSnapshot): ContextGraph {
-    // Check if the async background pipeline finished its job
-    const messages = inbox.read('SUMMARY_READY');
-    if (messages.isEmpty()) return graph;
-
-    const mutatedGraph = graph.clone();
-
-    for (const msg of messages) {
-      // Safely swap the old nodes for the new summary
-      mutatedGraph.collapseNodes(msg.targetNodes, {
-        type: 'ROLLING_SUMMARY',
-        text: msg.summary,
-      });
-      inbox.markConsumed(msg);
-    }
-
-    return mutatedGraph;
-  }
-}
-```
-
-### Example C: Downstream Observation & Memory Extraction (The Ledger)
-
-_Scenario: We want to extract long-term memories from conversation turns
-immediately before they are permanently deleted from the active context window
-by a garbage collector (GC)._
-
-Because the Context Working Buffer immutably tracks its own mathematical deltas
-(the Audit Log) and computes structural lineage Ahead-Of-Time (AOT), downstream
-processors don't have to guess what happened; they can simply read the math.
-
-```typescript
-class MemoryExtractionProcessor implements ContextProcessor {
-  // Runs sequentially AFTER a Garbage Collection (GC) Processor
-  apply(graph: ContextGraph): ContextGraph {
-    // 1. Look at the immutable Audit Log to see what the previous step did
-    const latestMutation = graph.getAuditLog().latest();
-    if (latestMutation.processorId !== 'HistoryTruncationProcessor') {
-      return graph; // Nothing was GC'd, do nothing
-    }
-
-    // 2. Identify the exact pristine nodes that were permanently lost
-    const lostPristineNodes = new Set<ContextNode>();
-
-    for (const removedId of latestMutation.removedIds) {
-      // Because the graph tracks provenance Ahead-Of-Time (AOT),
-      // we perfectly resolve what original thoughts this synthetic node represented.
-      // (e.g. Deleting a ROLLING_SUMMARY implies losing the 3 original USER_PROMPTS it summarized).
-      const roots = graph.getPristineNodes(removedId);
-      roots.forEach((r) => lostPristineNodes.add(r));
-    }
-
-    // 3. Compare against the currently surviving graph to find the TRUE delta
-    // (Ensure the roots aren't still surviving inside some OTHER summary node)
-    for (const survivingNode of graph.getNodes()) {
-      const survivingRoots = graph.getPristineNodes(survivingNode.id);
-      survivingRoots.forEach((r) => lostPristineNodes.delete(r));
-    }
-
-    // 4. Dispatch the permanently lost nodes to the long-term memory subsystem
-    if (lostPristineNodes.size > 0) {
-      // Fire-and-forget async dispatch (Actor Model) to external DB
-      LongTermMemorySystem.dispatchForEmbedding(Array.from(lostPristineNodes));
-    }
-
-    // This processor is purely observational; it returns the graph unmutated
-    return graph;
-  }
-}
-```
-
-## 5. Conclusion & Future Evolution
-
-By treating the Context Graph as an immutable ledger updated only via functional
-Pipelines, we have eliminated race conditions and untraceable graph corruption.
-By utilizing AsyncProcessors and Inboxes, we have safely bridged the gap between
-slow LLM analysis and fast, synchronous terminal UI updates.
-
-We recognize this is not the final form—future iterations may require strict
-simple priority to updates, or more advanced generational garbage collection.
-However, this architecture provides a rock-solid, extensible foundation.
-
-More importantly, while this system was born from the need to manage tokens, its
-immutable ledger and reactive pipelines generalize to something far more
-profound. We have built a safe, predictable computational engine capable of
-driving the _entire_ agentic loop—from background reflection, to semantic
-routing, to long-term memory extraction. It empowers us to safely deploy,
-observe, and scale a wide array of strategies in pursuit of the optimal balance
-between token cost, caching efficiency, and agentic quality.