Add validated architectural notes

2026-06-15 22:07:29 -07:00 · 2026-04-05 09:13:57 -07:00
parent 137c4cd59c
commit 9f3a154014
6 changed files with 2294 additions and 0 deletions
@@ -0,0 +1,529 @@
+# ADK-TS Alignment Pass
+
+Every interface in our outline must map cleanly to ADK-TS. This document
+verifies that mapping field-by-field, identifies gaps, and confirms
+HITL/plugin/transfer patterns work.
+
+Source: ADK-TS v0.4.0 at `/Users/adamfweidman/Desktop/adk-int/adk-js/core/src/`
+
+---
+
+## 1. AgentDescriptor ↔ ADK Agent Hierarchy
+
+### Field-by-field mapping
+
+| AgentDescriptor field        | ADK-TS source                                | Notes                                                                                                                                                    |
+| ---------------------------- | -------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name`                       | `BaseAgent.name`                             | Direct. ADK validates it's a valid JS identifier.                                                                                                        |
+| `displayName`                | —                                            | ADK doesn't have this. No conflict.                                                                                                                      |
+| `description`                | `BaseAgent.description` (optional in ADK)    | Direct. Used for model routing in AgentTool.                                                                                                             |
+| `executor`                   | —                                            | New concept. ADK agents are always 'adk'. Adapter sets this.                                                                                             |
+| `inputSchema`                | `LlmAgent.inputSchema` (Zod or JSON Schema)  | Direct. ADK's AgentTool uses this for tool parameter generation.                                                                                         |
+| `outputSchema`               | `LlmAgent.outputSchema` (Zod or JSON Schema) | Direct. ADK uses for structured output + AgentTool response.                                                                                             |
+| `capabilities`               | —                                            | New concept. Adapter infers from agent type: LlmAgent gets `['elicitation', 'streaming', 'host_tool_execution']`, LoopAgent gets `['composition']`, etc. |
+| `ownTools`                   | `LlmAgent.tools: ToolUnion[]`                | Maps via ToolDescriptor adapter. ADK tools have `name`, `description`, `_getDeclaration()` which returns JSON Schema.                                    |
+| `requiredTools`              | —                                            | New concept. ADK agents don't declare required host tools. Adapter can infer from tool references.                                                       |
+| `subAgents`                  | `BaseAgent.subAgents: BaseAgent[]`           | Recursive. Each sub-agent becomes a nested AgentDescriptor.                                                                                              |
+| `constraints.maxTurns`       | `RunConfig.maxLlmCalls` (default 500)        | Maps, though semantics differ slightly (LLM calls vs turns).                                                                                             |
+| `constraints.maxTimeMinutes` | —                                            | ADK doesn't have time limits. No conflict — host enforces.                                                                                               |
+| `constraints.maxBudgetUsd`   | —                                            | ADK doesn't have budget. No conflict — host enforces.                                                                                                    |
+| `metadata`                   | —                                            | New concept. Adapter can populate from agent registration context.                                                                                       |
+
+### ADK-specific fields NOT in AgentDescriptor
+
+| ADK field                           | Where it lives | Our approach                                                                                               |
+| ----------------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------- |
+| `instruction` / `globalInstruction` | LlmAgent       | Executor-internal. Not in descriptor (it's runtime config, not identity).                                  |
+| `model`                             | LlmAgent       | Goes in ExecutionOptions.model or executor-internal config.                                                |
+| `generateContentConfig`             | LlmAgent       | Executor-internal.                                                                                         |
+| `disallowTransferToParent/Peers`    | LlmAgent       | Could be `constraints` or `_meta`. Transfer policy is host-enforced.                                       |
+| `includeContents`                   | LlmAgent       | Executor-internal (context management).                                                                    |
+| `outputKey`                         | LlmAgent       | Executor-internal (state management).                                                                      |
+| `beforeModelCallback`, etc.         | LlmAgent       | Executor-internal. These are ADK's callback system — our LifecycleInterceptor is the interface equivalent. |
+
+### Verdict: CLEAN MAPPING
+
+AgentDescriptor captures everything needed to describe an ADK agent externally.
+ADK-specific runtime config (instruction, model, callbacks) stays inside the
+executor — exactly right for the descriptor/executor separation.
+
+**Key ADK pattern preserved:** AgentTool wraps an agent as a tool using
+`inputSchema` for parameters and `description` for the tool description. Our
+AgentDescriptor has both, so SubagentTool can do the same thing.
+
+---
+
+## 2. AgentSession ↔ ADK Runner
+
+### Method mapping
+
+| AgentSession method     | ADK-TS equivalent                                               | How adapter works                                                                                                                                                   |
+| ----------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `stream(data, options)` | `Runner.runAsync({ userId, sessionId, newMessage, runConfig })` | Adapter creates/loads session, maps data+options → runAsync params, wraps Event generator → AgentEvent generator. Each `stream()` call triggers a new `runAsync()`. |
+| `update(config)`        | No direct equivalent                                            | ADK doesn't support mid-stream config changes. Adapter queues updates for next `runAsync()` call.                                                                   |
+| `steer(data)`           | No direct equivalent                                            | ADK doesn't support mid-stream intervention. Adapter can queue for next invocation or ignore.                                                                       |
+| `abort()`               | No direct equivalent                                            | ADK uses `invocationContext.endInvocation = true`. Adapter sets this flag. Could also use AbortController.                                                          |
+
+### ExecutionRequest → Runner.runAsync mapping
+
+| ExecutionRequest field      | ADK mapping                                                      |
+| --------------------------- | ---------------------------------------------------------------- |
+| `descriptor`                | Used to find/create the BaseAgent instance                       |
+| `input`                     | → `newMessage: Content` (converted from ContentPart[] → Content) |
+| `sessionRef`                | → `sessionId` (string) or creates session from SessionSnapshot   |
+| `forkSession`               | Adapter clones session before running                            |
+| `options.tools`             | → merged into agent's `tools` config                             |
+| `options.model`             | → `LlmAgent.model` override                                      |
+| `options.hostToolExecution` | → `RunConfig.pauseOnToolCalls: true`                             |
+| `options.streaming`         | → `RunConfig.streamingMode`                                      |
+| `options.permissionMode`    | → SecurityPlugin config                                          |
+| `signal`                    | → wired to `invocationContext.endInvocation`                     |
+
+### HITL: How pauseOnToolCalls works end-to-end
+
+This is the critical path. Here's the full flow:
+
+```
+1. LLM returns tool call (FunctionCall in Event)
+2. ADK checks RunConfig.pauseOnToolCalls === true
+3. ADK sets invocationContext.endInvocation = true
+4. ADK yields the Event (with FunctionCall) and stops
+5. Runner.runAsync() generator completes
+
+   --- OUR INTERFACE BOUNDARY ---
+
+6. Adapter translates ADK Event → ToolRequestEvent
+7. Host receives ToolRequestEvent from session.stream() generator
+8. Host runs policy check (PolicyEvaluator.evaluate())
+9. Host fires hooks (LifecycleInterceptor.fire('before_tool', ...))
+10. If policy allows → Host executes tool → gets ToolResultData
+11. Host calls session.stream({ kind: 'tool_result', ... }) to get next stream
+
+   --- BACK INTO ADK ---
+
+12. Adapter receives tool result
+13. Adapter creates FunctionResponse Content
+14. Adapter calls Runner.runAsync() again with FunctionResponse as newMessage
+15. ADK loads session (has prior tool call event)
+16. ADK resumes agent with tool response
+17. Loop continues from step 1
+```
+
+**Why this works:** ADK's `pauseOnToolCalls` was designed exactly for this
+pattern — external tool execution by a host. The adapter translates between
+ADK's "end invocation + resume with FunctionResponse" pattern and our
+"ToolRequestEvent + send(tool_result)" pattern.
+
+**Key insight:** Each `session.stream()` call triggers a new `Runner.runAsync()`
+call. This means each ADK "invocation" maps to one `stream()` call. The session
+persists state across invocations. Mid-stream `update()` and `steer()` calls are
+queued for the next invocation since ADK doesn't support mid-turn changes.
+
+### HITL: ToolConfirmation flow
+
+ADK also has a separate ToolConfirmation pattern (via
+`context.requestConfirmation()`):
+
+```
+1. beforeToolCallback calls context.requestConfirmation({ hint: '...' })
+2. This sets eventActions.requestedToolConfirmations[functionCallId]
+3. ADK yields event with requestedToolConfirmations populated
+4. Runner completes (invocation ends)
+
+   --- OUR INTERFACE BOUNDARY ---
+
+5. Adapter sees requestedToolConfirmations in event
+6. Adapter translates → ElicitationRequest { kind: 'tool_confirmation', ... }
+7. Host renders confirmation UI
+8. User responds → ElicitationResponse { action: 'accept' | 'decline' }
+
+   --- BACK INTO ADK ---
+
+9. Adapter receives elicitation response
+10. If accepted: Adapter creates FunctionResponse with confirmed=true
+11. Calls Runner.runAsync() with FunctionResponse
+12. ADK's SecurityPlugin or callback reads confirmation from session
+13. Tool executes
+```
+
+**Maps to our ElicitationRequest:** ADK's `ToolConfirmation.hint` →
+`ElicitationRequest.message`. ADK's `ToolConfirmation.payload` →
+`ElicitationRequest.context`. The `kind: 'tool_confirmation'` is the
+discriminator.
+
+### HITL: Auth request flow
+
+```
+1. Tool or callback calls context.requestCredential(authConfig)
+2. Sets eventActions.requestedAuthConfigs[functionCallId]
+3. Event yields, invocation ends
+
+   --- OUR INTERFACE BOUNDARY ---
+
+4. Adapter sees requestedAuthConfigs
+5. Translates → ElicitationRequest { kind: 'auth_required', context: authConfig }
+6. User provides credentials
+7. ElicitationResponse { action: 'accept', content: { credential: ... } }
+
+   --- BACK INTO ADK ---
+
+8. Adapter stores credential via CredentialService
+9. Calls Runner.runAsync() again
+10. Tool calls context.getAuthResponse() → gets credential
+```
+
+**Maps to our ElicitationRequest:** ADK's auth pattern is just another
+elicitation kind. This validates our generic elicitation design — it handles
+tool confirmation, auth, and any future interaction type.
+
+---
+
+## 3. AgentEvent ↔ ADK Event
+
+### Event type mapping
+
+| Our AgentEvent        | ADK Event pattern                                                                 | Adapter translation                           |
+| --------------------- | --------------------------------------------------------------------------------- | --------------------------------------------- |
+| `InitializeEvent`     | First event from Runner.runAsync()                                                | Adapter emits on first stream() call          |
+| `SessionUpdateEvent`  | `eventActions.stateDelta`                                                         | Adapter emits when stateDelta is non-empty    |
+| `MessageEvent`        | `event.content` with text Parts                                                   | Filter text/thought parts from Content        |
+| `ToolRequestEvent`    | `getFunctionCalls(event)` returns FunctionCall[]                                  | Each FunctionCall → one ToolRequestEvent      |
+| `ToolUpdateEvent`     | `event.longRunningToolIds`                                                        | Adapter emits progress for long-running tools |
+| `ToolResponseEvent`   | `getFunctionResponses(event)` returns FunctionResponse[]                          | Each FunctionResponse → one ToolResponseEvent |
+| `ElicitationRequest`  | `eventActions.requestedToolConfirmations` or `requestedAuthConfigs`               | Map to generic elicitation                    |
+| `ElicitationResponse` | User input → FunctionResponse in next runAsync call                               | Reverse of above                              |
+| `UsageEvent`          | `event.usageMetadata` (GenerateContentResponseUsageMetadata)                      | Map token counts                              |
+| `ErrorEvent`          | `event.errorCode` + `event.errorMessage`                                          | Map error fields                              |
+| `stream_end`          | `isFinalResponse(event)`, `eventActions.transferToAgent`, `eventActions.escalate` | Derive `stream_end` reason from ADK signals   |
+| `CustomEvent`         | `event.customMetadata`                                                            | Pass through                                  |
+
+### ADK EventActions → Our events
+
+| EventActions field           | Our event                                                                 | Notes                                                                                                        |
+| ---------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
+| `stateDelta`                 | SessionUpdate or embedded in other events                                 | Delta state is a core ADK pattern                                                                            |
+| `artifactDelta`              | `CustomEvent { kind: 'artifact_delta' }`                                  | Artifacts not in our core events                                                                             |
+| `transferToAgent`            | Tool call (`transfer_to_agent`) + `stream_end` `reason: 'completed'`      | Handoff is a tool call. Host intercepts the tool request, mediates the handoff, originating agent completes. |
+| `escalate`                   | `stream_end` `reason: 'completed'` with `data: { escalateReason: '...' }` | LoopAgent exit signal. ADK's escalate = "I'm done, pass control back up"                                     |
+| `requestedToolConfirmations` | `ElicitationRequest { kind: 'tool_confirmation' }`                        | Per function call ID                                                                                         |
+| `requestedAuthConfigs`       | `ElicitationRequest { kind: 'auth_required' }`                            | Per function call ID                                                                                         |
+| `skipSummarization`          | `_meta: { skipSummarization: true }`                                      | ADK-specific, goes in metadata                                                                               |
+
+### AgentEventBase mapping
+
+| AgentEventBase field | ADK Event field                          | Notes                                             |
+| -------------------- | ---------------------------------------- | ------------------------------------------------- |
+| `id`                 | `event.id`                               | Direct                                            |
+| `timestamp`          | `event.timestamp` (number)               | Convert to ISO 8601 string                        |
+| `type`               | Derived from content analysis            | ADK doesn't have event types — adapter classifies |
+| `agentId`            | `event.author` (agent name) or context   | **New field** — which agent emitted this event    |
+| `threadId`           | `event.branch` (e.g., "agent_1.agent_2") | Direct mapping                                    |
+| `source`             | `event.author` ("user" or agent name)    | Direct                                            |
+| `_meta`              | `event.customMetadata`                   | Direct                                            |
+
+### Verdict: CLEAN MAPPING
+
+Every ADK event pattern maps to our event types. The adapter classifies ADK's
+untyped events into our typed event taxonomy. Key insight: ADK events are richer
+(they carry EventActions, function calls, auth requests all in one event), so
+the adapter may fan out one ADK Event into multiple AgentEvents (e.g., one
+Message + one ToolRequest + one ElicitationRequest). The new `agentId` field
+maps directly from ADK's `event.author`.
+
+---
+
+## 4. ToolContract ↔ ADK Tool System
+
+### ToolDescriptor ↔ BaseTool
+
+| ToolDescriptor field      | ADK source                                                    | Notes                             |
+| ------------------------- | ------------------------------------------------------------- | --------------------------------- |
+| `name`                    | `BaseTool.name`                                               | Direct                            |
+| `displayName`             | —                                                             | ADK doesn't have this             |
+| `description`             | `BaseTool.description`                                        | Direct                            |
+| `parametersSchema`        | `BaseTool._getDeclaration()` → FunctionDeclaration.parameters | JSON Schema from declaration      |
+| `annotations.readOnly`    | Inferred from tool type                                       | FunctionTool with no side effects |
+| `annotations.longRunning` | `BaseTool.isLongRunning`                                      | Direct                            |
+
+### ToolCallRequest ↔ FunctionCall
+
+| ToolCallRequest | ADK FunctionCall    | Notes  |
+| --------------- | ------------------- | ------ |
+| `requestId`     | `functionCall.id`   | Direct |
+| `name`          | `functionCall.name` | Direct |
+| `args`          | `functionCall.args` | Direct |
+
+### ToolResultData ↔ FunctionResponse + tool return
+
+| ToolResultData   | ADK                            | Notes                                            |
+| ---------------- | ------------------------------ | ------------------------------------------------ |
+| `llmContent`     | `FunctionResponse.response`    | Adapter wraps into ContentPart[]                 |
+| `displayContent` | —                              | ADK doesn't separate display from model content  |
+| `isError`        | Error thrown from `runAsync()` | Adapter catches and sets flag                    |
+| `tailCalls`      | —                              | ADK doesn't have tail calls (gemini-cli concept) |
+
+### AgentTool pattern
+
+ADK's `AgentTool` wraps a `BaseAgent` as a `BaseTool`:
+
+- Uses `agent.inputSchema` for tool parameters
+- Uses `agent.description` for tool description
+- Creates internal Runner with isolated session
+- Returns agent output as tool result
+- Merges state deltas back to parent
+
+**Our equivalent:** `SubagentTool` wraps `AgentDescriptor` as a tool:
+
+- Uses `descriptor.inputSchema` for tool parameters
+- Uses `descriptor.description` for tool description
+- Creates executor via `SessionFactory.create(descriptor, context)`
+- Returns execution result as tool result
+
+**Mapping is 1:1.** The only difference is ADK does it with concrete agent
+instances; we do it with descriptors + factory.
+
+---
+
+## 5. LifecycleInterceptor ↔ ADK Plugin System
+
+### Hook point mapping
+
+| Our hook point string | ADK Plugin callback     | Mapping                                    |
+| --------------------- | ----------------------- | ------------------------------------------ |
+| `'before_agent'`      | `beforeAgentCallback`   | `payload: { agent, context }`              |
+| `'after_agent'`       | `afterAgentCallback`    | `payload: { agent, context }`              |
+| `'before_model'`      | `beforeModelCallback`   | `payload: { context, llmRequest }`         |
+| `'after_model'`       | `afterModelCallback`    | `payload: { context, llmResponse }`        |
+| `'before_tool'`       | `beforeToolCallback`    | `payload: { tool, args, context }`         |
+| `'after_tool'`        | `afterToolCallback`     | `payload: { tool, args, context, result }` |
+| `'on_event'`          | `onEventCallback`       | `payload: { event }`                       |
+| `'on_user_message'`   | `onUserMessageCallback` | `payload: { userMessage }`                 |
+| `'before_run'`        | `beforeRunCallback`     | `payload: { context }`                     |
+| `'after_run'`         | `afterRunCallback`      | `payload: { context }`                     |
+| `'on_model_error'`    | `onModelErrorCallback`  | `payload: { request, error }`              |
+| `'on_tool_error'`     | `onToolErrorCallback`   | `payload: { tool, args, error }`           |
+
+### HookResult ↔ ADK callback return
+
+| HookResult field    | ADK pattern                                     | Notes                               |
+| ------------------- | ----------------------------------------------- | ----------------------------------- |
+| `action: 'proceed'` | Return `undefined`                              | Plugin returns nothing → continue   |
+| `action: 'block'`   | Return `Content` (for agent/model) or throw     | Non-undefined return short-circuits |
+| `modifications`     | Return modified `LlmRequest`/`LlmResponse`/args | Plugin returns modified version     |
+
+### ADK's early-exit pattern
+
+ADK plugins use "first non-undefined return wins":
+
+- `beforeModelCallback` returns `LlmResponse` → skips LLM call entirely (cache
+  hit)
+- `beforeToolCallback` returns modified `args` → tool runs with new args
+- `beforeAgentCallback` returns `Content` → skips agent run entirely
+
+Our `HookResult.modifications` carries the same data. The `action: 'block'` +
+return value pattern maps cleanly.
+
+### gemini-cli hooks NOT in ADK
+
+| gemini-cli hook       | ADK equivalent                       | Notes                                                         |
+| --------------------- | ------------------------------------ | ------------------------------------------------------------- |
+| `BeforeToolSelection` | —                                    | ADK doesn't let you modify which tools are available mid-turn |
+| `Notification`        | —                                    | ADK doesn't have notification hooks                           |
+| `SessionStart`        | `onUserMessageCallback` (first call) | Close enough                                                  |
+| `SessionEnd`          | `afterRunCallback`                   | Close enough                                                  |
+| `PreCompress`         | —                                    | ADK doesn't have context compression hooks                    |
+
+These gaps are fine — they're gemini-cli-specific hook points. Our generic
+`fire(hookPoint, payload)` handles them because the hook point is an open
+string. ADK executors simply don't fire these hook points, and
+`supportedHookPoints()` reflects that.
+
+---
+
+## 6. PolicyEvaluator ↔ ADK SecurityPlugin
+
+### ADK SecurityPlugin
+
+```typescript
+class SecurityPlugin extends BasePlugin {
+  policyEngine: BasePolicyEngine;
+
+  // In beforeToolCallback:
+  async beforeToolCallback({ tool, args, context }) {
+    const outcome = await this.policyEngine.evaluate(tool.name, args);
+    switch (outcome) {
+      case PolicyOutcome.DENY:
+        throw error;
+      case PolicyOutcome.CONFIRM:
+        context.requestConfirmation({ hint });
+      case PolicyOutcome.ALLOW:
+        return undefined; // proceed
+    }
+  }
+}
+```
+
+### Mapping
+
+| Our PolicyEvaluator       | ADK SecurityPlugin                                        | Notes                                      |
+| ------------------------- | --------------------------------------------------------- | ------------------------------------------ |
+| `evaluate(request)`       | `policyEngine.evaluate(toolName, args)`                   | ADK is simpler — tool name + args only     |
+| `PolicyDecision.allow`    | `PolicyOutcome.ALLOW`                                     | Direct                                     |
+| `PolicyDecision.deny`     | `PolicyOutcome.DENY`                                      | Direct                                     |
+| `PolicyDecision.ask_user` | `PolicyOutcome.CONFIRM` → `context.requestConfirmation()` | ADK chains to ToolConfirmation             |
+| `getExcluded()`           | —                                                         | ADK doesn't pre-filter tools               |
+| `request.principal`       | —                                                         | ADK doesn't track who's calling            |
+| `request.principalPath`   | Could use `context.agentName` + branch                    | For hierarchical policy                    |
+| `request.context`         | —                                                         | Our extension point for host-specific data |
+
+### How ADK policy maps when host controls execution
+
+With `pauseOnToolCalls: true`, the flow is:
+
+1. ADK yields tool call → adapter converts to ToolRequestEvent
+2. **Host** runs PolicyEvaluator.evaluate() — NOT ADK's SecurityPlugin
+3. Host decides allow/deny/ask_user
+4. If allowed, host executes tool and sends result via `session.stream()`
+
+This means **ADK's SecurityPlugin is bypassed when the host controls tool
+execution** — which is correct! The host's PolicyEvaluator is the authority.
+ADK's SecurityPlugin only matters when ADK executes tools internally
+(`pauseOnToolCalls: false`).
+
+---
+
+## 7. SessionContract ↔ ADK Session
+
+### Session mapping
+
+| Our SessionHandle | ADK Session                              | Notes                                     |
+| ----------------- | ---------------------------------------- | ----------------------------------------- |
+| `id`              | `Session.id`                             | Direct                                    |
+| `agentName`       | `Session.appName`                        | ADK uses appName, not agent name          |
+| `events`          | `Session.events: Event[]`                | Direct (but ADK Events → our AgentEvents) |
+| `state`           | `Session.state: Record<string, unknown>` | Direct                                    |
+| `lastUpdateTime`  | `Session.lastUpdateTime`                 | Direct                                    |
+
+### SessionProvider ↔ BaseSessionService
+
+| Our SessionProvider           | ADK BaseSessionService                          | Notes                      |
+| ----------------------------- | ----------------------------------------------- | -------------------------- |
+| `create(agentName, metadata)` | `createSession({ appName, userId })`            | ADK requires userId        |
+| `load(sessionId)`             | `getSession({ appName, userId, sessionId })`    | ADK requires all three IDs |
+| `list(agentName)`             | `listSessions({ appName, userId })`             | ADK scopes by userId       |
+| `delete(sessionId)`           | `deleteSession({ appName, userId, sessionId })` | Same pattern               |
+
+### Gap: ADK requires userId
+
+ADK sessions are scoped by `(appName, userId, sessionId)`. Our interface uses
+just `sessionId`. The adapter can embed userId in the session metadata or derive
+it from HostContext.
+
+### State prefixes (ADK-specific)
+
+ADK uses prefixed state keys:
+
+- `app:` — app-scoped, persisted
+- `user:` — user-scoped, persisted
+- `temp:` — temporary, stripped before persistence
+
+Our `SessionHandle.state` is a flat `Record<string, unknown>`. The adapter
+preserves prefixes as-is — they're just string keys. No conflict.
+
+---
+
+## 8. ContentPart ↔ ADK Content/Part
+
+### ADK uses Google GenAI types
+
+ADK's `Content` and `Part` come from `@google/genai`:
+
+```typescript
+interface Content {
+  role?: string;  // 'user' | 'model'
+  parts: Part[];
+}
+
+type Part = TextPart | InlineDataPart | FunctionCallPart | FunctionResponsePart | ...
+```
+
+### Mapping
+
+| Our ContentPart                                     | ADK/GenAI Part                                 | Notes                                                  |
+| --------------------------------------------------- | ---------------------------------------------- | ------------------------------------------------------ |
+| `{ type: 'text', text }`                            | `{ text: string }`                             | Direct                                                 |
+| `{ type: 'thought', thought }`                      | `{ thought: true, text: string }`              | ADK uses `thought` boolean flag on TextPart            |
+| `{ type: 'media', mimeType, data }`                 | `{ inlineData: { mimeType, data } }`           | Restructure                                            |
+| `{ type: 'reference', text, uri }`                  | `{ fileData: { fileUri, mimeType } }`          | Map fileData → reference                               |
+| `{ type: 'refusal', text }`                         | —                                              | Not in ADK/GenAI. Adapter would map from finishReason. |
+| `{ type: 'function_call', name, args, id }`         | `{ functionCall: { name, args, id } }`         | Unwrap                                                 |
+| `{ type: 'function_response', name, response, id }` | `{ functionResponse: { name, response, id } }` | Unwrap                                                 |
+
+### Verdict: CLEAN MAPPING
+
+The adapter converts between our flat discriminated union and ADK's nested Part
+structure. No information loss in either direction.
+
+---
+
+## 9. Composition ↔ ADK Agent Patterns
+
+| Our CompositionConfig.pattern | ADK Agent type                         | Notes                                            |
+| ----------------------------- | -------------------------------------- | ------------------------------------------------ |
+| `'hierarchical'`              | Any agent with `subAgents`             | Default — parent calls sub-agents as tools       |
+| `'sequential'`                | `SequentialAgent`                      | Runs children in order                           |
+| `'parallel'`                  | `ParallelAgent`                        | Runs children concurrently, branch isolation     |
+| `'loop'`                      | `LoopAgent`                            | Repeats children until escalate or maxIterations |
+| `'transfer'`                  | LlmAgent with `transfer_to_agent` tool | Peer-to-peer handoff                             |
+
+### Branch isolation
+
+ADK's `ParallelAgent` gives each child an isolated `branch` context:
+
+- Children don't see peer events
+- Each gets unique branch path: `"parent.child_0"`, `"parent.child_1"`
+- Results merged after all complete
+
+Maps to our `threadId` — each parallel branch gets a unique threadId. Events
+from different branches are interleaved by the host.
+
+---
+
+## 10. Summary: Gaps and Resolutions
+
+### No gaps blocking ADK integration:
+
+| Concern                 | Status    | Resolution                                                                |
+| ----------------------- | --------- | ------------------------------------------------------------------------- |
+| pauseOnToolCalls HITL   | **Works** | Adapter maps to stream() cycle (§2)                                       |
+| ToolConfirmation        | **Works** | Maps to ElicitationRequest (§2)                                           |
+| Auth requests           | **Works** | Maps to ElicitationRequest (§2)                                           |
+| Plugin hooks (12 types) | **Works** | Maps to LifecycleInterceptor.fire() (§5)                                  |
+| Agent transfers         | **Works** | Tool call (`transfer_to_agent`) + `stream_end` `reason: 'completed'` (§3) |
+| State delta pattern     | **Works** | SessionUpdateEvent or \_meta (§3)                                         |
+| Branch isolation        | **Works** | threadId mapping (§9)                                                     |
+| AgentTool pattern       | **Works** | SubagentTool with descriptor + factory (§4)                               |
+| Session management      | **Works** | Adapter maps userId into session (§7)                                     |
+
+### Minor adapter complexity:
+
+1. **Event fan-out:** One ADK Event may become multiple AgentEvents (message +
+   tool call + elicitation). Adapter logic needed but straightforward.
+2. **userId scoping:** ADK sessions require userId; our interface doesn't.
+   Adapter derives from HostContext.
+3. **Timestamp format:** ADK uses `number` (epoch ms); we use ISO 8601 string.
+   Simple conversion.
+4. **Content structure:** ADK uses nested Part types; we use flat discriminated
+   union. Adapter converts bidirectionally.
+
+### ADK features our interface supports that gemini-cli doesn't have yet:
+
+- `LoopAgent` / `ParallelAgent` / `SequentialAgent` composition → our
+  CompositionConfig
+- `eventActions.stateDelta` → our SessionUpdateEvent
+- `eventActions.transferToAgent` → tool call (`transfer_to_agent`) +
+  `stream_end` `reason: 'completed'`
+- `eventActions.escalate` → `stream_end` `reason: 'completed'` with
+  `data: { escalateReason }`
+- Long-running tools → our ToolUpdateEvent
+- Auth credential flow → our ElicitationRequest with kind: 'auth_required'
@@ -0,0 +1,274 @@
+# ADK-TS (Agent Development Kit - TypeScript) Architecture Notes
+
+## Package: `@google/adk` v0.4.0
+
+**Location:** `/Users/adamfweidman/Desktop/adk-int/adk-js/core/`
+
+## Agent Hierarchy
+
+```
+BaseAgent (abstract)
+├── LlmAgent         - Model-driven agent with tools (the main one)
+├── LoopAgent         - Runs sub-agents in a loop (maxIterations, escalate to exit)
+├── ParallelAgent     - Runs sub-agents concurrently (isolated branches)
+└── SequentialAgent   - Runs sub-agents sequentially
+```
+
+### BaseAgent Config
+
+- `name: string` - Unique identifier (must be valid JS identifier)
+- `description?: string` - One-line capability for model routing
+- `parentAgent?: BaseAgent` - Parent in agent tree
+- `subAgents?: BaseAgent[]` - Child agents
+- `beforeAgentCallback / afterAgentCallback` - Pre/post execution hooks
+
+### LlmAgent Config (extends BaseAgent)
+
+- `model?: string | BaseLlm` - LLM to use
+- `instruction?: string | InstructionProvider` - Agent-specific instructions
+- `globalInstruction?: string | InstructionProvider` - Tree-wide (root only)
+- `tools?: ToolUnion[]` - Available tools
+- `generateContentConfig?: GenerateContentConfig` - LLM params
+- `disallowTransferToParent / disallowTransferToPeers` - Transfer controls
+- `includeContents?: 'default' | 'none'` - Context history inclusion
+- `inputSchema / outputSchema` - Validation schemas
+- `outputKey?: string` - Session state key for output storage
+- `beforeModelCallback / afterModelCallback` - LLM hooks
+- `beforeToolCallback / afterToolCallback` - Tool hooks
+- `requestProcessors / responseProcessors` - LLM request/response processors
+- `codeExecutor?: BaseCodeExecutor`
+
+## Event System
+
+### Event Interface
+
+```typescript
+interface Event extends LlmResponse {
+  id: string;
+  invocationId: string;
+  author?: string; // "user" or agent name
+  actions: EventActions; // State/artifact/auth/transfer operations
+  longRunningToolIds?: string[];
+  branch?: string; // Hierarchical agent path
+  timestamp: number;
+  content?: Content;
+  partial?: boolean; // Streaming indicator
+}
+```
+
+### EventActions
+
+```typescript
+interface EventActions {
+  skipSummarization?: boolean;
+  stateDelta: Record<string, unknown>;
+  artifactDelta: Record<string, number>;
+  transferToAgent?: string;
+  escalate?: boolean;
+  requestedAuthConfigs: Record<string, AuthConfig>;
+  requestedToolConfirmations: Record<string, ToolConfirmation>;
+}
+```
+
+### Structured Events (utility layer)
+
+Converts raw Event to discriminated union:
+
+```
+EventType: THOUGHT | CONTENT | TOOL_CALL | TOOL_RESULT | CALL_CODE |
+           CODE_RESULT | ERROR | ACTIVITY | TOOL_CONFIRMATION | FINISHED
+```
+
+## Tool System
+
+### BaseTool (abstract)
+
+- `name, description, isLongRunning`
+- `_getDeclaration(): FunctionDeclaration` - OpenAPI schema for LLM
+- `runAsync(request): Promise<unknown>` - Execute tool
+- `processLlmRequest(request): Promise<void>` - Preprocessing
+
+### Concrete Tool Types
+
+1. **FunctionTool** - Generic typed tools (Zod schema support)
+2. **AgentTool** - Wrap agents as tools (for hierarchical composition)
+3. **MCPTool** - Model Context Protocol server tools
+4. **GoogleSearchTool** - Built-in web search
+5. **ExitLoopTool** - Signal loop exit
+6. **LongRunningFunctionTool** - Async long-running operations
+
+### BaseToolset
+
+- Filter tools by predicate or string list
+- `getTools(context)`, `close()`, `isToolSelected()`
+- **MCPToolset** - Toolset for MCP server connections
+
+## Session Management
+
+### Session Interface
+
+```typescript
+interface Session {
+  id: string;
+  appName: string;
+  userId: string;
+  state: Record<string, unknown>; // Mutable key-value store
+  events: Event[]; // Complete conversation history
+  lastUpdateTime: number;
+}
+```
+
+### Session Services
+
+- `BaseSessionService` (abstract) - createSession, getSession, listSessions,
+  deleteSession, appendEvent
+- `InMemorySessionService` - In-process storage
+- `DatabaseSessionService` - Mikro-ORM backed (SQL)
+
+### State Management
+
+- `State` class wraps base state + delta
+- `get()` returns from delta if present, else base
+- `set()` updates delta only
+- `hasDelta()` checks if changes made
+
+## Human-in-the-Loop (HITL)
+
+### Tool Confirmation
+
+```typescript
+class ToolConfirmation {
+  hint?: string; // Guidance for user
+  confirmed: boolean; // User approval
+  payload?: unknown; // Additional context
+}
+```
+
+### Security Plugin
+
+- `beforeToolCallback` - Evaluates policy before tool execution
+- `BasePolicyEngine` interface with `evaluate()` method
+- `PolicyOutcome`: DENY | CONFIRM | ALLOW
+
+### Auth Requests
+
+- `context.requestCredential(authConfig)` - Request auth from user
+- `context.getAuthResponse(authConfig)` - Check for auth response
+- Sets `eventActions.requestedAuthConfigs[functionCallId]`
+
+## Multi-Agent Patterns
+
+### Agent Transfer
+
+- LlmAgent injects `transfer_to_agent(agentName)` tool
+- Sets `eventActions.transferToAgent = targetAgentName`
+- Runner resolves target and continues
+- Can transfer to: sub-agents, parent (if not disabled), peers (if not disabled)
+
+### Parallel Agent
+
+- Runs all subAgents concurrently
+- Isolates each via `branch` context
+- Sub-agents don't see peer history
+- Merges event streams with fair ordering
+
+### Loop Agent
+
+- Repeatedly runs subAgents
+- `maxIterations` caps loop count
+- Exits on `event.actions.escalate === true`
+
+## Plugin System
+
+### BasePlugin Lifecycle Hooks (14 hooks!)
+
+- `onUserMessageCallback` - Preprocess user messages
+- `beforeRunCallback` - Before agent run (can short-circuit)
+- `onEventCallback` - Per-event (can modify events)
+- `afterRunCallback` - Final cleanup
+- `beforeAgentCallback / afterAgentCallback` - Agent lifecycle
+- `beforeModelCallback / afterModelCallback` - LLM lifecycle
+- `onModelErrorCallback` - Model error handling
+- `beforeToolCallback / afterToolCallback` - Tool lifecycle
+- `onToolErrorCallback` - Tool error handling
+
+### Built-in Plugins
+
+- **LoggingPlugin** - Debug logging
+- **SecurityPlugin** - Policy enforcement + tool confirmation
+- **PluginManager** - Plugin orchestration
+
+## Runner
+
+### Runner Config
+
+```typescript
+interface RunnerConfig {
+  appName: string;
+  agent: BaseAgent; // Root agent
+  plugins?: BasePlugin[];
+  artifactService?: BaseArtifactService;
+  sessionService: BaseSessionService; // Required
+  memoryService?: BaseMemoryService;
+  credentialService?: BaseCredentialService;
+}
+```
+
+### RunConfig (per-run options)
+
+```typescript
+interface RunConfig {
+  speechConfig?: SpeechConfig;
+  responseModalities?: Modality[];
+  maxLlmCalls?: number; // Default 500
+  pauseOnToolCalls?: boolean; // Client-side tool execution
+  streamingMode?: StreamingMode; // NONE | SSE | BIDI
+  // ... audio/live configs
+}
+```
+
+### Execution Pipeline
+
+1. Load or create session
+2. Create InvocationContext
+3. Run pluginManager.runOnUserMessageCallback()
+4. Append user message to session
+5. Run agent.runAsync(invocationContext) → yields events
+6. For each non-partial event: append to session
+7. Run pluginManager.runOnEventCallback()
+8. Run pluginManager.runAfterRunCallback()
+
+## Model Layer
+
+### BaseLlm (abstract)
+
+- `generateContentAsync(llmRequest, stream?): AsyncGenerator<LlmResponse>`
+- `connect(llmRequest): Promise<BaseLlmConnection>` - For live/streaming
+
+### Implementations
+
+- `Gemini` - Google Gemini API
+- `ApigeeLlm` - Apigee-wrapped models
+- `LLMRegistry` - Static registry for model lookup
+
+## Service Adapters (all abstract base + implementations)
+
+| Service               | Implementations                |
+| --------------------- | ------------------------------ |
+| BaseSessionService    | InMemory, Database (Mikro-ORM) |
+| BaseArtifactService   | InMemory, File, GCS            |
+| BaseMemoryService     | InMemory                       |
+| BaseCredentialService | InMemory                       |
+| BaseCodeExecutor      | BuiltIn                        |
+
+## Design Patterns
+
+1. **Symbol-based type guards** - Every class uses `Symbol.for()` + `isXxx()`
+2. **Abstract base classes** - Service interfaces via abstract classes
+3. **Async generators** - All agent execution yields events
+4. **Context objects** - Rich context passed to callbacks/tools
+5. **Delta state** - Session state + event action deltas
+6. **Plugin middleware** - 14 hooks at multiple execution points
+7. **Tree-based hierarchy** - Parent-child agents with root traversal
+8. **Branch isolation** - Parallel agents use branch paths
+9. **Callback chains** - Multiple callbacks per stage with early termination
@@ -0,0 +1,587 @@
+# Cross-SDK Comparison: Events, Agents, and Interface Superset
+
+## 1. AgentEvents: Our Outline vs Michael's
+
+Our outline and Michael's `Gemini CLI Agents.txt` are **nearly identical** in
+event taxonomy. The only difference is we added a `stream_end` event type:
+
+| #   | Michael's Events       | Our Outline           | Delta                                                                           |
+| --- | ---------------------- | --------------------- | ------------------------------------------------------------------------------- |
+| 1   | `initialize`           | `InitializeEvent`     | Same                                                                            |
+| 2   | `session_update`       | `SessionUpdateEvent`  | Same                                                                            |
+| 3   | `message`              | `MessageEvent`        | Same — streaming handled by AsyncGenerator                                      |
+| 4   | `tool_request`         | `ToolRequestEvent`    | Same                                                                            |
+| 5   | `tool_update`          | `ToolUpdateEvent`     | Same                                                                            |
+| 6   | `tool_response`        | `ToolResponseEvent`   | Same                                                                            |
+| 7   | `elicitation_request`  | `ElicitationRequest`  | Same                                                                            |
+| 8   | `elicitation_response` | `ElicitationResponse` | Same                                                                            |
+| 9   | `usage`                | `UsageEvent`          | Same                                                                            |
+| 10  | `error`                | `ErrorEvent`          | Same                                                                            |
+| 11  | `custom`               | `CustomEvent`         | Same                                                                            |
+| 12  | —                      | **StreamEnd**         | **Added**: completed, failed, aborted, max_turns, max_budget, max_time, refusal |
+
+### Minor structural differences:
+
+| Aspect                 | Michael                                             | Our Outline                                                               |
+| ---------------------- | --------------------------------------------------- | ------------------------------------------------------------------------- |
+| **Base type**          | `AgentEventCommon` with `type: string` (fully open) | `AgentEventBase` with `type: AgentEventType` (`'known' \| (string & {})`) |
+| **Agent ID**           | —                                                   | `agentId` on event base (which agent emitted this event)                  |
+| **Event map**          | Generic `interface AgentEvents` + mapped type       | Same — adopted Michael's pattern for declaration merging extensibility    |
+| **ContentPart.\_meta** | Required (`_meta: Record<string, unknown>`)         | Optional (`_meta?: Record<string, unknown>`)                              |
+| **ErrorData.status**   | Google RPC codes (`'RESOURCE_EXHAUSTED' \| '...'`)  | Open string (per our generic philosophy)                                  |
+| **Message.role**       | `'user' \| 'agent' \| 'developer'`                  | Same                                                                      |
+| **Stream end**         | Only `initialize`                                   | `stream_end` with `reason` field + open `data` bag                        |
+| **Handoff**            | Not covered                                         | Tool call (`transfer_to_agent`) — no dedicated event                      |
+| **Pausing**            | Implicit (elicitation/tool events)                  | Same — no explicit pause/resume events                                    |
+
+### Design decisions adopted from Michael
+
+1. **`interface AgentEvents` + mapped type** — Michael's pattern enables
+   declaration merging, letting any module add new event types without modifying
+   the base definition. Strictly better than an explicit union type.
+2. **`_meta` on ContentPart** — More extensible. We adopted it (as optional).
+3. **Implicit pausing** — No separate pause/resume events. When the agent emits
+   an `elicitation_request` or `tool_request`, the stream naturally pauses. The
+   host calls `stream()` to resume.
+
+---
+
+## 2. Claude Agent SDK — Key Interfaces
+
+Source: `@anthropic-ai/claude-agent-sdk`
+
+### Agent Execution Model
+
+```typescript
+// Entry point — not an interface, a function
+function query({
+  prompt: string | AsyncIterable<SDKUserMessage>,
+  options?: Options
+}): Query  // extends AsyncGenerator<SDKMessage, void>
+```
+
+### Message Types (Event Stream)
+
+```typescript
+type SDKMessage =
+  | SystemMessage // subtype: "init" | "compact_boundary"
+  | AssistantMessage // Claude's response with tool calls
+  | UserMessage // Tool results fed back
+  | StreamEvent // Raw API stream events (opt-in)
+  | ResultMessage // Final: success | error_max_turns | error_max_budget_usd | error_during_execution
+  | CompactBoundaryMessage; // Context compaction marker
+```
+
+### Tool Approval (HITL)
+
+```typescript
+canUseTool: async (toolName: string, input: Record<string, any>) =>
+  Promise<
+    | { behavior: 'allow'; updatedInput: Record<string, any> }
+    | { behavior: 'deny'; message: string }
+  >;
+```
+
+### Subagent Definition
+
+```typescript
+interface AgentDefinition {
+  description: string; // When to invoke
+  prompt: string; // System prompt
+  tools?: string[]; // Available tools (defaults to all)
+  model?: 'sonnet' | 'opus' | 'haiku' | 'inherit';
+}
+```
+
+### Session Management
+
+```typescript
+interface Options {
+  continue?: boolean;         // Resume most recent session
+  resume?: string;            // Resume by session ID
+  forkSession?: boolean;      // Branch from resume point
+  persistSession?: boolean;   // Default: true
+  maxTurns?: number;
+  maxBudgetUsd?: number;      // Spend limit
+  permissionMode?: 'default' | 'acceptEdits' | 'plan' | 'dontAsk' | 'bypassPermissions';
+  structuredOutput?: { type: "json_schema", ... };
+}
+```
+
+### Result (Termination)
+
+```typescript
+interface SDKResultMessage {
+  type: 'result';
+  subtype:
+    | 'success'
+    | 'error_max_turns'
+    | 'error_max_budget_usd'
+    | 'error_during_execution'
+    | 'error_max_structured_output_retries';
+  result?: string;
+  total_cost_usd: number;
+  usage: { input_tokens: number; output_tokens: number };
+  num_turns: number;
+  session_id: string;
+  stop_reason: string | null; // "end_turn", "max_tokens", "refusal"
+}
+```
+
+### V2 Preview (Simpler API)
+
+```typescript
+await using session = unstable_v2_createSession({ model: "..." });
+await session.send("Hello!");
+for await (const msg of session.stream()) { ... }
+await session.send("Follow-up");
+for await (const msg of session.stream()) { ... }
+```
+
+---
+
+## 3. OpenAI Codex SDK / Responses API — Key Interfaces
+
+### Codex SDK (TypeScript)
+
+```typescript
+// Client
+const codex = new Codex({ env?, config? });
+const thread = codex.startThread({ workingDirectory?, skipGitRepoCheck? });
+const thread = codex.resumeThread(threadId);
+
+// Execution
+const turn = await thread.run(prompt: string | InputEntry[], options?);
+const { events } = await thread.runStreamed(prompt);
+
+// Streaming
+for await (const event of events) {
+  switch (event.type) {
+    case "item.completed": // event.item
+    case "turn.completed": // event.usage
+  }
+}
+```
+
+### Responses API Streaming Events (53 types)
+
+Organized hierarchically:
+
+**Response Lifecycle (7):**
+
+- `response.queued`, `response.created`, `response.in_progress`
+- `response.completed`, `response.incomplete`, `response.failed`
+- `error`
+
+**Content Streaming (8):**
+
+- `response.output_item.added`, `response.output_item.done`
+- `response.content_part.added`, `response.content_part.done`
+- `response.output_text.delta`, `response.output_text.done`
+- `response.refusal.delta`, `response.refusal.done`
+
+**Reasoning (6):**
+
+- `response.reasoning_text.delta`, `response.reasoning_text.done`
+- `response.reasoning_summary_part.added`,
+  `response.reasoning_summary_part.done`
+- `response.reasoning_summary_text.delta`,
+  `response.reasoning_summary_text.done`
+
+**Function Calls (2):**
+
+- `response.function_call_arguments.delta`,
+  `response.function_call_arguments.done`
+
+**MCP (8):**
+
+- `response.mcp_call_arguments.delta`, `response.mcp_call_arguments.done`
+- `response.mcp_call.in_progress`, `response.mcp_call.completed`,
+  `response.mcp_call.failed`
+- `response.mcp_list_tools.in_progress`, `response.mcp_list_tools.completed`,
+  `response.mcp_list_tools.failed`
+
+**Built-in Tools (15):**
+
+- File search: `in_progress`, `searching`, `completed`
+- Web search: `in_progress`, `searching`, `completed`
+- Code interpreter: `in_progress`, `interpreting`, `code.delta`, `code.done`,
+  `completed`
+- Image gen: `in_progress`, `generating`, `partial_image`, `completed`
+
+**Audio (4):**
+
+- `response.audio.delta`, `response.audio.done`
+- `response.audio.transcript.delta`, `response.audio.transcript.done`
+
+**Annotations (1):**
+
+- `response.output_text.annotation.added`
+
+### OpenAI Agents SDK (higher-level)
+
+```python
+# Python-first, but patterns apply
+class RunItemStreamEvent:
+    name: Literal[
+        "message_output_created",
+        "handoff_requested",
+        "handoff_occurred",
+        "tool_called",
+        "tool_output",
+        "tool_search_called",
+        "tool_search_output_created",
+        "reasoning_item_created",
+        "mcp_approval_requested",
+        "mcp_approval_response",
+        "mcp_list_tools",
+    ]
+
+class AgentUpdatedStreamEvent:
+    # Fires when current agent changes (handoff)
+    new_agent: Agent
+```
+
+---
+
+## 4. Superset Analysis — What Changes Our Interfaces?
+
+### Concepts Present in ALL Systems
+
+| Concept               | gemini-cli | ADK-TS | Claude SDK    | Codex/OpenAI   | Our Interfaces          |
+| --------------------- | ---------- | ------ | ------------- | -------------- | ----------------------- |
+| Text streaming        | ✅         | ✅     | ✅            | ✅             | ✅ MessageEvent         |
+| Tool request/response | ✅         | ✅     | ✅            | ✅             | ✅ ToolRequest/Response |
+| Thinking/reasoning    | ✅         | ✅     | ✅ (thinking) | ✅ (reasoning) | ✅ ContentPart.thought  |
+| Error events          | ✅         | ✅     | ✅            | ✅             | ✅ ErrorEvent           |
+| Token usage           | ✅         | ✅     | ✅            | ✅             | ✅ UsageEvent           |
+| Tool progress         | ✅         | ✅     | —             | ✅             | ✅ ToolUpdateEvent      |
+| Session resume        | ✅         | ✅     | ✅            | ✅             | ✅ sessionRef           |
+| Subagents             | ✅         | ✅     | ✅            | —              | ✅ threadId             |
+| Abort/cancel          | ✅         | ✅     | ✅            | ✅             | ✅ abort()              |
+| Metadata escape hatch | —          | ✅     | —             | —              | ✅ \_meta               |
+
+### NEW Concepts From Claude/Codex That We Should Incorporate
+
+#### 4.1 Structured Stream End Reasons (HIGH PRIORITY)
+
+**What:** Claude SDK has typed termination:
+`success | error_max_turns | error_max_budget_usd | error_during_execution`.
+OpenAI has `completed | incomplete | failed`.
+
+**Why it matters:** We need a `stream_end` event that captures why the stream
+ended — the one signal not covered by other event types.
+
+**Final design — `stream_end` with `reason` + open `data` bag:**
+
+```typescript
+type StreamEndReason =
+  | 'completed'
+  | 'failed'
+  | 'aborted'
+  | 'max_turns'
+  | 'max_budget'
+  | 'max_time'
+  | 'refusal'
+  | (string & {});
+
+interface StreamEnd {
+  reason: StreamEndReason;
+  data?: Record<string, unknown>; // { result?, cost?, usage?, numTurns?, error?, ... }
+}
+```
+
+**Design rationale:**
+
+- Start is covered by `initialize`. Pausing is implicit (elicitation/tool
+  request events). Handoff is a tool call (`transfer_to_agent`).
+- End-of-stream details go in `data` as an open bag, not fixed fields.
+
+#### 4.2 Budget Constraints (MEDIUM PRIORITY)
+
+**What:** Claude SDK has `maxBudgetUsd`. Neither gemini-cli nor ADK has this
+today.
+
+**Why it matters:** Cost control is critical for production deployments.
+
+**Proposed change to AgentConstraints:**
+
+```typescript
+interface AgentConstraints {
+  maxTurns?: number;
+  maxTimeMinutes?: number;
+  maxLlmCalls?: number;
+  maxBudgetUsd?: number; // NEW: from Claude SDK
+}
+```
+
+#### 4.3 Session Forking (MEDIUM PRIORITY)
+
+**What:** Claude SDK supports `forkSession: boolean` — branch from a resume
+point to explore alternatives.
+
+**Why it matters:** Enables "what if" exploration without destroying history.
+Useful for plan mode.
+
+**Proposed change to ExecutionRequest:**
+
+```typescript
+interface ExecutionRequest {
+  // ... existing fields ...
+  sessionRef?: string | SessionSnapshot;
+  forkSession?: boolean; // NEW: branch from sessionRef instead of continuing
+}
+```
+
+#### 4.4 Permission Modes on Execution (MEDIUM PRIORITY)
+
+**What:** Claude has 5 permission modes:
+`default | acceptEdits | plan | dontAsk | bypassPermissions`. gemini-cli has 4
+approval modes: `default | autoEdit | yolo | plan`.
+
+**Why it matters:** Both systems have this concept. It should be in
+ExecutionOptions, not hard-coded.
+
+**Proposed change to ExecutionOptions:**
+
+```typescript
+interface ExecutionOptions {
+  // ... existing fields ...
+  permissionMode?: string; // Open string. Conventions: 'default' | 'auto_edit' | 'autonomous' | 'plan' | string
+}
+```
+
+#### 4.5 Agent Handoff (MEDIUM PRIORITY)
+
+**What:** OpenAI Agents SDK has explicit `handoff_requested` /
+`handoff_occurred` events plus `AgentUpdatedStreamEvent`. ADK has
+`transfer_to_agent` tool + `eventActions.transferToAgent`. Claude SDK has
+subagent invocation via Agent tool.
+
+**Why it matters:** When agent A delegates to agent B, the host/UI needs to
+know.
+
+**Design decision: Handoff is a tool call, not a separate event type.**
+
+The agent calls `transfer_to_agent` as a tool (ToolRequest event). The host
+intercepts this tool call (since host controls tool execution), looks up the
+target agent, creates a new executor via the factory, and mediates the handoff.
+The originating agent's stream ends with `stream_end` reason `'completed'`.
+
+```typescript
+// 1. Agent emits tool request:
+{ type: 'tool_request', name: 'transfer_to_agent', args: { target: 'coder', reason: '...' } }
+
+// 2. Host mediates handoff, originating agent completes:
+{ type: 'stream_end', reason: 'completed', agentId: 'planner', data: { handoffTarget: 'coder' } }
+```
+
+This avoids duplicating routing logic between stream_end events and tool calls.
+Matches ADK's `transfer_to_agent` tool pattern.
+
+#### 4.6 Refusal as Distinct Signal (LOW PRIORITY)
+
+**What:** OpenAI has explicit `response.refusal.delta/done` events. Claude has
+`stop_reason: "refusal"`.
+
+**Why it matters:** Model refusals are operationally important (safety, policy).
+
+**Proposed:** No new event type. Handle via `MessageEvent` with a `refusal`
+content part type, or via `ErrorEvent` with specific error code. ContentPart can
+be extended:
+
+```typescript
+| { type: 'refusal'; text: string }
+```
+
+#### 4.7 Content Annotations (LOW PRIORITY)
+
+**What:** OpenAI has `response.output_text.annotation.added` for citations, file
+paths.
+
+**Why it matters:** Citations and source attribution are increasingly important.
+
+**Proposed:** Michael's `reference` ContentPart already covers this. No change
+needed — `reference` with `uri` and `text` handles citations.
+
+#### 4.8 Context Compaction Events (LOW PRIORITY)
+
+**What:** Claude SDK has `CompactBoundaryMessage` marking when context was
+compressed.
+
+**Why it matters:** For long sessions, knowing when context was compressed helps
+with debugging and UI.
+
+**Proposed:** `CustomEvent` with `kind: 'compact_boundary'`. No new event type
+needed.
+
+#### 4.9 Structured Output Schema (ALREADY COVERED)
+
+**What:** Both Claude (`structuredOutput`) and OpenAI support JSON Schema output
+constraints.
+
+**Status:** Already covered by `AgentDescriptor.outputSchema: JsonSchema`. No
+change needed.
+
+### Concepts We DON'T Need to Adopt
+
+| Concept                                                                   | Why Skip                                                                                                               |
+| ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
+| OpenAI's 53 granular streaming events                                     | Too coupled to Responses API internals. Our `ToolUpdateEvent` + `MessageEvent` via AsyncGenerator abstracts over this. |
+| OpenAI's per-tool-type events (file_search, web_search, code_interpreter) | Tool-specific progress belongs in `ToolUpdateEvent.data`, not in the event taxonomy.                                   |
+| Audio/Image streaming events                                              | Handle via `ToolUpdateEvent` with media ContentParts. When needed, add as ContentPart types, not event types.          |
+| Claude's raw `StreamEvent` wrapper                                        | Implementation detail of the Claude API client. Our adapters consume these internally.                                 |
+| MCP-specific events (mcp_call, mcp_list_tools)                            | MCP tools are just tools. Use generic `ToolRequestEvent/ToolResponseEvent`. MCP approval is an `ElicitationRequest`.   |
+
+---
+
+## 5. Updated Event Type Comparison (Full Superset)
+
+| #   | Event Type           | Michael | Our Outline | Claude SDK                    | OpenAI                            | Verdict                                        |
+| --- | -------------------- | ------- | ----------- | ----------------------------- | --------------------------------- | ---------------------------------------------- |
+| 1   | Initialize           | ✅      | ✅          | SystemMessage(init)           | —                                 | **Keep**                                       |
+| 2   | Session Update       | ✅      | ✅          | —                             | —                                 | **Keep**                                       |
+| 3   | Message              | ✅      | ✅          | AssistantMessage              | output_text.delta/done            | **Keep**                                       |
+| 4   | Tool Request         | ✅      | ✅          | AssistantMessage.tool_use     | function_call_arguments           | **Keep**                                       |
+| 5   | Tool Update          | ✅      | ✅          | —                             | per-tool progress events          | **Keep**                                       |
+| 6   | Tool Response        | ✅      | ✅          | UserMessage                   | —                                 | **Keep**                                       |
+| 7   | Elicitation Request  | ✅      | ✅          | canUseTool callback           | mcp_approval_requested            | **Keep**                                       |
+| 8   | Elicitation Response | ✅      | ✅          | canUseTool return             | mcp_approval_response             | **Keep**                                       |
+| 9   | Usage                | ✅      | ✅          | ResultMessage.usage           | response.completed                | **Keep**                                       |
+| 10  | Error                | ✅      | ✅          | ResultMessage(error\_\*)      | response.failed                   | **Keep**                                       |
+| 11  | Custom               | ✅      | ✅          | —                             | —                                 | **Keep**                                       |
+| 12  | StreamEnd            | —       | ✅          | ResultMessage + SystemMessage | response.created/completed/failed | **Keep — `stream_end` with `reason` + `data`** |
+
+**Result: Our 12 event types are the right abstraction level.** Claude and
+OpenAI validate every category. The granularity differences (OpenAI's 53 vs
+our 12) are implementation details that adapters handle internally. `stream_end`
+uses a single `reason` field with an open `data` bag. Handoff is a tool call.
+Pausing is implicit.
+
+---
+
+## 6. Updated ContentPart Types (Superset)
+
+```typescript
+type ContentPart = (
+  | { type: 'text'; text: string }
+  | { type: 'thought'; thought: string; thoughtSignature?: string }
+  | { type: 'media'; data?: string; uri?: string; mimeType?: string }
+  | {
+      type: 'reference';
+      text: string;
+      data?: string;
+      uri?: string;
+      mimeType?: string;
+    }
+  | { type: 'refusal'; text: string } // NEW: from OpenAI
+) &
+  // Future: type: string for unknown types from new SDKs
+  { _meta?: Record<string, unknown> };
+```
+
+Adding `refusal` as a ContentPart type (rather than a new event) keeps the event
+taxonomy stable while supporting model refusals from both Claude and OpenAI.
+
+---
+
+## 7. Key Architectural Patterns Across SDKs
+
+### Pattern: Execution Entry Points
+
+| SDK         | Entry Point                                                               | Multi-turn Pattern                              |
+| ----------- | ------------------------------------------------------------------------- | ----------------------------------------------- |
+| Michael     | `agent.send(trajectory, data)` / `session.send()` + `session.update()`    | Same method / three-method session              |
+| Our Outline | `session.stream(data)` + `session.update(config)` + `session.steer(data)` | Four-method session (stream/update/steer/abort) |
+| Claude SDK  | `query({ prompt, options })`                                              | New `query()` call with `resume: sessionId`     |
+| Claude V2   | `session.send()` + `session.stream()`                                     | Separate send/stream                            |
+| Codex SDK   | `thread.run(prompt)` / `thread.runStreamed(prompt)`                       | Same thread object                              |
+
+**Observation:** Claude V2 and Codex both use a stateful session/thread object
+with send+stream. Michael uses a single `send()` method. Our `stream()` method
+is the unified version — the first call starts, subsequent calls continue (like
+ADK's `runAsync()`).
+
+### Pattern: Tool Approval
+
+| SDK         | Pattern                                                  | Sync/Async               |
+| ----------- | -------------------------------------------------------- | ------------------------ |
+| gemini-cli  | PolicyEngine + ConfirmationBus                           | Async (message bus)      |
+| ADK-TS      | SecurityPlugin.policyCheck()                             | Async (plugin callback)  |
+| Claude SDK  | `canUseTool()` callback                                  | Async (callback)         |
+| OpenAI      | `mcp_approval_requested` event                           | Event-based              |
+| Our Outline | `ElicitationRequest` event + `PolicyEvaluator` interface | Both (event + interface) |
+
+**Observation:** Our approach covers both patterns — the `ElicitationRequest`
+event for event-based approval (like OpenAI), and the `PolicyEvaluator`
+interface for synchronous policy checks (like gemini-cli/ADK/Claude). This is
+the right superset.
+
+### Pattern: Subagent Definition
+
+| SDK           | Pattern                          | Key Fields                                                                                 |
+| ------------- | -------------------------------- | ------------------------------------------------------------------------------------------ |
+| gemini-cli    | `AgentDefinition` (local/remote) | name, description, kind, tools, model                                                      |
+| ADK-TS        | `BaseAgentConfig`                | name, description, subAgents, tools                                                        |
+| Claude SDK    | `AgentDefinition`                | description, prompt, tools, model                                                          |
+| OpenAI Agents | `Agent` class                    | name, instructions, tools, handoffs, model                                                 |
+| Our Outline   | `AgentDescriptor`                | name, description, executor, inputSchema, capabilities, ownTools, requiredTools, subAgents |
+
+**Observation:** Our `AgentDescriptor` is the most complete. Claude's `prompt`
+field and OpenAI's `instructions` are executor-level concerns (system prompt),
+not descriptor-level. The descriptor declares identity; the executor uses the
+prompt. This separation is correct.
+
+One gap: **handoffs**. OpenAI Agents has an explicit `handoffs` field listing
+which agents can be delegated to. Our `subAgents` field serves the same purpose
+but the naming implies hierarchy rather than peer delegation. Consider whether
+`subAgents` should be renamed to `delegateAgents` or kept as-is with
+documentation clarifying it covers both hierarchical and peer delegation.
+
+---
+
+## 8. Concrete Changes to outline.md
+
+Based on this analysis, the following changes should be made:
+
+### Applied (validated by multiple SDKs):
+
+1. ✅ **`type: AgentEventType`** with known values + `(string & {})`
+   (autocomplete + extensibility)
+2. ✅ **`interface AgentEvents` + mapped type** (adopted from Michael for
+   declaration merging)
+3. ✅ **`agentId` on event base** (which agent emitted this event)
+4. ✅ **`_meta` on ContentPart** (aligned with Michael)
+5. ✅ **`stream_end` event** — signals why the stream ended, with `reason`
+   field + open `data` bag
+6. ✅ **Handoff as tool call** — `transfer_to_agent` tool, not a separate event
+7. ✅ **`maxBudgetUsd` in AgentConstraints** (Claude SDK, increasingly standard)
+8. ✅ **`refusal` ContentPart type** (both Claude and OpenAI surface refusals)
+9. ✅ **`forkSession` in ExecutionRequest** (Claude SDK, valuable for
+   exploration)
+10. ✅ **`permissionMode` in ExecutionOptions** (both gemini-cli and Claude SDK)
+11. ✅ **`cost` field on Usage** (Claude SDK tracks total_cost_usd)
+
+### Correctly abstracted (no change needed):
+
+- Event taxonomy (12 types) — validated as right abstraction level
+- `AgentDescriptor` shape — most complete across all SDKs
+- `AgentSession.stream/update/steer/abort` — covers all SDK patterns
+- ToolUpdate — correctly abstracts over OpenAI's 15+ tool-specific progress
+  events
+- `ElicitationRequest/Response` — covers both callback and event patterns
+- `ContentPart` types — text/thought/media/reference/refusal
+
+---
+
+## Sources
+
+- [Claude Agent SDK TypeScript Reference](https://platform.claude.com/docs/en/agent-sdk/typescript)
+- [Claude Agent SDK Streaming](https://platform.claude.com/docs/en/agent-sdk/streaming-output)
+- [Claude Agent SDK Sessions](https://platform.claude.com/docs/en/agent-sdk/sessions)
+- [Claude Agent SDK Subagents](https://platform.claude.com/docs/en/agent-sdk/subagents)
+- [OpenAI Codex SDK TypeScript](https://github.com/openai/codex/tree/main/sdk/typescript)
+- [OpenAI Codex SDK Docs](https://developers.openai.com/codex/sdk/)
+- [OpenAI Responses API Streaming Events](https://developers.openai.com/api/reference/resources/responses/streaming-events/)
+- [OpenAI Agents SDK Streaming](https://openai.github.io/openai-agents-python/streaming/)
+- [Responses API Streaming Guide (Community)](https://community.openai.com/t/responses-api-streaming-the-simple-guide-to-events/1363122)
@@ -0,0 +1,259 @@
+# Gemini CLI Architecture Notes
+
+## Project Structure
+
+**Monorepo packages:**
+
+- `packages/core/` - Main execution engine (the big one)
+- `packages/cli/` - CLI frontend
+- `packages/sdk/` - SDK for extensions
+- `packages/a2a-server/` - Agent-to-agent server
+- `packages/devtools/` - Dev utilities
+- `packages/vscode-ide-companion/` - VS Code extension
+
+## Core Execution Loop
+
+### GeminiClient (`core/src/core/client.ts` ~38KB)
+
+- **Primary orchestrator** for user interactions
+- Manages session lifecycle, message routing, model selection
+- Coordinates hooks, context management, error recovery
+- Enforces `MAX_TURNS = 100` per session
+- Tracks `currentSequenceModel` for multi-turn stickiness
+- Handles history compression when context grows
+
+### GeminiChat (`core/src/core/geminiChat.ts` ~34KB)
+
+- Bidirectional LLM communication
+- Maintains `history[]` alternating user/model turns
+- Retry logic: max 2 attempts, 500ms delay for invalid responses
+- Fires `BeforeModel` and `AfterModel` hooks
+- Integrates ChatRecordingService for persistence
+
+### Scheduler (`core/src/scheduler/scheduler.ts` ~23KB)
+
+- **Three-phase event-driven**: Ingestion → Processing → Completion
+- Tool call state machine:
+  `Validating → AwaitingApproval → Scheduled → Executing → Terminal`
+- Terminal states: `Success`, `Error`, `Cancelled`
+- Parallel execution for read-only and agent-type tools
+- Yields to event loop for user approval
+- Publishes state changes via MessageBus
+
+### CoreToolScheduler (`core/src/core/coreToolScheduler.ts` ~38KB)
+
+- Sequential, queue-based tool processing
+- Validates policy via PolicyEngine
+- Confirmation handling via ToolModificationHandler (editor integration)
+- Uses MessageBus for async confirmation responses
+
+## Tool System
+
+### DeclarativeTool Pattern
+
+- **Separation of concerns**: build() → validate → createInvocation() →
+  execute()
+- `ToolBuilder` defines metadata (name, displayName, description, kind) + schema
+  via `getSchema()`
+- `ToolInvocation` has: `getDescription()`, `toolLocations()`,
+  `shouldConfirmExecute()`, `execute()`
+- `ToolResult` contains: `llmContent` (for LLM), `returnDisplay` (for UI), error
+  details, tail calls
+
+### BaseToolInvocation
+
+- Abstract base with MessageBus integration for policy/confirmation
+- Three decision paths: ALLOW, DENY, ASK_USER via `getMessageBusDecision()`
+
+### ToolRegistry (`core/src/tools/tool-registry.ts`)
+
+- Registers tools via `registerTool()`
+- MCP tools with fully qualified names: `mcp_serverName_toolName`
+- Priority sorting: built-in → discovered → MCP (by server name)
+- Filters by active status based on configuration
+
+### Confirmation System
+
+- `ToolCallConfirmationDetails` union: edit, execute, MCP, info, ask_user,
+  exit_plan_mode
+- `ToolConfirmationOutcome` enum: ProceedOnce, ProceedAlways, etc.
+- Async confirmation via MessageBus pub/sub
+
+## Hooks System
+
+### Hook Types (11 hook points)
+
+| Hook                  | Trigger                 | Key Capability                    |
+| --------------------- | ----------------------- | --------------------------------- |
+| `BeforeTool`          | Before tool execution   | Modify tool_input                 |
+| `AfterTool`           | After tool completion   | Context injection, tail calls     |
+| `BeforeAgent`         | Before agent prompt     | Additional context                |
+| `AfterAgent`          | After agent response    | Clear context flag                |
+| `BeforeModel`         | Before LLM request      | Modify request or inject response |
+| `AfterModel`          | After LLM response      | Modify response                   |
+| `BeforeToolSelection` | Before tool selection   | Modify toolConfig                 |
+| `Notification`        | When notifications fire | Suppress/modify message           |
+| `SessionStart`        | Session begins          | Additional context                |
+| `SessionEnd`          | Session terminates      | Cleanup                           |
+| `PreCompress`         | Before compression      | Suppress/modify                   |
+
+### Hook Output Fields (common to all hooks)
+
+- `continue` - Whether execution proceeds
+- `stopReason` - Reason to halt
+- `suppressOutput` - Hide from user
+- `systemMessage` - Add to system context
+- `decision` - ask/block/deny/approve/allow
+
+### Hook System Components
+
+- `HookSystem` - Main coordinator
+- `HookRegistry` - Stores/manages configurations
+- `HookRunner` - Executes registered hooks
+- `HookAggregator` - Combines multiple hook results
+- `HookPlanner` - Determines execution order
+- `HookEventHandler` - Orchestrates event firing
+- `HookTranslator` - Converts between formats
+
+## Policy Engine
+
+### Rule Structure
+
+```
+PolicyRule {
+  toolName: string;        // wildcards supported
+  decision: PolicyDecision; // ALLOW | DENY | ASK_USER
+  priority: number;
+  argsPattern?: RegExp;    // conditional on args
+  mcpName?: string;
+  source: string;
+}
+```
+
+### Tier Hierarchy (lowest → highest priority)
+
+1. Default (1) - Core built-in policies
+2. Extension (2) - Extension contributions
+3. Workspace (3) - Project-scoped (.gemini/)
+4. User (4) - User-provided (~/.gemini/)
+5. Admin (5) - System-level policies
+
+### Dynamic Rule Priorities (within User Tier)
+
+- 4.9 - MCP_EXCLUDED (persistent server blocks)
+- 4.4 - EXCLUDE_TOOLS_FLAG (CLI exclusions)
+- 4.3 - ALLOWED_TOOLS_FLAG (CLI allows)
+- 4.2 - TRUSTED_MCP_SERVER
+- 4.1 - ALLOWED_MCP_SERVER
+- 3.95 - ALWAYS_ALLOW (interactive selections)
+
+### Security Constraint
+
+- Extensions CANNOT contribute ALLOW rules or YOLO mode
+
+## Agent System
+
+### Agent Registry (`core/src/agents/registry.ts`)
+
+Discovery sources:
+
+1. Built-in: CodebaseInvestigator, CliHelp, Generalist, Browser
+2. User-level: `~/.gemini/agents/`
+3. Project-level: `.gemini/agents/` (requires folder trust)
+4. Extension-based: From active extensions
+
+### LocalAgentExecutor (`core/src/agents/local-executor.ts`)
+
+- Prompt processing: input augmentation → template expansion → system prompt
+  construction
+- Uses GeminiChat for accumulating conversation
+- ChatCompressionService for history management
+- Turn loop: invoke model → extract function calls → check auth → append results
+- Termination: complete_task tool, max turns, timeout
+
+### SubagentTool (`core/src/agents/subagent-tool.ts`)
+
+- Extends BaseDeclarativeTool - agents invoked like standard tools
+- Read-only status checking, user hint propagation
+- Execution: validate → optional confirmation → parameter enrichment →
+  SubagentToolWrapper
+
+### Remote Agents
+
+- A2A client manager for agent-to-agent protocol
+- Remote invocation for external agents
+- Agent acknowledgement system (security for project agents)
+
+## Model System
+
+### ModelConfigService
+
+- **Hierarchical alias system**: children override parents
+- Resolution: alias chain → level assignment → apply overrides
+- Deep merging with array override capability
+- Fallback to `chat-base` alias for unknown models
+
+### ModelRouterService
+
+Sequential strategy pattern:
+
+1. Fallback & Override
+2. Approval Mode Strategy
+3. Gemma Classifier (if enabled)
+4. Generic Classifier
+5. Numerical Classifier
+6. Default Strategy
+
+### ModelAvailabilityService
+
+Health states:
+
+- **Terminal** - permanently unavailable
+- **Sticky Retry** - failed once, can retry once per turn
+- **Healthy** - no issues
+
+## Services
+
+| Service                     | Purpose                                 |
+| --------------------------- | --------------------------------------- |
+| ChatRecordingService        | Session persistence (JSON files)        |
+| ChatCompressionService      | History summarization for token budgets |
+| ModelConfigService          | Hierarchical model config with aliases  |
+| ModelAvailabilityService    | Model health tracking                   |
+| ModelRouterService          | Model selection via strategies          |
+| FolderTrustDiscoveryService | Workspace security scanning             |
+| KeychainService             | Credential storage                      |
+| LoopDetectionService        | Detect repetitive agent loops           |
+
+## UI + Core Separation
+
+### IDE Client (`core/src/ide/ide-client.ts`)
+
+- Singleton managing CLI ↔ IDE communication via MCP
+- **Outbound** (CLI → IDE): `openDiff`, `closeDiff`
+- **Inbound** (IDE → CLI): `ide/contextUpdate`, `ide/diffAccepted`,
+  `ide/diffRejected`
+
+### Event Contract
+
+```typescript
+interface IdeContextNotification {
+  method: 'ide/contextUpdate';
+  params: { workspaceState: { openFiles: string[]; isTrusted: boolean } };
+}
+```
+
+### Confirmation Bus
+
+- `TOOL_CONFIRMATION_REQUEST` / `TOOL_CONFIRMATION_RESPONSE`
+- Detail types: edit, execute, MCP, info, ask_user, exit_plan_mode
+- Async pub/sub via MessageBus
+
+## Configuration (`core/src/config/config.ts` ~95KB!)
+
+- Tool config: core tools, allowed/excluded, MCP servers
+- File filtering: git ignore, fuzzy search, max counts, timeouts
+- Approval modes: policy engine config
+- Experiments: feature flags (GEMINI_3_1_PRO_LAUNCHED, ENABLE_ADMIN_CONTROLS,
+  etc.)
+- FolderTrust: discovery scans for commands, skills, settings, MCP, hooks
@@ -0,0 +1,296 @@
+# Deep Dive: Key Gemini-CLI Systems
+
+## Hooks System (Complete)
+
+### 11 Hook Points
+
+| Hook                | Input                                  | Key Output Capabilities                       |
+| ------------------- | -------------------------------------- | --------------------------------------------- |
+| BeforeTool          | toolName, toolInput, mcpContext        | Modify tool_input, block/allow, systemMessage |
+| AfterTool           | toolName, toolInput, toolResponse      | additionalContext, tailToolCallRequest        |
+| BeforeAgent         | prompt                                 | Additional context                            |
+| AfterAgent          | prompt, response, stopHookActive       | Clear context                                 |
+| BeforeModel         | llmRequest (GenerateContentParameters) | Modify llm_request OR inject llm_response     |
+| AfterModel          | llmRequest, llmResponse                | Modify llm_response                           |
+| BeforeToolSelection | llmRequest                             | Modify toolConfig (function list, mode)       |
+| Notification        | type, message, details                 | Suppress/modify                               |
+| SessionStart        | source (Startup/Resume/Clear)          | Additional context                            |
+| SessionEnd          | reason (Exit/Clear/Logout/etc)         | Cleanup                                       |
+| PreCompress         | trigger (Manual/Auto)                  | Suppress/modify                               |
+
+### Hook Configuration Types
+
+- **Runtime hooks** (HookType.Runtime): JS/TS functions, registered
+  programmatically
+- **Command hooks** (HookType.Command): External shell commands with JSON I/O
+
+### Exit Code Semantics (Command Hooks)
+
+- 0 = Success (allowed with system message)
+- 1 = Non-blocking error (warning, continues)
+- 2+ = Blocking failure (denied, stderr as reason)
+
+### Hook Decision Values
+
+`'ask' | 'block' | 'deny' | 'approve' | 'allow' | undefined`
+
+### Execution Strategies
+
+- **Parallel** (default): Promise.all(), independent
+- **Sequential** (opt-in per hook): Chained, output→input cascading
+
+### Aggregation
+
+- Blocking decisions: OR logic (any block → all block)
+- Field replacement: later overrides earlier
+- Tool selection: union of allowed functions, mode precedence NONE > ANY > AUTO
+
+### Trust Model
+
+- Project hooks require folder trust verification
+- TrustedHooksManager at `~/.gemini/trusted-hooks.json`
+- Environment sanitized for command hooks (sensitive vars removed)
+- `GEMINI_PROJECT_DIR` injected
+
+### Key Insight for Abstraction
+
+Hooks fire inside gemini-cli's execution loop. When ADK controls the model:
+
+- BeforeModel/AfterModel still fire because AdkGeminiModel wraps GeminiChat
+- BeforeTool/AfterTool still fire because AdkToolAdapter wraps DeclarativeTool
+- This is dewitt's solution: adapters preserve hook injection points
+
+**For OpenRouter or opaque agents, hooks CANNOT fire unless the agent delegates
+model/tool calls back to gemini-cli.**
+
+---
+
+## Policy Engine (Complete)
+
+### TOML Rule Format
+
+```toml
+[[rules]]
+decision = "allow" | "deny" | "ask_user"
+priority = 0-999
+toolName = "tool_name"       # wildcards: *, mcp_*, mcp_server_*
+mcpName = "server_name"      # MCP server filter
+argsPattern = "regex"        # matches JSON-stringified args
+commandPrefix = "cmd"        # shell command prefix match
+commandRegex = "regex"       # shell command regex (mutually exclusive with prefix)
+modes = ["default", "autoEdit", "yolo", "plan"]
+annotations = ["read-only", "experimental"]
+allowRedirection = true      # for shell commands
+allowMessage = "..."         # user-facing message on allow
+denyMessage = "..."          # user-facing message on deny
+```
+
+### 5-Tier Priority System
+
+- Tier 5 (Admin): 5.000-5.999
+- Tier 4 (User): 4.000-4.999
+- Tier 3 (Workspace): 3.000-3.999
+- Tier 2 (Extension): 2.000-2.999
+- Tier 1 (Default): 1.000-1.999
+
+Formula: `tier + (priority / 1000)`
+
+### 4 Approval Modes
+
+1. **default** — ASK_USER decisions prompt user
+2. **autoEdit** — File writes auto-approved with safety checking (conseca)
+3. **yolo** — All auto-approved except explicit ask_user rules
+4. **plan** — Read-only, blocks modifications, allows planning docs
+
+### Shell Command Safety
+
+- Parses multi-command sequences (&&, ;, ||)
+- Detects injection: $(...), `...`, <(...), >(...), --flag=$(...)
+- Each subcommand evaluated independently
+- DENY overrides everything; ASK_USER escalates; ALLOW only if all pass
+- Redirections (>) downgrade ALLOW → ASK_USER unless allowRedirection=true
+
+### Security Constraints
+
+- Extensions cannot contribute ALLOW rules or YOLO mode
+- Regex patterns validated for ReDoS
+- Tool name typos detected via Levenshtein distance ≤3
+- Policy file integrity: SHA-256 hash checking
+
+### Key Insight for Abstraction
+
+Policy is evaluated at the tool execution boundary. For the interface layer:
+
+- If CLI controls tool execution → policy naturally applies
+- If agent controls tool execution internally → policy bypassed (danger!)
+- This reinforces the `pauseOnToolCalls: true` approach for ADK
+- Need a `PolicyEvaluator` interface that any executor can call
+
+---
+
+## Tool System (Complete)
+
+### Core Abstraction Chain
+
+```
+ToolBuilder (metadata + schema)
+  → build(params) validates → ToolInvocation (ready to execute)
+    → shouldConfirmExecute() → execute(signal) → ToolResult
+```
+
+### DeclarativeTool Pattern
+
+- `build(params)` — Validate and create invocation
+- `buildAndExecute(params)` — One-step convenience
+- `validateBuildAndExecute(params)` — Non-throwing variant
+
+### BaseToolInvocation
+
+- Message bus integration for policy decisions
+- Three decision paths: ALLOW → execute, DENY → reject, ASK_USER → confirm
+
+### ToolResult Structure
+
+- `llmContent` — For LLM conversation history
+- `returnDisplay` — For UI presentation
+- `displayContent` — Additional display formatting
+- `errorDetails` — Optional error info
+- `result` — Structured data payload
+- `tailCall` — Optional chaining requests
+
+### Confirmation System (6 types)
+
+1. **edit** — File modification with diff
+2. **execute** — Command execution
+3. **mcp** — MCP tool with allowlist mgmt
+4. **info** — Information-only
+5. **ask_user** — General user approval
+6. **exit_plan_mode** — Plan exit notification
+
+### Confirmation Outcomes (7 values)
+
+ProceedOnce, ProceedAlways, ProceedAlwaysAndSave, ProceedAlwaysServer,
+ProceedAlwaysTool, ModifyWithEditor, Cancel
+
+### Tool Kinds
+
+- **Mutator**: Edit, Delete, Move, Execute
+- **Read-Only**: Read, Search, Fetch
+- **Other**: Think, Agent, Communicate, Plan, SwitchMode, Other
+
+### MCP Tools
+
+- Naming: `mcp_<server>_<toolname>` (64-char limit)
+- Schema validation via LenientJsonSchemaValidator
+- Response types: McpTextBlock, McpMediaBlock, McpResourceBlock,
+  McpResourceLinkBlock
+- Transform to GenAI Parts format
+
+### Error Types (20+)
+
+- **Recoverable**: INVALID_TOOL_PARAMS, FILE_NOT_FOUND,
+  EDIT_NO_OCCURRENCE_FOUND, SHELL_TIMEOUT, MCP_TOOL_ERROR...
+- **Fatal**: NO_SPACE_LEFT (only one!)
+
+### ModifiableTool
+
+- Extends DeclarativeTool with external editor support
+- `getModifyContext()` → temp files → editor opens → `getUpdatedParams()` → diff
+
+---
+
+## Execution Loop (Complete)
+
+### LocalAgentExecutor Flow
+
+1. Collect user hints, setup deadline timer
+2. **Turn loop**: executeTurn() repeatedly until completion
+3. Per-turn: compress chat → callModel() → processFunctionCalls()
+4. On limit hit: executeFinalWarningTurn() with 60s grace period
+5. Return OutputObject { result, terminate_reason }
+
+### AgentTerminateMode
+
+GOAL | TIMEOUT | MAX_TURNS | ABORTED | ERROR | ERROR_NO_COMPLETE_TASK_CALL
+
+### SubagentTool Architecture
+
+```
+Parent Agent
+  └─ SubagentTool (wraps AgentDefinition as DeclarativeTool)
+       └─ SubagentToolWrapper (routes by agent kind)
+            ├─ LocalSubagentInvocation → LocalAgentExecutor
+            ├─ RemoteAgentInvocation → A2AClientManager
+            └─ BrowserAgentInvocation
+```
+
+### Agent Types
+
+- `LocalAgentDefinition` — kind: 'local', has promptConfig, modelConfig,
+  runConfig, toolConfig
+- `RemoteAgentDefinition` — kind: 'remote', has agentCardUrl, auth config
+
+### Key Defaults
+
+- DEFAULT_MAX_TURNS = 15
+- DEFAULT_MAX_TIME_MINUTES = 5
+- A2A_TIMEOUT = 1800000 (30 min for remote agents)
+
+---
+
+## Services/Config (Complete)
+
+### ModelConfigService
+
+- **Alias chains**: Inheritance with `extends`, merged root-to-leaf
+- **Overrides**: Contextual (model, scope, retry, isChatModel), sorted by
+  specificity
+- **Runtime registration**: Dynamic aliases and overrides
+- **Deep merge**: Objects merged, arrays replaced entirely
+
+### ModelRouterService (Strategy Chain)
+
+1. Fallback & Override → 2. Approval Mode → 3. Gemma Classifier → 4. Generic
+   Classifier → 5. Numerical Classifier → 6. Default
+
+### ModelAvailabilityService
+
+- Terminal (permanent), Sticky_retry (one retry per turn), Healthy
+- `selectFirstAvailable()` iterates fallback chain
+- `resetTurn()` at turn boundaries enables fresh retries
+
+### Config (~95KB!)
+
+Central dependency injection. Initializes: ModelAvailabilityService →
+ModelConfigService → FolderTrustDiscoveryService → PolicyEngine →
+FileDiscoveryService → GitService → ToolRegistry → MCP → GeminiClient →
+HookSystem
+
+### CoreEventEmitter (UI Events)
+
+Event types: UserFeedback, ModelChanged, ConsoleLog, Output, RetryAttempt,
+ConsentRequest, McpProgress, Hook, QuotaChanged
+
+Backlog buffering (max 10,000) with head-pointer eviction and auto-compaction.
+
+### Scheduler Types
+
+```typescript
+ToolCallRequestInfo {
+  callId, name, args, originalRequestName,
+  isClientInitiated, prompt_id, checkpoint, traceId,
+  parentCallId, schedulerId
+}
+ToolCallResponseInfo {
+  callId, responseParts, resultDisplay, error, errorType,
+  outputFile, contentLength, data
+}
+CoreToolCallStatus: Validating → AwaitingApproval → Scheduled → Executing → Success|Error|Cancelled
+```
+
+### FolderTrust
+
+Scans: commands (.toml), skills (SKILL.md), settings.json, MCP servers, hooks
+Security warnings: auto-approved tools, autonomous agents, disabled trust,
+disabled sandbox Pattern: discovery → review → execution (no code runs during
+scan)
@@ -0,0 +1,349 @@
+# Interface Priority Analysis & Open Questions
+
+## The Big Picture
+
+We're defining **framework-agnostic interfaces** that allow gemini-cli to:
+
+1. Keep its existing execution loop working unchanged (Legacy path)
+2. Swap in ADK as an alternative runtime via config flag
+3. Eventually support OpenRouter or other agent backends
+4. Maintain all existing CLI behavior: hooks, policies, confirmations, UI events
+
+## Proposed Interface Layers (Priority Order)
+
+---
+
+### P0 (Critical Path - Must Define First)
+
+#### 1. AgentEvent / Event Stream Contract
+
+**Why first:** Everything else consumes or produces these events. The UI renders
+them. The hooks intercept them. The adapters translate to/from them.
+
+**Key decision:** Merge Dewitt's simpler model with Coworker's richer model?
+
+**Recommendation:** Coworker's approach is more complete. Key additions:
+
+- `threadId` for sub-agent tracking (AG-UI has `parentRunId`)
+- `tool_update` for progress on long-running tools
+- `elicitation_request/response` as first-class (not just tool_confirmation)
+- `usage` event for token tracking
+- `_meta` escape hatch (matches AG-UI's extensibility philosophy)
+- `initialize` event (matches AG-UI's RunStarted)
+
+**Open questions:**
+
+- Do we need AG-UI's start/content/end triple pattern for streaming? Or is
+  yielding partial events sufficient?
+- How do ContentPart types map to existing gemini-cli Part types?
+- Should events carry a `source` field? (useful for hook attribution)
+
+#### 2. Agent Interface
+
+**Why second:** This is the primary abstraction that LocalAgentExecutor, ADK
+adapters, and future OpenRouter adapters all implement.
+
+**Key decision:** Dewitt's `runAsync/runEphemeral` vs Coworker's
+`send(Trajectory|string)`
+
+**Recommendation:** Hybrid approach:
+
+- Dewitt's `runAsync/runEphemeral` split is ADK-aligned and cleaner for the
+  factory pattern
+- BUT add Coworker's elicitation support via AgentSend union type
+- The Trajectory concept is powerful but may be too opinionated for Phase 2
+
+```
+Agent<TInput, TOutput>
+  name: string
+  description: string
+  runAsync(input, options) → AsyncGenerator<AgentEvent, TOutput>
+  runEphemeral(input, options) → AsyncGenerator<AgentEvent, TOutput>
+```
+
+**Open questions:**
+
+- Should Agent also support `send()` for mid-stream interactions (elicitations)?
+- How does AbortSignal propagate through the adapter boundary?
+- Do we need a `capabilities` field (supports elicitation? supports HITL? etc.)?
+
+#### 3. Tool Execution Contract
+
+**Why third:** Tools are the primary action mechanism. Both the policy engine
+and hooks system wrap tool execution.
+
+**What needs abstracting:**
+
+- Tool declaration (name, schema) — already somewhat generic via JSON Schema
+- Tool execution (args → result)
+- Tool confirmation flow (ASK_USER → user decision → proceed/deny)
+- Tool result shape (llmContent + displayContent + error + tailCalls)
+
+**Key decision:** Keep DeclarativeTool pattern or flatten to a simpler
+interface?
+
+**Recommendation:** Define a minimal `ToolExecutor` interface:
+
+```
+ToolExecutor {
+  name: string
+  description: string
+  schema: JSONSchema
+  execute(args, context): Promise<ToolResult>
+  requiresConfirmation?(args, context): Promise<boolean>
+}
+```
+
+DeclarativeTool remains the concrete implementation. ADK's BaseTool adapts to
+this.
+
+**Open questions:**
+
+- How do MCP tools fit? They already have their own protocol.
+- Tool annotations (destructive hints) — should these be in the interface?
+- Long-running tools need progress reporting — how does this interact with
+  tool_update events?
+
+---
+
+### P1 (Important - Define After P0)
+
+#### 4. Policy / Permission Interface
+
+**Why important:** Every tool call goes through policy. External agents need
+policy enforcement too.
+
+**Current state:** gemini-cli has a sophisticated TOML-based policy engine with
+tiered priorities. ADK-TS has a simpler SecurityPlugin with PolicyOutcome
+(DENY/CONFIRM/ALLOW).
+
+**What needs abstracting:**
+
+```
+PolicyEngine {
+  evaluate(toolName, args, context): PolicyDecision  // ALLOW | DENY | ASK_USER
+  getExcludedTools(): string[]  // Tools statically denied
+}
+```
+
+**Key decision:** Do external agents (OpenRouter, etc.) get the same policy
+enforcement?
+
+**Open questions:**
+
+- If an ADK agent calls a tool internally, does gemini-cli's policy apply?
+- With `pauseOnToolCalls: true` in ADK, the CLI controls execution — but what
+  about headless mode?
+- How do agent-level policies work? (allow/deny entire agents, not just tools)
+- Should policy be a middleware (AG-UI pattern) or a callback (ADK plugin
+  pattern)?
+
+#### 5. Hooks Interface
+
+**Why important:** Hooks are a major gemini-cli feature. They need to work
+regardless of which agent backend runs.
+
+**Current state:** 11 hook types firing at specific lifecycle points.
+
+**What needs abstracting:**
+
+- Hook lifecycle must be backend-agnostic
+- BeforeModel/AfterModel hooks need to work even when ADK controls the model
+- BeforeTool/AfterTool hooks need to intercept regardless of who executes the
+  tool
+
+**Key challenge:** When ADK runs the model internally, gemini-cli hooks can't
+easily intercept. **Dewitt's solution:** ADK uses gemini-cli's model via
+AdkGeminiModel adapter — hooks fire inside GeminiChat.
+
+**Open questions:**
+
+- If OpenRouter runs the model, how do BeforeModel/AfterModel hooks work?
+- Do we need a "model steering" abstraction (injecting context mid-stream)?
+- Can hooks be expressed as AG-UI middleware? (intercept event stream)
+
+#### 6. Model / LLM Interface
+
+**Why important:** Model abstraction enables swapping LLM providers.
+
+**Dewitt's approach:** Exposes Model interface, ADK uses it via AdkGeminiModel
+adapter. **Coworker's approach:** Model is internal to Agent (no separate Model
+interface).
+
+**Recommendation:** Keep Dewitt's separate Model interface BUT make it
+provider-agnostic:
+
+- Remove `@google/genai` types from the interface signature
+- Define generic Message/Content types
+- Model interface is an implementation detail, not part of the Agent contract
+
+**Open questions:**
+
+- Can we define a truly provider-agnostic Model interface?
+- Or is the Model always tied to the agent backend? (ADK uses Gemini, OpenRouter
+  uses whatever)
+- Model routing (choosing which model) — is this a concern of the Model
+  interface or a separate service?
+
+---
+
+### P2 (Important but Can Follow)
+
+#### 7. Session / State Interface
+
+**Current state:** gemini-cli uses ChatRecordingService (JSON files). ADK uses
+Session with BaseSessionService.
+
+**What needs abstracting:**
+
+- Session creation/retrieval
+- State persistence across turns
+- History/trajectory management
+
+**Open questions:**
+
+- Does the trajectory (coworker's concept) replace gemini-cli's chat recording?
+- Should session state be shared between gemini-cli and the agent backend?
+
+#### 8. Elicitation / User Interaction Interface
+
+**What it covers:** Model fallback dialogs, tool confirmations, Ctrl+B
+interrupts, user questions
+
+**Current state:** gemini-cli uses ConfirmationBus + MessageBus. AG-UI uses
+frontend tools.
+
+**Open questions:**
+
+- Is elicitation just a special case of tool calls (AG-UI approach)?
+- Or is it a first-class event type (coworker's approach)?
+- How does Ctrl+B (cancel/interrupt) propagate through the agent boundary?
+
+#### 9. Configuration / Capability Discovery
+
+**What it covers:** Feature flags, experiment settings, agent capabilities
+
+**Open questions:**
+
+- How does an external agent declare its capabilities?
+- Does OpenRouter support HITL? Elicitation? Tool confirmation? Each agent may
+  differ.
+- Need a `capabilities` negotiation at connection time?
+
+---
+
+### P3 (Future / Can Defer)
+
+#### 10. A2UI / Rich UI Interface
+
+- Declarative UI generation from agents
+- Not critical for Phase 2 but important for differentiation
+
+#### 11. Memory / Artifact Interface
+
+- ADK has memory/artifact services
+- gemini-cli has ChatRecordingService + memory tools
+- Can standardize later
+
+#### 12. Telemetry / Observability Interface
+
+- Both systems have telemetry
+- Can standardize later
+
+---
+
+## Critical Open Questions (Need Team Discussion)
+
+### 1. OpenRouter Integration Model
+
+**Question:** When OpenRouter (or any external agent) is used, what does the
+integration look like?
+
+**Option A: Full Agent Interface** — OpenRouter implements the Agent interface
+directly
+
+- Pro: Clean, uniform
+- Con: OpenRouter doesn't support HITL, hooks, policies natively
+
+**Option B: ACP Shim** — Agent Communication Protocol between CLI and external
+agents
+
+- Pro: Standards-based
+- Con: Additional protocol layer, may be premature
+
+**Option C: Model-only Integration** — OpenRouter is just an alternative Model,
+not Agent
+
+- Pro: Simpler, leverages existing agent loop
+- Con: Doesn't support OpenRouter-specific features
+
+**Recommendation:** Start with Option C (model-only). OpenRouter provides an LLM
+endpoint. Gemini-cli's own agent loop handles tools, policies, hooks. This means
+defining a provider-agnostic Model interface is the key enabler.
+
+### 2. Tool Execution: Client-side vs Agent-side
+
+**Question:** Who executes tools — the CLI or the agent backend?
+
+**Option A: Always client-side** (CLI executes, agent suspends)
+
+- ADK: `pauseOnToolCalls: true`
+- Pro: CLI maintains control, policies enforced, hooks fire
+- Con: Higher latency, more round-trips
+
+**Option B: Agent-side execution** (agent runs tools internally)
+
+- Pro: Faster, simpler
+- Con: Bypasses CLI policies, hooks, confirmations
+
+**Option C: Configurable** — CLI decides per-tool or per-agent
+
+- Pro: Flexible
+- Con: Complex
+
+**Recommendation:** Option A for safety-critical CLI use case. Option B only for
+trusted/sandboxed sub-agents.
+
+### 3. Model Steering (Hooks that inject context mid-stream)
+
+**Question:** How do user-local hooks (like injecting project context) work with
+external agents?
+
+**Answer:** They can only work if:
+
+- The CLI controls the model (via Model interface adapter) — then BeforeModel
+  hook injects context
+- OR the agent supports a "system instruction update" mechanism
+
+For OpenRouter: model steering works because CLI controls the model call. For
+ADK: model steering works because AdkGeminiModel wraps GeminiChat. For fully
+opaque agents: model steering **cannot work** — this is a known limitation.
+
+### 4. Elicitation Flow
+
+**Question:** When the agent needs user input (model fallback, clarification),
+how does it work?
+
+**For CLI-controlled agents:** Agent yields an elicitation_request event → CLI
+renders prompt → user responds → CLI sends response back via session.stream({
+kind: 'elicitation_response', ... }) to resume
+
+**For external agents:** Agent uses A2A protocol or similar to send elicitation
+→ CLI bridges the request to user → response sent back via protocol
+
+**Key insight:** Elicitation is fundamentally about the agent SUSPENDING and
+waiting for user input. ADK already supports this via `pauseOnToolCalls`. Can we
+generalize to `pauseOnElicitation`?
+
+### 5. Sub-agent Identity and Policies
+
+**Question:** When a sub-agent spawns, does it inherit parent policies? Get its
+own?
+
+**Current gemini-cli behavior:** Sub-agents registered as tools, go through same
+policy engine. **ADK behavior:** Sub-agents are child nodes in agent tree, get
+parent's plugins.
+
+**Recommendation:** Sub-agents inherit parent policy context. Additional
+restrictions can be layered (e.g., sub-agent X cannot use shell tool). This is
+already how gemini-cli works.