mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-05-13 21:32:56 -07:00
Add ADK replatforming design docs
This commit is contained in:
@@ -0,0 +1,438 @@
|
||||
# Architectural Design: Gemini CLI to ADK Migration
|
||||
|
||||
| Authors: [Adam Weidman](mailto:adamfweidman@google.com) Contributors: Reviewers: *See section [Status of this document](#status-of-this-document).* | Status: Draft Last revised: Apr 7, 2026 Visibility: Confidential |
|
||||
| :--- | :--- |
|
||||
|
||||
---
|
||||
|
||||
# Goal
|
||||
|
||||
To migrate the Gemini CLI backend execution engine from its legacy fragmented loop structure to the Agent Development Kit (ADK). This migration will unify how agents and subagents are orchestrated, simplify state persistence, and expose a standard `AgentSession` interface for the CLI, future SDK surfaces, and subagent execution.
|
||||
|
||||
---
|
||||
|
||||
# Context
|
||||
|
||||
Over time, Gemini CLI has accumulated complex runtime behaviors: multi-tier tool scheduling, policy-driven approvals, payload masking, dynamic routing, and fine-grained telemetry. Integrating these with ADK requires a clean boundary that preserves Gemini CLI semantics without forking ADK core behavior.
|
||||
|
||||
The key migration boundary is:
|
||||
|
||||
- ADK runtime semantics -> Gemini CLI `AgentProtocol` / `AgentSession`
|
||||
- ADK `Event` stream -> Gemini CLI `AgentEvent` stream
|
||||
|
||||
That boundary, not the model wrapper alone, is the architectural center of this design.
|
||||
|
||||
---
|
||||
|
||||
# Current State and Proposed Mappings
|
||||
|
||||
The following analysis maps existing Gemini CLI components onto ADK capabilities, citing both repositories (`gemini-cli` and `adk-js`).
|
||||
|
||||
## Core Runtime Architecture
|
||||
|
||||
The migration uses one shared ADK-backed runtime core. Every orchestrated agent, including subagents, is exposed through the same external session contract:
|
||||
|
||||
- **Top-level CLI agent** -> `AgentSession`
|
||||
- **Future SDK entry point** -> `AgentSession`
|
||||
- **Subagent execution** -> `AgentSession`
|
||||
|
||||
The runtime owns:
|
||||
|
||||
- ADK runner/session lifecycle
|
||||
- tool execution
|
||||
- policy integration
|
||||
- routing, availability, compaction, and masking hooks
|
||||
- persistence integration
|
||||
|
||||
The adapters own:
|
||||
|
||||
- `streamId` timing guarantees
|
||||
- replay / reattach behavior
|
||||
- translation from ADK `Event` to Gemini CLI `AgentEvent`
|
||||
- top-level versus subagent event projection
|
||||
- projection of a child `AgentSession` into parent-facing tool or thread events when a subagent is embedded inside another agent
|
||||
|
||||
The minimum shape is:
|
||||
|
||||
```typescript
|
||||
interface PipelineServices {
|
||||
run(
|
||||
request: LlmRequest,
|
||||
baseModel: BaseLlm,
|
||||
stream: boolean,
|
||||
): AsyncGenerator<LlmResponse, void>;
|
||||
connect(
|
||||
request: LlmRequest,
|
||||
baseModel: BaseLlm,
|
||||
): Promise<BaseLlmConnection>;
|
||||
}
|
||||
|
||||
class GcliAgentModel extends BaseLlm {
|
||||
constructor(
|
||||
private baseModel: BaseLlm,
|
||||
private pipeline: PipelineServices,
|
||||
) {
|
||||
super({model: 'gcli-consolidated'});
|
||||
}
|
||||
|
||||
async *generateContentAsync(
|
||||
request: LlmRequest,
|
||||
stream = false,
|
||||
): AsyncGenerator<LlmResponse, void> {
|
||||
yield* this.pipeline.run(request, this.baseModel, stream);
|
||||
}
|
||||
|
||||
async connect(request: LlmRequest): Promise<BaseLlmConnection> {
|
||||
return this.pipeline.connect(request, this.baseModel);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Design rule:
|
||||
|
||||
- request mutation stays in the pipeline
|
||||
- all orchestrated agents expose the same `AgentSession` contract
|
||||
- session lifecycle, replay, approvals, and subagent projection stay in the runtime/adapters
|
||||
|
||||
`AdkAgentService` is the composition root for this architecture. It creates and resumes both top-level and subagent `AgentSession`s, builds scoped registries and message-bus instances, wires policy and approval bridges, and embeds child sessions through projection adapters rather than through a separate subagent runtime. See Appendix A for the intended initialization shape.
|
||||
|
||||
Composition rule:
|
||||
|
||||
- stateful tools are cloned or reinstantiated per session when needed; stateless tools may be shared
|
||||
- MCP discovery is shared at the manager layer but registered into session-local tool, prompt, and resource registries
|
||||
- `AgentLoopContext` is decomposed into pipeline config, tool/subagent config, session services, callback bridges, and UI projection rather than passed through as a single runtime object
|
||||
- file persistence remains Gemini CLI-owned through `GcliFileSessionService extends BaseSessionService`
|
||||
|
||||
Persistence rule:
|
||||
|
||||
- persisted history is an append-only event log plus derived app/user/session state
|
||||
- the session service provides atomic append, crash-safe recovery, and single-writer enforcement
|
||||
- rewind truncates the event log, recomputes derived state, and invalidates confirmation/resumption state past the rewind point
|
||||
|
||||
## 3.1 Authentication Flexibility
|
||||
|
||||
The CLI resolves distinct authentication flows (OAuth, ADC, Compute metadata) using standard Google libraries.
|
||||
|
||||
* **Current State:** Resolved in `packages/core/src/code_assist/oauth2.ts` based on `AuthType`.
|
||||
* OAuth (`LOGIN_WITH_GOOGLE`)
|
||||
* Compute Metadata Server (`COMPUTE_ADC`)
|
||||
* **Constraint:** Standard `Gemini` construction in `adk-js/core/src/models/google_llm.ts` still selects backend from constructor-level `apiKey` or Vertex config. It does **not** natively accept Gemini CLI's `AuthClient`-driven auth shape.
|
||||
* **Proposed ADK Mapping:** Phase 1 keeps auth in `GcliAgentModel`. The pipeline resolves refreshed credentials and injects them through `request.config.httpOptions.headers` for unary requests and `request.liveConnectConfig.httpOptions.headers` for live connections.
|
||||
* **Design position:** This is a **bridge**, not native ADK auth parity. Long-term cleanup is either:
|
||||
* a `CodeAssistLlm extends BaseLlm`, or
|
||||
* an upstream ADK auth-provider abstraction.
|
||||
|
||||
```typescript
|
||||
request.config ??= {};
|
||||
request.config.httpOptions ??= {};
|
||||
request.config.httpOptions.headers = {
|
||||
...request.config.httpOptions.headers,
|
||||
Authorization: `Bearer ${await auth.getAccessToken()}`,
|
||||
};
|
||||
```
|
||||
|
||||
## 3.2 Model Steering and Mid-Stream Injection
|
||||
|
||||
User interjections (hints) course-correct the loop mid-turn.
|
||||
|
||||
* **Current State:** Steering today is tied to the legacy loop and injection services.
|
||||
* **Proposed ADK Mapping (Next-Step Steering):** Supported in Phase 1. `beforeModelCallback` can read queued hints and mutate the next outbound request.
|
||||
* **Proposed ADK Mapping (True Real-Time Interrupt):** Still blocked. TypeScript ADK live runtime is not complete yet, and input-stream semantics are not a stable dependency.
|
||||
* **Phase 1 behavior:** user interjections are queued and prepended to the next model request at tool or turn boundaries. True in-place turn interruption remains out of scope.
|
||||
|
||||
## 3.3 State Management and Token Compaction
|
||||
|
||||
The CLI truncates large tool responses and summarizes older history to protect token budgets.
|
||||
|
||||
* **Current State:** `ChatCompressionService` in `packages/core/src/context/chatCompressionService.ts` implements reverse token budgeting and a two-phase verification loop.
|
||||
* **Proposed ADK Mapping:** Compaction remains a Gemini CLI-owned history processor invoked by the request pipeline.
|
||||
* **Design position:** Phase 1 does **not** force this onto ADK `BaseContextCompactor`. Gemini CLI compression is currently an outbound-history projection with truncation plus summary/verification sub-calls, while ADK's native compactor path is event-log-centric and mutates session history.
|
||||
* **Design choice:** In Phase 1, compaction mutates **outgoing request history only**. The persisted session event log remains canonical.
|
||||
* **Utility calls:** Compaction sub-calls use the same routing, auth, and availability pipeline as primary model calls.
|
||||
|
||||
## 3.4 Model Configuration and Hierarchical Overrides
|
||||
|
||||
Dynamic aliasing (for example, temperature scoped to specific sub-commands).
|
||||
|
||||
* **Current State:** Managed by `ModelConfigService`.
|
||||
* **Proposed ADK Mapping:** Resolution stays **request-scoped**, not session-init-scoped.
|
||||
* **Design choice:** The session stores requested model and override state. The runtime resolves the concrete model and temperature on each request, including retries, subagents, and utility calls.
|
||||
|
||||
## 3.5 Universal Policy Enforcement (TOML Rules)
|
||||
|
||||
Tiered workspace restrictions (for example, read-only tools in untrusted folders).
|
||||
|
||||
* **Current State:** Intercepted at tool scheduling time in legacy scheduler loops.
|
||||
* **Proposed ADK Mapping (Container):** Standardize on ADK `SecurityPlugin`.
|
||||
* **Proposed ADK Mapping (Decision Brain):** Implement `GcliPolicyEngineAdapter implements BasePolicyEngine`.
|
||||
* **Richer context:** The adapter is fed session/mode/subagent/MCP metadata from the runtime so current Gemini CLI policy semantics are preserved as closely as possible.
|
||||
* **Phase 1 suspension flow:** current tool approvals use existing Gemini CLI callbacks. This is intentionally **non-native** and **non-resumable across process death**.
|
||||
* **Long-term target:** move the same policy decisions onto native ADK confirmation/resumption semantics once the broader live/elicitation surface is ready.
|
||||
|
||||
```typescript
|
||||
interface GcliPolicyBridge {
|
||||
evaluate(context: ToolCallPolicyContext): Promise<PolicyCheckResult>;
|
||||
}
|
||||
|
||||
export class GcliPolicyEngineAdapter implements BasePolicyEngine {
|
||||
constructor(private readonly policyBridge: GcliPolicyBridge) {}
|
||||
|
||||
async evaluate(context: ToolCallPolicyContext): Promise<PolicyCheckResult> {
|
||||
return this.policyBridge.evaluate(context);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 3.6 Telemetry and Observability (Clearcut Tracking)
|
||||
|
||||
Hardware metrics, token counts, and step durations.
|
||||
|
||||
* **Current State:** `ClearcutLogger` reads system metrics and relies on deep scheduler hooks for latency accounting.
|
||||
* **Proposed ADK Mapping:** Use a combination of:
|
||||
* passive event-stream observation, and
|
||||
* explicit runtime instrumentation where passive ADK events are insufficient.
|
||||
* **Correlation rule:** tool timing is keyed by `functionCall.id`, **not** by `event.id`.
|
||||
* **Design position:** Passive stream interception alone is not assumed to provide full parity.
|
||||
|
||||
## 3.7 Dynamic Model Routing and Configurability
|
||||
|
||||
Banning a model mid-turn, auto-routing via classifiers, and falling back dynamically without reinitializing the session.
|
||||
|
||||
* **Current State:** Managed by `ModelRouterService` and a chain of `RoutingStrategy` implementations which require the full `RoutingContext` (`history`, `request`, `AbortSignal`).
|
||||
* **Proposed ADK Mapping:** Mostly implementable now, but not “100% possible today” without bridge logic.
|
||||
* **Design choice:** The runtime constructs a proper `RoutingContext` from:
|
||||
* canonical session history
|
||||
* the pending user request
|
||||
* requested model
|
||||
* abort signal
|
||||
* **Execution point:** routing remains request-scoped and runs before final dispatch.
|
||||
* **Model banning:** treated as routing/fallback selection, not as a synthetic terminal model error.
|
||||
|
||||
## 3.8 Fallbacks and Availability Management
|
||||
|
||||
Ensuring availability by retrying or switching models when rate limits (429s) or terminal faults occur.
|
||||
|
||||
* **Current State:** Managed by `ModelAvailabilityService` and `ModelPool`.
|
||||
* **Proposed ADK Mapping (Preflight):** availability and fallback selection run before dispatch and before routing commits to a concrete model.
|
||||
* **Proposed ADK Mapping (Post-failure):** handled separately by the runtime. Availability state mutation and retry decisions are not treated as pure request preprocessing.
|
||||
* **Global Application:** utility calls use the same availability/fallback services as primary model calls.
|
||||
* **Retry safety rule:** automatic full-turn replay is allowed only before any side-effecting tool has executed. After side effects, the runtime surfaces the failure and requires explicit user action.
|
||||
* **Phase 1 transition path:** approvals and fallback prompts continue to use existing Gemini CLI callbacks. The doc treats this as an internal bridge, not as an existing standard ADK elicitation API.
|
||||
|
||||
## 3.9 State-Driven Mode Switching (Plan Mode)
|
||||
|
||||
Dynamically shifting system prompts and active tools when users switch interaction tiers (for example, Chat Mode to Plan Mode).
|
||||
|
||||
* **Current State:** Toggled via `/plan`, which changes `ApprovalMode` and related legacy scheduler behavior.
|
||||
* **Proposed ADK Mapping (Dynamic Prompt):** Use `InstructionProvider` in `LlmAgentConfig.instruction`.
|
||||
* **Proposed ADK Mapping (Dynamic Tooling):** Use a custom `BaseToolset` whose filter is derived from session mode.
|
||||
* **Single source of truth:** mode is session/runtime state derived from current approval mode; prompt, toolset, policy, routing, and UI all consume the same state.
|
||||
|
||||
```typescript
|
||||
export class GcliModeAwareToolset extends BaseToolset {
|
||||
constructor(
|
||||
private readonly chatTools: BaseTool[],
|
||||
private readonly planTools: BaseTool[],
|
||||
) {
|
||||
super(() => true);
|
||||
}
|
||||
|
||||
async getTools(context?: ReadonlyContext): Promise<BaseTool[]> {
|
||||
const isPlan = context?.state.get('plan_mode') === true;
|
||||
return isPlan ? this.planTools : this.chatTools;
|
||||
}
|
||||
|
||||
async close(): Promise<void> {}
|
||||
}
|
||||
```
|
||||
|
||||
## 3.10 Tool Output Masking
|
||||
|
||||
Managing context window efficiency by offloading bulky tool outputs (for example, shell logs and large file reads) to files.
|
||||
|
||||
* **Current State:** `ToolOutputMaskingService` in `packages/core/src/context/toolOutputMaskingService.ts`.
|
||||
* **Proposed ADK Mapping:** masking runs on **outgoing request history only**, after compaction.
|
||||
* **Design choice:** persisted session history remains the canonical event log. Masking artifacts are session-scoped files referenced from the outbound request projection.
|
||||
|
||||
---
|
||||
|
||||
# 5. Known Gaps in ADK (Gating Blockers)
|
||||
|
||||
This section highlights existing gaps in standard ADK that prevent a seamless cutover without bridge logic or upstream changes.
|
||||
|
||||
## 5.1 Real-Time User Message Injections (Aborted Turns)
|
||||
|
||||
While next-step steering is possible today using `beforeModelCallback`, true real-time interruption requires:
|
||||
|
||||
- stable input-stream support, and
|
||||
- a complete TypeScript live runtime
|
||||
|
||||
Until then, the supported behavior is queued next-step steering at model boundaries, not mid-stream interruption.
|
||||
|
||||
## 5.2 Conversation Rewind and State Reversal
|
||||
|
||||
Translating manual trajectory drops to ADK runtime state is cumbersome. While Python ADK supports rollback, TypeScript ADK does not yet support it natively.
|
||||
|
||||
* **Resolution Strategy:** Gemini CLI implements rewind in `GcliFileSessionService`, but **not** as a shallow JSON edit. Rewind truncates the event log, recomputes derived state, and invalidates any confirmation/resumption state past the rewind point.
|
||||
|
||||
## 5.3 Phase 1 Non-Goals
|
||||
|
||||
To keep the migration surface bounded, Phase 1 intentionally excludes:
|
||||
|
||||
- non-interactive surfacing of `elicitation_request` / `elicitation_response`
|
||||
- true live/bidi interruption semantics
|
||||
- native ADK confirmation/resumption parity for approvals and fallback prompts
|
||||
|
||||
---
|
||||
|
||||
# Long-Term Vision: Unification of Agents and Subagents
|
||||
|
||||
The long-term vision is that subagents and the primary agent share:
|
||||
|
||||
- the same runtime core
|
||||
- the same tool definitions
|
||||
- the same policy constraints
|
||||
- the same configuration schemas
|
||||
- the same orchestration contract: `AgentSession`
|
||||
|
||||
What may differ is the **embedding adapter**:
|
||||
|
||||
- standalone/top-level agent -> consumed directly as `AgentSession`
|
||||
- subagent embedded by a parent agent -> projected from its `AgentSession` into parent-facing tool or child-thread events
|
||||
|
||||
For SDK-first unification, Gemini CLI orchestration targets `AgentSession`, not a specific ADK tool abstraction. When a subagent must be exposed to a parent model as a tool, Gemini CLI wraps that child `AgentSession` with a `FunctionTool` or custom `BaseTool` projection. `AgentTool` remains the native ADK nested-agent option if we later want full ADK-native nested-agent semantics.
|
||||
|
||||
---
|
||||
|
||||
# Migration Sequence and Unification Checklist
|
||||
|
||||
The migration remains non-sequential, but each phase has an explicit success condition:
|
||||
|
||||
- [ ] **Non-interactive session parity** `#22699`
|
||||
output/error parity, basic replay correctness, feature-flagged rollout
|
||||
- [ ] **Interactive session parity** `#22701`
|
||||
projected event parity, approval bridge wiring, no live-interrupt claim
|
||||
- [ ] **Subagent adapter parity** `#22700`
|
||||
tool isolation, activity projection, policy/routing parity for child runs
|
||||
- [ ] **ADK session conformance** `#22974`
|
||||
`AgentSession` timing/replay guarantees preserved by the adapter
|
||||
- [ ] **Skills parity** `#22966`
|
||||
skills work through the shared runtime without special legacy paths
|
||||
- [ ] **Policy and confirmation parity** `#22964`
|
||||
phase-1 callback bridge stable; native confirmation migration scoped separately
|
||||
- [ ] **Compaction quality parity** `#22979`
|
||||
comparable summarization and truncation quality to current behavior
|
||||
|
||||
---
|
||||
|
||||
# Appendix A: Initialization Sketch
|
||||
|
||||
The main agent and subagents share the same composition root. The difference is whether the resulting `AgentSession` is consumed directly or projected into a parent session.
|
||||
|
||||
```typescript
|
||||
shared = {
|
||||
baseModel,
|
||||
modelConfigService,
|
||||
availabilityService,
|
||||
router,
|
||||
authService,
|
||||
policyEngine,
|
||||
mcpClientManager,
|
||||
skillCatalog,
|
||||
baseToolCatalog,
|
||||
sessionService,
|
||||
}
|
||||
|
||||
agentService = new AdkAgentService(shared)
|
||||
|
||||
function createSession(definition, parentSessionId?) {
|
||||
sessionRecord = sessionService.create({ definition, parentSessionId })
|
||||
registries = buildScopedRegistries(definition, parentSessionId)
|
||||
pipeline = createPipeline({
|
||||
sessionId: sessionRecord.id,
|
||||
agentId: definition.name,
|
||||
parentSessionId,
|
||||
})
|
||||
|
||||
model = new GcliAgentModel(baseModel, pipeline)
|
||||
runtime = createAdkRuntime({
|
||||
definition,
|
||||
model,
|
||||
sessionRecord,
|
||||
registries,
|
||||
policyEngine,
|
||||
})
|
||||
|
||||
return new AgentSession(new AdkAgentProtocolAdapter(runtime))
|
||||
}
|
||||
|
||||
function createSubagentTool(definition) {
|
||||
return new FunctionTool(async (input, toolContext) => {
|
||||
child = createSession(definition, toolContext.sessionId)
|
||||
stream = await child.send(userMessage(input))
|
||||
|
||||
for await (event of child.stream(stream)) {
|
||||
projectChildEventToParent(toolContext, event)
|
||||
}
|
||||
|
||||
return collectFinalText()
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
Child session projection shape:
|
||||
|
||||
```typescript
|
||||
emit(tool_request({ requestId, name: definition.name, args: input }))
|
||||
|
||||
for await (event of childSession.stream({ streamId })) {
|
||||
if (event.type === 'message' || event.type === 'tool_update') {
|
||||
emit(
|
||||
tool_update({
|
||||
requestId,
|
||||
content: projectDisplayContent(event),
|
||||
}),
|
||||
)
|
||||
continue
|
||||
}
|
||||
|
||||
if (event.type === 'error') {
|
||||
emit(
|
||||
tool_response({
|
||||
requestId,
|
||||
name: definition.name,
|
||||
isError: true,
|
||||
content: projectErrorContent(event),
|
||||
}),
|
||||
)
|
||||
return
|
||||
}
|
||||
|
||||
if (event.type === 'agent_end') {
|
||||
emit(
|
||||
tool_response({
|
||||
requestId,
|
||||
name: definition.name,
|
||||
content: collectFinalChildResult(),
|
||||
}),
|
||||
)
|
||||
return
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Only the final `tool_response` is returned to the parent model. `tool_update` remains a progress/UI projection surface.
|
||||
|
||||
Service lifetime split:
|
||||
|
||||
- shared across sessions: model config, availability, routing, auth, policy engine, MCP manager, skill catalog
|
||||
- scoped per agent/session: `AgentSession`, pipeline instance, tool/prompt/resource registries, derived message bus
|
||||
- scoped per invocation: shell process, MCP request, confirmation continuation, tool progress updates
|
||||
|
||||
---
|
||||
|
||||
# Status of this document approvals table {#status-of-this-document}
|
||||
|
||||
| #begin-approvals-addon-section See [go/g3a-approvals](http://goto.google.com/g3a-approvals) for instructions on adding reviewers. |
|
||||
| :---: |
|
||||
.
|
||||
@@ -0,0 +1,170 @@
|
||||
# M5 Implementation Details
|
||||
|
||||
Companion to `replatforming_design.md`. Per-flow walkthroughs, the routing
|
||||
and masking integration recipes, and the seam-level acceptance criteria
|
||||
for the early scaffolding PRs. Anything strategic goes back into the
|
||||
design doc; this file is the mechanical reference.
|
||||
|
||||
## Surface Phases
|
||||
|
||||
M5 lands by runnable surface, not by isolated subsystem. Each phase wires
|
||||
one call site end-to-end before the next starts.
|
||||
|
||||
**Phase A — Non-interactive.** Wires the ADK runtime through the
|
||||
non-interactive AgentSession path. Translator, `GcliAgentModel` dispatch
|
||||
through `ContentGenerator`, `GcliRoutingProcessor`, 429 retry +
|
||||
`handleFallback` + `ModelAvailabilityService` (silent-policy branch swaps
|
||||
models without a UI prompt; non-silent intents are unreachable until the
|
||||
TUI handler lands), abort, scheduler-backed tool execution,
|
||||
`ToolOutputMaskingProcessor`, `LoopDetectionAdkPlugin`, native
|
||||
`MCPToolset` (`tools/list` + `tools/call`).
|
||||
|
||||
**Phase B — Local subagents.** Same `AdkAgentProtocol` as Phase A; new
|
||||
work is subagent-only behavior: factory wiring at
|
||||
[`subagent-tool-wrapper.ts:97`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/subagent-tool-wrapper.ts#L97), `complete_task` terminator, grace-period
|
||||
recovery turn, scoped execution wrappers (workspace, memory-inbox,
|
||||
auto-memory-extraction), and the `onWaitingForConfirmation` activity
|
||||
signal. Event propagation to the parent is already an M4 property — no
|
||||
new wiring.
|
||||
|
||||
**Phase C — Interactive.** Wires the TUI on top of the proven runtime.
|
||||
Stream rendering parity, `session_update`, `HookBridgePlugin` + lifecycle
|
||||
hooks + `Notification`, plan mode (`InstructionProvider` + mode-aware
|
||||
tools), user steering injection, the TUI-registered
|
||||
`fallbackModelHandler` callback (unlocks `retry_once`/`stop`/`upgrade`),
|
||||
and `/rewind`. **All `/slash commands` live outside the protocol
|
||||
layer** — `/rewind` is a slash-command/runtime-adapter concern.
|
||||
|
||||
## Walkthroughs
|
||||
|
||||
**Typical message:** user input → `AdkAgentProtocol.send` → emit `agent_start` (`BeforeAgent` hook fires) → emit deterministic `session_update{model}` once `config.getModel()` is resolved (do NOT wait on first translator output — first output may be `error` or `usage`) → `Runner.runAsync` → request processors run in one explicit ordered list (ADK defaults through `CONTENT_REQUEST_PROCESSOR`, then `ToolOutputMaskingProcessor`, then `GcliRoutingProcessor`, then the remaining ADK defaults; supplying `config.requestProcessors` replaces ADK's default list, so we must include the defaults ourselves) → tool preprocessing → `beforeModelCallback` runs guards + `BeforeModel` / `BeforeToolSelection` hooks → `GcliAgentModel.generateContentAsync` dispatches through `config.getContentGenerator().generateContentStream(...)` → translator emits `message` + `tool_request` events → `toAdkTool.execute` calls `scheduler.schedule` (`BeforeTool` hook, approval, exec, per-call truncation at 40k chars via [`tool-executor.ts:196-292`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/tool-executor.ts#L196-L292), `AfterTool` hook) → feeds back → loop iterates → final `message` events → `agent_end` (`AfterAgent` hook fires). ADK currently dispatches function calls sequentially within a turn ([`functions.ts:345-491`](https://github.com/google/adk-js/blob/main/core/src/agents/functions.ts#L345-L491)), and `scheduler.schedule` queues concurrent callers — no interleaving risk.
|
||||
|
||||
**429 fallback:** `GcliAgentModel` owns the same retry loop shape that `GeminiChat` owns today: the content-generator call is inside a `retryWithBackoff` `apiCall` closure, `onPersistent429` calls `handleFallback(config, currentModel, authType, error)`, the handler applies the existing `ModelAvailabilityService` transition and may call `config.activateFallbackMode(...)` for `retry_always`, then the retry loop resets attempts and re-runs the same closure. Do not throw to ADK Runner for retry; `LlmAgent.runAndHandleError` converts thrown model errors into error responses/events. Fallback retry must re-resolve the same current model/config path used by `GcliRoutingProcessor`, and the parity test must cover a fallback that changes the concrete model so we do not dispatch with stale model-specific config or tool declarations. The TUI's `fallbackModelHandler` callback ([`handler.ts:89`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/fallback/handler.ts#L89)) is what unlocks the non-silent intents (`retry_once`, `stop`, `upgrade`); non-interactive runs without one and `handleFallback` returns `null` after silent transitions — that's by design, not a Phase A limitation.
|
||||
|
||||
**Loop detection:** `LoopDetectionAdkPlugin.onEventCallback` feeds each `partial:true` delta to `LoopDetectionService.addAndCheck`, and ignores the `partial:false` consolidated event so the accumulated buffer isn't double-counted. On terminate the plugin calls `protocol.abort()` with reason `LOOP_DETECTED`; the protocol emits a synthetic `error{_meta.code:'LOOP_DETECTED'}` event and the abort propagates through `InvocationContext.abortSignal` into `GcliAgentModel.generateContentAsync`. The plugin does NOT mutate the in-flight ADK event and does NOT abort inside `onEventCallback` — that would drop the event before yield ([`runner.ts:332-334`](https://github.com/google/adk-js/blob/main/core/src/runner/runner.ts#L332-L334)).
|
||||
|
||||
**`/rewind`:** slash command → slash-command/runtime-adapter (NOT an `AgentProtocol` method) → `ChatRecordingService.rewindTo(id)` truncates `ConversationRecord.messages` ([`chatRecordingService.ts:743-762`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/services/chatRecordingService.ts#L743-L762)) → `sessionService.deleteSession(...)` drops the ADK session → emits `session_update`. Next `send()` calls `sessionService.createSession(...)` then `appendEvent`s each retained `ConversationRecord` message converted to an ADK `Event` (`InMemorySessionService.createSession` has no seed-events parameter — [`in_memory_session_service.ts:54-67`](https://github.com/google/adk-js/blob/main/core/src/sessions/in_memory_session_service.ts#L54-L67)). Conversion fields: ISO timestamp → epoch ms, `type:'user'|'gemini'` → `author`, synthesized deterministic `invocationId` (ConversationRecord has none), `ToolCallRecord` → `FunctionCall` + `FunctionResponse` Parts, thoughts and `TokensSummary` → `actions.customMetadata`. Rewind-injected events are non-partial by construction. Wired in Phase C alongside the rest of the interactive surface.
|
||||
|
||||
**Subagent invocation:** Parent ADK `LlmAgent` dispatches an M4 subagent tool → [`subagent-tool-wrapper.ts:97`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/subagent-tool-wrapper.ts#L97) calls the session factory (today it constructs `LocalSubagentInvocation` directly — [`local-invocation.ts:48`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-invocation.ts#L48)) → factory returns an ADK-backed `AgentProtocol` when the flag is on, legacy `LocalSubagentInvocation` otherwise → child constructs its own independent `Runner` with its own `LlmAgent`, plugins, tools, and `GcliSessionService` → child events stream back through the M4 wrapper to the parent as `tool_update`s and a terminal `tool_response`. Both implementations satisfy `AgentProtocol`; the factory selects between them. Existing parent-side telemetry on [`subagent-tool.ts:224`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/subagent-tool.ts#L224) / [`subagent-tool-wrapper.ts:75`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/subagent-tool-wrapper.ts#L75) is the correlation point — no aggregation changes for M5.
|
||||
|
||||
## Model routing
|
||||
|
||||
ADK ships a `RoutedLlm` that can sit in front of a `BaseLlm` and pick a model
|
||||
per call. M5 does not use it. Gemini CLI already owns the routing stack
|
||||
(`ModelRouterService`, sequence-sticky model, `applyModelSelection`,
|
||||
`ModelConfigService`, `ModelAvailabilityService`, fallback config), and that
|
||||
behavior has to stay byte-for-byte the same through the deprecation window.
|
||||
Bolting a second routing layer on top would not preserve behavior; it would
|
||||
give us two places to keep in sync.
|
||||
|
||||
The integration is a custom ADK **request processor**, not a step inside
|
||||
`BaseLlm`. ADK runs request processors at [`llm_agent.ts:731-743`](https://github.com/google/adk-js/blob/main/core/src/agents/llm_agent.ts#L731-L743) before
|
||||
tool preprocessing ([`llm_agent.ts:746`](https://github.com/google/adk-js/blob/main/core/src/agents/llm_agent.ts#L746)) and before
|
||||
`BaseLlm.generateContentAsync` ([`llm_agent.ts:1024`](https://github.com/google/adk-js/blob/main/core/src/agents/llm_agent.ts#L1024)). Selecting the model
|
||||
inside `generateContentAsync` is too late — tool declarations and
|
||||
generation config have already been materialized from whatever model was
|
||||
on the `LlmRequest` when it arrived.
|
||||
|
||||
Per-call flow inside `GcliRoutingProcessor.runAsync(invocationContext, llmRequest)`:
|
||||
|
||||
1. Build a `RoutingContext` from the masked `llmRequest.contents` + `invocationContext.session` + `Config` + `abortSignal`. Same shape `GeminiClient.processTurn` constructs today.
|
||||
2. If the current sequence model is set, keep using it for this tool-call sequence. Otherwise call `config.getModelRouterService().route(routingContext)`. Sequence-sticky state moves to `Config` (or its successor) in PR #14 with named reset points — today it lives on `GeminiClient` ([`client.ts:101`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L101), [`:339`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L339)).
|
||||
3. Run the selection through the existing path: `applyModelSelection(...)`, `ModelConfigService`, `ModelAvailabilityService`. Active model, availability state, fallback config, and resolved generation config stay consistent with the legacy run loop. Set the current sequence model.
|
||||
4. Rewrite `llmRequest.model`, `llmRequest.config` (system instructions, generation config), and the tool subset in place. ADK's tool preprocessing then sees the resolved model.
|
||||
|
||||
`GcliAgentModel.generateContentAsync` is the content-generator dispatcher plus the legacy-compatible fallback retry loop:
|
||||
|
||||
5. Check abort signal.
|
||||
6. Map ADK `LlmRequest` → `GenerateContentParameters` and call `config.getContentGenerator().generateContentStream(...)`. Auth (login-with-Google, Gemini API key, Vertex, ADC/compute, gateway, fake responses, logging/recording wrappers) lives inside the content generator implementation — no header injection in this layer.
|
||||
7. Stream the result back to ADK, propagating abort. On 429 / persistent quota errors, stay inside the local `retryWithBackoff` loop: call `handleFallback(...)`, let it apply the existing availability/fallback state transition, reset attempts when it returns a retry intent, and re-enter the content-generator call path. A thrown model error is terminal for this ADK iteration unless an ADK `onModelError` callback returns a replacement response; Runner does not automatically re-run request processors.
|
||||
|
||||
## Availability service integration
|
||||
|
||||
`ModelAvailabilityService` is a self-contained state machine — no Config in its constructor, no Config reads in its methods, just a `Map<ModelId, HealthState>` and the methods that mutate it ([`modelAvailabilityService.ts:41-137`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/availability/modelAvailabilityService.ts#L41-L137)). The class is reusable as-is under ADK. The integration *around* it is Config-driven, and that's what has to be ported.
|
||||
|
||||
What feeds the service: `handleFallback` reads the policy chain from Config (`resolvePolicyChain(config)`), takes the failed model as a parameter, and calls `availability.selectFirstAvailable(candidates)`. The failed model comes from [`client.ts:1080`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L1080): `const active = this.config.getActiveModel()`. The retry loop re-polls Config between attempts so any mid-loop fallback mutation is picked up cleanly.
|
||||
|
||||
What resets the service: two triggers, both outside the service. `reset()` fires from `config.setModel()` ([`config.ts:1813`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/config/config.ts#L1813)), itself triggered by `config.activateFallbackMode()` after a `retry_always` decision. `resetTurn()` fires at the turn boundary — call site not yet verified, probably `client.ts` or `GeminiChat`.
|
||||
|
||||
What `GcliAgentModel` has to do: replicate the [`client.ts:1072-1110`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L1072-L1110) retry pattern faithfully — pre-call read `config.getActiveModel()`, call `handleFallback(config, currentAttemptModel, authType, error)` on persistent 429, re-poll Config before the next attempt. The `resetTurn` trigger lands at whatever ADK boundary corresponds to "new turn started" — ADK doesn't know about it, so it's an explicit wire.
|
||||
|
||||
Subagent note: `AgentLoopContext.config` is shared by reference between parent and child, so parent and subagent share the same `ModelAvailabilityService.health` map and the same `_activeModel`. A subagent fallback mutates parent state. This is legacy behavior, not an ADK regression — flagging it as a documented property, not a Phase A task.
|
||||
|
||||
Open Questions: locate the `resetTurn` call site in legacy and map it to an ADK trigger.
|
||||
|
||||
## Tool output masking
|
||||
|
||||
Two mechanisms, only one is new under ADK:
|
||||
|
||||
**Per-call truncation (already done by the scheduler).** When a tool returns, [`tool-executor.ts:196-292`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/tool-executor.ts#L196-L292) truncates content > 40k chars for shell and single-text-part MCP tools, writes the full output to `<projectTempDir>/.../<toolName>_<callId>_<random>.txt`, and returns a snippet + file path pointer. This rides for free in M5 because `toAdkTool` calls the scheduler.
|
||||
|
||||
**Per-turn batch masking (this is what's new).** Today [`client.ts:637`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L637) calls `tryMaskToolOutputs(getHistory())` inside `processTurn`, after compression and before the model call. `ToolOutputMaskingService.mask` walks the whole history, applies a 50k-protection + 30k-prunable-gate sliding window, writes full content to disk, replaces `functionResponse.response` with a `<tool_output_masked>...</tool_output_masked>` preview + pointer ([`toolOutputMaskingService.ts:70-272`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/context/toolOutputMaskingService.ts#L70-L272)). Then `setHistory` syncs to `ChatRecordingService.updateMessagesFromHistory` ([`chatRecordingService.ts:768`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/services/chatRecordingService.ts#L768)).
|
||||
|
||||
Under ADK this becomes a custom `RequestProcessor` (`ToolOutputMaskingProcessor`) registered with `LlmAgent` after ADK has built `llmRequest.contents` from `session.events` and before `GcliRoutingProcessor` reads those contents. The processor:
|
||||
|
||||
1. Reads `llmRequest.contents` (produced by ADK's `ContentRequestProcessor` from `session.events`).
|
||||
2. Calls `ToolOutputMaskingService.mask(llmRequest.contents, config)`. The algorithm is unchanged.
|
||||
3. Applies each masked `functionResponse` back to the matching live ADK `Session.events` part by stable event/part identity. Do not use `session.events[i]` ⇔ `llmRequest.contents[i]`; ADK filters, rearranges, and clone-deeps content while building the request ([`content_processor_utils.ts:72-84`](https://github.com/google/adk-js/blob/main/core/src/agents/processors/content_processor_utils.ts#L72-L84)).
|
||||
4. Writes the masked `llmRequest.contents` back so the model call and downstream routing see the masked view.
|
||||
5. Calls `ChatRecordingService.updateMessagesFromHistory(masked)` so the persisted transcript matches.
|
||||
|
||||
`isAlreadyMasked` keeps it idempotent across turns. The disk file path stays the same across runs once we drop the `Math.random()` suffix in the filename (also part of PR #21).
|
||||
|
||||
## GcliSessionService
|
||||
|
||||
`InMemorySessionService.getSession` does `cloneDeep(session)` ([`in_memory_session_service.ts:88-131`](https://github.com/google/adk-js/blob/main/core/src/sessions/in_memory_session_service.ts#L88-L131)), so the runner operates on a clone of the stored session. Any mutation to `invocationContext.session.events[i]` only persists for the duration of `runAsync`; the next turn fetches a fresh clone. Subclassing it is not enough: its `sessions` store is private ([`in_memory_session_service.ts:40`](https://github.com/google/adk-js/blob/main/core/src/sessions/in_memory_session_service.ts#L40)), and returning the stored reference from `getSession` would make inherited `appendEvent` push the same event twice ([`in_memory_session_service.ts:173-227`](https://github.com/google/adk-js/blob/main/core/src/sessions/in_memory_session_service.ts#L173-L227)).
|
||||
|
||||
`GcliSessionService` extends `BaseSessionService`, not `InMemorySessionService`. It owns an authoritative live-session map keyed by `appName/userId/sessionId`. Plugins and request processors that mutate session events (masking is the primary use case) get persistence across turns.
|
||||
|
||||
```ts
|
||||
class GcliSessionService extends BaseSessionService {
|
||||
private readonly liveSessions = new Map<string, Session>();
|
||||
|
||||
private key(appName: string, userId: string, sessionId: string): string {
|
||||
return `${appName}\0${userId}\0${sessionId}`;
|
||||
}
|
||||
|
||||
override async createSession(req: CreateSessionRequest): Promise<Session> {
|
||||
const session = createSession({
|
||||
id: req.sessionId ?? randomUUID(),
|
||||
appName: req.appName,
|
||||
userId: req.userId,
|
||||
state: req.state ?? {},
|
||||
events: [],
|
||||
lastUpdateTime: Date.now(),
|
||||
});
|
||||
this.liveSessions.set(this.key(req.appName, req.userId, session.id), session);
|
||||
return session;
|
||||
}
|
||||
|
||||
override async getSession(req: GetSessionRequest): Promise<Session | undefined> {
|
||||
const session = this.liveSessions.get(
|
||||
this.key(req.appName, req.userId, req.sessionId),
|
||||
);
|
||||
if (!session || !req.config) return session;
|
||||
// If a caller asks for a filtered view, return a copy so the caller cannot
|
||||
// mutate a truncated live session.
|
||||
return copyWithFilteredEvents(session, req.config);
|
||||
}
|
||||
|
||||
override async appendEvent(req: AppendEventRequest): Promise<Event> {
|
||||
const event = await super.appendEvent(req); // pushes once, skips partials
|
||||
req.session.lastUpdateTime = event.timestamp;
|
||||
return event;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`listSessions` and `deleteSession` operate on `liveSessions`; they do not delegate to an inner `InMemorySessionService` whose state would drift once `appendEvent` stops delegating. The runner calls `getSession` without `GetSessionConfig`, but preserving filtered-copy behavior for `numRecentEvents` / `afterTimestamp` keeps the service compatible with other ADK callers. If any M5 plugin uses ADK app/user state prefixes, PR #20 ports `InMemorySessionService`'s app-state/user-state handling too; otherwise the service intentionally supports only session-local state.
|
||||
|
||||
## PR seam summary
|
||||
|
||||
Seam names and acceptance criteria for the foundation PRs. Aside from the `GcliSessionService` code shape above, pseudo-code stubs are intentionally omitted because they drift from real code; the per-PR description owns the implementation specifics.
|
||||
|
||||
**PR #1 — Scaffold.** Add `packages/core/src/adk-agent/`. Export `AdkAgentProtocol` (skeleton implementing `AgentProtocol`), `GcliSessionService` (`BaseSessionService` implementation with live session storage), and named seam files for `GcliAgentModel`, `GcliRoutingProcessor`, `ToolOutputMaskingProcessor`, `HookBridgePlugin`, `LoopDetectionAdkPlugin`, `toAdkTool`. All bodies are TODO; compile green; no behavior change. Acceptance: module compiles, file structure matches the architectural sketch, all symbols importable.
|
||||
|
||||
**PR #2 — Flag + factory.** Add `experimental.adk.runtimeEnabled` to the settings schema. Implement the session factory: builds an `AdkAgentProtocol` when the runtime flag is on, otherwise the existing legacy implementation. Define precedence with `experimental.adk.agentSessionNoninteractiveEnabled` (the new flag is the runtime selector; the old one continues to gate the non-interactive code path). Acceptance: flag visible in settings, factory unit-tested with both flag states, no production caller wired yet.
|
||||
|
||||
**PR #3 — Event-type cleanup.** Delete unused `elicitation_request`, `elicitation_response`, and `elicitations` references that the protocol surface no longer uses.
|
||||
|
||||
(`/rewind` is intentionally NOT a foundation PR. It stays in the slash-command/runtime-adapter layer and is wired during Phase C — see Surface Phases above.)
|
||||
@@ -0,0 +1,472 @@
|
||||
# Gemini CLI Replatforming Design
|
||||
|
||||
## Goals
|
||||
|
||||
1. **Fast ADK Replatforming:** Transition the default Gemini CLI agent to use
|
||||
the ADK execution engine to reduce maintenance overhead on the legacy run
|
||||
loop.
|
||||
2. **Reuse Existing Frameworks:** Minimize UI and internal API churn by keeping
|
||||
existing callback mechanisms and frameworks. Specifically, we will _not_
|
||||
rewrite the existing tool approval and model fallback mechanisms to use ADK's
|
||||
`elicitation_request`/`elicitation_response` events. We will reuse the
|
||||
existing message bus and scheduler integrations.
|
||||
3. **Unified Core Engine:** Ensure the main interactive chat, non-interactive
|
||||
mode, and subagents execute via the exact same core ADK engine.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
1. **No Public SDK Readiness:** We are no longer refactoring the core codebase
|
||||
to expose a polished, public-facing SDK.
|
||||
2. **No BYOA / Multi-Agent Orchestration:** We are explicitly not building
|
||||
extensibility for third-party agents or external runtime orchestration.
|
||||
3. **No UI Modernization for Elicitations:** We will not refactor the React TUI
|
||||
or underlying message bus to adopt standard ADK elicitation events. Tool
|
||||
approvals and system interrupts will continue to use the legacy callback
|
||||
bridges.
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
Where we want feedback. Each links to the section with more context — please weigh in there.
|
||||
|
||||
**Milestone 3 — TUI behind `AgentSession`:** scope to be fleshed out by @Jacob Richman; tracked in [#22702](https://github.com/google-gemini/gemini-cli/issues/22702). Exemplar areas listed in [Transition Plan](#transition-plan).
|
||||
|
||||
**Milestone 5 — ADK Execution Engine:**
|
||||
|
||||
- Loop detection: which event stream feeds the detector, how does soft-recovery get into the next request, and which terminate mechanism (abort / throw / replacement response)? See [Loop detection](#loop-detection).
|
||||
- MCP: upstream the missing surface into `MCPToolset`, or keep `McpClient` and pass its outputs as functional tools? See [MCP](#mcp).
|
||||
- Slash commands: live outside the protocol layer — scope of the slash-command/runtime-adapter layer TBD. See [Slash commands](#slash-commands).
|
||||
|
||||
---
|
||||
|
||||
## Milestones
|
||||
|
||||
1. AgentSession Interface Create `AgentSession` abstraction flexible enough to
|
||||
support the legacy runtime and ADK.
|
||||
2. Adapting the Non-Interactive Runtime Adapt the existing non-interactive CLI
|
||||
to conform to the new `AgentSession` API.
|
||||
3. Initial TUI Adaptation & Session Creation Move the main agent behind the
|
||||
`AgentSession` interface, adapting the TUI via `useAgentStream` and
|
||||
`LegacyAgentProtocol` while reusing the message bus. **Includes implementing
|
||||
the Agent Creation Factory/Function.**
|
||||
4. Subagent Orchestration Decouple subagents from the legacy runtime by wrapping
|
||||
them in the `AgentSession` interface.
|
||||
5. ADK Execution Engine Build the unified ADK agent conforming to
|
||||
`AgentSession`, handling all modes (Main, Non-Interactive, Subagents).
|
||||
Adopts ADK's loop and request processors; retains gemini-cli's scheduler,
|
||||
routing, fallback, masking, hooks, compression, recording, and loop
|
||||
detection. See [Detailed Design §5](#5-milestone-5-adk-execution-engine).
|
||||
|
||||
## Progress
|
||||
|
||||
| Milestone | Owner | Status | Relevant PRs / Branches | Time Estimate |
|
||||
| :------------------------------------------- | :------------------------------- | :---------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------ |
|
||||
| 1. AgentSession Interface | Adam Weidman, Michael Bleigh | ✅ Complete | [PR #22270](https://github.com/google-gemini/gemini-cli/pull/22270), [PR #23159](https://github.com/google-gemini/gemini-cli/pull/23159), [PR #23548](https://github.com/google-gemini/gemini-cli/pull/23548) | - |
|
||||
| 2. Adapting the Non-Interactive Runtime | Adam Weidman | ✅ Complete | [PR #22984](https://github.com/google-gemini/gemini-cli/pull/22984), [PR #22985](https://github.com/google-gemini/gemini-cli/pull/22985), [PR #22986](https://github.com/google-gemini/gemini-cli/pull/22986), [PR #24439](https://github.com/google-gemini/gemini-cli/pull/24439), [PR #22987](https://github.com/google-gemini/gemini-cli/pull/22987) | - |
|
||||
| 3. Initial TUI Adaptation & Session Creation | Michael Bleigh | 🚧 WIP | [PR #24275](https://github.com/google-gemini/gemini-cli/pull/24275), [PR #24287](https://github.com/google-gemini/gemini-cli/pull/24287), [PR #24292](https://github.com/google-gemini/gemini-cli/pull/24292), [PR #24297](https://github.com/google-gemini/gemini-cli/pull/24297), [Issue #25046](https://github.com/google-gemini/gemini-cli/issues/25046) | 2-3 weeks |
|
||||
| 4. Subagent Orchestration | Adam Weidman | 🚧 WIP | [PR #25302](https://github.com/google-gemini/gemini-cli/pull/25302), [PR #25303](https://github.com/google-gemini/gemini-cli/pull/25303)<br>Branches: `agent-session/local-invocation`, `agent-session/remote-invocation`, `agent-session/agent-tool` | 1 week |
|
||||
| 5. ADK Execution Engine | Adam Weidman, Alexey Kalenkevich | 🚧 WIP | Alexey Prototype: `eeb9301`. See [M5 Execution Plan](#m5-execution-plan) below for the PR breakdown. | 4 weeks |
|
||||
| 6. Testing / Validation / Bug fixing | Adam Weidman, Alexey Kalenkevich | ⏳ Upcoming | Folded into each phase below: non-interactive first, then subagents, then interactive. See [M5 Execution Plan](#m5-execution-plan). | 2 weeks |
|
||||
|
||||
_(Note: Previous milestones related to "Unified Elicitations" and "Adopt ADK
|
||||
Primitives" have been removed as per the updated Non-Goals)._
|
||||
|
||||
---
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### 1. Milestone 1: AgentSession Interface
|
||||
|
||||
The `AgentSession` interface is the core abstraction that decouples the TUI from
|
||||
the specific agent implementation by proposing a purely event-driven loop
|
||||
boundary (`AgentEvent` streams). _For full rationale and design, see the
|
||||
[Gemini CLI Agents design document](https://docs.google.com/document/d/1Zv2_VuVNc-PtsFIU5HYApdmaC3EyK5_fNeivYUtL1fs/edit?tab=t.0#heading=h.ok5bx3z7fmr8)._
|
||||
|
||||
### 2. Milestone 2: Adapting the Non-Interactive Runtime
|
||||
|
||||
The non-interactive runtime conforms to the `AgentSession` API via a legacy
|
||||
protocol adapter. This adapter translates internal message bus and scheduler
|
||||
events into standard agent event streams. Because this mode does not require
|
||||
user interaction, it bypasses complex UI features like tool approvals.
|
||||
|
||||
This capability is enabled via the following experimental setting in
|
||||
`.gemini/settings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"experimental": {
|
||||
"adk": {
|
||||
"agentSessionNoninteractiveEnabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Milestone 3: Initial TUI Adaptation & Session Creation
|
||||
|
||||
Adapting the interactive TUI to consume the `AgentSession` interface is a
|
||||
complex, multi-layered effort. The core mechanism is an event-streaming hook
|
||||
that subscribes to the session.
|
||||
|
||||
#### Current Event Handling State (Legacy Mechanism)
|
||||
|
||||
The legacy system relies on a central `Scheduler` and a `MessageBus` to
|
||||
orchestrate tool execution and approvals. This is the mechanism we are retaining
|
||||
to avoid a full UI refactor:
|
||||
|
||||
- **Tool Approvals**: When a tool requires user confirmation, the scheduler
|
||||
calls `resolveConfirmation`
|
||||
([`scheduler.ts:L663`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/scheduler.ts#L663))
|
||||
and updates the tool's status to `AwaitingApproval` in `confirmation.ts`
|
||||
([`confirmation.ts:L162`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/confirmation.ts#L162)).
|
||||
- **UI Notification**: The `SchedulerStateManager` publishes a
|
||||
`TOOL_CALLS_UPDATE` event on the message bus
|
||||
([`state-manager.ts:L254`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/state-manager.ts#L254)).
|
||||
The UI hook `useToolScheduler.ts` subscribes to this event to update the React
|
||||
state
|
||||
([`useToolScheduler.ts:L178`](https://github.com/google-gemini/gemini-cli/blob/main/packages/cli/src/ui/hooks/useToolScheduler.ts#L178)).
|
||||
- **User Interaction**: If any tool requires approval, the TUI transitions to a
|
||||
waiting state
|
||||
([`useGeminiStream.ts:L174`](https://github.com/google-gemini/gemini-cli/blob/main/packages/cli/src/ui/hooks/useGeminiStream.ts#L174))
|
||||
and renders the `ToolConfirmationQueue.tsx` component.
|
||||
- **Response Loop**: Once the user makes a decision, `ToolActionsContext.tsx`
|
||||
publishes a `TOOL_CONFIRMATION_RESPONSE`
|
||||
([`ToolActionsContext.tsx:L150`](https://github.com/google-gemini/gemini-cli/blob/main/packages/cli/src/ui/contexts/ToolActionsContext.tsx#L150)).
|
||||
The scheduler, which is blocked in `waitForConfirmation`
|
||||
([`confirmation.ts:L168`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/scheduler/confirmation.ts#L168)),
|
||||
receives the response via the message bus and resumes execution.
|
||||
|
||||
#### Transition Plan
|
||||
|
||||
The interactive `AgentSession` adapter continues to leverage the legacy message bus and scheduler rather than migrating tool approvals to ADK elicitations — minimizing UI churn. The adapter subscribes to the message bus and translates read-only observational events (`tool_update`, `message`, `usage`) into `AgentEvent` streams; interactive/blocking events (tool approvals) stay on the legacy callback path.
|
||||
|
||||
Work areas (scope is settled; implementation within each owned by @Jacob Richman in [#22702](https://github.com/google-gemini/gemini-cli/issues/22702)):
|
||||
|
||||
- **Event Parity:** standard tool calls, MCP tool calls, and subagent display.
|
||||
- **UI Coalescence:** using `agent_start` / `agent_end` for robust UI state management.
|
||||
- **Tool-Controlled Display:** per-tool render variants (`FileDiff`, `TodoList`, `AnsiOutput`, etc.) passing through `AgentEvent`, plus tool-triggered out-of-band display state (e.g., IDE diff overlay, plan-mode toggle).
|
||||
- **Command Routing:** client-initiated commands (slash commands) routed down to the session; ensuring `abort` works properly.
|
||||
- **System Notices:** generic notice events for system notifications.
|
||||
|
||||
### 4. Milestone 4: Subagent Orchestration
|
||||
|
||||
Subagents invoke both local and remote execution via a unified `AgentTool` that
|
||||
wraps them in `AgentSession` instances. This pattern translates low-level
|
||||
execution events into standard `AgentEvent`s for the parent session.
|
||||
|
||||
### 5. Milestone 5: ADK Execution Engine
|
||||
|
||||
Replaces the legacy `GeminiClient` + `Turn` loop with an ADK `Runner` + `LlmAgent`
|
||||
behind the existing `AgentSession` interface. ADK contributes the agent loop;
|
||||
gemini-cli contributes everything else.
|
||||
|
||||
Reference implementation:
|
||||
[commit `eeb9301a`](https://github.com/google-gemini/gemini-cli/commit/eeb9301a489a1609f7d74e6c61569d82c5742821).
|
||||
|
||||
#### Approach: progressive integration
|
||||
|
||||
Start by landing the initial scaffolding: the `adk-agent/` module, the runtime
|
||||
flag, the top-level session factory, and the event-type cleanup. These changes
|
||||
should compile green, keep the flag off by default, and avoid behavior changes.
|
||||
|
||||
Once the initial scaffolding lands, M5 proceeds by **runnable surface**, not
|
||||
by isolated subsystem. Each phase wires one call site end-to-end before
|
||||
moving to the next:
|
||||
|
||||
1. **Non-interactive first.** Run the ADK loop end-to-end without UI. This phase brings up the translator, model dispatch through `GcliAgentModel`, routing, scheduler-backed tool execution, output masking, abort propagation, loop detection, and native MCP. The 429 retry path lands here too: `handleFallback` silently swaps to an available model when the policy allows. The TUI-side prompt that lets a user pick `retry_once` or `stop` is a separate callback registered in Phase C; non-interactive runs without it.
|
||||
|
||||
2. **Local subagents second.** Same ADK runtime — no second implementation. The new work is the subagent-only behavior that doesn't exist on the top-level loop: `complete_task` as a mandatory terminator, grace-period retry, scoped workspace/memory wrappers, and the confirmation-waiting activity signal. Event propagation back to the parent already works through the M4 wrapper.
|
||||
|
||||
3. **Interactive last.** Wire the TUI once the runtime is proven. Stream rendering, `session_update` behavior, hooks and notifications, plan mode, between-turn steering, the TUI fallback prompt callback, and `/rewind`. All slash commands live outside the protocol layer — `/rewind` is owned by the slash-command/runtime-adapter layer, not `AgentProtocol`.
|
||||
|
||||
#### Architectural sketch
|
||||
|
||||
```
|
||||
User input
|
||||
│
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ AgentSession │ existing wrapper (agent-session.ts:18)
|
||||
└──────────┬───────────┘ SessionStart / SessionEnd hooks
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ AdkAgentProtocol │ send/subscribe/abort/events
|
||||
│ │ agent_start/end → Before/AfterAgent hooks
|
||||
└──────────┬───────────┘
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ GeminiCliAgent │ extends BaseAgent
|
||||
│ .runAsyncImpl(ctx) │ yields ADK Events
|
||||
└──────────┬───────────┘
|
||||
▼
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ ADK Runner with: │
|
||||
│ • GcliSessionService (live event state) │
|
||||
│ • RequestProcessors (ordered): │
|
||||
│ ADK defaults through content build │
|
||||
│ ToolOutputMaskingProcessor │
|
||||
│ GcliRoutingProcessor │
|
||||
│ remaining ADK defaults │
|
||||
│ • Plugins: │
|
||||
│ MaxTurns / TokenLimit / MaxTime │
|
||||
│ HookBridgePlugin │
|
||||
│ LoopDetectionAdkPlugin │
|
||||
│ Steering injection │
|
||||
│ │
|
||||
│ LlmAgent loop: call → parse → dispatch → │
|
||||
│ feed back → iterate │
|
||||
└────────┬──────────────────────┬──────────────┘
|
||||
│ │
|
||||
┌────────▼─────────┐ ┌─────────▼──────────────┐
|
||||
│ GcliAgentModel │ │ toAdkTool │
|
||||
│ • dispatch │ │ scheduler.schedule({ │
|
||||
│ • fallback retry │ │ callId, name, args │
|
||||
│ • abort │ │ }, signal) │
|
||||
│ │ │ │ → BeforeTool hook │
|
||||
│ ▼ │ │ → MessageBus policy │
|
||||
│ ContentGenerator │ │ → tool exec │
|
||||
│ (auth lives here)│ │ → AfterTool hook │
|
||||
└──────────────────┘ │ → CompletedToolCall[] │
|
||||
└─────────────────────────┘
|
||||
|
||||
Runner yields Events → adk-event-translator → AgentEvent[]
|
||||
→ AdkAgentProtocol._emit → subscribers (TUI, non-interactive, etc.)
|
||||
└─ ChatRecordingService.recordMessage on final (non-partial) events:
|
||||
user messages, consolidated model responses, usage, tool calls
|
||||
```
|
||||
|
||||
Per-flow walkthroughs (typical message, 429 fallback, loop detection,
|
||||
`/rewind`, subagent invocation) and scaffold sketches live in
|
||||
[`implementation_details.md`](./implementation_details.md).
|
||||
|
||||
#### Session entry point
|
||||
|
||||
`AgentProtocol` is the interface ([`types.ts:11`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agent/types.ts#L11)). `AgentSession` is the
|
||||
existing wrapper class that implements `AgentProtocol`
|
||||
([`agent-session.ts:18`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agent/agent-session.ts#L18)). M5 adds `AdkAgentProtocol` as a second
|
||||
implementation; the existing `AgentSession` wrapper composes either one.
|
||||
The whole ADK runtime sits inside one constructor:
|
||||
|
||||
```ts
|
||||
class AdkAgentProtocol implements AgentProtocol {
|
||||
constructor(opts: {
|
||||
config: Config; // model + workspace + services
|
||||
// (auth lives inside the ContentGenerator
|
||||
// returned by config.getContentGenerator())
|
||||
instructionProvider: InstructionProvider; // system prompt (mode-aware)
|
||||
tools: BaseTool[]; // gemini-cli tools wrapped via toAdkTool;
|
||||
// MCP tools come from MCPToolset
|
||||
plugins: BasePlugin[]; // HookBridge, LoopDetection, MaxTurns, ...
|
||||
requestProcessors: RequestProcessor[]; // full ordered ADK + GCLI processor list
|
||||
sessionService: BaseSessionService; // GcliSessionService (live in-memory events)
|
||||
parentSessionId?: string; // present iff this is a subagent
|
||||
});
|
||||
send(input: AgentSend): Promise<{ streamId: string | null }>;
|
||||
subscribe(cb: (e: AgentEvent) => void): Unsubscribe;
|
||||
abort(): Promise<void>;
|
||||
get events(): readonly AgentEvent[];
|
||||
}
|
||||
```
|
||||
|
||||
#### Hooks
|
||||
|
||||
Gemini CLI's `HookSystem` (`packages/core/src/hooks/`) ships 11 events
|
||||
that user-defined hooks integrate against. We keep it as the user-facing
|
||||
surface; exposing ADK plugins to users would force every existing hook
|
||||
to rewrite against different signatures.
|
||||
|
||||
Eight of the 11 fire at boundaries our own code controls. Three fire
|
||||
mid-loop where only ADK knows the timing. We fire the eight directly;
|
||||
the three go through a small ADK plugin.
|
||||
|
||||
Owned — fire from our own code:
|
||||
|
||||
| Hook | Fires from |
|
||||
| --- | --- |
|
||||
| `BeforeTool` / `AfterTool` | `scheduler.schedule()` |
|
||||
| `BeforeAgent` / `AfterAgent` | `AdkAgentProtocol` on `agent_start` / `agent_end` |
|
||||
| `SessionStart` / `SessionEnd` | `AdkAgentProtocol` constructor / `.dispose()` |
|
||||
| `PreCompress` | `ChatCompressionService` |
|
||||
| `Notification` | `AdkAgentProtocol._emit` on tool-notification events |
|
||||
|
||||
Bridged — fire from an ADK plugin callback:
|
||||
|
||||
| Hook | Fires from |
|
||||
| --- | --- |
|
||||
| `BeforeToolSelection` | `HookBridgePlugin.beforeModelCallback` |
|
||||
| `BeforeModel` | `HookBridgePlugin.beforeModelCallback` |
|
||||
| `AfterModel` | `HookBridgePlugin.afterModelCallback` |
|
||||
|
||||
`HookBridgePlugin` is a `BasePlugin` registered with the runner. ADK
|
||||
invokes its model callbacks; the body fires the matching gemini-cli
|
||||
hook events. One-way, no state.
|
||||
|
||||
#### MCP
|
||||
|
||||
Today. `mcp-client.ts` is a full server-lifecycle layer: tool discovery, prompt and resource registries, OAuth refresh and 401 recovery, list-change notifications, stdio crash restart, and progress routing.
|
||||
|
||||
What ADK provides. `MCPToolset` is tool-scoped — tools/list and tools/call only, for stdio and streamable HTTP ([`mcp_toolset.ts:58`](https://github.com/google/adk-js/blob/main/core/src/tools/mcp/mcp_toolset.ts#L58), [`mcp_tool.ts:65`](https://github.com/google/adk-js/blob/main/core/src/tools/mcp/mcp_tool.ts#L65)), with HTTP auth material set at connection construction ([`mcp_session_manager.ts:38-50`](https://github.com/google/adk-js/blob/main/core/src/tools/mcp/mcp_session_manager.ts#L38-L50)). It builds its own session manager rather than reusing ours, and exposes no path to prompts or resources even though the underlying `@modelcontextprotocol/sdk` client supports them. Stored-token refresh, 401 recovery, list-change handlers, progress notifications, and stdio crash restart are absent.
|
||||
|
||||
Open Questions:
|
||||
|
||||
- Upstream the missing surface into `MCPToolset`, or keep `McpClient` and pass its outputs (tools, prompts, resources) into the ADK runtime as functional tools?
|
||||
|
||||
#### Subagents
|
||||
|
||||
A subagent is an `AgentProtocol` with a different system prompt, tool subset, and termination policy. Local subagents run through the same `AdkAgentProtocol` constructor as the top-level session; the session factory selects the runtime and picks which plugins ride along per session. The factory omits `LoopDetectionAdkPlugin` for subagents — they have no loop detection today and terminate via `complete_task` plus the executor's turn/time guards. `Config` and `MessageBus` pass through.
|
||||
|
||||
Remote subagents (`RemoteAgentInvocation`, [`remote-invocation.ts:41`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/remote-invocation.ts#L41)) are unaffected — they run on our own A2A integration, and M4's `AgentTool` routes remote vs. local on `definition.kind`. The wiring change is at [`subagent-tool-wrapper.ts:97`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/subagent-tool-wrapper.ts#L97), which today directly constructs `LocalSubagentInvocation` ([`local-invocation.ts:48`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-invocation.ts#L48)) and instead delegates to the session factory.
|
||||
|
||||
Legacy reference points for the behaviors Phase B reimplements as plugins / wrappers:
|
||||
|
||||
- `complete_task` mandatory terminator ([`local-executor.ts:355`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L355)) — subagent must call this tool or it errors with `ERROR_NO_COMPLETE_TASK_CALL`.
|
||||
- Grace-period recovery turn ([`local-executor.ts:460`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L460), [`:724`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L724)) — single final retry with an injected "you must call `complete_task` now" message on recoverable terminate reasons.
|
||||
- Scoped execution wrappers ([`local-executor.ts:530-545`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L530-L545), memory injection at [`:624`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L624) and [`:1339`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/agents/local-executor.ts#L1339)) — workspace, memory-inbox, auto-memory-extraction wrap the run.
|
||||
- `onWaitingForConfirmation` UI signal — propagated through the executor's activity callback today; maps to `BeforeTool` under ADK.
|
||||
|
||||
#### Loop detection
|
||||
|
||||
Today. `LoopDetectionService` runs two detectors against one strike counter: a sliding-window hash over text/tool-call chunks as they stream ([`loopDetectionService.ts:339-437`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/services/loopDetectionService.ts#L339-L437)), and a per-N-turn semantic check via a side LLM call ([`loopDetectionService.ts:261-311`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/services/loopDetectionService.ts#L261-L311)). Below threshold, `GeminiClient` prepends a "System: Potential loop detected..." nudge to the next request ([`client.ts:1246-1279`](https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts#L1246-L1279)); at threshold, it throws and the turn unwinds.
|
||||
|
||||
What ADK provides. Plugin callbacks at every event, around every model call, and on model error (the latter can return a replacement response to keep the loop going). Model output is double-emitted — partial deltas per chunk AND a single consolidated event for the same response — so anything reading the stream has to pick one to avoid double-counting. Exit mechanisms ([`llm_agent.js:634-686`](https://github.com/google/adk-js/blob/main/core/src/agents/llm_agent.ts#L634-L686)): a tripped abort signal returns the generator silently; a thrown Error from inside the model call yields a synthetic error event and ends the turn; a replacement response from `onModelErrorCallback` keeps the loop going. ADK has no direct equivalent of "prepend a system message to the next request"; the closest path is mutating the request contents inside `beforeModelCallback`.
|
||||
|
||||
Open Questions:
|
||||
|
||||
- Which event stream feeds the detector — partial deltas vs consolidated events?
|
||||
- How does the soft-recovery nudge get into the next request?
|
||||
- Which terminate mechanism — abort, throw, or replacement response ([`llm_agent.js:634-686`](https://github.com/google/adk-js/blob/main/core/src/agents/llm_agent.ts#L634-L686))?
|
||||
|
||||
#### Slash commands
|
||||
|
||||
Slash commands live outside the core `AgentProtocol` layer, in a slash-command/runtime-adapter layer wired during Phase C. Specific commands (`/clear`, `/help`, `/memory`, `/compress`, `/rewind`, `/agents`, ...) route differently — some are pure TUI commands, some need to talk to the session, some restart the engine. Implementation details for the adapter layer (per-command routing, abort propagation, and the `ConversationRecord → ADK Event` conversion that `/rewind` requires) are still being expanded.
|
||||
|
||||
#### What we reuse from gemini-cli
|
||||
|
||||
These existing services remain the source of truth under the ADK runtime.
|
||||
M5 adapts to them rather than replacing them:
|
||||
|
||||
- `scheduler.schedule()` — tool execution, approval, telemetry, truncation/distillation, `BeforeTool`/`AfterTool` hooks all run inside it (masking is separate — see Tool output masking row)
|
||||
- `MessageBus` + `PolicyEngine` — tool approval correlation; legacy approval UI continues to consume bus events
|
||||
- `handleFallback` + `ModelAvailabilityService` — 429 fallback + model pool
|
||||
- `ModelRouterService` — per-turn model selection (sequence-sticky), authoritative for routing
|
||||
- `ModelConfigService` — model resolution per request
|
||||
- `LoopDetectionService` — strike-tracked detector
|
||||
- `ChatRecordingService` — product-facing conversation persistence + `rewindTo`
|
||||
- `ChatCompressionService` — outgoing-history projection only; ADK `Session.events` stays as in-memory loop state, `ConversationRecord` stays authoritative for persistence and `/rewind`
|
||||
- `InjectionService` — user steering queue
|
||||
- `ToolOutputMaskingService` — masking
|
||||
- `HookSystem` + 11 hook events — see Hooks section
|
||||
- Existing tool catalog and slash commands
|
||||
|
||||
#### What old logic gets dropped under ADK
|
||||
|
||||
These are the legacy bits the ADK runtime actually replaces:
|
||||
|
||||
- `GeminiClient.processTurn` — replaced by ADK `Runner` + request processors.
|
||||
- `GeminiChat` — replaced by ADK `Session` events.
|
||||
- `Turn` loop — replaced by ADK `LlmAgent.runAsyncImpl`.
|
||||
- `ServerGeminiStreamEvent` — replaced by ADK `Event` → translator → `AgentEvent`.
|
||||
|
||||
#### What we don't reuse from ADK
|
||||
|
||||
ADK is a framework; we adopt the loop and a few extension points. The
|
||||
exclusions below would each be plausible mistakes without explicit
|
||||
rejection:
|
||||
|
||||
- ADK's built-in `Gemini` (`BaseLlm`) — we run our own `BaseLlm` subclass (`GcliAgentModel`) that dispatches through `config.getContentGenerator()`. Auth (OAuth / Gemini API key / Vertex), fallback retry, recording/logging wrappers, and telemetry all live in our content generator; using ADK's bare client would bypass all of that.
|
||||
- `BasePolicyEngine` / `SecurityPlugin` — legacy `MessageBus` + `PolicyEngine` stays. ADK's security plugin would force the approval UI to re-bind against ADK elicitations.
|
||||
- `BaseContextCompactor` — `ChatCompressionService` stays. ADK's compactor doesn't know about our token-budget projection rules.
|
||||
- ADK `RoutedLlm` — not the routing source of truth. Gemini CLI routing (`ModelRouterService` + `ModelConfigService` + `ModelAvailabilityService`, sequence-sticky model, model-specific tool/config updates) remains authoritative. Routing lives in a custom ADK `RequestProcessor` — see Model routing in implementation_details.md.
|
||||
- `BaseSessionService` (file-backed) — `ChatRecordingService` stays. M5 uses a small `GcliSessionService` implementation for in-memory ADK event state.
|
||||
- Most ADK plugin callbacks — see Hooks section.
|
||||
|
||||
#### Supported under ADK runtime
|
||||
|
||||
| Feature | Seam |
|
||||
| --- | --- |
|
||||
| Auth (OAuth / Gemini API key / Vertex) | `GcliAgentModel` maps ADK `LlmRequest` to `GenerateContentParameters` and dispatches through `config.getContentGenerator()`; auth lives in the content generator implementation |
|
||||
| Model fallback (429) | `GcliAgentModel` ports the existing `retryWithBackoff` + `handleFallback` + `ModelAvailabilityService` path; retry stays inside the model wrapper, not ADK Runner |
|
||||
| Dynamic routing & availability | `GcliRoutingProcessor` (a custom ADK `RequestProcessor`) consults `ModelRouterService`, `ModelConfigService`, `ModelAvailabilityService`, and the sequence-sticky model, then rewrites `LlmRequest.model` + tool/config before tool preprocessing. No ADK `RoutedLlm` — see Model routing in implementation_details.md |
|
||||
| Tool execution | Existing `scheduler.schedule()` |
|
||||
| Tool approval | Existing `MessageBus` + `ToolConfirmationQueue.tsx` |
|
||||
| Tool output masking | `ToolOutputMaskingProcessor` (a custom ADK `RequestProcessor`) runs `ToolOutputMaskingService.mask`, writes the masked view back to `LlmRequest.contents`, and applies the same masked function responses back to the live ADK session events by event/part identity rather than request-content index; then calls `ChatRecordingService.updateMessagesFromHistory` to sync the transcript. Per-call truncation rides for free inside `scheduler.schedule`. |
|
||||
| Chat compression | Existing `ChatCompressionService` (outgoing-history projection only) |
|
||||
| Loop detection | `LoopDetectionAdkPlugin` + `LoopDetectionService` — see Loop detection section |
|
||||
| Plan mode | `InstructionProvider` + mode-aware tools (mode-change side effects fire from `Config`, runtime-independent) |
|
||||
| User steering (between turns) | `beforeModelCallback` consumes `InjectionService` queue |
|
||||
| MCP `tools/list` + `tools/call` | Native `MCPToolset` — see MCP section |
|
||||
| MCP OAuth | Scoping open — see MCP section |
|
||||
| Subagents | M4 `AgentTool` + session factory — see Subagents section |
|
||||
| `/rewind` and all `/slash commands` | Live outside the protocol layer. `/rewind` is a slash-command/runtime-adapter concern: the adapter truncates `ConversationRecord` via `ChatRecordingService.rewindTo` and drops the ADK session; next `send()` rebuilds via `appendEvent`. Wired in Phase C (interactive). |
|
||||
| User hooks (11 events) | See Hooks section |
|
||||
| Max turns / tokens / time | Custom plugins (`MaxTurnsAdkPlugin`, `TokenLimitAdkPlugin`, `MaxTimeAdkPlugin`) |
|
||||
| Telemetry / Clearcut | Translator + plugin instrumentation; no new telemetry architecture |
|
||||
| Session persistence | `ChatRecordingService` remains the persisted conversation record; ADK `Session.events` (held in `GcliSessionService`) is runtime state only. `streamId` parity only; `eventId` resume is preserved on the protocol interface, see Rewind PR |
|
||||
| Skills, slash commands, ACP | Unchanged surfaces (ACP picks runtime by feature flag) |
|
||||
| VSCode IDE companion | Unaffected — side-channel to CLI process |
|
||||
|
||||
#### Not supported under ADK runtime
|
||||
|
||||
| Feature | Status | Rationale |
|
||||
| --- | --- | --- |
|
||||
| MCP `/mcp prompts` (`prompts/list`, `prompts/get`) | Scoping open — see MCP section | `MCPToolset` exposes no prompts API today. |
|
||||
| MCP `/mcp resources` (`resources/list`, `resources/read`) | Scoping open — see MCP section | Same gap as prompts. |
|
||||
|
||||
---
|
||||
|
||||
## M5 Execution Plan
|
||||
|
||||
**Foundation** — sequenced first:
|
||||
|
||||
| # | Title | Type |
|
||||
| --- | --- | --- |
|
||||
| 1 | `[AdkAgent] Scaffold adk-agent/ module: AdkAgentProtocol skeleton, GcliSessionService, seam names` | Feature |
|
||||
| 2 | `[AdkAgent] Add experimental.adk.runtimeEnabled flag + session factory + define precedence vs existing agentSessionNoninteractiveEnabled` | Feature |
|
||||
| 3 | `[AdkAgent] Delete unused elicitation_request/response from AgentEvent types` | Bug |
|
||||
|
||||
**Phase A — Non-interactive.** Wires the non-interactive AgentSession entry
|
||||
point through the ADK runtime. Proves the core loop without TUI complexity.
|
||||
|
||||
| # | Title | Type |
|
||||
| --- | --- | --- |
|
||||
| A1 | `[AdkAgent] Translator: text + thought + functionCall + functionResponse + usage + error (with _meta.code) + partial/consolidation + agent_start/end; matches event-translator.test.ts shape` | Feature |
|
||||
| A2 | `[AdkAgent] GcliAgentModel: dispatch ADK LlmRequest through config.getContentGenerator() + AbortSignal propagation` | Feature |
|
||||
| A3 | `[AdkAgent] GcliAgentModel: 429 retry via retryWithBackoff + handleFallback + ModelAvailabilityService (retry inside the model wrapper); silent-policy branch covers non-interactive without a UI handler; covers concrete-model changes` | Feature |
|
||||
| A4 | `[AdkAgent] GcliRoutingProcessor (custom ADK RequestProcessor) + move sequence-sticky model owner from GeminiClient to Config with named reset points` | Feature |
|
||||
| A5 | `[AdkAgent] Invalid-stream retry + next-speaker continuation (client.ts:818, :845) — port or explicitly mark Not Supported` | Feature |
|
||||
| A6 | `[AdkAgent] toAdkTool: route execution through existing scheduler.schedule()` | Feature |
|
||||
| A7 | `[AdkAgent] GcliSessionService: BaseSessionService implementation with authoritative live session map and single-append semantics` | Feature |
|
||||
| A8 | `[AdkAgent] ToolOutputMaskingProcessor: ToolOutputMaskingService.mask + apply masks back to live session events by event/part identity + sync ChatRecordingService.updateMessagesFromHistory` | Feature |
|
||||
| A8a | `[Cleanup] Migrate toolDistillationService off GeminiClient onto BaseLlmClient` | Bug |
|
||||
| A9 | `[AdkAgent] Plugins: MaxTurnsAdkPlugin + TokenLimitAdkPlugin + MaxTimeAdkPlugin (beforeModelCallback / before+after)` | Feature |
|
||||
| A10 | `[AdkAgent] LoopDetectionAdkPlugin: feed partial:true deltas to detector; on terminate call protocol.abort() with LOOP_DETECTED; protocol emits synthetic error event` | Feature |
|
||||
| A11 | `[AdkAgent] MCP: adopt native MCPToolset for stdio + HTTP (tools/list + tools/call only)` | Feature |
|
||||
| A11a–d | `[adk-js upstream] MCPToolset capability gaps: prompts/resources, auth lifecycle (token reuse/refresh/401), server lifecycle (list-change, stdio restart, progress, close), and wire upstream prompts/resources into PromptRegistry + ResourceRegistry` | Feature |
|
||||
| A12 | `[AdkAgent] Wire non-interactive entry point through session factory to select ADK runtime` | Feature |
|
||||
|
||||
**Phase B — Local subagents.** Reuses the Phase-A runtime for child sessions
|
||||
once Phase A is wired and runs cleanly.
|
||||
|
||||
| # | Title | Type |
|
||||
| --- | --- | --- |
|
||||
| B1 | `[AdkAgent] Route subagent-tool-wrapper.ts:97 constructor through session factory; return ADK-backed AgentProtocol when the parent runtime is ADK` | Feature |
|
||||
| B2 | `[AdkAgent] Subagent: complete_task mandatory terminator plugin (afterModelCallback)` | Feature |
|
||||
| B3 | `[AdkAgent] Subagent: grace-period recovery turn (inject "you must call complete_task now" + final retry)` | Feature |
|
||||
| B4 | `[AdkAgent] Subagent: scoped execution wrappers (workspace, memory-inbox, auto-memory-extraction)` | Feature |
|
||||
| B5 | `[AdkAgent] Subagent: onWaitingForConfirmation activity signal via BeforeTool` | Feature |
|
||||
|
||||
**Phase C — Interactive.** Wires the TUI last. By this point the runtime
|
||||
loop, model path, tools, fallback, masking, and subagents are already proven
|
||||
outside the TUI. **All `/slash commands` live outside the protocol layer.**
|
||||
|
||||
| # | Title | Type |
|
||||
| --- | --- | --- |
|
||||
| C1 | `[AdkAgent] Wire interactive AgentSession through session factory; TUI stream rendering parity` | Feature |
|
||||
| C2 | `[AdkAgent] User steering injection via beforeModelCallback` | Feature |
|
||||
| C3 | `[AdkAgent] Plan mode: InstructionProvider for system-prompt swapping + mode-aware tool filtering via BaseToolset` | Feature |
|
||||
| C4 | `[AdkAgent] HookBridgePlugin: fire BeforeModel / AfterModel / BeforeToolSelection from beforeModelCallback / afterModelCallback` | Feature |
|
||||
| C5 | `[AdkAgent] Wire SessionStart / SessionEnd / BeforeAgent / AfterAgent / Notification hooks (Notification fires from AdkAgentProtocol._emit on tool-notification events)` | Feature |
|
||||
| C6 | `[AdkAgent] Register TUI fallbackModelHandler callback so non-silent intents (retry_once / stop / upgrade / retry_with_credits) work; handleFallback machinery is already wired in Phase A` | Feature |
|
||||
| C7 | `[AdkAgent] /rewind: slash-command/runtime-adapter truncates ConversationRecord via ChatRecordingService.rewindTo, drops ADK session, next send() rebuilds via appendEvent (NOT an AgentProtocol method)` | Feature |
|
||||
| C8 | `[AdkAgent] Make ADK runtime the default for all wired surfaces` | Feature |
|
||||
Reference in New Issue
Block a user