3.9 KiB
Agent harness architecture
This document provides a detailed walkthrough of the architectural shift from linear turn-based execution to the unified hierarchical loop model used by the Agent Harness.
Note: This is a preview feature currently under active development.
Overview
The Agent Harness represents a fundamental evolution in how Gemini CLI manages interactions with Large Language Models (LLMs) and tools. It unifies the execution logic for both the main CLI agent and subagents, providing parity in features like model routing, history management, and tool execution.
Legacy architecture: Linear turns
The legacy system operates on a "Stop-and-Go" model where the UI manages the execution turn-by-turn.
In this model, when you send a prompt, the system follows these steps:
- Orchestration: The
GeminiClientand theuseGeminiStreamhook manage the flow. - Execution: Gemini returns a single response containing text or tool calls.
- UI Interruption: The execution stops at the UI layer. If Gemini calls tools, the UI schedules them, waits for results, and then re-submits the entire history as a brand-new turn.
- Subagents: Subagents are treated as "Black Box" tools. The main agent
calls a subagent (for example,
codebase_investigator), waits for it to complete its private loop usingLocalAgentExecutor, and receives a single string result.
This model results in duplicated logic for subagents and prevents them from using advanced features available to the main agent.
New architecture: Unified agent harness
The Agent Harness treats the ReAct (Reasoning and Action) loop as a first-class, autonomous process.
The new model introduces several key improvements:
- Continuous Loop: The
AgentHarnessmanages the entire lifecycle internally. It handles LLM calls, tool execution, and reasoning without relinquishing control to the UI until it reaches the final goal. - Event Stream: The harness yields a continuous stream of events
(
GeminiEvent) that the UI listens to and renders in real-time. - Hierarchical Delegation: Because the harness is unified, a subagent is
simply another instance of
AgentHarnessrunning inside a tool call of the parent harness. - Feature Parity: Subagents can now use the same features as the main agent, including dynamic model routing, history compression, and complex interactive tools.
UI synchronization challenges
Moving to a hierarchical model introduces complexity in how the UI maintains a consistent history.
The HistoryManager expects a flat list of messages, but the harness provides a
nested, multi-turn stream. This creates two primary challenges:
- History Persistence: Legacy code may clear the "active" turn state
prematurely when a turn boundary is crossed. The harness uses a
TurnFinishedevent to signal when to "lock in" reasoning without ending the overall session. - Hierarchical Boxes: In a hierarchical model, internal subagent tool
calls (for example, reading a file) shouldn't clutter the main history. The
UI uses
SubagentActivityevents to update a single, persistent subagent box rather than rendering every internal step as a top-level item.
Isolation strategy
To ensure stability during this transition, the project uses a "Dual Implementation" strategy.
This strategy isolates the experimental logic from the stable codebase:
- Hook Isolation:
useAgentHarness.tsprovides a dedicated hook for the new event model, leaving the stableuseGeminiStreamuntouched. - Logic Isolation:
HarnessSubagentInvocation.tsmanages subagent execution specifically for the harness, whileLocalSubagentInvocation.tscontinues to serve the legacy path. - Conditional Forking: The system switches between these paths based on the
experimental-agent-harnessconfiguration flag.