MediaMetz/gemini-cli

Fork 0

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-06-12 12:26:57 -07:00

Files

T

Adam Weidman 9f3a154014 Add validated architectural notes

2026-04-05 09:13:57 -07:00

8.9 KiB

Raw Blame History

Gemini CLI Architecture Notes

Project Structure

Monorepo packages:

packages/core/ - Main execution engine (the big one)
packages/cli/ - CLI frontend
packages/sdk/ - SDK for extensions
packages/a2a-server/ - Agent-to-agent server
packages/devtools/ - Dev utilities
packages/vscode-ide-companion/ - VS Code extension

Core Execution Loop

GeminiClient (`core/src/core/client.ts` ~38KB)

Primary orchestrator for user interactions
Manages session lifecycle, message routing, model selection
Coordinates hooks, context management, error recovery
Enforces MAX_TURNS = 100 per session
Tracks currentSequenceModel for multi-turn stickiness
Handles history compression when context grows

GeminiChat (`core/src/core/geminiChat.ts` ~34KB)

Bidirectional LLM communication
Maintains history[] alternating user/model turns
Retry logic: max 2 attempts, 500ms delay for invalid responses
Fires BeforeModel and AfterModel hooks
Integrates ChatRecordingService for persistence

Scheduler (`core/src/scheduler/scheduler.ts` ~23KB)

Three-phase event-driven: Ingestion → Processing → Completion
Tool call state machine: Validating → AwaitingApproval → Scheduled → Executing → Terminal
Terminal states: Success, Error, Cancelled
Parallel execution for read-only and agent-type tools
Yields to event loop for user approval
Publishes state changes via MessageBus

CoreToolScheduler (`core/src/core/coreToolScheduler.ts` ~38KB)

Sequential, queue-based tool processing
Validates policy via PolicyEngine
Confirmation handling via ToolModificationHandler (editor integration)
Uses MessageBus for async confirmation responses

Tool System

DeclarativeTool Pattern

Separation of concerns: build() → validate → createInvocation() → execute()
ToolBuilder defines metadata (name, displayName, description, kind) + schema via getSchema()
ToolInvocation has: getDescription(), toolLocations(), shouldConfirmExecute(), execute()
ToolResult contains: llmContent (for LLM), returnDisplay (for UI), error details, tail calls

BaseToolInvocation

Abstract base with MessageBus integration for policy/confirmation
Three decision paths: ALLOW, DENY, ASK_USER via getMessageBusDecision()

ToolRegistry (`core/src/tools/tool-registry.ts`)

Registers tools via registerTool()
MCP tools with fully qualified names: mcp_serverName_toolName
Priority sorting: built-in → discovered → MCP (by server name)
Filters by active status based on configuration

Confirmation System

ToolCallConfirmationDetails union: edit, execute, MCP, info, ask_user, exit_plan_mode
ToolConfirmationOutcome enum: ProceedOnce, ProceedAlways, etc.
Async confirmation via MessageBus pub/sub

Hooks System

Hook Types (11 hook points)

Hook	Trigger	Key Capability
`BeforeTool`	Before tool execution	Modify tool_input
`AfterTool`	After tool completion	Context injection, tail calls
`BeforeAgent`	Before agent prompt	Additional context
`AfterAgent`	After agent response	Clear context flag
`BeforeModel`	Before LLM request	Modify request or inject response
`AfterModel`	After LLM response	Modify response
`BeforeToolSelection`	Before tool selection	Modify toolConfig
`Notification`	When notifications fire	Suppress/modify message
`SessionStart`	Session begins	Additional context
`SessionEnd`	Session terminates	Cleanup
`PreCompress`	Before compression	Suppress/modify

Hook Output Fields (common to all hooks)

continue - Whether execution proceeds
stopReason - Reason to halt
suppressOutput - Hide from user
systemMessage - Add to system context
decision - ask/block/deny/approve/allow

Hook System Components

HookSystem - Main coordinator
HookRegistry - Stores/manages configurations
HookRunner - Executes registered hooks
HookAggregator - Combines multiple hook results
HookPlanner - Determines execution order
HookEventHandler - Orchestrates event firing
HookTranslator - Converts between formats

Policy Engine

Rule Structure

PolicyRule {
  toolName: string;        // wildcards supported
  decision: PolicyDecision; // ALLOW | DENY | ASK_USER
  priority: number;
  argsPattern?: RegExp;    // conditional on args
  mcpName?: string;
  source: string;
}

Tier Hierarchy (lowest → highest priority)

Default (1) - Core built-in policies
Extension (2) - Extension contributions
Workspace (3) - Project-scoped (.gemini/)
User (4) - User-provided (~/.gemini/)
Admin (5) - System-level policies

Dynamic Rule Priorities (within User Tier)

4.9 - MCP_EXCLUDED (persistent server blocks)
4.4 - EXCLUDE_TOOLS_FLAG (CLI exclusions)
4.3 - ALLOWED_TOOLS_FLAG (CLI allows)
4.2 - TRUSTED_MCP_SERVER
4.1 - ALLOWED_MCP_SERVER
3.95 - ALWAYS_ALLOW (interactive selections)

Security Constraint

Extensions CANNOT contribute ALLOW rules or YOLO mode

Agent System

Agent Registry (`core/src/agents/registry.ts`)

Discovery sources:

Built-in: CodebaseInvestigator, CliHelp, Generalist, Browser
User-level: ~/.gemini/agents/
Project-level: .gemini/agents/ (requires folder trust)
Extension-based: From active extensions

LocalAgentExecutor (`core/src/agents/local-executor.ts`)

Prompt processing: input augmentation → template expansion → system prompt construction
Uses GeminiChat for accumulating conversation
ChatCompressionService for history management
Turn loop: invoke model → extract function calls → check auth → append results
Termination: complete_task tool, max turns, timeout

SubagentTool (`core/src/agents/subagent-tool.ts`)

Extends BaseDeclarativeTool - agents invoked like standard tools
Read-only status checking, user hint propagation
Execution: validate → optional confirmation → parameter enrichment → SubagentToolWrapper

Remote Agents

A2A client manager for agent-to-agent protocol
Remote invocation for external agents
Agent acknowledgement system (security for project agents)

Model System

ModelConfigService

Hierarchical alias system: children override parents
Resolution: alias chain → level assignment → apply overrides
Deep merging with array override capability
Fallback to chat-base alias for unknown models

ModelRouterService

Sequential strategy pattern:

Fallback & Override
Approval Mode Strategy
Gemma Classifier (if enabled)
Generic Classifier
Numerical Classifier
Default Strategy

ModelAvailabilityService

Health states:

Terminal - permanently unavailable
Sticky Retry - failed once, can retry once per turn
Healthy - no issues

Services

Service	Purpose
ChatRecordingService	Session persistence (JSON files)
ChatCompressionService	History summarization for token budgets
ModelConfigService	Hierarchical model config with aliases
ModelAvailabilityService	Model health tracking
ModelRouterService	Model selection via strategies
FolderTrustDiscoveryService	Workspace security scanning
KeychainService	Credential storage
LoopDetectionService	Detect repetitive agent loops

UI + Core Separation

IDE Client (`core/src/ide/ide-client.ts`)

Singleton managing CLI ↔ IDE communication via MCP
Outbound (CLI → IDE): openDiff, closeDiff
Inbound (IDE → CLI): ide/contextUpdate, ide/diffAccepted, ide/diffRejected

Event Contract

interface IdeContextNotification {
  method: 'ide/contextUpdate';
  params: { workspaceState: { openFiles: string[]; isTrusted: boolean } };
}

Confirmation Bus

TOOL_CONFIRMATION_REQUEST / TOOL_CONFIRMATION_RESPONSE
Detail types: edit, execute, MCP, info, ask_user, exit_plan_mode
Async pub/sub via MessageBus

Configuration (`core/src/config/config.ts` ~95KB!)

Tool config: core tools, allowed/excluded, MCP servers
File filtering: git ignore, fuzzy search, max counts, timeouts
Approval modes: policy engine config
Experiments: feature flags (GEMINI_3_1_PRO_LAUNCHED, ENABLE_ADMIN_CONTROLS, etc.)
FolderTrust: discovery scans for commands, skills, settings, MCP, hooks

8.9 KiB Raw Blame History