Commit Graph

3727 Commits

Author SHA1 Message Date
Gaurav Ghosh 7d32f2bf88 fix: suppress MCP server stderr from corrupting alternate buffer UI
Pipe stderr from npx chrome-devtools-mcp instead of inheriting it.
The server's banner warnings were leaking into the terminal and
corrupting the Ink-based UI in alternate buffer mode. Piped output
is forwarded to debugLogger so it remains visible with --debug.
2026-02-24 02:05:53 -08:00
Gaurav Ghosh c991e5b3dc fix: address PR #19284 review comments
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
2026-02-24 02:05:53 -08:00
Gaurav Ghosh c1560d99fd fix: update browser agent description to encourage full task delegation
Updated the browser_agent description from a primitive-focused listing
(navigating, filling, clicking) to a goal-oriented description that
emphasizes autonomy, multi-step reasoning, and dynamic feedback
interpretation. This encourages the parent agent to delegate entire
tasks in a single call rather than micromanaging individual browser
actions.
2026-02-23 13:29:39 -08:00
Gaurav Ghosh 377186d831 feat(browser): default persistent profile to ~/.gemini/cli-browser-profile 2026-02-23 12:25:46 -08:00
Gaurav Ghosh 64853dbfde refactor: Introduce dedicated browser agent configuration with session mode, headless, profile path, and visual model settings. 2026-02-23 12:06:23 -08:00
Gaurav Ghosh 6732115859 fix: Add LlmRole.UTILITY_TOOL to analyzeScreenshot function calls. 2026-02-23 11:52:50 -08:00
Gaurav Ghosh 7718709f01 fix(browser): exclude visual prompt section when vision is disabled
The system prompt always included the VISUAL IDENTIFICATION section
telling the model about analyze_screenshot, even when visualModel was
not configured. This caused the model to attempt calling the tool
despite it not being registered.

- Convert BROWSER_SYSTEM_PROMPT to buildBrowserSystemPrompt(visionEnabled)
- Pass vision state from factory to definition builder
- Remove analyze_screenshot reference from click_at tool description
- Add tests for conditional prompt inclusion/exclusion
- Fix misleading test comment about tool count
2026-02-23 11:52:49 -08:00
Gaurav Ghosh 4e2856c4dd feat(browser): add submitKey param to type_text and improve connection errors
- Add submitKey parameter to type_text tool for pressing Enter/Tab/etc
  after typing, eliminating a separate model round-trip per value entry
- Update system prompt and tool hints to guide model toward type_text
  with submitKey instead of per-character press_key calls
- Refactor connection error handling into createConnectionError() with
  session-mode-aware remediation messages for profile locks, timeouts,
  and generic failures
- Update terminal failure prompts to pass through error remediation
  verbatim instead of hardcoding instructions
- Add tests for profile-lock, timeout, and generic connection errors
2026-02-23 11:52:49 -08:00
Gaurav Ghosh 067d0ecab3 fix: update chrome-devtools-mcp dependency, and add transport error handling. 2026-02-23 11:52:49 -08:00
Gaurav Ghosh fb1b2891cc feat(browser): gate vision on visualModel setting
Vision (screenshot analysis + coordinate-based interactions) is now
disabled by default. Set visualModel in browser_agent customConfig
to enable it, e.g. visualModel: 'gemini-2.5-computer-use-preview-10-2025'.
2026-02-23 11:52:48 -08:00
Gaurav Ghosh 2bc2945d14 feat(browser-agent): add type_text composite tool and improve prompt
- Add custom type_text tool that types a full string by internally
  calling press_key for each character, turning N model round-trips
  into 1. Dramatically speeds up text input in complex web apps.

- Move tool-specific usage rules from system prompt to individual
  tool descriptions via augmentToolDescription() for better
  organization and token efficiency.

- Add terminal failure handling instructions to system prompt
  (Chrome connection errors, browser crashes, repeated errors)
  with specific remediation steps.

- Add complex web app guidance (spreadsheets, rich editors) to
  system prompt, recommending type_text + keyboard navigation.

- Fix augmentToolDescription key ordering so more-specific keys
  (fill_form, click_at) match before shorter keys (fill, click).

- Remove non-existent tool references (scroll, type_text as MCP tool)
  and add click_at hint for vision tool.
2026-02-23 11:52:48 -08:00
Gaurav Ghosh 1c8a37379b fix(browser): correct session mode CLI flags and add connection validation
Fix chrome-devtools-mcp CLI flags:
- --existing (invalid) → --autoConnect for existing session mode
- --profile-path (invalid) → --userDataDir for custom profile path
- Default session mode changed from 'isolated' to 'persistent'

Add 'persistent' session mode (new default) which uses a persistent
Chrome profile at ~/.cache/chrome-devtools-mcp/chrome-profile.

Add connection timeout and actionable error for 'existing' mode when
Chrome remote debugging is not enabled.
2026-02-23 11:52:47 -08:00
Gaurav Ghosh 1620c7d82f feat(browser): implement visual agent for coordinate-based interactions
Implement the visual agent using the LocalAgentDefinition pattern:
- VisualAgentDefinition: Agent metadata for coordinate-based visual tasks
- delegateToVisualAgent.ts: Tool for semantic agent to delegate visual tasks
- Uses gemini-2.5-computer-use-preview-10-2025 model for Computer Use capability

The visual agent handles tasks requiring visual identification or precise
coordinate-based actions that cannot be done via the accessibility tree.
2026-02-23 11:52:47 -08:00
Gaurav Ghosh f4100baf6b feat(browser): implement browser agent as LocalAgentDefinition
Implement the browser agent using the LocalAgentDefinition pattern:
- BrowserAgentDefinition: Agent metadata and prompt configuration
- BrowserAgentInvocation: Handles individual browser agent invocations
- BrowserAgentFactory: Creates agent definitions with dynamic MCP tools
- BrowserManager: Manages chrome-devtools-mcp connection lifecycle

Uses getBrowserAgentConfig() to read settings from agents.overrides.browser_agent
2026-02-23 11:52:47 -08:00
Gaurav Ghosh 0b93c868e9 feat(browser): add browser agent settings schema
Add extensible browser agent configuration using the agents.overrides pattern:
- Extended AgentOverride interface with customConfig field for agent-specific settings
- Added BrowserAgentCustomConfig type for browser-specific configuration
- Added getAgentOverride() and getBrowserAgentConfig() methods to Config class
- Settings configured via agents.overrides.browser_agent.customConfig
- Updated settings schema with customConfig in AgentOverride definition

This establishes the foundational pattern for configuring the browser agent
through the standard agents.overrides infrastructure.
2026-02-23 11:52:46 -08:00
Aishanee Shah 7cfbb6fb71 feat(core): optimize tool descriptions and schemas for Gemini 3 (#19643) 2026-02-23 19:27:35 +00:00
Jerop Kipruto 347f3fe7e4 feat(policy): Support MCP Server Wildcards in Policy Engine (#20024) 2026-02-23 19:07:06 +00:00
Himanshu Soni 774ae220be fix(core): prevent state corruption in McpClientManager during collis (#19782) 2026-02-23 18:35:31 +00:00
Tommaso Sciortino 813e0c18ac Allow ask headers longer than 16 chars (#20041) 2026-02-23 18:26:59 +00:00
Sri Pasumarthi 3966f3c053 feat: Map tool kinds to explicit ACP.ToolKind values and update test … (#19547) 2026-02-23 18:22:05 +00:00
sinisterchill 2e3cbd6363 fix(core): prevent OAuth server crash on unexpected requests (#19668) 2026-02-23 18:03:31 +00:00
Adib234 8b1dc15182 fix(plan): allow plan mode writes on Windows and fix prompt paths (#19658) 2026-02-23 17:48:50 +00:00
owenofbrien fa9aee2bf0 Fix for silent failures in non-interactive mode (#19905) 2026-02-23 17:35:13 +00:00
Sehoon Shon aa9163da60 feat(core): add policy chain support for Gemini 3.1 (#19991) 2026-02-23 15:13:48 +00:00
Sehoon Shon ec0f23ae03 fix(core): increase default retry attempts and add quota error backoff (#19949) 2026-02-23 15:13:34 +00:00
nityam ac04c388e0 Fix: Persist manual model selection on restart #19864 (#19891) 2026-02-23 03:44:00 +00:00
Abhi 621ddbe744 refactor(core): move session conversion logic to core (#19972) 2026-02-23 01:18:07 +00:00
Sehoon Shon c537fd5aec refactor(config): remove enablePromptCompletion from settings (#19974) 2026-02-22 19:10:20 -05:00
Shivangi Sharma a91bc60e18 fix(core): add uniqueness guard to edit tool (#19890)
Co-authored-by: Bryan Morgan <bryanmorgan@google.com>
2026-02-22 20:24:58 +00:00
Nick Salerni faa1ec3044 fix(core): prevent omission placeholder deletions in replace/write_file (#19870)
Co-authored-by: Bryan Morgan <bryanmorgan@google.com>
2026-02-22 19:58:31 +00:00
Bryan Morgan d96bd05d36 fix(core): allow any preview model in quota access check (#19867) 2026-02-22 12:53:24 +00:00
Adib234 84666e1bbc fix(plan): time share by approval mode dashboard reporting negative time shares (#19847) 2026-02-22 00:32:57 +00:00
N. Taylor Mullen a7d851146a feat(core): remove unnecessary login verbiage from Code Assist auth (#19861) 2026-02-21 21:55:11 +00:00
Abhi acb7f577de chore(lint): fix lint errors seen when running npm run lint (#19844) 2026-02-21 18:33:25 +00:00
Abhi d2d345f41a fix(cli): filter subagent sessions from resume history (#19698) 2026-02-21 17:41:27 +00:00
Christian Gunderman dfd7721e69 Disallow unsafe returns. (#19767) 2026-02-21 01:12:56 +00:00
matt korwel 09218572d0 refactor(core): remove unsafe type assertions in error utils (Phase 1.1) (#19750) 2026-02-21 01:00:57 +00:00
Christian Gunderman 5d98ed5820 Utilize pipelining of grep_search -> read_file to eliminate turns (#19574) 2026-02-21 00:36:10 +00:00
Jarrod Whelan 727f9b67b1 feat(cli): improve CTRL+O experience for both standard and alternate screen buffer (ASB) modes (#19010)
Co-authored-by: jacob314 <jacob314@gmail.com>
2026-02-21 00:26:11 +00:00
Adam Weidman 547f5d45f5 feat(core): migrate read_file to 1-based start_line/end_line parameters (#19526) 2026-02-20 22:59:18 +00:00
Christian Gunderman 58d637f919 Disallow and suppress unsafe assignment (#19736) 2026-02-20 22:28:55 +00:00
Sehoon Shon b746524a1b fix(cli): re-enable CLI banner (#19741) 2026-02-20 22:21:26 +00:00
Abhijit Balaji c5baf39dbd feat(policy): repurpose "Always Allow" persistence to workspace level (#19707) 2026-02-20 22:07:20 +00:00
Sehoon Shon b48970da15 fix(cli): use getDisplayString for manual model selection in dialog (#19726) 2026-02-20 22:03:32 +00:00
Jacob Richman 9a8e5d3940 fix(cli): extensions dialog UX polish (#19685) 2026-02-20 21:08:24 +00:00
Jacob Richman 089aec8b8d feat(cli): make JetBrains warning more specific (#19687) 2026-02-20 21:06:35 +00:00
Christian Gunderman b7555ab1e1 Fix unsafe assertions in code_assist folder. (#19706)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-20 20:44:23 +00:00
Emily Hedlund c04602f209 fix(core): restore auth consent in headless mode and add unit tests (#19689) 2026-02-20 20:31:43 +00:00
Emily Hedlund a01d7e9a05 security: implement deceptive URL detection and disclosure in tool confirmations (#19288) 2026-02-20 20:21:31 +00:00
Emily Hedlund 49b2e76ee1 Revert "feat(ui): add source indicators to slash commands" (#19695) 2026-02-20 20:08:49 +00:00