diff --git a/docs/core/subagents.md b/docs/core/subagents.md index a789e0f741..bfd107071e 100644 --- a/docs/core/subagents.md +++ b/docs/core/subagents.md @@ -120,10 +120,12 @@ Gemini CLI comes with the following built-in subagents: The browser agent requires: -- **Chrome** version 144 or later (any recent stable release will work). -- **Node.js** with `npx` available (used to launch the - [`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp) - server). +- **Chrome** version 144 or later (any recent stable release works). + +The underlying +[`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp) +server is bundled with Gemini CLI and launched automatically — no separate +installation is needed. #### Enabling the browser agent @@ -169,26 +171,58 @@ The available modes are: | `isolated` | Launches Chrome with a temporary profile that is deleted after each session. Use this for clean-state automation. | | `existing` | Attaches to an already-running Chrome instance. You must enable remote debugging first by navigating to `chrome://inspect/#remote-debugging` in Chrome. No new browser process is launched. | +#### First-run consent + +The first time the browser agent is invoked, Gemini CLI displays a consent +dialog. You must accept before the browser session starts. This dialog only +appears once. + #### Configuration reference All browser-specific settings go under `agents.browser` in your `settings.json`. +For full details, see the +[`agents.browser` configuration reference](../reference/configuration.md#agents). -| Setting | Type | Default | Description | -| :------------ | :-------- | :------------- | :---------------------------------------------------------------------------------------------- | -| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. | -| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). | -| `profilePath` | `string` | — | Custom path to a browser profile directory. | -| `visualModel` | `string` | — | Model override for the visual agent (for example, `"gemini-2.5-computer-use-preview-10-2025"`). | +| Setting | Type | Default | Description | +| :------------------------ | :--------- | :------------- | :------------------------------------------------------------------------------ | +| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. | +| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). | +| `profilePath` | `string` | — | Custom path to a browser profile directory. | +| `visualModel` | `string` | — | Model override for the visual agent. | +| `allowedDomains` | `string[]` | — | Restrict navigation to specific domains (for example, `["github.com"]`). | +| `disableUserInput` | `boolean` | `true` | Disable user input on the browser window during automation (non-headless only). | +| `maxActionsPerTask` | `number` | `100` | Maximum tool calls per task. The agent is terminated when the limit is reached. | +| `confirmSensitiveActions` | `boolean` | `false` | Require manual confirmation for `upload_file` and `evaluate_script`. | +| `blockFileUploads` | `boolean` | `false` | Hard-block all file upload requests from the agent. | + +#### Automation overlay and input blocking + +In non-headless mode, the browser agent injects a visual overlay into the +browser window to indicate that automation is in progress. By default, user +input (keyboard and mouse) is also blocked to prevent accidental interference. +You can disable this by setting `disableUserInput` to `false`. #### Security -The browser agent enforces the following security restrictions: +The browser agent enforces several layers of security: -- **Blocked URL patterns:** `file://`, `javascript:`, `data:text/html`, - `chrome://extensions`, and `chrome://settings/passwords` are always blocked. -- **Sensitive action confirmation:** Actions like form filling, file uploads, - and form submissions require user confirmation through the standard policy - engine. +- **Domain restrictions:** When `allowedDomains` is set, the agent can only + navigate to the listed domains (and their subdomains when using `*.` prefix). + Attempting to visit a disallowed domain throws a fatal error that immediately + terminates the agent. The agent also attempts to detect and block the use of + allowed domains as proxies (e.g., via query parameters or fragments) to access + restricted content. +- **Blocked URL patterns:** The underlying MCP server blocks dangerous URL + schemes including `file://`, `javascript:`, `data:text/html`, + `chrome://extensions`, and `chrome://settings/passwords`. +- **Sensitive action confirmation:** Form filling (`fill`, `fill_form`) always + requires user confirmation through the policy engine, regardless of approval + mode. When `confirmSensitiveActions` is `true`, `upload_file` and + `evaluate_script` also require confirmation. +- **File upload blocking:** Set `blockFileUploads` to `true` to hard-block all + file upload requests, preventing the agent from uploading any files. +- **Action rate limiting:** The `maxActionsPerTask` setting (default: 100) + limits the total number of tool calls per task to prevent runaway execution. #### Visual agent