mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-04-02 17:31:05 -07:00
docs(browser-agent): update stale browser agent documentation (#24463)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This commit is contained in:
@@ -120,10 +120,12 @@ Gemini CLI comes with the following built-in subagents:
|
||||
|
||||
The browser agent requires:
|
||||
|
||||
- **Chrome** version 144 or later (any recent stable release will work).
|
||||
- **Node.js** with `npx` available (used to launch the
|
||||
[`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp)
|
||||
server).
|
||||
- **Chrome** version 144 or later (any recent stable release works).
|
||||
|
||||
The underlying
|
||||
[`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp)
|
||||
server is bundled with Gemini CLI and launched automatically — no separate
|
||||
installation is needed.
|
||||
|
||||
#### Enabling the browser agent
|
||||
|
||||
@@ -169,26 +171,58 @@ The available modes are:
|
||||
| `isolated` | Launches Chrome with a temporary profile that is deleted after each session. Use this for clean-state automation. |
|
||||
| `existing` | Attaches to an already-running Chrome instance. You must enable remote debugging first by navigating to `chrome://inspect/#remote-debugging` in Chrome. No new browser process is launched. |
|
||||
|
||||
#### First-run consent
|
||||
|
||||
The first time the browser agent is invoked, Gemini CLI displays a consent
|
||||
dialog. You must accept before the browser session starts. This dialog only
|
||||
appears once.
|
||||
|
||||
#### Configuration reference
|
||||
|
||||
All browser-specific settings go under `agents.browser` in your `settings.json`.
|
||||
For full details, see the
|
||||
[`agents.browser` configuration reference](../reference/configuration.md#agents).
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
| :------------ | :-------- | :------------- | :---------------------------------------------------------------------------------------------- |
|
||||
| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. |
|
||||
| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). |
|
||||
| `profilePath` | `string` | — | Custom path to a browser profile directory. |
|
||||
| `visualModel` | `string` | — | Model override for the visual agent (for example, `"gemini-2.5-computer-use-preview-10-2025"`). |
|
||||
| Setting | Type | Default | Description |
|
||||
| :------------------------ | :--------- | :------------- | :------------------------------------------------------------------------------ |
|
||||
| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. |
|
||||
| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). |
|
||||
| `profilePath` | `string` | — | Custom path to a browser profile directory. |
|
||||
| `visualModel` | `string` | — | Model override for the visual agent. |
|
||||
| `allowedDomains` | `string[]` | — | Restrict navigation to specific domains (for example, `["github.com"]`). |
|
||||
| `disableUserInput` | `boolean` | `true` | Disable user input on the browser window during automation (non-headless only). |
|
||||
| `maxActionsPerTask` | `number` | `100` | Maximum tool calls per task. The agent is terminated when the limit is reached. |
|
||||
| `confirmSensitiveActions` | `boolean` | `false` | Require manual confirmation for `upload_file` and `evaluate_script`. |
|
||||
| `blockFileUploads` | `boolean` | `false` | Hard-block all file upload requests from the agent. |
|
||||
|
||||
#### Automation overlay and input blocking
|
||||
|
||||
In non-headless mode, the browser agent injects a visual overlay into the
|
||||
browser window to indicate that automation is in progress. By default, user
|
||||
input (keyboard and mouse) is also blocked to prevent accidental interference.
|
||||
You can disable this by setting `disableUserInput` to `false`.
|
||||
|
||||
#### Security
|
||||
|
||||
The browser agent enforces the following security restrictions:
|
||||
The browser agent enforces several layers of security:
|
||||
|
||||
- **Blocked URL patterns:** `file://`, `javascript:`, `data:text/html`,
|
||||
`chrome://extensions`, and `chrome://settings/passwords` are always blocked.
|
||||
- **Sensitive action confirmation:** Actions like form filling, file uploads,
|
||||
and form submissions require user confirmation through the standard policy
|
||||
engine.
|
||||
- **Domain restrictions:** When `allowedDomains` is set, the agent can only
|
||||
navigate to the listed domains (and their subdomains when using `*.` prefix).
|
||||
Attempting to visit a disallowed domain throws a fatal error that immediately
|
||||
terminates the agent. The agent also attempts to detect and block the use of
|
||||
allowed domains as proxies (e.g., via query parameters or fragments) to access
|
||||
restricted content.
|
||||
- **Blocked URL patterns:** The underlying MCP server blocks dangerous URL
|
||||
schemes including `file://`, `javascript:`, `data:text/html`,
|
||||
`chrome://extensions`, and `chrome://settings/passwords`.
|
||||
- **Sensitive action confirmation:** Form filling (`fill`, `fill_form`) always
|
||||
requires user confirmation through the policy engine, regardless of approval
|
||||
mode. When `confirmSensitiveActions` is `true`, `upload_file` and
|
||||
`evaluate_script` also require confirmation.
|
||||
- **File upload blocking:** Set `blockFileUploads` to `true` to hard-block all
|
||||
file upload requests, preventing the agent from uploading any files.
|
||||
- **Action rate limiting:** The `maxActionsPerTask` setting (default: 100)
|
||||
limits the total number of tool calls per task to prevent runaway execution.
|
||||
|
||||
#### Visual agent
|
||||
|
||||
|
||||
Reference in New Issue
Block a user