mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-06-10 11:12:35 -07:00
feat: document the experimental browser agent, its configuration, session modes, and security.
This commit is contained in:
@@ -126,6 +126,17 @@ they appear in the UI.
|
||||
| --------------------------------- | ------------------------------ | --------------------------------------------- | ------- |
|
||||
| Auto Configure Max Old Space Size | `advanced.autoConfigureMemory` | Automatically configure Node.js memory limits | `false` |
|
||||
|
||||
### Agents
|
||||
|
||||
| UI Label | Setting | Description | Default |
|
||||
| -------------------- | --------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | -------------- |
|
||||
| Enable Browser Agent | `agents.overrides.browser_agent.enabled` | Enable the browser automation sub-agent. | `false` |
|
||||
| Session Mode | `agents.overrides.browser_agent.customConfig.sessionMode` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. | `"persistent"` |
|
||||
| Headless | `agents.overrides.browser_agent.customConfig.headless` | Run Chrome in headless mode (no visible window). | `false` |
|
||||
| Chrome Profile Path | `agents.overrides.browser_agent.customConfig.chromeProfilePath` | Custom path to a Chrome profile directory. | `undefined` |
|
||||
| Visual Model | `agents.overrides.browser_agent.customConfig.visualModel` | Model override for visual analysis (for example, `"gemini-2.5-computer-use-preview-10-2025"`). | `undefined` |
|
||||
| Allowed Domains | `agents.overrides.browser_agent.customConfig.allowedDomains` | Restrict navigation to these domain patterns. Supports `*` wildcards. If empty, all non-blocked URLs are allowed. | `[]` |
|
||||
|
||||
### Experimental
|
||||
|
||||
| UI Label | Setting | Description | Default |
|
||||
|
||||
@@ -80,6 +80,119 @@ Gemini CLI comes with the following built-in subagents:
|
||||
invoked by the user.
|
||||
- **Configuration:** Enabled by default. No specific configuration options.
|
||||
|
||||
### Browser Agent (experimental)
|
||||
|
||||
- **Name:** `browser_agent`
|
||||
- **Purpose:** Automate web browser tasks — navigating websites, filling forms,
|
||||
clicking buttons, and extracting information from web pages — using the
|
||||
accessibility tree.
|
||||
- **When to use:** "Go to example.com and fill out the contact form," "Extract
|
||||
the pricing table from this page," "Click the login button and enter my
|
||||
credentials."
|
||||
|
||||
> **Note:** This is a preview feature currently under active development.
|
||||
|
||||
#### Prerequisites
|
||||
|
||||
The browser agent requires:
|
||||
|
||||
- **Chrome** version 144 or later installed on your system.
|
||||
- **Node.js** with `npx` available (used to launch the
|
||||
[`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp)
|
||||
server).
|
||||
|
||||
#### Enabling the browser agent
|
||||
|
||||
The browser agent is disabled by default. Enable it in your `settings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"overrides": {
|
||||
"browser_agent": {
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Session modes
|
||||
|
||||
The `sessionMode` setting controls how Chrome is launched and managed. Set it
|
||||
under `agents.browser`:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"overrides": {
|
||||
"browser_agent": {
|
||||
"enabled": true
|
||||
}
|
||||
},
|
||||
"browser": {
|
||||
"sessionMode": "persistent"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The available modes are:
|
||||
|
||||
| Mode | Description |
|
||||
| :----------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `persistent` | **(Default)** Launches Chrome with a persistent profile stored at `~/.cache/chrome-devtools-mcp/`. Cookies, history, and settings are preserved between sessions. |
|
||||
| `isolated` | Launches Chrome with a temporary profile that is deleted after each session. Use this for clean-state automation. |
|
||||
| `existing` | Attaches to an already-running Chrome instance. You must enable remote debugging first by navigating to `chrome://inspect/#remote-debugging` in Chrome. No new browser process is launched. |
|
||||
|
||||
#### Configuration reference
|
||||
|
||||
All browser-specific settings go under `agents.browser` in your `settings.json`.
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
| :------------ | :-------- | :------------- | :---------------------------------------------------------------------------------------------- |
|
||||
| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. |
|
||||
| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). |
|
||||
| `profilePath` | `string` | — | Custom path to a browser profile directory. |
|
||||
| `visualModel` | `string` | — | Model override for the visual agent (for example, `"gemini-2.5-computer-use-preview-10-2025"`). |
|
||||
|
||||
#### Security
|
||||
|
||||
The browser agent enforces the following security restrictions:
|
||||
|
||||
- **Blocked URL patterns:** `file://`, `javascript:`, `data:text/html`,
|
||||
`chrome://extensions`, and `chrome://settings/passwords` are always blocked.
|
||||
- **Sensitive action confirmation:** Actions like form filling, file uploads,
|
||||
and form submissions require user confirmation through the standard policy
|
||||
engine.
|
||||
|
||||
#### Visual agent
|
||||
|
||||
By default, the browser agent interacts with pages through the accessibility
|
||||
tree using element `uid` values. For tasks that require visual identification
|
||||
(for example, "click the yellow button" or "find the red error message"), you
|
||||
can enable the visual agent by setting a `visualModel`:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"overrides": {
|
||||
"browser_agent": {
|
||||
"enabled": true
|
||||
}
|
||||
},
|
||||
"browser": {
|
||||
"visualModel": "gemini-2.5-computer-use-preview-10-2025"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When enabled, the agent gains access to the `analyze_screenshot` tool, which
|
||||
captures a screenshot and sends it to the vision model for analysis. The model
|
||||
returns coordinates and element descriptions that the browser agent uses with
|
||||
the `click_at` tool for precise, coordinate-based interactions.
|
||||
|
||||
## Creating custom subagents
|
||||
|
||||
You can create your own subagents to automate specific workflows or enforce
|
||||
|
||||
@@ -196,6 +196,7 @@
|
||||
{
|
||||
"label": "resources_tab",
|
||||
"items": [
|
||||
{
|
||||
{
|
||||
"label": "Resources",
|
||||
"items": [
|
||||
@@ -215,6 +216,7 @@
|
||||
{ "label": "Uninstall", "slug": "docs/resources/uninstall" }
|
||||
]
|
||||
}
|
||||
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -52,6 +52,9 @@ These tools help the model manage its plan and interact with you.
|
||||
complex plans.
|
||||
- **[Agent Skills](../cli/skills.md) (`activate_skill`):** Loads specialized
|
||||
procedural expertise when needed.
|
||||
- **[Browser agent](../core/subagents.md#browser-agent-experimental)
|
||||
(`browser_agent`):** Automates web browser tasks through the accessibility
|
||||
tree.
|
||||
- **Internal docs (`get_internal_docs`):** Accesses Gemini CLI's own
|
||||
documentation to help answer your questions.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user