mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-03-10 22:21:22 -07:00
308 lines
12 KiB
Markdown
308 lines
12 KiB
Markdown
# Subagents (experimental)
|
|
|
|
Subagents are specialized agents that operate within your main Gemini CLI
|
|
session. They are designed to handle specific, complex tasks—like deep codebase
|
|
analysis, documentation lookup, or domain-specific reasoning—without cluttering
|
|
the main agent's context or toolset.
|
|
|
|
> **Note: Subagents are currently an experimental feature.**
|
|
>
|
|
> To use custom subagents, you must explicitly enable them in your
|
|
> `settings.json`:
|
|
>
|
|
> ```json
|
|
> {
|
|
> "experimental": { "enableAgents": true }
|
|
> }
|
|
> ```
|
|
>
|
|
> **Warning:** Subagents currently operate in
|
|
> ["YOLO mode"](../reference/configuration.md#command-line-arguments), meaning
|
|
> they may execute tools without individual user confirmation for each step.
|
|
> Proceed with caution when defining agents with powerful tools like
|
|
> `run_shell_command` or `write_file`.
|
|
|
|
## What are subagents?
|
|
|
|
Subagents are "specialists" that the main Gemini agent can hire for a specific
|
|
job.
|
|
|
|
- **Focused context:** Each subagent has its own system prompt and persona.
|
|
- **Specialized tools:** Subagents can have a restricted or specialized set of
|
|
tools.
|
|
- **Independent context window:** Interactions with a subagent happen in a
|
|
separate context loop, which saves tokens in your main conversation history.
|
|
|
|
Subagents are exposed to the main agent as a tool of the same name. When the
|
|
main agent calls the tool, it delegates the task to the subagent. Once the
|
|
subagent completes its task, it reports back to the main agent with its
|
|
findings.
|
|
|
|
## Built-in subagents
|
|
|
|
Gemini CLI comes with the following built-in subagents:
|
|
|
|
### Codebase Investigator
|
|
|
|
- **Name:** `codebase_investigator`
|
|
- **Purpose:** Analyze the codebase, reverse engineer, and understand complex
|
|
dependencies.
|
|
- **When to use:** "How does the authentication system work?", "Map out the
|
|
dependencies of the `AgentRegistry` class."
|
|
- **Configuration:** Enabled by default. You can configure it in
|
|
`settings.json`. Example (forcing a specific model):
|
|
```json
|
|
{
|
|
"experimental": {
|
|
"codebaseInvestigatorSettings": {
|
|
"enabled": true,
|
|
"maxNumTurns": 20,
|
|
"model": "gemini-2.5-pro"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### CLI Help Agent
|
|
|
|
- **Name:** `cli_help`
|
|
- **Purpose:** Get expert knowledge about Gemini CLI itself, its commands,
|
|
configuration, and documentation.
|
|
- **When to use:** "How do I configure a proxy?", "What does the `/rewind`
|
|
command do?"
|
|
- **Configuration:** Enabled by default.
|
|
|
|
### Generalist Agent
|
|
|
|
- **Name:** `generalist_agent`
|
|
- **Purpose:** Route tasks to the appropriate specialized subagent.
|
|
- **When to use:** Implicitly used by the main agent for routing. Not directly
|
|
invoked by the user.
|
|
- **Configuration:** Enabled by default. No specific configuration options.
|
|
|
|
### Browser Agent (experimental)
|
|
|
|
- **Name:** `browser_agent`
|
|
- **Purpose:** Automate web browser tasks — navigating websites, filling forms,
|
|
clicking buttons, and extracting information from web pages — using the
|
|
accessibility tree.
|
|
- **When to use:** "Go to example.com and fill out the contact form," "Extract
|
|
the pricing table from this page," "Click the login button and enter my
|
|
credentials."
|
|
|
|
> **Note:** This is a preview feature currently under active development.
|
|
|
|
#### Prerequisites
|
|
|
|
The browser agent requires:
|
|
|
|
- **Chrome** version 144 or later (any recent stable release will work).
|
|
- **Node.js** with `npx` available (used to launch the
|
|
[`chrome-devtools-mcp`](https://www.npmjs.com/package/chrome-devtools-mcp)
|
|
server).
|
|
|
|
#### Enabling the browser agent
|
|
|
|
The browser agent is disabled by default. Enable it in your `settings.json`:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"overrides": {
|
|
"browser_agent": {
|
|
"enabled": true
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Session modes
|
|
|
|
The `sessionMode` setting controls how Chrome is launched and managed. Set it
|
|
under `agents.browser`:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"overrides": {
|
|
"browser_agent": {
|
|
"enabled": true
|
|
}
|
|
},
|
|
"browser": {
|
|
"sessionMode": "persistent"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
The available modes are:
|
|
|
|
| Mode | Description |
|
|
| :----------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
| `persistent` | **(Default)** Launches Chrome with a persistent profile stored at `~/.gemini/cli-browser-profile/`. Cookies, history, and settings are preserved between sessions. |
|
|
| `isolated` | Launches Chrome with a temporary profile that is deleted after each session. Use this for clean-state automation. |
|
|
| `existing` | Attaches to an already-running Chrome instance. You must enable remote debugging first by navigating to `chrome://inspect/#remote-debugging` in Chrome. No new browser process is launched. |
|
|
|
|
#### Configuration reference
|
|
|
|
All browser-specific settings go under `agents.browser` in your `settings.json`.
|
|
|
|
| Setting | Type | Default | Description |
|
|
| :------------ | :-------- | :------------- | :---------------------------------------------------------------------------------------------- |
|
|
| `sessionMode` | `string` | `"persistent"` | How Chrome is managed: `"persistent"`, `"isolated"`, or `"existing"`. |
|
|
| `headless` | `boolean` | `false` | Run Chrome in headless mode (no visible window). |
|
|
| `profilePath` | `string` | — | Custom path to a browser profile directory. |
|
|
| `visualModel` | `string` | — | Model override for the visual agent (for example, `"gemini-2.5-computer-use-preview-10-2025"`). |
|
|
|
|
#### Security
|
|
|
|
The browser agent enforces the following security restrictions:
|
|
|
|
- **Blocked URL patterns:** `file://`, `javascript:`, `data:text/html`,
|
|
`chrome://extensions`, and `chrome://settings/passwords` are always blocked.
|
|
- **Sensitive action confirmation:** Actions like form filling, file uploads,
|
|
and form submissions require user confirmation through the standard policy
|
|
engine.
|
|
|
|
#### Visual agent
|
|
|
|
By default, the browser agent interacts with pages through the accessibility
|
|
tree using element `uid` values. For tasks that require visual identification
|
|
(for example, "click the yellow button" or "find the red error message"), you
|
|
can enable the visual agent by setting a `visualModel`:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"overrides": {
|
|
"browser_agent": {
|
|
"enabled": true
|
|
}
|
|
},
|
|
"browser": {
|
|
"visualModel": "gemini-2.5-computer-use-preview-10-2025"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
When enabled, the agent gains access to the `analyze_screenshot` tool, which
|
|
captures a screenshot and sends it to the vision model for analysis. The model
|
|
returns coordinates and element descriptions that the browser agent uses with
|
|
the `click_at` tool for precise, coordinate-based interactions.
|
|
|
|
> **Note:** The visual agent requires API key or Vertex AI authentication. It is
|
|
> not available when using "Sign in with Google".
|
|
|
|
## Creating custom subagents
|
|
|
|
You can create your own subagents to automate specific workflows or enforce
|
|
specific personas. To use custom subagents, you must enable them in your
|
|
`settings.json`:
|
|
|
|
```json
|
|
{
|
|
"experimental": {
|
|
"enableAgents": true
|
|
}
|
|
}
|
|
```
|
|
|
|
### Agent definition files
|
|
|
|
Custom agents are defined as Markdown files (`.md`) with YAML frontmatter. You
|
|
can place them in:
|
|
|
|
1. **Project-level:** `.gemini/agents/*.md` (Shared with your team)
|
|
2. **User-level:** `~/.gemini/agents/*.md` (Personal agents)
|
|
|
|
### File format
|
|
|
|
The file **MUST** start with YAML frontmatter enclosed in triple-dashes `---`.
|
|
The body of the markdown file becomes the agent's **System Prompt**.
|
|
|
|
**Example: `.gemini/agents/security-auditor.md`**
|
|
|
|
```markdown
|
|
---
|
|
name: security-auditor
|
|
description: Specialized in finding security vulnerabilities in code.
|
|
kind: local
|
|
tools:
|
|
- read_file
|
|
- grep_search
|
|
model: gemini-2.5-pro
|
|
temperature: 0.2
|
|
max_turns: 10
|
|
---
|
|
|
|
You are a ruthless Security Auditor. Your job is to analyze code for potential
|
|
vulnerabilities.
|
|
|
|
Focus on:
|
|
|
|
1. SQL Injection
|
|
2. XSS (Cross-Site Scripting)
|
|
3. Hardcoded credentials
|
|
4. Unsafe file operations
|
|
|
|
When you find a vulnerability, explain it clearly and suggest a fix. Do not fix
|
|
it yourself; just report it.
|
|
```
|
|
|
|
### Configuration schema
|
|
|
|
| Field | Type | Required | Description |
|
|
| :------------- | :----- | :------- | :------------------------------------------------------------------------------------------------------------------------ |
|
|
| `name` | string | Yes | Unique identifier (slug) used as the tool name for the agent. Only lowercase letters, numbers, hyphens, and underscores. |
|
|
| `description` | string | Yes | Short description of what the agent does. This is visible to the main agent to help it decide when to call this subagent. |
|
|
| `kind` | string | No | `local` (default) or `remote`. |
|
|
| `tools` | array | No | List of tool names this agent can use. If omitted, it may have access to a default set. |
|
|
| `model` | string | No | Specific model to use (e.g., `gemini-2.5-pro`). Defaults to `inherit` (uses the main session model). |
|
|
| `temperature` | number | No | Model temperature (0.0 - 2.0). |
|
|
| `max_turns` | number | No | Maximum number of conversation turns allowed for this agent before it must return. Defaults to `15`. |
|
|
| `timeout_mins` | number | No | Maximum execution time in minutes. Defaults to `5`. |
|
|
|
|
### Optimizing your subagent
|
|
|
|
The main agent's system prompt encourages it to use an expert subagent when one
|
|
is available. It decides whether an agent is a relevant expert based on the
|
|
agent's description. You can improve the reliability with which an agent is used
|
|
by updating the description to more clearly indicate:
|
|
|
|
- Its area of expertise.
|
|
- When it should be used.
|
|
- Some example scenarios.
|
|
|
|
For example, the following subagent description should be called fairly
|
|
consistently for Git operations.
|
|
|
|
> Git expert agent which should be used for all local and remote git operations.
|
|
> For example:
|
|
>
|
|
> - Making commits
|
|
> - Searching for regressions with bisect
|
|
> - Interacting with source control and issues providers such as GitHub.
|
|
|
|
If you need to further tune your subagent, you can do so by selecting the model
|
|
to optimize for with `/model` and then asking the model why it does not think
|
|
that your subagent was called with a specific prompt and the given description.
|
|
|
|
## Remote subagents (Agent2Agent) (experimental)
|
|
|
|
Gemini CLI can also delegate tasks to remote subagents using the Agent-to-Agent
|
|
(A2A) protocol.
|
|
|
|
> **Note: Remote subagents are currently an experimental feature.**
|
|
|
|
See the [Remote Subagents documentation](remote-agents) for detailed
|
|
configuration and usage instructions.
|
|
|
|
## Extension subagents
|
|
|
|
Extensions can bundle and distribute subagents. See the
|
|
[Extensions documentation](../extensions/index.md#subagents) for details on how
|
|
to package agents within an extension.
|