mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-03-30 07:51:07 -07:00
Updated ToC on docs intro; updated title casing to match Google style (#13717)
This commit is contained in:
committed by
GitHub
parent
bde8b78a88
commit
26f050ff10
@@ -1,11 +1,11 @@
|
||||
# Gemini CLI Core: Tools API
|
||||
# Gemini CLI core: Tools API
|
||||
|
||||
The Gemini CLI core (`packages/core`) features a robust system for defining,
|
||||
registering, and executing tools. These tools extend the capabilities of the
|
||||
Gemini model, allowing it to interact with the local environment, fetch web
|
||||
content, and perform various actions beyond simple text generation.
|
||||
|
||||
## Core Concepts
|
||||
## Core concepts
|
||||
|
||||
- **Tool (`tools.ts`):** An interface and base class (`BaseTool`) that defines
|
||||
the contract for all tools. Each tool must have:
|
||||
@@ -32,35 +32,35 @@ content, and perform various actions beyond simple text generation.
|
||||
- `returnDisplay`: A user-friendly string (often Markdown) or a special object
|
||||
(like `FileDiff`) for display in the CLI.
|
||||
|
||||
- **Returning Rich Content:** Tools are not limited to returning simple text.
|
||||
- **Returning rich content:** Tools are not limited to returning simple text.
|
||||
The `llmContent` can be a `PartListUnion`, which is an array that can contain
|
||||
a mix of `Part` objects (for images, audio, etc.) and `string`s. This allows a
|
||||
single tool execution to return multiple pieces of rich content.
|
||||
|
||||
- **Tool Registry (`tool-registry.ts`):** A class (`ToolRegistry`) responsible
|
||||
- **Tool registry (`tool-registry.ts`):** A class (`ToolRegistry`) responsible
|
||||
for:
|
||||
- **Registering Tools:** Holding a collection of all available built-in tools
|
||||
- **Registering tools:** Holding a collection of all available built-in tools
|
||||
(e.g., `ReadFileTool`, `ShellTool`).
|
||||
- **Discovering Tools:** It can also discover tools dynamically:
|
||||
- **Command-based Discovery:** If `tools.discoveryCommand` is configured in
|
||||
- **Discovering tools:** It can also discover tools dynamically:
|
||||
- **Command-based discovery:** If `tools.discoveryCommand` is configured in
|
||||
settings, this command is executed. It's expected to output JSON
|
||||
describing custom tools, which are then registered as `DiscoveredTool`
|
||||
instances.
|
||||
- **MCP-based Discovery:** If `mcp.serverCommand` is configured, the
|
||||
- **MCP-based discovery:** If `mcp.serverCommand` is configured, the
|
||||
registry can connect to a Model Context Protocol (MCP) server to list and
|
||||
register tools (`DiscoveredMCPTool`).
|
||||
- **Providing Schemas:** Exposing the `FunctionDeclaration` schemas of all
|
||||
- **Providing schemas:** Exposing the `FunctionDeclaration` schemas of all
|
||||
registered tools to the Gemini model, so it knows what tools are available
|
||||
and how to use them.
|
||||
- **Retrieving Tools:** Allowing the core to get a specific tool by name for
|
||||
- **Retrieving tools:** Allowing the core to get a specific tool by name for
|
||||
execution.
|
||||
|
||||
## Built-in Tools
|
||||
## Built-in tools
|
||||
|
||||
The core comes with a suite of pre-defined tools, typically found in
|
||||
`packages/core/src/tools/`. These include:
|
||||
|
||||
- **File System Tools:**
|
||||
- **File system tools:**
|
||||
- `LSTool` (`ls.ts`): Lists directory contents.
|
||||
- `ReadFileTool` (`read-file.ts`): Reads the content of a single file.
|
||||
- `WriteFileTool` (`write-file.ts`): Writes content to a file.
|
||||
@@ -70,26 +70,26 @@ The core comes with a suite of pre-defined tools, typically found in
|
||||
requiring confirmation).
|
||||
- `ReadManyFilesTool` (`read-many-files.ts`): Reads and concatenates content
|
||||
from multiple files or glob patterns (used by the `@` command in CLI).
|
||||
- **Execution Tools:**
|
||||
- **Execution tools:**
|
||||
- `ShellTool` (`shell.ts`): Executes arbitrary shell commands (requires
|
||||
careful sandboxing and user confirmation).
|
||||
- **Web Tools:**
|
||||
- **Web tools:**
|
||||
- `WebFetchTool` (`web-fetch.ts`): Fetches content from a URL.
|
||||
- `WebSearchTool` (`web-search.ts`): Performs a web search.
|
||||
- **Memory Tools:**
|
||||
- **Memory tools:**
|
||||
- `MemoryTool` (`memoryTool.ts`): Interacts with the AI's memory.
|
||||
|
||||
Each of these tools extends `BaseTool` and implements the required methods for
|
||||
its specific functionality.
|
||||
|
||||
## Tool Execution Flow
|
||||
## Tool execution flow
|
||||
|
||||
1. **Model Request:** The Gemini model, based on the user's prompt and the
|
||||
1. **Model request:** The Gemini model, based on the user's prompt and the
|
||||
provided tool schemas, decides to use a tool and returns a `FunctionCall`
|
||||
part in its response, specifying the tool name and arguments.
|
||||
2. **Core Receives Request:** The core parses this `FunctionCall`.
|
||||
3. **Tool Retrieval:** It looks up the requested tool in the `ToolRegistry`.
|
||||
4. **Parameter Validation:** The tool's `validateToolParams()` method is
|
||||
2. **Core receives request:** The core parses this `FunctionCall`.
|
||||
3. **Tool retrieval:** It looks up the requested tool in the `ToolRegistry`.
|
||||
4. **Parameter validation:** The tool's `validateToolParams()` method is
|
||||
called.
|
||||
5. **Confirmation (if needed):**
|
||||
- The tool's `shouldConfirmExecute()` method is called.
|
||||
@@ -99,27 +99,27 @@ its specific functionality.
|
||||
6. **Execution:** If validated and confirmed (or if no confirmation is needed),
|
||||
the core calls the tool's `execute()` method with the provided arguments and
|
||||
an `AbortSignal` (for potential cancellation).
|
||||
7. **Result Processing:** The `ToolResult` from `execute()` is received by the
|
||||
7. **Result processing:** The `ToolResult` from `execute()` is received by the
|
||||
core.
|
||||
8. **Response to Model:** The `llmContent` from the `ToolResult` is packaged as
|
||||
8. **Response to model:** The `llmContent` from the `ToolResult` is packaged as
|
||||
a `FunctionResponse` and sent back to the Gemini model so it can continue
|
||||
generating a user-facing response.
|
||||
9. **Display to User:** The `returnDisplay` from the `ToolResult` is sent to
|
||||
9. **Display to user:** The `returnDisplay` from the `ToolResult` is sent to
|
||||
the CLI to show the user what the tool did.
|
||||
|
||||
## Extending with Custom Tools
|
||||
## Extending with custom tools
|
||||
|
||||
While direct programmatic registration of new tools by users isn't explicitly
|
||||
detailed as a primary workflow in the provided files for typical end-users, the
|
||||
architecture supports extension through:
|
||||
|
||||
- **Command-based Discovery:** Advanced users or project administrators can
|
||||
- **Command-based discovery:** Advanced users or project administrators can
|
||||
define a `tools.discoveryCommand` in `settings.json`. This command, when run
|
||||
by the Gemini CLI core, should output a JSON array of `FunctionDeclaration`
|
||||
objects. The core will then make these available as `DiscoveredTool`
|
||||
instances. The corresponding `tools.callCommand` would then be responsible for
|
||||
actually executing these custom tools.
|
||||
- **MCP Server(s):** For more complex scenarios, one or more MCP servers can be
|
||||
- **MCP server(s):** For more complex scenarios, one or more MCP servers can be
|
||||
set up and configured via the `mcpServers` setting in `settings.json`. The
|
||||
Gemini CLI core can then discover and use tools exposed by these servers. As
|
||||
mentioned, if you have multiple MCP servers, the tool names will be prefixed
|
||||
|
||||
Reference in New Issue
Block a user