2025-09-05 15:31:42 -04:00
|
|
|
|
# RFC: Gemini CLI A2A Development-Tool Extension
|
|
|
|
|
|
|
|
|
|
|
|
## 1. Introduction
|
|
|
|
|
|
|
|
|
|
|
|
### 1.1 Overview
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
To standardize client integrations with the Gemini CLI agent, this document
|
|
|
|
|
|
proposes the `development-tool` extension for the A2A protocol.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
Rather than creating a new protocol, this specification builds upon the existing
|
|
|
|
|
|
A2A protocol. As an open-source standard recently adopted by the Linux
|
|
|
|
|
|
Foundation, A2A provides a robust foundation for core concepts like tasks,
|
|
|
|
|
|
messages, and streaming events. This extension-based approach allows us to
|
|
|
|
|
|
leverage A2A's proven architecture while defining the specific capabilities
|
|
|
|
|
|
required for rich, interactive workflows with the Gemini CLI agent.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
### 1.2 Motivation
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
Recent work integrating Gemini CLI with clients like Zed and Gemini Code
|
|
|
|
|
|
Assist’s agent mode has highlighted the need for a robust, standard
|
|
|
|
|
|
communication protocol. Standardizing on A2A provides several key advantages:
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **Solid Foundation**: Provides a robust, open standard that ensures a stable,
|
|
|
|
|
|
predictable, and consistent integration experience across different IDEs and
|
|
|
|
|
|
client surfaces.
|
|
|
|
|
|
- **Extensibility**: Creates a flexible foundation to support new tools and
|
|
|
|
|
|
workflows as they emerge.
|
|
|
|
|
|
- **Ecosystem Alignment**: Aligns Gemini CLI with a growing industry standard,
|
|
|
|
|
|
fostering broader interoperability.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
## 2. Communication Flow
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The interaction follows A2A’s task-based, streaming pattern. The client sends a
|
|
|
|
|
|
`message/stream` request and the agent responds with a `contextId` / `taskId`
|
|
|
|
|
|
and a stream of events. `TaskStatusUpdateEvent` events are used to convey the
|
|
|
|
|
|
overall state of the task. The task is complete when the agent sends a final
|
|
|
|
|
|
`TaskStatusUpdateEvent` with `final: true` and a terminal status like
|
|
|
|
|
|
`completed` or `failed`.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
### 2.1 Asynchronous Responses and Notifications
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
Clients that may disconnect from the agent should supply a
|
|
|
|
|
|
`PushNotificationConfig` to the agent with the initial `message/stream` method
|
|
|
|
|
|
or subsequently with the `tasks/pushNotificationConfig/set` method so that the
|
|
|
|
|
|
agent can call back when updates are ready.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
## 3. The `development-tool` extension
|
|
|
|
|
|
|
|
|
|
|
|
### 3.1 Overview
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The `development-tool` extension establishes a communication contract for
|
|
|
|
|
|
workflows between a client and the Gemini CLI agent. It consists of a
|
|
|
|
|
|
specialized set of schemas, embedded within core A2A data structures, that
|
|
|
|
|
|
enable the agent to stream real-time updates on its state and thought process.
|
|
|
|
|
|
These schemas also provide the mechanism for the agent to request user
|
|
|
|
|
|
permission before executing tools.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Sample Agent Card**
|
|
|
|
|
|
|
|
|
|
|
|
```json
|
|
|
|
|
|
{
|
|
|
|
|
|
"name": "Gemini CLI Agent",
|
|
|
|
|
|
"description": "An agent that generates code based on natural language instructions.",
|
|
|
|
|
|
"capabilities": {
|
|
|
|
|
|
"streaming": true,
|
|
|
|
|
|
"extensions": [
|
|
|
|
|
|
{
|
|
|
|
|
|
"uri": "https://github.com/google-gemini/gemini-cli/blob/main/docs/a2a/developer-profile/v0/spec.md",
|
|
|
|
|
|
"description": "An extension for interactive development tasks, enabling features like code generation, tool usage, and real-time status updates.",
|
|
|
|
|
|
"required": true
|
|
|
|
|
|
}
|
|
|
|
|
|
]
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Versioning**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The agent card `uri` field contains an embedded semantic version. The client
|
|
|
|
|
|
must extract this version to determine compatibility with the agent extension
|
|
|
|
|
|
using the compatibility logic defined in Semantic Versioning 2.0.0 spec.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
### 3.2 Schema Definitions
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This section defines the schemas for the `development-tool` A2A extension,
|
|
|
|
|
|
organized by their function within the communication flow. Note that all custom
|
|
|
|
|
|
objects included in the `metadata` field (e.g. `Message.metadata`) must be keyed
|
|
|
|
|
|
by the unique URI that points to that extension’s spec to prevent naming
|
|
|
|
|
|
collisions with other extensions.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Initialization & Configuration**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The first message in a session must contain an `AgentSettings` object in its
|
|
|
|
|
|
metadata. This object provides the agent with the necessary configuration
|
|
|
|
|
|
information for proper initialization. Additional configuration settings (ex.
|
|
|
|
|
|
MCP servers, allowed tools, etc.) can be added to this message.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Schema**
|
|
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
syntax = "proto3";
|
|
|
|
|
|
|
|
|
|
|
|
// Configuration settings for the Gemini CLI agent.
|
|
|
|
|
|
message AgentSettings {
|
|
|
|
|
|
// The absolute path to the workspace directory where the agent will execute.
|
|
|
|
|
|
string workspace_path = 1;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Agent-to-Client Messages**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
All real-time updates from the agent (including its thoughts, tool calls, and
|
|
|
|
|
|
simple text replies) are streamed to the client as `TaskStatusUpdateEvents`.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
Each Event contains a `Message` object, which holds the content in one of two
|
|
|
|
|
|
formats:
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **TextPart**: Used for standard text messages. This part requires no custom
|
|
|
|
|
|
schema.
|
|
|
|
|
|
- **DataPart**: Used for complex, structured objects. Tool Calls and Thoughts
|
|
|
|
|
|
are sent this way, each using their respective schemas defined below.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Tool Calls**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The `ToolCall` schema is designed to provide a structured representation of a
|
|
|
|
|
|
tool’s execution lifecycle. This protocol defines a clear state machine and
|
|
|
|
|
|
provides detailed schemas for common development tasks (file edits, shell
|
|
|
|
|
|
commands, MCP Tool), ensuring clients can build reliable UIs without being tied
|
|
|
|
|
|
to a specific agent implementation.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The core principle is that the agent sends a `ToolCall` object on every update.
|
|
|
|
|
|
This makes client-side logic stateless and simple.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Tool Call Lifecycle**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
1. **Creation**: The agent sends a `ToolCall` object with `status: PENDING`. If
|
|
|
|
|
|
user permission is required, the `confirmation_request` field will be
|
|
|
|
|
|
populated.
|
|
|
|
|
|
2. **Confirmation**: If the client needs to confirm the message, the client
|
|
|
|
|
|
will send a `ToolCallConfirmation`. If the client responds with a
|
|
|
|
|
|
cancellation, execution will be skipped.
|
|
|
|
|
|
3. **Execution**: Once approved (or if no approval is required), the agent
|
|
|
|
|
|
sends an update with `status: EXECUTING`. It can stream real-time progress
|
|
|
|
|
|
by updating the `live_content` field.
|
|
|
|
|
|
4. **Completion**: The agent sends a final update with the status set to
|
|
|
|
|
|
`SUCCEEDED`, `FAILED`, or `CANCELLED` and populates the appropriate result
|
|
|
|
|
|
field.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Schema**
|
|
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
syntax = "proto3";
|
|
|
|
|
|
|
|
|
|
|
|
import "google/protobuf/struct.proto";
|
|
|
|
|
|
|
2025-11-21 22:59:42 +08:00
|
|
|
|
// ToolCall is the central message representing a tool's execution lifecycle.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
// The entire object is sent from the agent to client on every update.
|
|
|
|
|
|
message ToolCall {
|
|
|
|
|
|
// A unique identifier, assigned by the agent
|
|
|
|
|
|
string tool_call_id = 1;
|
|
|
|
|
|
|
|
|
|
|
|
// The current state of the tool call in its lifecycle
|
|
|
|
|
|
ToolCallStatus status = 2;
|
|
|
|
|
|
|
|
|
|
|
|
// Name of the tool being called (e.g. 'Edit', 'ShellTool')
|
|
|
|
|
|
string tool_name = 3;
|
|
|
|
|
|
|
|
|
|
|
|
// An optional description of the tool call's purpose to show the user
|
|
|
|
|
|
optional string description = 4;
|
|
|
|
|
|
|
|
|
|
|
|
// The structured input params provided by the LLM for tool invocation.
|
|
|
|
|
|
google.protobuf.Struct input_parameters = 5;
|
|
|
|
|
|
|
|
|
|
|
|
// String containing the real-time output from the tool as it executes (primarily designed for shell output).
|
|
|
|
|
|
// During streaming the entire string is replaced on each update
|
|
|
|
|
|
optional string live_content = 6;
|
|
|
|
|
|
|
|
|
|
|
|
// The final result of the tool (used to replace live_content when applicable)
|
|
|
|
|
|
oneof result {
|
|
|
|
|
|
// The output on tool success
|
|
|
|
|
|
ToolOutput output = 7;
|
|
|
|
|
|
// The error details if the tool failed
|
|
|
|
|
|
ErrorDetails error = 8;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// If the tool requires user confirmation, this field will be populated while status is PENDING
|
|
|
|
|
|
optional ConfirmationRequest confirmation_request = 9;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Possible execution status of a ToolCall
|
|
|
|
|
|
enum ToolCallStatus {
|
|
|
|
|
|
STATUS_UNSPECIFIED = 0;
|
|
|
|
|
|
PENDING = 1;
|
|
|
|
|
|
EXECUTING = 2;
|
|
|
|
|
|
SUCCEEDED = 3;
|
|
|
|
|
|
FAILED = 4;
|
|
|
|
|
|
CANCELLED = 5;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2025-11-21 22:59:42 +08:00
|
|
|
|
// ToolOutput represents the final, successful, output of a tool
|
2025-09-05 15:31:42 -04:00
|
|
|
|
message ToolOutput {
|
|
|
|
|
|
oneof result {
|
|
|
|
|
|
string text = 1;
|
|
|
|
|
|
// For ToolCalls which resulted in a file modification
|
|
|
|
|
|
FileDiff diff = 2;
|
|
|
|
|
|
// A generic fallback for any other structured JSON data
|
|
|
|
|
|
google.protobuf.Struct structured_data = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// A structured representation of an error
|
|
|
|
|
|
message ErrorDetails {
|
|
|
|
|
|
// User facing error message
|
|
|
|
|
|
string message = 1;
|
|
|
|
|
|
// Optional agent-specific error type or category (e.g. read_content_failure, grep_execution_error, mcp_tool_error)
|
|
|
|
|
|
optional string type = 2;
|
|
|
|
|
|
// Optional status code
|
|
|
|
|
|
optional int32 status_code = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// ConfirmationRequest is sent from the agent to client to request user permission for a ToolCall
|
|
|
|
|
|
message ConfirmationRequest {
|
|
|
|
|
|
// A list of choices for the user to select from
|
|
|
|
|
|
repeated ConfirmationOption options = 1;
|
|
|
|
|
|
// Specific details of the action requiring user confirmation
|
|
|
|
|
|
oneof details {
|
|
|
|
|
|
ExecuteDetails execute_details = 2;
|
|
|
|
|
|
FileDiff file_edit_details = 3;
|
|
|
|
|
|
McpDetails mcp_details = 4;
|
|
|
|
|
|
GenericDetails generic_details = 5;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// A single choice presented to the user during a confirmation request
|
|
|
|
|
|
message ConfirmationOption {
|
|
|
|
|
|
// Unique ID for the choice (e.g. proceed_once, cancel)
|
|
|
|
|
|
string id = 1;
|
|
|
|
|
|
// Human-readable choice (e.g. Allow Once, Reject).
|
|
|
|
|
|
string name = 2;
|
|
|
|
|
|
// An optional longer description for a tooltip
|
|
|
|
|
|
optional string description = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Details for a request to execute a shell command
|
|
|
|
|
|
message ExecuteDetails {
|
|
|
|
|
|
// The shell command to be executed
|
|
|
|
|
|
string command = 1;
|
|
|
|
|
|
// An optional directory in which the command will be run
|
|
|
|
|
|
optional string working_directory = 2;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
message FileDiff {
|
|
|
|
|
|
string file_name = 1;
|
|
|
|
|
|
// The absolute path to the file to modify
|
|
|
|
|
|
string file_path = 2;
|
|
|
|
|
|
// The original content, if the file exists
|
|
|
|
|
|
optional string old_content = 3;
|
|
|
|
|
|
string new_content = 4;
|
|
|
|
|
|
// Pre-formatted diff string for display
|
|
|
|
|
|
optional string formatted_diff = 5;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Details for an MCP (Model Context Protocol) tool confirmation
|
|
|
|
|
|
message McpDetails {
|
|
|
|
|
|
// The name of the MCP server that provides the tool
|
|
|
|
|
|
string server_name = 1;
|
|
|
|
|
|
// THe name of the tool being called from the MCP Server
|
|
|
|
|
|
string tool_name = 2;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Generic catch-all for ToolCall requests that don't fit other types
|
|
|
|
|
|
message GenericDetails {
|
|
|
|
|
|
// Description of the action requiring confirmation
|
|
|
|
|
|
string description = 1;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Agent Thoughts**
|
|
|
|
|
|
|
|
|
|
|
|
**Schema**
|
|
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
syntax = "proto3";
|
|
|
|
|
|
|
|
|
|
|
|
// Represents a thought with a subject and a detailed description.
|
|
|
|
|
|
message AgentThought {
|
|
|
|
|
|
// A concise subject line or title for the thought.
|
|
|
|
|
|
string subject = 1;
|
|
|
|
|
|
|
|
|
|
|
|
// The description or elaboration of the thought itself.
|
|
|
|
|
|
string description = 2;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Event Metadata**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
The `metadata` object in `TaskStatusUpdateEvent` is used by the A2A client to
|
|
|
|
|
|
deserialize the `TaskStatusUpdateEvents` into their appropriate objects.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Schema**
|
|
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
syntax = "proto3";
|
|
|
|
|
|
|
|
|
|
|
|
// A DevelopmentToolEvent event.
|
|
|
|
|
|
message DevelopmentToolEvent {
|
|
|
|
|
|
// Enum representing the specific type of development tool event.
|
|
|
|
|
|
enum DevelopmentToolEventKind {
|
|
|
|
|
|
// The default, unspecified value.
|
|
|
|
|
|
DEVELOPMENT_TOOL_EVENT_KIND_UNSPECIFIED = 0;
|
|
|
|
|
|
TOOL_CALL_CONFIRMATION = 1;
|
|
|
|
|
|
TOOL_CALL_UPDATE = 2;
|
|
|
|
|
|
TEXT_CONTENT = 3;
|
|
|
|
|
|
STATE_CHANGE = 4;
|
|
|
|
|
|
THOUGHT = 5;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// The specific kind of event that occurred.
|
|
|
|
|
|
DevelopmentToolEventKind kind = 1;
|
|
|
|
|
|
|
|
|
|
|
|
// The model used for this event.
|
|
|
|
|
|
string model = 2;
|
|
|
|
|
|
|
|
|
|
|
|
// The tier of the user (optional).
|
|
|
|
|
|
string user_tier = 3;
|
|
|
|
|
|
|
2025-09-19 14:48:31 -07:00
|
|
|
|
// An unexpected error occurred in the agent execution (optional).
|
2025-09-05 15:31:42 -04:00
|
|
|
|
string error = 4;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Client-to-Agent Messages**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
When the agent sends a `TaskStatusUpdateEvent` with `status.state` set to
|
|
|
|
|
|
`input-required` and its message contains a `ConfirmationRequest`, the client
|
|
|
|
|
|
must respond by sending a new `message/stream` request.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This new request must include the `contextId` and the `taskId` from the ongoing
|
|
|
|
|
|
task and contain a `ToolCallConfirmation` object. This object conveys the user's
|
|
|
|
|
|
decision regarding the tool call that was awaiting approval.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Schema**
|
|
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
syntax = "proto3";
|
|
|
|
|
|
|
|
|
|
|
|
// The client's response to a ConfirmationRequest.
|
|
|
|
|
|
message ToolCallConfirmation {
|
|
|
|
|
|
// A unique identifier, assigned by the agent
|
|
|
|
|
|
string tool_call_id = 1;
|
|
|
|
|
|
// The 'id' of the ConfirmationOption chosen by the user.
|
|
|
|
|
|
string selected_option_id = 2;
|
|
|
|
|
|
// Included if the user modifies the proposed change.
|
|
|
|
|
|
// The type should correspond to the original ConfirmationRequest details.
|
|
|
|
|
|
oneof modified_details {
|
|
|
|
|
|
// Corresponds to a FileDiff confirmation
|
|
|
|
|
|
ModifiedFileDetails file_details = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
message ModifiedFileDetails {
|
|
|
|
|
|
// The new content after user edits.
|
|
|
|
|
|
string new_content = 1;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
### 3.3 Method Definitions
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This section defines the new methods introduced by the `development-tool`
|
|
|
|
|
|
extension.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
**Method: `commands/get`**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This method allows the client to discover slash commands supported by Gemini
|
|
|
|
|
|
CLI. The client should call this method during startup to dynamically populate
|
|
|
|
|
|
its command list.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
// Response message containing the list of all top-level slash commands.
|
|
|
|
|
|
message GetAllSlashCommandsResponse {
|
|
|
|
|
|
// A list of the top-level slash commands.
|
|
|
|
|
|
repeated SlashCommand commands = 1;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Represents a single slash command, which can contain subcommands.
|
|
|
|
|
|
message SlashCommand {
|
|
|
|
|
|
// The primary name of the command.
|
|
|
|
|
|
string name = 1;
|
|
|
|
|
|
// A detailed description of what the command does.
|
|
|
|
|
|
string description = 2;
|
|
|
|
|
|
// A list of arguments that the command accepts.
|
|
|
|
|
|
repeated SlashCommandArgument arguments = 3;
|
|
|
|
|
|
// A list of nested subcommands.
|
|
|
|
|
|
repeated SlashCommand sub_commands = 4;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Defines the structure for a single slash command argument.
|
|
|
|
|
|
message SlashCommandArgument {
|
|
|
|
|
|
// The name of the argument.
|
|
|
|
|
|
string name = 1;
|
|
|
|
|
|
// A brief description of what the argument is for.
|
|
|
|
|
|
string description = 2;
|
|
|
|
|
|
// Whether the argument is required or optional.
|
|
|
|
|
|
bool is_required = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Method: `command/execute`**
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This method allows the client to execute a slash command. Following the initial
|
|
|
|
|
|
`ExecuteSlashCommandResponse`, the agent will use the standard streaming
|
|
|
|
|
|
mechanism to communicate the command's progress and output. All subsequent
|
|
|
|
|
|
updates, including textual output, agent thoughts, and any required user
|
|
|
|
|
|
confirmations for tool calls (like executing a shell command), will be sent as
|
|
|
|
|
|
`TaskStatusUpdateEvent` messages, re-using the schemas defined above.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
```proto
|
|
|
|
|
|
// Request to execute a specific slash command.
|
|
|
|
|
|
message ExecuteSlashCommandRequest {
|
|
|
|
|
|
// The path to the command, e.g., ["memory", "add"] for /memory add
|
|
|
|
|
|
repeated string command_path = 1;
|
|
|
|
|
|
// The arguments for the command as a single string.
|
|
|
|
|
|
string args = 2;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Enum for the initial status of a command execution request.
|
|
|
|
|
|
enum CommandExecutionStatus {
|
|
|
|
|
|
// Default unspecified status.
|
|
|
|
|
|
COMMAND_EXECUTION_STATUS_UNSPECIFIED = 0;
|
|
|
|
|
|
// The command was successfully received and its execution has started.
|
|
|
|
|
|
STARTED = 1;
|
|
|
|
|
|
// The command failed to start (e.g., command not found, invalid format).
|
|
|
|
|
|
FAILED_TO_START = 2;
|
|
|
|
|
|
// The command has been paused and is waiting for the user to confirm
|
|
|
|
|
|
// a set of shell commands.
|
|
|
|
|
|
AWAITING_SHELL_CONFIRMATION = 3;
|
|
|
|
|
|
// The command has been paused and is waiting for the user to confirm
|
|
|
|
|
|
// a specific action.
|
|
|
|
|
|
AWAITING_ACTION_CONFIRMATION = 4;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// The immediate, async response after requesting a command execution.
|
|
|
|
|
|
message ExecuteSlashCommandResponse {
|
|
|
|
|
|
// A unique taskID for this specific command execution.
|
|
|
|
|
|
string execution_id = 1;
|
|
|
|
|
|
// The initial status of the command execution.
|
|
|
|
|
|
CommandExecutionStatus status = 2;
|
|
|
|
|
|
// An optional message, particularly useful for explaining why a command
|
|
|
|
|
|
// failed to start.
|
|
|
|
|
|
string message = 3;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
## 4. Separation of Concerns
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
We believe that all client-side context (ex., workspace state) and client-side
|
|
|
|
|
|
tool execution (ex. read active buffers) should be routed through MCP.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
This approach enforces a strict separation of concerns: the A2A
|
|
|
|
|
|
`development-tool` extension standardizes communication to the agent, while MCP
|
|
|
|
|
|
serves as the single, authoritative interface for client-side capabilities.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
|
|
|
|
|
|
## Appendix
|
|
|
|
|
|
|
|
|
|
|
|
### A. Example Interaction Flow
|
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
|
1. **Client -> Server**: The client sends a `message/stream` request containing
|
|
|
|
|
|
the initial prompt and configuration in an `AgentSettings` object.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
2. **Server -> Client**: SSE stream begins.
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **Event 1**: The server sends a `Task` object with
|
|
|
|
|
|
`status.state: 'submitted'` and the new `taskId`.
|
|
|
|
|
|
- **Event 2**: The server sends a `TaskStatusUpdateEvent` with the metadata
|
|
|
|
|
|
`kind` set to `'STATE_CHANGE'` and `status.state` set to `'working'`.
|
|
|
|
|
|
3. **Agent Logic**: The agent processes the prompt and decides to call the
|
|
|
|
|
|
`write_file` tool, which requires user confirmation.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
4. **Server -> Client**:
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **Event 3**: The server sends a `TaskStatusUpdateEvent`. The metadata
|
|
|
|
|
|
`kind` is `'TOOL_CALL_UPDATE'`, and the `DataPart` contains a `ToolCall`
|
|
|
|
|
|
object with its `status` as `'PENDING'` and a populated
|
|
|
|
|
|
`confirmation_request`.
|
|
|
|
|
|
- **Event 4**: The server sends a final `TaskStatusUpdateEvent` for this
|
|
|
|
|
|
exchange. The metadata `kind` is `'STATE_CHANGE'`, the `status.state` is
|
|
|
|
|
|
`'input-required'`, and `final` is `true`. The stream for this request
|
|
|
|
|
|
ends.
|
|
|
|
|
|
5. **Client**: The client UI renders the confirmation prompt based on the
|
|
|
|
|
|
`ToolCall` object from Event 3. The user clicks "Approve."
|
|
|
|
|
|
6. **Client -> Server**: The client sends a new `message/stream` request. It
|
|
|
|
|
|
includes the `taskId` from the ongoing task and a `DataPart` containing a
|
|
|
|
|
|
`ToolCallConfirmation` object (e.g.,
|
|
|
|
|
|
`{"tool_call_id": "...", "selected_option_id": "proceed_once"}`).
|
2025-09-05 15:31:42 -04:00
|
|
|
|
7. **Server -> Client**: A new SSE stream begins for the second request.
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **Event 1**: The server sends a `TaskStatusUpdateEvent` with
|
|
|
|
|
|
`kind: 'TOOL_CALL_UPDATE'`, containing the `ToolCall` object with its
|
|
|
|
|
|
`status` now set to `'EXECUTING'`.
|
|
|
|
|
|
- **Event 2**: After the tool runs, the server sends another
|
|
|
|
|
|
`TaskStatusUpdateEvent` with `kind: 'TOOL_CALL_UPDATE'`, containing the
|
|
|
|
|
|
`ToolCall` with its `status` as `'SUCCEEDED'`.
|
|
|
|
|
|
8. **Agent Logic**: The agent receives the successful tool result and generates
|
|
|
|
|
|
a final textual response.
|
2025-09-05 15:31:42 -04:00
|
|
|
|
9. **Server -> Client**:
|
2025-10-09 08:17:37 -04:00
|
|
|
|
- **Event 3**: The server sends a `TaskStatusUpdateEvent` with
|
|
|
|
|
|
`kind: 'TEXT_CONTENT'` and a `TextPart` containing the agent's final
|
|
|
|
|
|
answer.
|
|
|
|
|
|
- **Event 4**: The server sends the final `TaskStatusUpdateEvent`. The
|
|
|
|
|
|
`kind` is `'STATE_CHANGE'`, the `status.state` is `'completed'`, and
|
|
|
|
|
|
`final` is `true`. The stream ends.
|
|
|
|
|
|
10. **Client**: The client displays the final answer. The task is now complete
|
|
|
|
|
|
but can be continued by sending another message with the same `taskId`.
|