feat(plan): support automatic model switching for Plan Mode (#20240)

2026-06-17 23:07:22 -07:00 · 2026-02-24 19:15:14 -05:00
parent 1f9da6723f
commit bf278ef2b0
19 changed files with 422 additions and 31 deletions
@@ -27,6 +27,7 @@ implementation. It allows you to:
    - [Example: Allow git commands in Plan Mode](#example-allow-git-commands-in-plan-mode)
    - [Example: Enable research subagents in Plan Mode](#example-enable-research-subagents-in-plan-mode)
  - [Custom Plan Directory and Policies](#custom-plan-directory-and-policies)
+- [Automatic Model Routing](#automatic-model-routing)

 ## Enabling Plan Mode

@@ -242,6 +243,32 @@ modes = ["plan"]
 argsPattern = "\"file_path\":\"[^\"]+[\\\\/]+\\.gemini[\\\\/]+plans[\\\\/]+[\\w-]+\\.md\""
 ```

+## Automatic Model Routing
+
+When using an [**auto model**], Gemini CLI automatically optimizes [**model
+routing**] based on the current phase of your task:
+
+1.  **Planning Phase:** While in Plan Mode, the CLI routes requests to a
+    high-reasoning **Pro** model to ensure robust architectural decisions and
+    high-quality plans.
+2.  **Implementation Phase:** Once a plan is approved and you exit Plan Mode,
+    the CLI detects the existence of the approved plan and automatically
+    switches to a high-speed **Flash** model. This provides a faster, more
+    responsive experience during the implementation of the plan.
+
+This behavior is enabled by default to provide the best balance of quality and
+performance. You can disable this automatic switching in your settings:
+
+```json
+{
+  "general": {
+    "plan": {
+      "modelRouting": false
+    }
+  }
+}
+```
+
 [`list_directory`]: /docs/tools/file-system.md#1-list_directory-readfolder
 [`read_file`]: /docs/tools/file-system.md#2-read_file-readfile
 [`grep_search`]: /docs/tools/file-system.md#5-grep_search-searchtext
@@ -259,3 +286,5 @@ argsPattern = "\"file_path\":\"[^\"]+[\\\\/]+\\.gemini[\\\\/]+plans[\\\\/]+[\\w-
 [YOLO mode]: /docs/reference/configuration.md#command-line-arguments
 [`plan.toml`]:
  https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/policy/policies/plan.toml
+[auto model]: /docs/reference/configuration.md#model-settings
+[model routing]: /docs/cli/telemetry.md#model-routing
@@ -29,6 +29,7 @@ they appear in the UI.
 | Enable Auto Update      | `general.enableAutoUpdate`         | Enable automatic updates.                                                                                                                                                      | `true`      |
 | Enable Notifications    | `general.enableNotifications`      | Enable run-event notifications for action-required prompts and session completion. Currently macOS only.                                                                       | `false`     |
 | Plan Directory          | `general.plan.directory`           | The directory where planning artifacts are stored. If not specified, defaults to the system temporary directory.                                                               | `undefined` |
+| Plan Model Routing      | `general.plan.modelRouting`        | Automatically switch between Pro and Flash models based on Plan Mode status. Uses Pro for the planning phase and Flash for the implementation phase.                           | `true`      |
 | Max Chat Model Attempts | `general.maxAttempts`              | Maximum number of attempts for requests to the main chat model. Cannot exceed 10.                                                                                              | `10`        |
 | Debug Keystroke Logging | `general.debugKeystrokeLogging`    | Enable debug logging of keystrokes to the console.                                                                                                                             | `false`     |
 | Enable Session Cleanup  | `general.sessionRetention.enabled` | Enable automatic session cleanup                                                                                                                                               | `false`     |
@@ -487,6 +487,7 @@ Captures Gemini API requests, responses, and errors.
    - `reasoning` (string, optional)
    - `failed` (boolean)
    - `error_message` (string, optional)
+    - `approval_mode` (string)

 #### Chat and streaming

@@ -711,12 +712,14 @@ Routing latency/failures and slash-command selections.
  - **Attributes**:
    - `routing.decision_model` (string)
    - `routing.decision_source` (string)
+    - `routing.approval_mode` (string)

 - `gemini_cli.model_routing.failure.count` (Counter, Int): Counts model routing
  failures.
  - **Attributes**:
    - `routing.decision_source` (string)
    - `routing.error_message` (string)
+    - `routing.approval_mode` (string)

 ##### Agent runs

@@ -137,6 +137,12 @@ their corresponding top-level category object in your `settings.json` file.
  - **Default:** `undefined`
  - **Requires restart:** Yes

+- **`general.plan.modelRouting`** (boolean):
+  - **Description:** Automatically switch between Pro and Flash models based on
+    Plan Mode status. Uses Pro for the planning phase and Flash for the
+    implementation phase.
+  - **Default:** `true`
+
 - **`general.retryFetchErrors`** (boolean):
  - **Description:** Retry on "exception TypeError: fetch failed sending
    request" errors.