feat(cli): add streamlined gemini gemma local model setup (#25498)

Co-authored-by: Abhijit Balaji <abhijitbalaji@google.com> Co-authored-by: Samee Zahid <sameez@google.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-07-22 15:51:18 -07:00 · 2026-04-20 16:57:56 -07:00
parent 6afc47f81c
commit 1d383a4a8e
31 changed files with 2509 additions and 12 deletions
@@ -161,17 +161,19 @@ they appear in the UI.

 ### Experimental

-| UI Label                                             | Setting                          | Description                                                                                                                                               | Default |
-| ---------------------------------------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
-| Enable Git Worktrees                                 | `experimental.worktrees`         | Enable automated Git worktree management for parallel work.                                                                                               | `false` |
-| Use OSC 52 Paste                                     | `experimental.useOSC52Paste`     | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
-| Use OSC 52 Copy                                      | `experimental.useOSC52Copy`      | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
-| Model Steering                                       | `experimental.modelSteering`     | Enable model steering (user hints) to guide the model during tool execution.                                                                              | `false` |
-| Direct Web Fetch                                     | `experimental.directWebFetch`    | Enable web fetch behavior that bypasses LLM summarization.                                                                                                | `false` |
-| Memory Manager Agent                                 | `experimental.memoryManager`     | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories.             | `false` |
-| Auto Memory                                          | `experimental.autoMemory`        | Automatically extract reusable skills from past sessions in the background. Review results with /memory inbox.                                            | `false` |
-| Use the generalist profile to manage agent contexts. | `experimental.generalistProfile` | Suitable for general coding and software development tasks.                                                                                               | `false` |
-| Enable Context Management                            | `experimental.contextManagement` | Enable logic for context management.                                                                                                                      | `false` |
+| UI Label                                             | Setting                                         | Description                                                                                                                                               | Default |
+| ---------------------------------------------------- | ----------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
+| Enable Git Worktrees                                 | `experimental.worktrees`                        | Enable automated Git worktree management for parallel work.                                                                                               | `false` |
+| Use OSC 52 Paste                                     | `experimental.useOSC52Paste`                    | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
+| Use OSC 52 Copy                                      | `experimental.useOSC52Copy`                     | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
+| Model Steering                                       | `experimental.modelSteering`                    | Enable model steering (user hints) to guide the model during tool execution.                                                                              | `false` |
+| Direct Web Fetch                                     | `experimental.directWebFetch`                   | Enable web fetch behavior that bypasses LLM summarization.                                                                                                | `false` |
+| Enable Gemma Model Router                            | `experimental.gemmaModelRouter.enabled`         | Enable the Gemma Model Router (experimental). Requires a local endpoint serving Gemma via the Gemini API using LiteRT-LM shim.                            | `false` |
+| Auto-start LiteRT Server                             | `experimental.gemmaModelRouter.autoStartServer` | Automatically start the LiteRT-LM server when Gemini CLI starts and the Gemma router is enabled.                                                          | `false` |
+| Memory Manager Agent                                 | `experimental.memoryManager`                    | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories.             | `false` |
+| Auto Memory                                          | `experimental.autoMemory`                       | Automatically extract reusable skills from past sessions in the background. Review results with /memory inbox.                                            | `false` |
+| Use the generalist profile to manage agent contexts. | `experimental.generalistProfile`                | Suitable for general coding and software development tasks.                                                                                               | `false` |
+| Enable Context Management                            | `experimental.contextManagement`                | Enable logic for context management.                                                                                                                      | `false` |

 ### Skills

@@ -1711,6 +1711,18 @@ their corresponding top-level category object in your `settings.json` file.
  - **Default:** `false`
  - **Requires restart:** Yes

+- **`experimental.gemmaModelRouter.autoStartServer`** (boolean):
+  - **Description:** Automatically start the LiteRT-LM server when Gemini CLI
+    starts and the Gemma router is enabled.
+  - **Default:** `false`
+  - **Requires restart:** Yes
+
+- **`experimental.gemmaModelRouter.binaryPath`** (string):
+  - **Description:** Custom path to the LiteRT-LM binary. Leave empty to use the
+    default location (~/.gemini/bin/litert/).
+  - **Default:** `""`
+  - **Requires restart:** Yes
+
 - **`experimental.gemmaModelRouter.classifier.host`** (string):
  - **Description:** The host of the classifier.
  - **Default:** `"http://localhost:9379"`