mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-03-15 08:31:14 -07:00
feat: implement adaptive thinking budget
This commit is contained in:
3
conductor/tracks.md
Normal file
3
conductor/tracks.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# Tracks
|
||||
|
||||
- [Dynamic Thinking Budget](tracks/dynamic-thinking-budget/plan.md)
|
||||
101
conductor/tracks/dynamic-thinking-budget/plan.md
Normal file
101
conductor/tracks/dynamic-thinking-budget/plan.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# Dynamic Thinking Budget Plan
|
||||
|
||||
## Context
|
||||
|
||||
The current Gemini CLI implementation uses static thinking configurations
|
||||
defined in `settings.json` (or defaults).
|
||||
|
||||
- **Gemini 2.x**: Uses a static `thinkingBudget` (e.g., 8192 tokens).
|
||||
- **Gemini 3**: Uses a static `thinkingLevel` (e.g., "HIGH").
|
||||
|
||||
This "one-size-fits-all" approach is inefficient. Simple queries waste compute,
|
||||
while complex queries might not get enough reasoning depth. The goal is to
|
||||
implement an "Adaptive Budget Manager" that dynamically adjusts the
|
||||
`thinkingBudget` (for v2) or `thinkingLevel` (for v3) based on the complexity of
|
||||
the user's request.
|
||||
|
||||
## Goals
|
||||
|
||||
- Implement a **Complexity Classifier** using a lightweight model (e.g., Gemini
|
||||
Flash) to analyze the user's prompt and history.
|
||||
- **Map complexity levels** to:
|
||||
- `thinkingBudget` token counts for Gemini 2.x models.
|
||||
- `thinkingLevel` enums for Gemini 3 models.
|
||||
- **Dynamically update** the `GenerateContentConfig` in `GeminiClient` before
|
||||
the main model call.
|
||||
- Ensure **fallback mechanisms** if the classification fails.
|
||||
- (Optional) **Visual feedback** to the user regarding the determined
|
||||
complexity.
|
||||
|
||||
## Strategy
|
||||
|
||||
### 1. Adaptive Budget Manager Service
|
||||
|
||||
Create a new service `AdaptiveBudgetService` in
|
||||
`packages/core/src/services/adaptiveBudgetService.ts`.
|
||||
|
||||
- **Functionality**:
|
||||
- Takes `userPrompt` and `recentHistory` as input.
|
||||
- Calls Gemini Flash (using `config.getBaseLlmClient()`) with a specialized
|
||||
system prompt.
|
||||
- Returns a `ComplexityLevel` (1-4).
|
||||
|
||||
### 2. Budget/Level Mapping
|
||||
|
||||
| Complexity Level | Gemini 2.x (`thinkingBudget`) | Gemini 3 (`thinkingLevel`) | Description |
|
||||
| :--------------- | :---------------------------- | :------------------------- | :----------------------------- |
|
||||
| **1 (Simple)** | 1,024 tokens | `LOW` | Quick fixes, syntax questions. |
|
||||
| **2 (Moderate)** | 4,096 tokens | `MEDIUM` (or `LOW`) | Function-level logic. |
|
||||
| **3 (High)** | 16,384 tokens | `HIGH` | Module-level refactoring. |
|
||||
| **4 (Extreme)** | 32,768+ tokens | `HIGH` | Architecture, deep debugging. |
|
||||
|
||||
### 3. Integration Point
|
||||
|
||||
Modify `packages/core/src/core/client.ts` to invoke the `AdaptiveBudgetService`
|
||||
before `sendMessageStream`.
|
||||
|
||||
- **Flow**:
|
||||
1. User sends message.
|
||||
2. `GeminiClient` identifies the target model family (v2 or v3).
|
||||
3. Call `AdaptiveBudgetService.determineComplexity()`.
|
||||
4. If **v2**: Calculate `thinkingBudget` based on complexity. Update config.
|
||||
5. If **v3**: Calculate `thinkingLevel` based on complexity. Update config.
|
||||
6. Proceed with `sendMessageStream`.
|
||||
|
||||
### 4. Configuration
|
||||
|
||||
Add settings to `packages/core/src/config/config.ts` and `settings.schema.json`:
|
||||
|
||||
- `adaptiveThinking.enabled`: boolean (default true)
|
||||
- `adaptiveThinking.classifierModel`: string (default "gemini-2.0-flash")
|
||||
|
||||
## Insights from "J1: Exploring Simple Test-Time Scaling (STTS)"
|
||||
|
||||
The paper (arXiv:2505.xxxx / 2512.19585) highlights that models trained with
|
||||
Reinforcement Learning (like Gemini 3) exhibit strong scaling trends when
|
||||
allocated more inference-time compute.
|
||||
|
||||
- **Budget Forcing**: The "Adaptive Budget Manager" implements this by forcing
|
||||
higher `thinkingLevel` or `thinkingBudget` for harder tasks, maximizing the
|
||||
"verifiable reward" (correct code) for complex problems while saving latency
|
||||
on simple ones.
|
||||
- **Best-of-N**: The paper suggests that generating N solutions and selecting
|
||||
the best one is a powerful STTS method. While out of scope for _this_ specific
|
||||
track, the "Complexity Classifier" we build here is the _prerequisite_ for
|
||||
that future feature. We should only trigger expensive "Best-of-N" flows when
|
||||
the Complexity Level is 3 or 4.
|
||||
|
||||
## Files to Modify
|
||||
|
||||
- `packages/core/src/services/adaptiveBudgetService.ts` (New)
|
||||
- `packages/core/src/core/client.ts`
|
||||
- `packages/core/src/config/config.ts`
|
||||
|
||||
## Verification Plan
|
||||
|
||||
1. **Unit Tests**: Verify `AdaptiveBudgetService` returns correct mappings for
|
||||
both model families.
|
||||
2. **Integration Tests**: Mock API calls to ensure `thinkingLevel` is sent for
|
||||
v3 and `thinkingBudget` for v2.
|
||||
3. **Manual Verification**: Use debug logs to verify the correct parameters are
|
||||
being sent to the API.
|
||||
@@ -716,6 +716,7 @@ export async function loadCliConfig(
|
||||
settings.experimental?.codebaseInvestigatorSettings,
|
||||
introspectionAgentSettings:
|
||||
settings.experimental?.introspectionAgentSettings,
|
||||
adaptiveThinking: settings.experimental?.adaptiveThinking,
|
||||
fakeResponses: argv.fakeResponses,
|
||||
recordResponses: argv.recordResponses,
|
||||
retryFetchErrors: settings.general?.retryFetchErrors,
|
||||
|
||||
@@ -1473,6 +1473,37 @@ const SETTINGS_SCHEMA = {
|
||||
},
|
||||
},
|
||||
},
|
||||
adaptiveThinking: {
|
||||
type: 'object',
|
||||
label: 'Adaptive Thinking Settings',
|
||||
category: 'Experimental',
|
||||
requiresRestart: false,
|
||||
default: {},
|
||||
description: 'Configuration for Adaptive Thinking Budget.',
|
||||
showInDialog: false,
|
||||
properties: {
|
||||
enabled: {
|
||||
type: 'boolean',
|
||||
label: 'Enable Adaptive Thinking',
|
||||
category: 'Experimental',
|
||||
requiresRestart: false,
|
||||
default: false,
|
||||
description:
|
||||
'Enable adaptive thinking budget based on task complexity.',
|
||||
showInDialog: true,
|
||||
},
|
||||
classifierModel: {
|
||||
type: 'string',
|
||||
label: 'Classifier Model',
|
||||
category: 'Experimental',
|
||||
requiresRestart: false,
|
||||
default: 'classifier',
|
||||
description:
|
||||
'The model (or alias) to use for complexity classification.',
|
||||
showInDialog: false,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
|
||||
|
||||
@@ -73,6 +73,7 @@ import type { ModelConfigServiceConfig } from '../services/modelConfigService.js
|
||||
import { ModelConfigService } from '../services/modelConfigService.js';
|
||||
import { DEFAULT_MODEL_CONFIGS } from './defaultModelConfigs.js';
|
||||
import { ContextManager } from '../services/contextManager.js';
|
||||
import { AdaptiveBudgetService } from '../services/adaptiveBudgetService.js';
|
||||
|
||||
// Re-export OAuth config type
|
||||
export type { MCPOAuthConfig, AnyToolInvocation };
|
||||
@@ -335,6 +336,10 @@ export interface ConfigParameters {
|
||||
disableModelRouterForAuth?: AuthType[];
|
||||
codebaseInvestigatorSettings?: CodebaseInvestigatorSettings;
|
||||
introspectionAgentSettings?: IntrospectionAgentSettings;
|
||||
adaptiveThinking?: {
|
||||
enabled?: boolean;
|
||||
classifierModel?: string;
|
||||
};
|
||||
continueOnFailedApiCall?: boolean;
|
||||
retryFetchErrors?: boolean;
|
||||
enableShellOutputEfficiency?: boolean;
|
||||
@@ -460,6 +465,10 @@ export class Config {
|
||||
private readonly outputSettings: OutputSettings;
|
||||
private readonly codebaseInvestigatorSettings: CodebaseInvestigatorSettings;
|
||||
private readonly introspectionAgentSettings: IntrospectionAgentSettings;
|
||||
private readonly adaptiveThinking: {
|
||||
enabled: boolean;
|
||||
classifierModel: string;
|
||||
};
|
||||
private readonly continueOnFailedApiCall: boolean;
|
||||
private readonly retryFetchErrors: boolean;
|
||||
private readonly enableShellOutputEfficiency: boolean;
|
||||
@@ -491,6 +500,7 @@ export class Config {
|
||||
private readonly experimentalJitContext: boolean;
|
||||
private contextManager?: ContextManager;
|
||||
private terminalBackground: string | undefined = undefined;
|
||||
private adaptiveBudgetService!: AdaptiveBudgetService;
|
||||
|
||||
constructor(params: ConfigParameters) {
|
||||
this.sessionId = params.sessionId;
|
||||
@@ -618,6 +628,10 @@ export class Config {
|
||||
this.introspectionAgentSettings = {
|
||||
enabled: params.introspectionAgentSettings?.enabled ?? false,
|
||||
};
|
||||
this.adaptiveThinking = {
|
||||
enabled: params.adaptiveThinking?.enabled ?? false,
|
||||
classifierModel: params.adaptiveThinking?.classifierModel ?? 'classifier',
|
||||
};
|
||||
this.continueOnFailedApiCall = params.continueOnFailedApiCall ?? true;
|
||||
this.enableShellOutputEfficiency =
|
||||
params.enableShellOutputEfficiency ?? true;
|
||||
@@ -763,6 +777,13 @@ export class Config {
|
||||
await this.contextManager.refresh();
|
||||
}
|
||||
|
||||
this.adaptiveBudgetService = new AdaptiveBudgetService(this);
|
||||
if (this.adaptiveThinking.enabled) {
|
||||
debugLogger.debug(
|
||||
`Adaptive Thinking Budget enabled (classifier: ${this.adaptiveThinking.classifierModel})`,
|
||||
);
|
||||
}
|
||||
|
||||
await this.geminiClient.initialize();
|
||||
}
|
||||
|
||||
@@ -770,6 +791,10 @@ export class Config {
|
||||
return this.contentGenerator;
|
||||
}
|
||||
|
||||
getAdaptiveBudgetService(): AdaptiveBudgetService {
|
||||
return this.adaptiveBudgetService;
|
||||
}
|
||||
|
||||
async refreshAuth(authMethod: AuthType) {
|
||||
// Reset availability service when switching auth
|
||||
this.modelAvailabilityService.reset();
|
||||
@@ -1664,6 +1689,10 @@ export class Config {
|
||||
return this.introspectionAgentSettings;
|
||||
}
|
||||
|
||||
getAdaptiveThinkingConfig(): { enabled: boolean; classifierModel: string } {
|
||||
return this.adaptiveThinking;
|
||||
}
|
||||
|
||||
async createToolRegistry(): Promise<ToolRegistry> {
|
||||
const registry = new ToolRegistry(this, this.messageBus);
|
||||
|
||||
|
||||
@@ -28,6 +28,7 @@ import { GeminiChat } from './geminiChat.js';
|
||||
import { retryWithBackoff } from '../utils/retry.js';
|
||||
import { getErrorMessage } from '../utils/errors.js';
|
||||
import { tokenLimit } from './tokenLimits.js';
|
||||
import { partListUnionToString } from './geminiRequest.js';
|
||||
import type {
|
||||
ChatRecordingService,
|
||||
ResumedSessionData,
|
||||
@@ -620,6 +621,25 @@ export class GeminiClient {
|
||||
|
||||
// availability logic
|
||||
const modelConfigKey: ModelConfigKey = { model: modelToUse };
|
||||
|
||||
// Adaptive Thinking Budget Integration
|
||||
if (
|
||||
!isInvalidStreamRetry &&
|
||||
this.config.getAdaptiveThinkingConfig().enabled
|
||||
) {
|
||||
const userMessage = partListUnionToString(request);
|
||||
if (userMessage) {
|
||||
const adaptiveConfig = await this.config
|
||||
.getAdaptiveBudgetService()
|
||||
.determineAdaptiveConfig(userMessage, modelToUse);
|
||||
|
||||
if (adaptiveConfig) {
|
||||
modelConfigKey.thinkingBudget = adaptiveConfig.thinkingBudget;
|
||||
modelConfigKey.thinkingLevel = adaptiveConfig.thinkingLevel;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const { model: finalModel } = applyModelSelection(
|
||||
this.config,
|
||||
modelConfigKey,
|
||||
|
||||
88
packages/core/src/services/adaptiveBudgetService.test.ts
Normal file
88
packages/core/src/services/adaptiveBudgetService.test.ts
Normal file
@@ -0,0 +1,88 @@
|
||||
/**
|
||||
* @license
|
||||
* Copyright 2026 Google LLC
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
import { describe, it, expect, vi } from 'vitest';
|
||||
import {
|
||||
AdaptiveBudgetService,
|
||||
ComplexityLevel,
|
||||
} from './adaptiveBudgetService.js';
|
||||
import type { Config } from '../config/config.js';
|
||||
import { ThinkingLevel } from '@google/genai';
|
||||
|
||||
describe('AdaptiveBudgetService', () => {
|
||||
it('should map complexity levels to correct V2 budgets', () => {
|
||||
const service = new AdaptiveBudgetService({} as Config);
|
||||
expect(service.getThinkingBudgetV2(ComplexityLevel.SIMPLE)).toBe(1024);
|
||||
expect(service.getThinkingBudgetV2(ComplexityLevel.MODERATE)).toBe(4096);
|
||||
expect(service.getThinkingBudgetV2(ComplexityLevel.HIGH)).toBe(16384);
|
||||
expect(service.getThinkingBudgetV2(ComplexityLevel.EXTREME)).toBe(32768);
|
||||
});
|
||||
|
||||
it('should map complexity levels to correct V3 levels', () => {
|
||||
const service = new AdaptiveBudgetService({} as Config);
|
||||
expect(service.getThinkingLevelV3(ComplexityLevel.SIMPLE)).toBe(
|
||||
ThinkingLevel.LOW,
|
||||
);
|
||||
expect(service.getThinkingLevelV3(ComplexityLevel.MODERATE)).toBe(
|
||||
ThinkingLevel.LOW,
|
||||
);
|
||||
expect(service.getThinkingLevelV3(ComplexityLevel.HIGH)).toBe(
|
||||
ThinkingLevel.HIGH,
|
||||
);
|
||||
expect(service.getThinkingLevelV3(ComplexityLevel.EXTREME)).toBe(
|
||||
ThinkingLevel.HIGH,
|
||||
);
|
||||
});
|
||||
|
||||
it('should determine adaptive config based on LLM response', async () => {
|
||||
const mockGenerateContent = vi.fn().mockResolvedValue({
|
||||
candidates: [{ content: { parts: [{ text: '3' }] } }],
|
||||
});
|
||||
|
||||
const mockConfig = {
|
||||
getBaseLlmClient: () => ({
|
||||
generateContent: mockGenerateContent,
|
||||
}),
|
||||
getAdaptiveThinkingConfig: () => ({
|
||||
enabled: true,
|
||||
classifierModel: 'gemini-2.0-flash',
|
||||
}),
|
||||
} as unknown as Config;
|
||||
|
||||
const service = new AdaptiveBudgetService(mockConfig);
|
||||
const result = await service.determineAdaptiveConfig(
|
||||
'Complex task',
|
||||
'gemini-2.5-pro',
|
||||
);
|
||||
|
||||
expect(result?.complexity).toBe(ComplexityLevel.HIGH);
|
||||
expect(result?.thinkingBudget).toBe(16384);
|
||||
expect(mockGenerateContent).toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('should handle Gemini 3 models with thinkingLevel', async () => {
|
||||
const mockConfig = {
|
||||
getBaseLlmClient: () => ({
|
||||
generateContent: vi.fn().mockResolvedValue({
|
||||
candidates: [{ content: { parts: [{ text: '1' }] } }],
|
||||
}),
|
||||
}),
|
||||
getAdaptiveThinkingConfig: () => ({
|
||||
enabled: true,
|
||||
classifierModel: 'gemini-2.0-flash',
|
||||
}),
|
||||
} as unknown as Config;
|
||||
|
||||
const service = new AdaptiveBudgetService(mockConfig);
|
||||
const result = await service.determineAdaptiveConfig(
|
||||
'Hi',
|
||||
'gemini-3-pro-preview',
|
||||
);
|
||||
|
||||
expect(result?.complexity).toBe(ComplexityLevel.SIMPLE);
|
||||
expect(result?.thinkingLevel).toBe(ThinkingLevel.LOW);
|
||||
expect(result?.thinkingBudget).toBeUndefined();
|
||||
});
|
||||
});
|
||||
132
packages/core/src/services/adaptiveBudgetService.ts
Normal file
132
packages/core/src/services/adaptiveBudgetService.ts
Normal file
@@ -0,0 +1,132 @@
|
||||
/**
|
||||
* @license
|
||||
* Copyright 2026 Google LLC
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
import type { Config } from '../config/config.js';
|
||||
import { debugLogger } from '../utils/debugLogger.js';
|
||||
import { isGemini2Model, isPreviewModel } from '../config/models.js';
|
||||
import { ThinkingLevel } from '@google/genai';
|
||||
|
||||
export enum ComplexityLevel {
|
||||
SIMPLE = 1,
|
||||
MODERATE = 2,
|
||||
HIGH = 3,
|
||||
EXTREME = 4,
|
||||
}
|
||||
|
||||
export const BUDGET_MAPPING_V2: Record<ComplexityLevel, number> = {
|
||||
[ComplexityLevel.SIMPLE]: 1024,
|
||||
[ComplexityLevel.MODERATE]: 4096,
|
||||
[ComplexityLevel.HIGH]: 16384,
|
||||
[ComplexityLevel.EXTREME]: 32768,
|
||||
};
|
||||
|
||||
export const LEVEL_MAPPING_V3: Record<ComplexityLevel, ThinkingLevel> = {
|
||||
[ComplexityLevel.SIMPLE]: ThinkingLevel.LOW,
|
||||
[ComplexityLevel.MODERATE]: ThinkingLevel.LOW,
|
||||
[ComplexityLevel.HIGH]: ThinkingLevel.HIGH,
|
||||
[ComplexityLevel.EXTREME]: ThinkingLevel.HIGH,
|
||||
};
|
||||
|
||||
export interface AdaptiveBudgetResult {
|
||||
complexity: ComplexityLevel;
|
||||
thinkingBudget?: number;
|
||||
thinkingLevel?: ThinkingLevel;
|
||||
strategyNote?: string;
|
||||
}
|
||||
|
||||
export class AdaptiveBudgetService {
|
||||
constructor(private config: Config) {}
|
||||
|
||||
/**
|
||||
* Analyzes the user prompt and determines the optimal thinking configuration.
|
||||
*
|
||||
* Note on future scaling (per arXiv:2512.19585):
|
||||
* At Complexity 4 (Extreme), we should consider:
|
||||
* 1. Best-of-N: Generate multiple solutions.
|
||||
* 2. LLM-as-a-Judge: Use a strong model to evaluate candidates.
|
||||
* 3. Compiler Verification: Check code correctness via environment tools.
|
||||
*/
|
||||
async determineAdaptiveConfig(
|
||||
userPrompt: string,
|
||||
model: string,
|
||||
): Promise<AdaptiveBudgetResult | undefined> {
|
||||
const { classifierModel } = this.config.getAdaptiveThinkingConfig();
|
||||
|
||||
try {
|
||||
const llm = this.config.getBaseLlmClient();
|
||||
debugLogger.debug(
|
||||
`AdaptiveBudgetService: Classifying prompt complexity using ${classifierModel}...`,
|
||||
);
|
||||
const systemPrompt = `You are a complexity classifier for a coding assistant.
|
||||
Analyze the user's request and determine the complexity of the task.
|
||||
Output ONLY a single integer from 1 to 4 based on the following scale:
|
||||
|
||||
1 (Simple): Quick fixes, syntax questions, simple explanations, greetings.
|
||||
2 (Moderate): Function-level logic, writing small scripts, standard debugging.
|
||||
3 (High): Module-level refactoring, complex feature implementation, multi-file changes.
|
||||
4 (Extreme): Architecture design, deep root-cause analysis of obscure bugs, large-scale migrations.
|
||||
|
||||
Request: ${userPrompt}
|
||||
Complexity Level:`;
|
||||
|
||||
const response = await llm.generateContent({
|
||||
modelConfigKey: { model: classifierModel },
|
||||
contents: [{ role: 'user', parts: [{ text: systemPrompt }] }],
|
||||
promptId: 'adaptive-budget-classifier',
|
||||
abortSignal: new AbortController().signal,
|
||||
});
|
||||
|
||||
const text = response.candidates?.[0]?.content?.parts?.[0]?.text?.trim();
|
||||
if (!text) {
|
||||
debugLogger.debug(
|
||||
'AdaptiveBudgetService: No response from classifier.',
|
||||
);
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const level = parseInt(text, 10) as ComplexityLevel;
|
||||
if (isNaN(level) || level < 1 || level > 4) {
|
||||
debugLogger.debug(
|
||||
`AdaptiveBudgetService: Invalid complexity level returned: ${text}`,
|
||||
);
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const result: AdaptiveBudgetResult = { complexity: level };
|
||||
|
||||
// Determine mapping based on model version
|
||||
// Gemini 3 uses ThinkingLevel, Gemini 2.x uses thinkingBudget
|
||||
if (isPreviewModel(model)) {
|
||||
result.thinkingLevel = LEVEL_MAPPING_V3[level] ?? ThinkingLevel.HIGH;
|
||||
} else if (isGemini2Model(model)) {
|
||||
result.thinkingBudget = BUDGET_MAPPING_V2[level];
|
||||
}
|
||||
|
||||
if (level === ComplexityLevel.EXTREME) {
|
||||
result.strategyNote =
|
||||
'EXTREME complexity detected. Future implementations should use Best-of-N + Verification.';
|
||||
}
|
||||
|
||||
debugLogger.debug(
|
||||
`AdaptiveBudgetService: Complexity ${level} -> Thinking Param: ${result.thinkingLevel || result.thinkingBudget}`,
|
||||
);
|
||||
return result;
|
||||
} catch (error) {
|
||||
debugLogger.error(
|
||||
'AdaptiveBudgetService: Error classifying complexity',
|
||||
error,
|
||||
);
|
||||
return undefined;
|
||||
}
|
||||
}
|
||||
|
||||
getThinkingBudgetV2(level: ComplexityLevel): number {
|
||||
return BUDGET_MAPPING_V2[level];
|
||||
}
|
||||
|
||||
getThinkingLevelV3(level: ComplexityLevel): ThinkingLevel {
|
||||
return LEVEL_MAPPING_V3[level] ?? ThinkingLevel.HIGH;
|
||||
}
|
||||
}
|
||||
@@ -4,7 +4,7 @@
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
|
||||
import type { GenerateContentConfig } from '@google/genai';
|
||||
import type { GenerateContentConfig, ThinkingLevel } from '@google/genai';
|
||||
|
||||
// The primary key for the ModelConfig is the model string. However, we also
|
||||
// support a secondary key to limit the override scope, typically an agent name.
|
||||
@@ -26,6 +26,10 @@ export interface ModelConfigKey {
|
||||
// This allows overrides to specify different settings (e.g., higher temperature)
|
||||
// specifically for retry scenarios.
|
||||
isRetry?: boolean;
|
||||
|
||||
// Dynamic thinking configuration determined at runtime (e.g. via complexity classification)
|
||||
thinkingBudget?: number;
|
||||
thinkingLevel?: ThinkingLevel;
|
||||
}
|
||||
|
||||
export interface ModelConfig {
|
||||
@@ -205,6 +209,22 @@ export class ModelConfigService {
|
||||
}
|
||||
}
|
||||
|
||||
// Apply dynamic thinking parameters from context if present
|
||||
if (
|
||||
context.thinkingBudget !== undefined ||
|
||||
context.thinkingLevel !== undefined
|
||||
) {
|
||||
resolvedConfig.thinkingConfig = {
|
||||
...(resolvedConfig.thinkingConfig as object),
|
||||
...(context.thinkingBudget !== undefined
|
||||
? { thinkingBudget: context.thinkingBudget }
|
||||
: {}),
|
||||
...(context.thinkingLevel !== undefined
|
||||
? { thinkingLevel: context.thinkingLevel }
|
||||
: {}),
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
model: baseModel,
|
||||
generateContentConfig: resolvedConfig,
|
||||
|
||||
@@ -1441,6 +1441,30 @@
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"adaptiveThinking": {
|
||||
"title": "Adaptive Thinking Settings",
|
||||
"description": "Configuration for Adaptive Thinking Budget.",
|
||||
"markdownDescription": "Configuration for Adaptive Thinking Budget.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `{}`",
|
||||
"default": {},
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"enabled": {
|
||||
"title": "Enable Adaptive Thinking",
|
||||
"description": "Enable adaptive thinking budget based on task complexity.",
|
||||
"markdownDescription": "Enable adaptive thinking budget based on task complexity.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `false`",
|
||||
"default": false,
|
||||
"type": "boolean"
|
||||
},
|
||||
"classifierModel": {
|
||||
"title": "Classifier Model",
|
||||
"description": "The model (or alias) to use for complexity classification.",
|
||||
"markdownDescription": "The model (or alias) to use for complexity classification.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `classifier`",
|
||||
"default": "classifier",
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
|
||||
Reference in New Issue
Block a user