mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-05-12 12:54:07 -07:00
feat: implement adaptive thinking budget
This commit is contained in:
@@ -0,0 +1,3 @@
|
|||||||
|
# Tracks
|
||||||
|
|
||||||
|
- [Dynamic Thinking Budget](tracks/dynamic-thinking-budget/plan.md)
|
||||||
@@ -0,0 +1,101 @@
|
|||||||
|
# Dynamic Thinking Budget Plan
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The current Gemini CLI implementation uses static thinking configurations
|
||||||
|
defined in `settings.json` (or defaults).
|
||||||
|
|
||||||
|
- **Gemini 2.x**: Uses a static `thinkingBudget` (e.g., 8192 tokens).
|
||||||
|
- **Gemini 3**: Uses a static `thinkingLevel` (e.g., "HIGH").
|
||||||
|
|
||||||
|
This "one-size-fits-all" approach is inefficient. Simple queries waste compute,
|
||||||
|
while complex queries might not get enough reasoning depth. The goal is to
|
||||||
|
implement an "Adaptive Budget Manager" that dynamically adjusts the
|
||||||
|
`thinkingBudget` (for v2) or `thinkingLevel` (for v3) based on the complexity of
|
||||||
|
the user's request.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Implement a **Complexity Classifier** using a lightweight model (e.g., Gemini
|
||||||
|
Flash) to analyze the user's prompt and history.
|
||||||
|
- **Map complexity levels** to:
|
||||||
|
- `thinkingBudget` token counts for Gemini 2.x models.
|
||||||
|
- `thinkingLevel` enums for Gemini 3 models.
|
||||||
|
- **Dynamically update** the `GenerateContentConfig` in `GeminiClient` before
|
||||||
|
the main model call.
|
||||||
|
- Ensure **fallback mechanisms** if the classification fails.
|
||||||
|
- (Optional) **Visual feedback** to the user regarding the determined
|
||||||
|
complexity.
|
||||||
|
|
||||||
|
## Strategy
|
||||||
|
|
||||||
|
### 1. Adaptive Budget Manager Service
|
||||||
|
|
||||||
|
Create a new service `AdaptiveBudgetService` in
|
||||||
|
`packages/core/src/services/adaptiveBudgetService.ts`.
|
||||||
|
|
||||||
|
- **Functionality**:
|
||||||
|
- Takes `userPrompt` and `recentHistory` as input.
|
||||||
|
- Calls Gemini Flash (using `config.getBaseLlmClient()`) with a specialized
|
||||||
|
system prompt.
|
||||||
|
- Returns a `ComplexityLevel` (1-4).
|
||||||
|
|
||||||
|
### 2. Budget/Level Mapping
|
||||||
|
|
||||||
|
| Complexity Level | Gemini 2.x (`thinkingBudget`) | Gemini 3 (`thinkingLevel`) | Description |
|
||||||
|
| :--------------- | :---------------------------- | :------------------------- | :----------------------------- |
|
||||||
|
| **1 (Simple)** | 1,024 tokens | `LOW` | Quick fixes, syntax questions. |
|
||||||
|
| **2 (Moderate)** | 4,096 tokens | `MEDIUM` (or `LOW`) | Function-level logic. |
|
||||||
|
| **3 (High)** | 16,384 tokens | `HIGH` | Module-level refactoring. |
|
||||||
|
| **4 (Extreme)** | 32,768+ tokens | `HIGH` | Architecture, deep debugging. |
|
||||||
|
|
||||||
|
### 3. Integration Point
|
||||||
|
|
||||||
|
Modify `packages/core/src/core/client.ts` to invoke the `AdaptiveBudgetService`
|
||||||
|
before `sendMessageStream`.
|
||||||
|
|
||||||
|
- **Flow**:
|
||||||
|
1. User sends message.
|
||||||
|
2. `GeminiClient` identifies the target model family (v2 or v3).
|
||||||
|
3. Call `AdaptiveBudgetService.determineComplexity()`.
|
||||||
|
4. If **v2**: Calculate `thinkingBudget` based on complexity. Update config.
|
||||||
|
5. If **v3**: Calculate `thinkingLevel` based on complexity. Update config.
|
||||||
|
6. Proceed with `sendMessageStream`.
|
||||||
|
|
||||||
|
### 4. Configuration
|
||||||
|
|
||||||
|
Add settings to `packages/core/src/config/config.ts` and `settings.schema.json`:
|
||||||
|
|
||||||
|
- `adaptiveThinking.enabled`: boolean (default true)
|
||||||
|
- `adaptiveThinking.classifierModel`: string (default "gemini-2.0-flash")
|
||||||
|
|
||||||
|
## Insights from "J1: Exploring Simple Test-Time Scaling (STTS)"
|
||||||
|
|
||||||
|
The paper (arXiv:2505.xxxx / 2512.19585) highlights that models trained with
|
||||||
|
Reinforcement Learning (like Gemini 3) exhibit strong scaling trends when
|
||||||
|
allocated more inference-time compute.
|
||||||
|
|
||||||
|
- **Budget Forcing**: The "Adaptive Budget Manager" implements this by forcing
|
||||||
|
higher `thinkingLevel` or `thinkingBudget` for harder tasks, maximizing the
|
||||||
|
"verifiable reward" (correct code) for complex problems while saving latency
|
||||||
|
on simple ones.
|
||||||
|
- **Best-of-N**: The paper suggests that generating N solutions and selecting
|
||||||
|
the best one is a powerful STTS method. While out of scope for _this_ specific
|
||||||
|
track, the "Complexity Classifier" we build here is the _prerequisite_ for
|
||||||
|
that future feature. We should only trigger expensive "Best-of-N" flows when
|
||||||
|
the Complexity Level is 3 or 4.
|
||||||
|
|
||||||
|
## Files to Modify
|
||||||
|
|
||||||
|
- `packages/core/src/services/adaptiveBudgetService.ts` (New)
|
||||||
|
- `packages/core/src/core/client.ts`
|
||||||
|
- `packages/core/src/config/config.ts`
|
||||||
|
|
||||||
|
## Verification Plan
|
||||||
|
|
||||||
|
1. **Unit Tests**: Verify `AdaptiveBudgetService` returns correct mappings for
|
||||||
|
both model families.
|
||||||
|
2. **Integration Tests**: Mock API calls to ensure `thinkingLevel` is sent for
|
||||||
|
v3 and `thinkingBudget` for v2.
|
||||||
|
3. **Manual Verification**: Use debug logs to verify the correct parameters are
|
||||||
|
being sent to the API.
|
||||||
@@ -716,6 +716,7 @@ export async function loadCliConfig(
|
|||||||
settings.experimental?.codebaseInvestigatorSettings,
|
settings.experimental?.codebaseInvestigatorSettings,
|
||||||
introspectionAgentSettings:
|
introspectionAgentSettings:
|
||||||
settings.experimental?.introspectionAgentSettings,
|
settings.experimental?.introspectionAgentSettings,
|
||||||
|
adaptiveThinking: settings.experimental?.adaptiveThinking,
|
||||||
fakeResponses: argv.fakeResponses,
|
fakeResponses: argv.fakeResponses,
|
||||||
recordResponses: argv.recordResponses,
|
recordResponses: argv.recordResponses,
|
||||||
retryFetchErrors: settings.general?.retryFetchErrors,
|
retryFetchErrors: settings.general?.retryFetchErrors,
|
||||||
|
|||||||
@@ -1473,6 +1473,37 @@ const SETTINGS_SCHEMA = {
|
|||||||
},
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
adaptiveThinking: {
|
||||||
|
type: 'object',
|
||||||
|
label: 'Adaptive Thinking Settings',
|
||||||
|
category: 'Experimental',
|
||||||
|
requiresRestart: false,
|
||||||
|
default: {},
|
||||||
|
description: 'Configuration for Adaptive Thinking Budget.',
|
||||||
|
showInDialog: false,
|
||||||
|
properties: {
|
||||||
|
enabled: {
|
||||||
|
type: 'boolean',
|
||||||
|
label: 'Enable Adaptive Thinking',
|
||||||
|
category: 'Experimental',
|
||||||
|
requiresRestart: false,
|
||||||
|
default: false,
|
||||||
|
description:
|
||||||
|
'Enable adaptive thinking budget based on task complexity.',
|
||||||
|
showInDialog: true,
|
||||||
|
},
|
||||||
|
classifierModel: {
|
||||||
|
type: 'string',
|
||||||
|
label: 'Classifier Model',
|
||||||
|
category: 'Experimental',
|
||||||
|
requiresRestart: false,
|
||||||
|
default: 'classifier',
|
||||||
|
description:
|
||||||
|
'The model (or alias) to use for complexity classification.',
|
||||||
|
showInDialog: false,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
|
|||||||
@@ -73,6 +73,7 @@ import type { ModelConfigServiceConfig } from '../services/modelConfigService.js
|
|||||||
import { ModelConfigService } from '../services/modelConfigService.js';
|
import { ModelConfigService } from '../services/modelConfigService.js';
|
||||||
import { DEFAULT_MODEL_CONFIGS } from './defaultModelConfigs.js';
|
import { DEFAULT_MODEL_CONFIGS } from './defaultModelConfigs.js';
|
||||||
import { ContextManager } from '../services/contextManager.js';
|
import { ContextManager } from '../services/contextManager.js';
|
||||||
|
import { AdaptiveBudgetService } from '../services/adaptiveBudgetService.js';
|
||||||
|
|
||||||
// Re-export OAuth config type
|
// Re-export OAuth config type
|
||||||
export type { MCPOAuthConfig, AnyToolInvocation };
|
export type { MCPOAuthConfig, AnyToolInvocation };
|
||||||
@@ -335,6 +336,10 @@ export interface ConfigParameters {
|
|||||||
disableModelRouterForAuth?: AuthType[];
|
disableModelRouterForAuth?: AuthType[];
|
||||||
codebaseInvestigatorSettings?: CodebaseInvestigatorSettings;
|
codebaseInvestigatorSettings?: CodebaseInvestigatorSettings;
|
||||||
introspectionAgentSettings?: IntrospectionAgentSettings;
|
introspectionAgentSettings?: IntrospectionAgentSettings;
|
||||||
|
adaptiveThinking?: {
|
||||||
|
enabled?: boolean;
|
||||||
|
classifierModel?: string;
|
||||||
|
};
|
||||||
continueOnFailedApiCall?: boolean;
|
continueOnFailedApiCall?: boolean;
|
||||||
retryFetchErrors?: boolean;
|
retryFetchErrors?: boolean;
|
||||||
enableShellOutputEfficiency?: boolean;
|
enableShellOutputEfficiency?: boolean;
|
||||||
@@ -460,6 +465,10 @@ export class Config {
|
|||||||
private readonly outputSettings: OutputSettings;
|
private readonly outputSettings: OutputSettings;
|
||||||
private readonly codebaseInvestigatorSettings: CodebaseInvestigatorSettings;
|
private readonly codebaseInvestigatorSettings: CodebaseInvestigatorSettings;
|
||||||
private readonly introspectionAgentSettings: IntrospectionAgentSettings;
|
private readonly introspectionAgentSettings: IntrospectionAgentSettings;
|
||||||
|
private readonly adaptiveThinking: {
|
||||||
|
enabled: boolean;
|
||||||
|
classifierModel: string;
|
||||||
|
};
|
||||||
private readonly continueOnFailedApiCall: boolean;
|
private readonly continueOnFailedApiCall: boolean;
|
||||||
private readonly retryFetchErrors: boolean;
|
private readonly retryFetchErrors: boolean;
|
||||||
private readonly enableShellOutputEfficiency: boolean;
|
private readonly enableShellOutputEfficiency: boolean;
|
||||||
@@ -491,6 +500,7 @@ export class Config {
|
|||||||
private readonly experimentalJitContext: boolean;
|
private readonly experimentalJitContext: boolean;
|
||||||
private contextManager?: ContextManager;
|
private contextManager?: ContextManager;
|
||||||
private terminalBackground: string | undefined = undefined;
|
private terminalBackground: string | undefined = undefined;
|
||||||
|
private adaptiveBudgetService!: AdaptiveBudgetService;
|
||||||
|
|
||||||
constructor(params: ConfigParameters) {
|
constructor(params: ConfigParameters) {
|
||||||
this.sessionId = params.sessionId;
|
this.sessionId = params.sessionId;
|
||||||
@@ -618,6 +628,10 @@ export class Config {
|
|||||||
this.introspectionAgentSettings = {
|
this.introspectionAgentSettings = {
|
||||||
enabled: params.introspectionAgentSettings?.enabled ?? false,
|
enabled: params.introspectionAgentSettings?.enabled ?? false,
|
||||||
};
|
};
|
||||||
|
this.adaptiveThinking = {
|
||||||
|
enabled: params.adaptiveThinking?.enabled ?? false,
|
||||||
|
classifierModel: params.adaptiveThinking?.classifierModel ?? 'classifier',
|
||||||
|
};
|
||||||
this.continueOnFailedApiCall = params.continueOnFailedApiCall ?? true;
|
this.continueOnFailedApiCall = params.continueOnFailedApiCall ?? true;
|
||||||
this.enableShellOutputEfficiency =
|
this.enableShellOutputEfficiency =
|
||||||
params.enableShellOutputEfficiency ?? true;
|
params.enableShellOutputEfficiency ?? true;
|
||||||
@@ -763,6 +777,13 @@ export class Config {
|
|||||||
await this.contextManager.refresh();
|
await this.contextManager.refresh();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
this.adaptiveBudgetService = new AdaptiveBudgetService(this);
|
||||||
|
if (this.adaptiveThinking.enabled) {
|
||||||
|
debugLogger.debug(
|
||||||
|
`Adaptive Thinking Budget enabled (classifier: ${this.adaptiveThinking.classifierModel})`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
await this.geminiClient.initialize();
|
await this.geminiClient.initialize();
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -770,6 +791,10 @@ export class Config {
|
|||||||
return this.contentGenerator;
|
return this.contentGenerator;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
getAdaptiveBudgetService(): AdaptiveBudgetService {
|
||||||
|
return this.adaptiveBudgetService;
|
||||||
|
}
|
||||||
|
|
||||||
async refreshAuth(authMethod: AuthType) {
|
async refreshAuth(authMethod: AuthType) {
|
||||||
// Reset availability service when switching auth
|
// Reset availability service when switching auth
|
||||||
this.modelAvailabilityService.reset();
|
this.modelAvailabilityService.reset();
|
||||||
@@ -1664,6 +1689,10 @@ export class Config {
|
|||||||
return this.introspectionAgentSettings;
|
return this.introspectionAgentSettings;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
getAdaptiveThinkingConfig(): { enabled: boolean; classifierModel: string } {
|
||||||
|
return this.adaptiveThinking;
|
||||||
|
}
|
||||||
|
|
||||||
async createToolRegistry(): Promise<ToolRegistry> {
|
async createToolRegistry(): Promise<ToolRegistry> {
|
||||||
const registry = new ToolRegistry(this, this.messageBus);
|
const registry = new ToolRegistry(this, this.messageBus);
|
||||||
|
|
||||||
|
|||||||
@@ -28,6 +28,7 @@ import { GeminiChat } from './geminiChat.js';
|
|||||||
import { retryWithBackoff } from '../utils/retry.js';
|
import { retryWithBackoff } from '../utils/retry.js';
|
||||||
import { getErrorMessage } from '../utils/errors.js';
|
import { getErrorMessage } from '../utils/errors.js';
|
||||||
import { tokenLimit } from './tokenLimits.js';
|
import { tokenLimit } from './tokenLimits.js';
|
||||||
|
import { partListUnionToString } from './geminiRequest.js';
|
||||||
import type {
|
import type {
|
||||||
ChatRecordingService,
|
ChatRecordingService,
|
||||||
ResumedSessionData,
|
ResumedSessionData,
|
||||||
@@ -620,6 +621,25 @@ export class GeminiClient {
|
|||||||
|
|
||||||
// availability logic
|
// availability logic
|
||||||
const modelConfigKey: ModelConfigKey = { model: modelToUse };
|
const modelConfigKey: ModelConfigKey = { model: modelToUse };
|
||||||
|
|
||||||
|
// Adaptive Thinking Budget Integration
|
||||||
|
if (
|
||||||
|
!isInvalidStreamRetry &&
|
||||||
|
this.config.getAdaptiveThinkingConfig().enabled
|
||||||
|
) {
|
||||||
|
const userMessage = partListUnionToString(request);
|
||||||
|
if (userMessage) {
|
||||||
|
const adaptiveConfig = await this.config
|
||||||
|
.getAdaptiveBudgetService()
|
||||||
|
.determineAdaptiveConfig(userMessage, modelToUse);
|
||||||
|
|
||||||
|
if (adaptiveConfig) {
|
||||||
|
modelConfigKey.thinkingBudget = adaptiveConfig.thinkingBudget;
|
||||||
|
modelConfigKey.thinkingLevel = adaptiveConfig.thinkingLevel;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
const { model: finalModel } = applyModelSelection(
|
const { model: finalModel } = applyModelSelection(
|
||||||
this.config,
|
this.config,
|
||||||
modelConfigKey,
|
modelConfigKey,
|
||||||
|
|||||||
@@ -0,0 +1,88 @@
|
|||||||
|
/**
|
||||||
|
* @license
|
||||||
|
* Copyright 2026 Google LLC
|
||||||
|
* SPDX-License-Identifier: Apache-2.0
|
||||||
|
*/
|
||||||
|
import { describe, it, expect, vi } from 'vitest';
|
||||||
|
import {
|
||||||
|
AdaptiveBudgetService,
|
||||||
|
ComplexityLevel,
|
||||||
|
} from './adaptiveBudgetService.js';
|
||||||
|
import type { Config } from '../config/config.js';
|
||||||
|
import { ThinkingLevel } from '@google/genai';
|
||||||
|
|
||||||
|
describe('AdaptiveBudgetService', () => {
|
||||||
|
it('should map complexity levels to correct V2 budgets', () => {
|
||||||
|
const service = new AdaptiveBudgetService({} as Config);
|
||||||
|
expect(service.getThinkingBudgetV2(ComplexityLevel.SIMPLE)).toBe(1024);
|
||||||
|
expect(service.getThinkingBudgetV2(ComplexityLevel.MODERATE)).toBe(4096);
|
||||||
|
expect(service.getThinkingBudgetV2(ComplexityLevel.HIGH)).toBe(16384);
|
||||||
|
expect(service.getThinkingBudgetV2(ComplexityLevel.EXTREME)).toBe(32768);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should map complexity levels to correct V3 levels', () => {
|
||||||
|
const service = new AdaptiveBudgetService({} as Config);
|
||||||
|
expect(service.getThinkingLevelV3(ComplexityLevel.SIMPLE)).toBe(
|
||||||
|
ThinkingLevel.LOW,
|
||||||
|
);
|
||||||
|
expect(service.getThinkingLevelV3(ComplexityLevel.MODERATE)).toBe(
|
||||||
|
ThinkingLevel.LOW,
|
||||||
|
);
|
||||||
|
expect(service.getThinkingLevelV3(ComplexityLevel.HIGH)).toBe(
|
||||||
|
ThinkingLevel.HIGH,
|
||||||
|
);
|
||||||
|
expect(service.getThinkingLevelV3(ComplexityLevel.EXTREME)).toBe(
|
||||||
|
ThinkingLevel.HIGH,
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should determine adaptive config based on LLM response', async () => {
|
||||||
|
const mockGenerateContent = vi.fn().mockResolvedValue({
|
||||||
|
candidates: [{ content: { parts: [{ text: '3' }] } }],
|
||||||
|
});
|
||||||
|
|
||||||
|
const mockConfig = {
|
||||||
|
getBaseLlmClient: () => ({
|
||||||
|
generateContent: mockGenerateContent,
|
||||||
|
}),
|
||||||
|
getAdaptiveThinkingConfig: () => ({
|
||||||
|
enabled: true,
|
||||||
|
classifierModel: 'gemini-2.0-flash',
|
||||||
|
}),
|
||||||
|
} as unknown as Config;
|
||||||
|
|
||||||
|
const service = new AdaptiveBudgetService(mockConfig);
|
||||||
|
const result = await service.determineAdaptiveConfig(
|
||||||
|
'Complex task',
|
||||||
|
'gemini-2.5-pro',
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(result?.complexity).toBe(ComplexityLevel.HIGH);
|
||||||
|
expect(result?.thinkingBudget).toBe(16384);
|
||||||
|
expect(mockGenerateContent).toHaveBeenCalled();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should handle Gemini 3 models with thinkingLevel', async () => {
|
||||||
|
const mockConfig = {
|
||||||
|
getBaseLlmClient: () => ({
|
||||||
|
generateContent: vi.fn().mockResolvedValue({
|
||||||
|
candidates: [{ content: { parts: [{ text: '1' }] } }],
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
getAdaptiveThinkingConfig: () => ({
|
||||||
|
enabled: true,
|
||||||
|
classifierModel: 'gemini-2.0-flash',
|
||||||
|
}),
|
||||||
|
} as unknown as Config;
|
||||||
|
|
||||||
|
const service = new AdaptiveBudgetService(mockConfig);
|
||||||
|
const result = await service.determineAdaptiveConfig(
|
||||||
|
'Hi',
|
||||||
|
'gemini-3-pro-preview',
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(result?.complexity).toBe(ComplexityLevel.SIMPLE);
|
||||||
|
expect(result?.thinkingLevel).toBe(ThinkingLevel.LOW);
|
||||||
|
expect(result?.thinkingBudget).toBeUndefined();
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -0,0 +1,132 @@
|
|||||||
|
/**
|
||||||
|
* @license
|
||||||
|
* Copyright 2026 Google LLC
|
||||||
|
* SPDX-License-Identifier: Apache-2.0
|
||||||
|
*/
|
||||||
|
import type { Config } from '../config/config.js';
|
||||||
|
import { debugLogger } from '../utils/debugLogger.js';
|
||||||
|
import { isGemini2Model, isPreviewModel } from '../config/models.js';
|
||||||
|
import { ThinkingLevel } from '@google/genai';
|
||||||
|
|
||||||
|
export enum ComplexityLevel {
|
||||||
|
SIMPLE = 1,
|
||||||
|
MODERATE = 2,
|
||||||
|
HIGH = 3,
|
||||||
|
EXTREME = 4,
|
||||||
|
}
|
||||||
|
|
||||||
|
export const BUDGET_MAPPING_V2: Record<ComplexityLevel, number> = {
|
||||||
|
[ComplexityLevel.SIMPLE]: 1024,
|
||||||
|
[ComplexityLevel.MODERATE]: 4096,
|
||||||
|
[ComplexityLevel.HIGH]: 16384,
|
||||||
|
[ComplexityLevel.EXTREME]: 32768,
|
||||||
|
};
|
||||||
|
|
||||||
|
export const LEVEL_MAPPING_V3: Record<ComplexityLevel, ThinkingLevel> = {
|
||||||
|
[ComplexityLevel.SIMPLE]: ThinkingLevel.LOW,
|
||||||
|
[ComplexityLevel.MODERATE]: ThinkingLevel.LOW,
|
||||||
|
[ComplexityLevel.HIGH]: ThinkingLevel.HIGH,
|
||||||
|
[ComplexityLevel.EXTREME]: ThinkingLevel.HIGH,
|
||||||
|
};
|
||||||
|
|
||||||
|
export interface AdaptiveBudgetResult {
|
||||||
|
complexity: ComplexityLevel;
|
||||||
|
thinkingBudget?: number;
|
||||||
|
thinkingLevel?: ThinkingLevel;
|
||||||
|
strategyNote?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export class AdaptiveBudgetService {
|
||||||
|
constructor(private config: Config) {}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Analyzes the user prompt and determines the optimal thinking configuration.
|
||||||
|
*
|
||||||
|
* Note on future scaling (per arXiv:2512.19585):
|
||||||
|
* At Complexity 4 (Extreme), we should consider:
|
||||||
|
* 1. Best-of-N: Generate multiple solutions.
|
||||||
|
* 2. LLM-as-a-Judge: Use a strong model to evaluate candidates.
|
||||||
|
* 3. Compiler Verification: Check code correctness via environment tools.
|
||||||
|
*/
|
||||||
|
async determineAdaptiveConfig(
|
||||||
|
userPrompt: string,
|
||||||
|
model: string,
|
||||||
|
): Promise<AdaptiveBudgetResult | undefined> {
|
||||||
|
const { classifierModel } = this.config.getAdaptiveThinkingConfig();
|
||||||
|
|
||||||
|
try {
|
||||||
|
const llm = this.config.getBaseLlmClient();
|
||||||
|
debugLogger.debug(
|
||||||
|
`AdaptiveBudgetService: Classifying prompt complexity using ${classifierModel}...`,
|
||||||
|
);
|
||||||
|
const systemPrompt = `You are a complexity classifier for a coding assistant.
|
||||||
|
Analyze the user's request and determine the complexity of the task.
|
||||||
|
Output ONLY a single integer from 1 to 4 based on the following scale:
|
||||||
|
|
||||||
|
1 (Simple): Quick fixes, syntax questions, simple explanations, greetings.
|
||||||
|
2 (Moderate): Function-level logic, writing small scripts, standard debugging.
|
||||||
|
3 (High): Module-level refactoring, complex feature implementation, multi-file changes.
|
||||||
|
4 (Extreme): Architecture design, deep root-cause analysis of obscure bugs, large-scale migrations.
|
||||||
|
|
||||||
|
Request: ${userPrompt}
|
||||||
|
Complexity Level:`;
|
||||||
|
|
||||||
|
const response = await llm.generateContent({
|
||||||
|
modelConfigKey: { model: classifierModel },
|
||||||
|
contents: [{ role: 'user', parts: [{ text: systemPrompt }] }],
|
||||||
|
promptId: 'adaptive-budget-classifier',
|
||||||
|
abortSignal: new AbortController().signal,
|
||||||
|
});
|
||||||
|
|
||||||
|
const text = response.candidates?.[0]?.content?.parts?.[0]?.text?.trim();
|
||||||
|
if (!text) {
|
||||||
|
debugLogger.debug(
|
||||||
|
'AdaptiveBudgetService: No response from classifier.',
|
||||||
|
);
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
const level = parseInt(text, 10) as ComplexityLevel;
|
||||||
|
if (isNaN(level) || level < 1 || level > 4) {
|
||||||
|
debugLogger.debug(
|
||||||
|
`AdaptiveBudgetService: Invalid complexity level returned: ${text}`,
|
||||||
|
);
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
const result: AdaptiveBudgetResult = { complexity: level };
|
||||||
|
|
||||||
|
// Determine mapping based on model version
|
||||||
|
// Gemini 3 uses ThinkingLevel, Gemini 2.x uses thinkingBudget
|
||||||
|
if (isPreviewModel(model)) {
|
||||||
|
result.thinkingLevel = LEVEL_MAPPING_V3[level] ?? ThinkingLevel.HIGH;
|
||||||
|
} else if (isGemini2Model(model)) {
|
||||||
|
result.thinkingBudget = BUDGET_MAPPING_V2[level];
|
||||||
|
}
|
||||||
|
|
||||||
|
if (level === ComplexityLevel.EXTREME) {
|
||||||
|
result.strategyNote =
|
||||||
|
'EXTREME complexity detected. Future implementations should use Best-of-N + Verification.';
|
||||||
|
}
|
||||||
|
|
||||||
|
debugLogger.debug(
|
||||||
|
`AdaptiveBudgetService: Complexity ${level} -> Thinking Param: ${result.thinkingLevel || result.thinkingBudget}`,
|
||||||
|
);
|
||||||
|
return result;
|
||||||
|
} catch (error) {
|
||||||
|
debugLogger.error(
|
||||||
|
'AdaptiveBudgetService: Error classifying complexity',
|
||||||
|
error,
|
||||||
|
);
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
getThinkingBudgetV2(level: ComplexityLevel): number {
|
||||||
|
return BUDGET_MAPPING_V2[level];
|
||||||
|
}
|
||||||
|
|
||||||
|
getThinkingLevelV3(level: ComplexityLevel): ThinkingLevel {
|
||||||
|
return LEVEL_MAPPING_V3[level] ?? ThinkingLevel.HIGH;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -4,7 +4,7 @@
|
|||||||
* SPDX-License-Identifier: Apache-2.0
|
* SPDX-License-Identifier: Apache-2.0
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import type { GenerateContentConfig } from '@google/genai';
|
import type { GenerateContentConfig, ThinkingLevel } from '@google/genai';
|
||||||
|
|
||||||
// The primary key for the ModelConfig is the model string. However, we also
|
// The primary key for the ModelConfig is the model string. However, we also
|
||||||
// support a secondary key to limit the override scope, typically an agent name.
|
// support a secondary key to limit the override scope, typically an agent name.
|
||||||
@@ -26,6 +26,10 @@ export interface ModelConfigKey {
|
|||||||
// This allows overrides to specify different settings (e.g., higher temperature)
|
// This allows overrides to specify different settings (e.g., higher temperature)
|
||||||
// specifically for retry scenarios.
|
// specifically for retry scenarios.
|
||||||
isRetry?: boolean;
|
isRetry?: boolean;
|
||||||
|
|
||||||
|
// Dynamic thinking configuration determined at runtime (e.g. via complexity classification)
|
||||||
|
thinkingBudget?: number;
|
||||||
|
thinkingLevel?: ThinkingLevel;
|
||||||
}
|
}
|
||||||
|
|
||||||
export interface ModelConfig {
|
export interface ModelConfig {
|
||||||
@@ -205,6 +209,22 @@ export class ModelConfigService {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Apply dynamic thinking parameters from context if present
|
||||||
|
if (
|
||||||
|
context.thinkingBudget !== undefined ||
|
||||||
|
context.thinkingLevel !== undefined
|
||||||
|
) {
|
||||||
|
resolvedConfig.thinkingConfig = {
|
||||||
|
...(resolvedConfig.thinkingConfig as object),
|
||||||
|
...(context.thinkingBudget !== undefined
|
||||||
|
? { thinkingBudget: context.thinkingBudget }
|
||||||
|
: {}),
|
||||||
|
...(context.thinkingLevel !== undefined
|
||||||
|
? { thinkingLevel: context.thinkingLevel }
|
||||||
|
: {}),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
model: baseModel,
|
model: baseModel,
|
||||||
generateContentConfig: resolvedConfig,
|
generateContentConfig: resolvedConfig,
|
||||||
|
|||||||
@@ -1441,6 +1441,30 @@
|
|||||||
}
|
}
|
||||||
},
|
},
|
||||||
"additionalProperties": false
|
"additionalProperties": false
|
||||||
|
},
|
||||||
|
"adaptiveThinking": {
|
||||||
|
"title": "Adaptive Thinking Settings",
|
||||||
|
"description": "Configuration for Adaptive Thinking Budget.",
|
||||||
|
"markdownDescription": "Configuration for Adaptive Thinking Budget.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `{}`",
|
||||||
|
"default": {},
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"enabled": {
|
||||||
|
"title": "Enable Adaptive Thinking",
|
||||||
|
"description": "Enable adaptive thinking budget based on task complexity.",
|
||||||
|
"markdownDescription": "Enable adaptive thinking budget based on task complexity.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `false`",
|
||||||
|
"default": false,
|
||||||
|
"type": "boolean"
|
||||||
|
},
|
||||||
|
"classifierModel": {
|
||||||
|
"title": "Classifier Model",
|
||||||
|
"description": "The model (or alias) to use for complexity classification.",
|
||||||
|
"markdownDescription": "The model (or alias) to use for complexity classification.\n\n- Category: `Experimental`\n- Requires restart: `no`\n- Default: `classifier`",
|
||||||
|
"default": "classifier",
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additionalProperties": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"additionalProperties": false
|
"additionalProperties": false
|
||||||
|
|||||||
Reference in New Issue
Block a user