feat(prompt-optimization): implement multi-objective evaluation metrics

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-06-08 10:02:59 -07:00

Established a Pareto-ready evaluation foundation for the Genetic-Pareto (GEPA)
optimizer, supporting simultaneous optimization of accuracy and density.

Key improvements:
- Core Architecture: Defined standardized `MetricResult` and `OptimizationDirection`
  types in `packages/core/src/evals/types.ts` to support multi-objective fitness.
- Centralized Config: Implemented `packages/core/src/evals/config.ts` with tunable
  weights and detailed documentation for scoring gradients.
- Tool Alignment Metric: Created `metrics/toolAlignment.ts` to measure functional
  accuracy, argument precision, and explicit shell avoidance.
- Token Frugality Metric: Created `metrics/tokenFrugality.ts` to measure and
  penalize conversational noise ("chatter") using a configurable threshold.
- Verification Suite: Added comprehensive unit tests for all metrics, achieving
  100% coverage of scoring logic and gradient steps.
- Project Integration: Relocated `schema.ts` to the core package for build safety,
  updated the data validator, and extended project-wide lint/format scripts.

This commit is contained in:

Abhijit Balaji

2026-03-02 14:10:45 -08:00

parent c0b463dbcf

commit 6c94c4d9ca

9 changed files with 458 additions and 3 deletions

									
										scripts/validate-data.ts
									
		+1
		-1
	
												View File
												
				@@ -6,7 +6,7 @@

				import * as fs from 'node:fs';

				import * as path from 'node:path';

				import type { Scenario } from '../data/schema.ts';

				import type { Scenario } from '../packages/core/src/evals/schema.ts';

				const MANIFEST_FILE = 'data/manifest.json';

				const DEFAULT_DATA_DIR = 'data';