Commit Graph

3 Commits

Author SHA1 Message Date
Abhijit Balaji 59d377e5e0 feat(optimization): implement manifest-driven extraction pipeline
- Implement `extract.ts` with robust character-aware parsing for snippets and tools.
- Consolidate research dependencies by moving `@ax-llm/ax` to root `optionalDependencies`.
- Relocate evaluation logic from `packages/core` to `scripts/optimization/lib/evals` to keep the production core lean.
- Add `optimization_targets` to `data/manifest.json` as the single source of truth for the pipeline.
- Implement comprehensive unit tests for extraction and variable masking with 100% pass rate.
- Update global config and linting rules to support the new optimization infrastructure.
2026-03-04 14:25:17 -08:00
Abhijit Balaji 6c94c4d9ca feat(prompt-optimization): implement multi-objective evaluation metrics
Established a Pareto-ready evaluation foundation for the Genetic-Pareto (GEPA)
optimizer, supporting simultaneous optimization of accuracy and density.

Key improvements:
- Core Architecture: Defined standardized `MetricResult` and `OptimizationDirection`
  types in `packages/core/src/evals/types.ts` to support multi-objective fitness.
- Centralized Config: Implemented `packages/core/src/evals/config.ts` with tunable
  weights and detailed documentation for scoring gradients.
- Tool Alignment Metric: Created `metrics/toolAlignment.ts` to measure functional
  accuracy, argument precision, and explicit shell avoidance.
- Token Frugality Metric: Created `metrics/tokenFrugality.ts` to measure and
  penalize conversational noise ("chatter") using a configurable threshold.
- Verification Suite: Added comprehensive unit tests for all metrics, achieving
  100% coverage of scoring logic and gradient steps.
- Project Integration: Relocated `schema.ts` to the core package for build safety,
  updated the data validator, and extended project-wide lint/format scripts.
2026-03-04 10:08:14 -08:00
Abhijit Balaji c0b463dbcf feat(prompt-optimization): implement Data Layer MVP and Tool Alignment dataset
Established the "Heart" of the Prompt Optimization Pipeline by building a robust,
extensible data infrastructure and a high-fidelity golden dataset.

Key improvements:
- Core Schema: Defined the `Scenario` interface in `data/schema.ts` supporting
  multiple negative failure modes, platform-specific shell contexts (Unix/Win32),
  and strict tool-call typing.
- Optimization Manifest: Created `data/manifest.json` to define "No-Fly Zones"
  for the optimizer, protecting literal tool names and template variables, while
  providing descriptive context for validation.
- Tool Alignment Dataset: Authored 113 scenarios in `data/tool_alignment.jsonl`
  across 20 tools, focusing on "Built-in over Shell" preference. Heavily weighted
  `replace` (12) and `write_file` (10) to enforce surgical editing.
- Extensible Validator: Implemented `scripts/validate-data.ts` to provide
  real-time integrity checks and purpose-driven coverage reports.
- Project Integration: Added `data:validate`, `data:format`, and `data:lint`
  scripts to package.json and updated ESLint config to cover the data directory.
2026-03-04 10:08:13 -08:00