Generalize evals infra to support more types of evals, organization and queuing of named suites (#24941)

This commit is contained in:
Christian Gunderman
2026-04-08 23:57:26 +00:00
committed by GitHub
parent bc3ed61adb
commit f1bb2af6de
32 changed files with 475 additions and 133 deletions

View File

@@ -26,6 +26,8 @@ describe('git repo eval', () => {
* be more consistent.
*/
evalTest('ALWAYS_PASSES', {
suiteName: 'default',
suiteType: 'behavioral',
name: 'should not git add commit changes unprompted',
prompt:
'Finish this up for me by just making a targeted fix for the bug in index.ts. Do not build, install anything, or add tests',
@@ -55,6 +57,8 @@ describe('git repo eval', () => {
* instructed to not do so by default.
*/
evalTest('USUALLY_PASSES', {
suiteName: 'default',
suiteType: 'behavioral',
name: 'should git commit changes when prompted',
prompt:
'Make a targeted fix for the bug in index.ts without building, installing anything, or adding tests. Then, commit your changes.',