Christian Gunderman
|
d7f6d21c10
|
Generalize evals infra to support more types of evals, organization and queuing of named suites (#24941)
|
2026-04-08 23:57:26 +00:00 |
|
Alisa
|
aa178f547c
|
feat(evals): add reliability harvester and 500/503 retry support (#23626)
|
2026-03-26 01:48:45 +00:00 |
|
Christian Gunderman
|
7929516f1a
|
Retry evals on API error. (#23322)
|
2026-03-21 02:52:19 +00:00 |
|
joshualitt
|
e713e7d288
|
feat(core): experimental in-progress steering hints (1 of 3) (#19008)
|
2026-02-17 22:59:33 +00:00 |
|
Christian Gunderman
|
f60b442d39
|
Aggregate test results. (#16581)
|
2026-01-14 07:08:05 +00:00 |
|
Christian Gunderman
|
b82c66b2d8
|
Behavioral evals framework. (#16047)
|
2026-01-14 04:49:17 +00:00 |
|