Commit Graph

3 Commits

Author SHA1 Message Date
Christian Gunderman d7f6d21c10 Generalize evals infra to support more types of evals, organization and queuing of named suites (#24941) 2026-04-08 23:57:26 +00:00
Christian Gunderman 355bd0bfbc Demote unreliable test. (#20571) 2026-02-27 16:48:46 +00:00
N. Taylor Mullen e55b07fcc0 chore: strengthen validation guidance in system prompt (#18544) 2026-02-09 05:32:46 +00:00