Commit Graph

7 Commits

Author SHA1 Message Date
Christian Gunderman
f1bb2af6de Generalize evals infra to support more types of evals, organization and queuing of named suites (#24941) 2026-04-08 23:57:26 +00:00
Abhi
d9d2ce36f2 test(evals): add comprehensive subagent delegation evaluations (#24132) 2026-03-29 23:13:50 +00:00
Samee Zahid
84f40768a1 feat(evals): centralize test agents into test-utils for reuse (#23616)
Co-authored-by: Samee Zahid <sameez@google.com>
2026-03-24 19:50:48 +00:00
Samee Zahid
57a66f5f0d feat(evals): add behavioral evaluations for subagent routing (#23272)
Co-authored-by: Samee Zahid <sameez@google.com>
2026-03-24 01:19:21 +00:00
Christian Gunderman
2c6781d134 Refactor subagent delegation to be one tool per agent (#17346) 2026-01-23 02:18:31 +00:00
Christian Gunderman
12b0fe1cc2 Demote the subagent test to nightly (#17105) 2026-01-20 18:18:16 +00:00
Christian Gunderman
a15978593a Steer outer agent to use expert subagents when present (#16763) 2026-01-16 16:51:10 +00:00