gemini-cli

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-07-14 20:10:36 -07:00

Author	SHA1	Message	Date
Taylor Mullen	8d27a29053	fix(core): enforce non-interactive flags for scaffolding commands - Updated `newApplicationSteps` system prompt instructions to strongly require non-interactive flags (e.g. `--yes`, `-y`, or `--template`) when executing application scaffolding CLI tools. - Warned the model that omitting these flags for interactive tools will cause the environment to hang. - Added `ALWAYS_PASSES` evaluation case to `evals/interactive-hang.eval.ts` to assert that non-interactive flags are successfully provided for `npm create` and similar commands.	2026-02-15 11:49:41 -08:00
Christian Gunderman	6c1773170e	More grep prompt tweaks (#18846 )	2026-02-11 21:55:27 +00:00
Christian Gunderman	2a08456ed0	Update prompt and grep tool definition to limit context size (#18780 )	2026-02-11 19:20:51 +00:00
Jerop Kipruto	9c11ff2d58	test(evals): mark all `save_memory` evals as `USUALLY_PASSES` due to unreliability (#18786 )	2026-02-11 02:16:52 +00:00
Abhijit Balaji	b3ecac7086	fix(evals): prevent false positive in hierarchical memory test (#18777 )	2026-02-11 01:51:05 +00:00
Christian Gunderman	8b762111a8	Fix issue where Gemini CLI creates tests in a new file (#18409 )	2026-02-10 20:53:29 +00:00
Keith Guerin	5920750c24	ui: update & subdue footer colors and animate progress indicator (#18570 )	2026-02-10 17:36:20 +00:00
N. Taylor Mullen	67d9b76e81	test(core): remove hardcoded model from TestRig (#18710 )	2026-02-10 07:54:23 +00:00
joshualitt	89d4556c45	feat(core): Render memory hierarchically in context. (#18350 )	2026-02-10 02:01:59 +00:00
N. Taylor Mullen	aebc107d2c	feat: move shell efficiency guidelines to tool description (#18614 )	2026-02-09 18:51:13 +00:00
N. Taylor Mullen	da66c7c0d1	chore(evals): update validation_fidelity_pre_existing_errors to USUALLY_PASSES (#18617 )	2026-02-09 01:31:22 -08:00
N. Taylor Mullen	fe70052baf	fix(evals): update save_memory evals and simplify tool description (#18610 )	2026-02-09 01:06:03 -08:00
N. Taylor Mullen	d45a45d565	chore: strengthen validation guidance in system prompt (#18544 )	2026-02-09 05:32:46 +00:00
Sandy Tao	7409ce5df6	feat(cli): add WebSocket-based network logging and streaming chunk support (#18383 )	2026-02-07 00:20:22 +00:00
Jerop Kipruto	601f0606da	feat(plan): add positive test case and update eval stability policy (#18457 )	2026-02-06 19:45:22 +00:00
Jerop Kipruto	1d70aa5c1b	feat(plan): add behavioral evals for plan mode (#18437 )	2026-02-06 16:51:12 +00:00
Alisa	5b9ea35b63	Improving memory tool instructions and eval testing (#18091 )	2026-02-05 18:07:47 +00:00
Christian Gunderman	a0b6602d09	Fix issue where agent gets stuck at interactive commands. (#18272 )	2026-02-04 07:02:09 +00:00
Christian Gunderman	ed02b94570	Encourage agent to utilize ecosystem tools to perform work (#17881 )	2026-02-04 02:02:25 +00:00
Coco Sheng	3183e4137a	fix(test): improve test isolation and enable subagent evaluations (#18138 )	2026-02-03 19:05:26 +00:00
Christian Gunderman	bc258eba4c	Cleanup post delegate_to_agent removal (#17875 )	2026-01-29 18:24:35 +00:00
Sandy Tao	0b169e9867	fix(evals): use absolute path for activity log directory (#17830 )	2026-01-28 15:20:21 -08:00
Sandy Tao	9e09db1ddb	feat(cli): enable activity logging for non-interactive mode and evals (#17703 )	2026-01-28 17:02:41 +00:00
Christian Gunderman	b6cf189ab2	Fix issue where Gemini CLI can make changes when simply asked a question (#17608 )	2026-01-27 19:47:33 +00:00
Christian Gunderman	5cf06503c8	Slash command for helping in debugging (#17609 )	2026-01-27 02:47:04 +00:00
Sehoon Shon	5c649d8db1	feat(ui): display user tier in about command (#17400 )	2026-01-23 21:03:53 +00:00
Christian Gunderman	2c6781d134	Refactor subagent delegation to be one tool per agent (#17346 )	2026-01-23 02:18:31 +00:00
Jerop Kipruto	c21c297133	feat(plan): refactor TestRig and eval helper to support configurable approval modes (#17171 )	2026-01-21 15:43:48 +00:00
joshualitt	f42b4c80ac	feat(core): Add initial eval for generalist agent. (#16856 )	2026-01-20 20:32:48 +00:00
Christian Gunderman	12b0fe1cc2	Demote the subagent test to nightly (#17105 )	2026-01-20 18:18:16 +00:00
Christian Gunderman	49769152d6	Demote git evals to nightly run. (#17030 )	2026-01-19 19:00:41 +00:00
Christian Gunderman	d87a3acdef	Fix inverted logic. (#17007 )	2026-01-19 07:16:05 +00:00
Christian Gunderman	203f5209ba	Stabilize the git evals (#16989 )	2026-01-19 06:18:06 +00:00
Christian Gunderman	e03042657b	Don't commit unless user asks us to. (#16902 )	2026-01-17 01:00:46 +00:00
Christian Gunderman	a15978593a	Steer outer agent to use expert subagents when present (#16763 )	2026-01-16 16:51:10 +00:00
Christian Gunderman	66e7b479ae	Aggregate test results. (#16581 )	2026-01-14 07:08:05 +00:00
Christian Gunderman	8030404b08	Behavioral evals framework. (#16047 )	2026-01-14 04:49:17 +00:00

37 Commits