gemini-cli

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-04-28 22:14:52 -07:00

Author	SHA1	Message	Date
Alisa Novikova	901e94cba8	chore(core): simplify agent mandates to improve efficiency and reduce turn count	2026-03-03 01:00:26 -08:00
Alisa Novikova	0b06a9ae04	feat(core): implement circular behavior detection mandate Adds a self-awareness mandate to the agent's planning phase: 'Before attempting a fix for a validation error, review your recent tool calls. If you are repeatedly applying similar regex replacements or edits to the same block of code without the validation error changing, you are in a loop. Stop, revert your changes to a known good state, and rethink your approach.' This helps the agent identify and break out of unproductive loops during debugging and implementation.	2026-03-03 00:50:59 -08:00
Alisa Novikova	5ede571439	feat(core): implement validation back-off mechanism Adds a strict retry threshold to the agent's validation loop: 'If validation fails 3 times on the exact same test or error, DO NOT attempt another minor code tweak. You must immediately step back, use search tools to gather wider context, and formulate a completely new strategy.' This prevents the agent from getting stuck in repetitive, unsuccessful minor tweaks and encourages a more strategic approach when initial fixes fail.	2026-03-03 00:50:59 -08:00
Alisa Novikova	1df5178800	feat(core): prioritize existing test infrastructure and timebox test setup Introduces three critical mandates to the agent's testing and validation workflow: 1. Prioritize Existing Infrastructure: Strictly prefer running the project's existing test suite over writing custom reproduction scripts to avoid environment/import difficulties. 2. Timebox Test Setup: Abandon custom reproduction scripts if they fail to set up within 3-5 turns due to environment or import errors, falling back to static analysis and built-in tests. 3. Mandate Exhaustive Validation: Explicitly requires running relevant existing project tests to prevent regressions, ensuring a passing custom reproduction script is treated as a necessary but not sufficient condition for completion. These changes prevent 'Early Exhaustion' by reducing the complexity of standalone test setup in frameworks like Django.	2026-03-03 00:50:59 -08:00
Alisa Novikova	c3215aed93	feat(core): implement self-validation workflow with prompt-verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% semantic and verbatim coverage of the original system prompt. By moving to an additive model, we preserve the original reasoning anchors (lead-ins, heuristics, and formatting) while injecting critical autonomous engineering mandates. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (combining manifests/logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation. Technical Verification: - Verbatim restoration verified against 66 core tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' validation successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	61b35ff745	feat(core): comprehensive agent self-validation and engineering mandates Major upgrade to the agent's self-validation, safety, and project integrity capabilities through five iterations of system prompt enhancements: Workflow & Quality Mandates: 1. Incremental Validation: Mandates building, linting, and testing after every significant file change to maintain a "green" state. 2. Mandatory Reproduction: Requires creating a failing test case to confirm a bug before fixing, and explicitly verifying the failure (Negative Verification). 3. Test Persistence & Locality: Requires integrating repro cases into the permanent test suite, preferably by amending existing related test files. 4. Script Discovery: Mandates identifying project-specific validation commands from configuration files (package.json, Makefile, etc.). 5. Self-Review: Mandates running `git diff` after every edit, using `--name-only` for large changes to preserve context window tokens. 6. Fast-Path Validation: Prioritizes lightweight checks (e.g., `tsc --noEmit`) for frequent feedback, reserving heavy builds for final verification. 7. Output Verification: Requires checking command output (not just exit codes) to prevent false-positives from empty test runs or hidden warnings. Semantic Integrity & Dependency Safety: 8. Global Usage Discovery: Mandates searching the entire workspace for all usages (via `grep_search`) before modifying exported symbols or APIs. 9. Dependency Integrity: Requires verifying that new imports are explicitly declared in the project's dependency manifest (e.g., package.json). 10. Configuration Sync: Mandates updating build/environment configs (tsconfig, Dockerfile, etc.) to support new file types or entry points. 11. Documentation Sync: Requires searching for and updating documentation references when public APIs or CLI interfaces change. 12. Anti-Silencing Mandate: Prohibits using `any`, `@ts-ignore`, or lint suppressions to resolve validation errors. Diagnostics, Safety & Runtime Verification: 13. Error Grounding: Mandates reading full error logs and stack traces upon failure. Includes Smart Log Navigation to prioritize the tail of large files. 14. Scope Isolation: Instructs the agent to focus only on errors introduced by its changes and ignore unrelated legacy technical debt. 15. Destructive Safety: Mandates a `git status` check before deleting files or modifying critical project configurations. 16. Non-Blocking Smoke Tests: Requires briefly running applications to verify boot stability, using background/timeout strategies for servers. Includes 15 new behavioral evaluations verifying these mandates and updated snapshots in packages/core/src/core/prompts.test.ts.	2026-03-03 00:50:59 -08:00
Sandy Tao	a153ff587b	refactor(core): Extract tool parameter names as constants (#20460 )	2026-02-28 21:27:54 +00:00
Adib234	23905bcd77	fix(plan): prevent agent from using ask_user for shell command confirmation (#20504 )	2026-02-27 17:51:47 +00:00
Adib234	25ade7bcb7	feat(plan): update planning workflow to encourage multi-select with descriptions of options (#20491 )	2026-02-27 15:42:37 +00:00
Jerop Kipruto	aa98cafca7	feat(plan): adapt planning workflow based on complexity of task (#20465 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-26 22:58:19 +00:00
joshualitt	611d934829	feat(core): Enable generalist agent (#19665 )	2026-02-26 16:38:49 +00:00
Sandy Tao	39938000a9	feat(core): rename grep_search include parameter to include_pattern (#20328 )	2026-02-26 04:16:21 +00:00
Jerop Kipruto	baccda969d	feat(plan): summarize work after executing a plan (#19432 )	2026-02-24 17:35:32 +00:00
Jerop Kipruto	182c858e67	feat(policy): centralize plan mode tool visibility in policy engine (#20178 ) Co-authored-by: Mahima Shanware <mshanware@google.com>	2026-02-24 17:17:43 +00:00
Adam Weidman	547f5d45f5	feat(core): migrate read_file to 1-based start_line/end_line parameters (#19526 )	2026-02-20 22:59:18 +00:00
matt korwel	6cfd29ef9b	feat(plan): enforce read-only constraints in Plan Mode (#19433 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Jerop Kipruto <jerop@google.com>	2026-02-20 19:33:04 +00:00
joshualitt	87f5dd15d6	feat(core): experimental in-progress steering hints (2 of 2) (#19307 )	2026-02-18 22:05:50 +00:00
Jerop Kipruto	8f6a711a3a	fix(core): clarify plan mode constraints and exit mechanism (#19438 )	2026-02-18 20:09:59 +00:00
Christian Gunderman	ce84b3cb5f	Use ranged reads and limited searches and fuzzy editing improvements (#19240 )	2026-02-17 23:54:08 +00:00
Adib234	14aabbbe8b	feat(plan): support project exploration without planning when in plan mode (#18992 )	2026-02-17 16:52:59 +00:00
N. Taylor Mullen	39d36108d7	feat(core): support custom reasoning models by default (#19227 )	2026-02-16 20:47:58 +00:00
N. Taylor Mullen	6eec9f3350	fix(core): Encourage non-interactive flags for scaffolding commands (#18804 )	2026-02-15 20:26:59 +00:00
N. Taylor Mullen	27a1bae03b	feat(core): refine Plan Mode system prompt for agentic execution (#18799 )	2026-02-12 17:37:47 +00:00
Christian Gunderman	6c1773170e	More grep prompt tweaks (#18846 )	2026-02-11 21:55:27 +00:00
Christian Gunderman	2a08456ed0	Update prompt and grep tool definition to limit context size (#18780 )	2026-02-11 19:20:51 +00:00
Jerop Kipruto	49d55d972e	feat(core): formalize 5-phase sequential planning workflow (#18759 )	2026-02-11 03:02:20 +00:00
N. Taylor Mullen	cb4e1e684d	chore(core): update activate_skill prompt verbiage to be more direct (#18605 )	2026-02-10 22:17:42 +00:00
Christian Gunderman	8b762111a8	Fix issue where Gemini CLI creates tests in a new file (#18409 )	2026-02-10 20:53:29 +00:00
N. Taylor Mullen	55571de066	feat: redact disabled tools from system prompt (#13597 ) (#18613 )	2026-02-10 19:00:36 +00:00
Jack Wotherspoon	740f0e4c3d	fix: allow `ask_user` tool in yolo mode (#18541 )	2026-02-10 18:56:51 +00:00
N. Taylor Mullen	41bbe6ca0a	fix(core): standardize tool formatting in system prompts (#18615 )	2026-02-10 15:30:08 +00:00
N. Taylor Mullen	2ae5e1ae20	feat(core): optimize sub-agents system prompt intro (#18608 )	2026-02-10 08:25:42 +00:00
N. Taylor Mullen	92a5f725a1	refactor(core): refine Security & System Integrity section in system prompt (#18601 )	2026-02-10 04:32:36 +00:00
joshualitt	89d4556c45	feat(core): Render memory hierarchically in context. (#18350 )	2026-02-10 02:01:59 +00:00
N. Taylor Mullen	cc2798018b	feat: handle multiple dynamic context filenames in system prompt (#18598 )	2026-02-10 00:37:08 +00:00
N. Taylor Mullen	aebc107d2c	feat: move shell efficiency guidelines to tool description (#18614 )	2026-02-09 18:51:13 +00:00
N. Taylor Mullen	d45a45d565	chore: strengthen validation guidance in system prompt (#18544 )	2026-02-09 05:32:46 +00:00
N. Taylor Mullen	cb73fbf384	feat(core): transition sub-agents to XML format and improve definitions (#18555 )	2026-02-09 02:25:04 +00:00
N. Taylor Mullen	97a4e62dfa	feat(core): conditionally include ctrl+f prompt based on interactive shell setting (#18561 )	2026-02-09 00:23:22 +00:00
N. Taylor Mullen	92012365ca	fix(core): correct escaped interpolation in system prompt (#18557 )	2026-02-08 21:08:17 +00:00
N. Taylor Mullen	86bd7dbd4f	chore: remove feedback instruction from system prompt (#18560 )	2026-02-08 02:22:50 +00:00
N. Taylor Mullen	eee95c509d	refactor(core): remove memory tool instructions from Gemini 3 prompt (#18559 )	2026-02-08 01:57:53 +00:00
Jerop Kipruto	be6723ebcc	chore: remove redundant planning prompt from final shell (#18528 )	2026-02-07 19:45:09 +00:00
N. Taylor Mullen	9178b31629	feat(core): overhaul system prompt for rigor, integrity, and intent alignment (#17263 )	2026-02-07 03:13:07 +00:00
Jerop Kipruto	dc09b4988d	feat(plan): integrate planning artifacts and tools into primary workflows (#18375 )	2026-02-05 20:07:33 +00:00
Jerop Kipruto	6860556afe	feat(plan): add guidance on iterating on approved plans vs creating new plans (#18346 )	2026-02-05 19:11:45 +00:00
Jerop Kipruto	4a6e3eb646	feat(plan): support `replace` tool in plan mode to edit plans (#18379 )	2026-02-05 17:51:35 +00:00
Tommaso Sciortino	e4c80e6382	fix: Windows Specific Agent Quality & System Prompt (#18351 )	2026-02-05 17:50:12 +00:00
Christian Gunderman	a0b6602d09	Fix issue where agent gets stuck at interactive commands. (#18272 )	2026-02-04 07:02:09 +00:00
Jerop Kipruto	d866e7e6e7	feat(plan): unify workflow location in system prompt to optimize caching (#18258 )	2026-02-04 03:11:28 +00:00

1 2

56 Commits