gemini-cli

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-04-25 12:34:38 -07:00

Author	SHA1	Message	Date
Alisa Novikova	901e94cba8	chore(core): simplify agent mandates to improve efficiency and reduce turn count	2026-03-03 01:00:26 -08:00
Alisa Novikova	adc62a76e0	test(core): update snapshots for new agent behavior mandates Updates core prompt snapshots to include: - Priority for existing test infrastructure - Timeboxed test setup (3-5 turn limit) - Mandate for exhaustive validation against regressions - Validation back-off mechanism (retry threshold) - Detection of circular/looping behavior	2026-03-03 00:50:59 -08:00
Alisa Novikova	0b06a9ae04	feat(core): implement circular behavior detection mandate Adds a self-awareness mandate to the agent's planning phase: 'Before attempting a fix for a validation error, review your recent tool calls. If you are repeatedly applying similar regex replacements or edits to the same block of code without the validation error changing, you are in a loop. Stop, revert your changes to a known good state, and rethink your approach.' This helps the agent identify and break out of unproductive loops during debugging and implementation.	2026-03-03 00:50:59 -08:00
Alisa Novikova	5ede571439	feat(core): implement validation back-off mechanism Adds a strict retry threshold to the agent's validation loop: 'If validation fails 3 times on the exact same test or error, DO NOT attempt another minor code tweak. You must immediately step back, use search tools to gather wider context, and formulate a completely new strategy.' This prevents the agent from getting stuck in repetitive, unsuccessful minor tweaks and encourages a more strategic approach when initial fixes fail.	2026-03-03 00:50:59 -08:00
Alisa Novikova	1df5178800	feat(core): prioritize existing test infrastructure and timebox test setup Introduces three critical mandates to the agent's testing and validation workflow: 1. Prioritize Existing Infrastructure: Strictly prefer running the project's existing test suite over writing custom reproduction scripts to avoid environment/import difficulties. 2. Timebox Test Setup: Abandon custom reproduction scripts if they fail to set up within 3-5 turns due to environment or import errors, falling back to static analysis and built-in tests. 3. Mandate Exhaustive Validation: Explicitly requires running relevant existing project tests to prevent regressions, ensuring a passing custom reproduction script is treated as a necessary but not sufficient condition for completion. These changes prevent 'Early Exhaustion' by reducing the complexity of standalone test setup in frameworks like Django.	2026-03-03 00:50:59 -08:00
Alisa Novikova	616062bdf6	feat(core): implement self-validation workflow with exact verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% verbatim coverage of the original system prompt text. By moving to an additive model, we preserve all original reasoning anchors, instructional lead-ins, and senior engineering heuristics while injecting critical autonomous mandates. Verbatim Restoration: - All 'Context Efficiency' guidelines, lead-ins, and scenarios (Search/Understand/Navigate). - All 'Engineering Standards' regarding style mimicry, abstractions, and debt isolation. - Full 'Primary Workflows' sequence and formatting. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (manifests + logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation (Tail-First). Technical Verification: - Verified against 67 core prompt tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	c3215aed93	feat(core): implement self-validation workflow with prompt-verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% semantic and verbatim coverage of the original system prompt. By moving to an additive model, we preserve the original reasoning anchors (lead-ins, heuristics, and formatting) while injecting critical autonomous engineering mandates. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (combining manifests/logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation. Technical Verification: - Verbatim restoration verified against 66 core tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' validation successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	61b35ff745	feat(core): comprehensive agent self-validation and engineering mandates Major upgrade to the agent's self-validation, safety, and project integrity capabilities through five iterations of system prompt enhancements: Workflow & Quality Mandates: 1. Incremental Validation: Mandates building, linting, and testing after every significant file change to maintain a "green" state. 2. Mandatory Reproduction: Requires creating a failing test case to confirm a bug before fixing, and explicitly verifying the failure (Negative Verification). 3. Test Persistence & Locality: Requires integrating repro cases into the permanent test suite, preferably by amending existing related test files. 4. Script Discovery: Mandates identifying project-specific validation commands from configuration files (package.json, Makefile, etc.). 5. Self-Review: Mandates running `git diff` after every edit, using `--name-only` for large changes to preserve context window tokens. 6. Fast-Path Validation: Prioritizes lightweight checks (e.g., `tsc --noEmit`) for frequent feedback, reserving heavy builds for final verification. 7. Output Verification: Requires checking command output (not just exit codes) to prevent false-positives from empty test runs or hidden warnings. Semantic Integrity & Dependency Safety: 8. Global Usage Discovery: Mandates searching the entire workspace for all usages (via `grep_search`) before modifying exported symbols or APIs. 9. Dependency Integrity: Requires verifying that new imports are explicitly declared in the project's dependency manifest (e.g., package.json). 10. Configuration Sync: Mandates updating build/environment configs (tsconfig, Dockerfile, etc.) to support new file types or entry points. 11. Documentation Sync: Requires searching for and updating documentation references when public APIs or CLI interfaces change. 12. Anti-Silencing Mandate: Prohibits using `any`, `@ts-ignore`, or lint suppressions to resolve validation errors. Diagnostics, Safety & Runtime Verification: 13. Error Grounding: Mandates reading full error logs and stack traces upon failure. Includes Smart Log Navigation to prioritize the tail of large files. 14. Scope Isolation: Instructs the agent to focus only on errors introduced by its changes and ignore unrelated legacy technical debt. 15. Destructive Safety: Mandates a `git status` check before deleting files or modifying critical project configurations. 16. Non-Blocking Smoke Tests: Requires briefly running applications to verify boot stability, using background/timeout strategies for servers. Includes 15 new behavioral evaluations verifying these mandates and updated snapshots in packages/core/src/core/prompts.test.ts.	2026-03-03 00:50:59 -08:00
Bryan Morgan	208291f391	fix(ci): handle empty APP_ID in stale PR closer (#20919 )	2026-03-03 00:14:36 -05:00
Jacob Richman	8303edbb54	Code review fixes as a pr (#20612 )	2026-03-03 04:32:50 +00:00
Aswin Ashok	0d69f9f7fa	Build binary (#18933 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-03-03 01:02:19 +00:00
Christian Gunderman	46231a1755	ci(evals): only run evals in CI if prompts or tools changed (#20898 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-03 00:29:31 +00:00
Sandy Tao	2e7722d6a3	fix(core): restrict "System: Please continue" invalid stream retry to Gemini 2 models (#20897 )	2026-03-02 23:21:13 +00:00
Yuna Seol	69e15a50d1	fix(core): skip telemetry logging for AbortError exceptions (#19477 ) Co-authored-by: Yuna Seol <yunaseol@google.com>	2026-03-02 23:14:31 +00:00
Christian Gunderman	25f59a0099	Add some dos and don'ts to behavioral evals README. (#20629 )	2026-03-02 23:14:00 +00:00
Adib234	01927a36d1	feat(plan): support annotating plans with feedback for iteration (#20876 )	2026-03-02 23:03:59 +00:00
Shreya Keshive	06ddfa5c4c	feat(admin): enable 30 day default retention for chat history & remove warning (#20853 )	2026-03-02 22:44:49 +00:00
Christian Gunderman	3f7ef816f1	fix(core): increase default headers timeout to 5 minutes (#20890 )	2026-03-02 22:36:58 +00:00
Jerop Kipruto	d05ba11a31	refactor(core): replace manual syncPlanModeTools with declarative policy rules (#20596 )	2026-03-02 22:30:50 +00:00
Hamdanbinhashim	e43b1cff58	docs: fix broken markdown links in main README.md (#20300 )	2026-03-02 21:51:52 +00:00
Allen Hutchison	bb6d1a2775	feat(core): add tool name validation in TOML policy files (#19281 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-02 21:47:21 +00:00
Nayana Parameswarappa	dd9ccc9807	Adding MCPOAuthProvider implementing the MCPSDK OAuthClientProvider (#20121 )	2026-03-02 21:37:44 +00:00
Pyush Sinha	8133d63ac6	refactor(cli): fully remove React anti patterns, improve type safety and fix UX oversights in SettingsDialog.tsx (#18963 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-03-02 21:30:58 +00:00
Sandy Tao	18d0375a7f	feat(core): support authenticated A2A agent card discovery (#20622 ) Co-authored-by: Adam Weidman <adamfweidman@google.com> Co-authored-by: Adam Weidman <65992621+adamfweidman@users.noreply.github.com>	2026-03-02 21:29:31 +00:00
Keith Guerin	31ca57ec94	feat: redesign header to be compact with ASCII icon (#18713 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-03-02 21:12:17 +00:00
Abhi	b7a8f0d1f9	fix(core): ensure subagents use qualified MCP tool names (#20801 )	2026-03-02 21:12:13 +00:00
Abdul Tawab	1502e5cbc3	style(cli) : Dialog pattern for /hooks Command (#17930 )	2026-03-02 21:12:05 +00:00
Christian Gunderman	7ca3a33f8b	Subagent activity UX. (#17570 )	2026-03-02 21:04:31 +00:00
Sandy Tao	ce5a2d0760	feat(core): truncate large MCP tool output (#19365 )	2026-03-02 21:01:49 +00:00
Sam Roberts	aa321b3d8c	Update CODEOWNERS for README.md reviewers (#20860 )	2026-03-02 20:54:05 +00:00
David Pierce	3a7a6e1540	Add install as an option when extension is selected. (#20358 )	2026-03-02 20:41:16 +00:00
Tommaso Sciortino	66530e44c8	document node limitation for shift+tab (#20877 )	2026-03-02 20:31:52 +00:00
Christian Gunderman	b034dcd412	Do not block CI on evals (#20870 )	2026-03-02 20:31:02 +00:00
Aishanee Shah	659301ff83	feat(core): centralize read_file limits and update gemini-3 description (#20619 )	2026-03-02 20:11:58 +00:00
Sandy Tao	446a4316c4	feat(core): implement HTTP authentication support for A2A remote agents (#20510 ) Co-authored-by: Adam Weidman <adamfweidman@google.com>	2026-03-02 19:59:48 +00:00
Tommaso Sciortino	48412a068e	Add /unassign support (#20864 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-03-02 19:54:26 +00:00
Adib234	2e1efaebe4	fix(plan): deflake plan mode integration tests (#20477 )	2026-03-02 19:51:44 +00:00
Sandy Tao	7c9fceba7f	fix(core): reduce LLM-based loop detection false positives (#20701 )	2026-03-02 19:08:15 +00:00
Adam Weidman	740efa2ac2	Merge User and Agent Card Descriptions #20849 (#20850 )	2026-03-02 17:59:29 +00:00
Abhi	703759cfae	fix(cli): allow sub-agent confirmation requests in UI while preventing background flicker (#20722 )	2026-03-01 02:39:25 +00:00
Sehoon Shon	0063581e47	feat(skills): add github-issue-creator skill (#20709 )	2026-02-28 23:22:22 +00:00
Sehoon Shon	6757d4b5c5	fix(cli): resolve autoThemeSwitching when background hasn't changed but theme mismatches (#20706 )	2026-02-28 23:22:10 +00:00
Sandy Tao	a153ff587b	refactor(core): Extract tool parameter names as constants (#20460 )	2026-02-28 21:27:54 +00:00
N. Taylor Mullen	cd3a8c3f07	fix(cli): reset themeManager between tests to ensure isolation (#20598 )	2026-02-28 19:45:31 +00:00
kartik	b2214a6676	fix: acp/zed race condition between MCP initialisation and prompt (#20205 ) Signed-off-by: Kartik Angiras <angiraskartik@gmail.com>	2026-02-28 17:33:08 +00:00
gemini-cli-robot	6c65a2d813	Changelog for v0.32.0-preview.0 (#20627 ) Co-authored-by: gemini-cli-robot <224641728+gemini-cli-robot@users.noreply.github.com>	2026-02-28 16:03:50 +00:00
Jagjeevan Kashid	fae0639ba2	fix: use full paths for ACP diff payloads (#19539 ) Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>	2026-02-28 15:54:44 +00:00
gemini-cli-robot	76f70d65ff	Changelog for v0.31.0 (#20634 ) Co-authored-by: gemini-cli-robot <224641728+gemini-cli-robot@users.noreply.github.com>	2026-02-28 03:45:07 +00:00
gemini-cli-robot	fb6ff847dd	chore/release: bump version to 0.33.0-nightly.20260228.1ca5c05d0 (#20644 )	2026-02-28 02:13:48 +00:00
Gal Zahavi	1ca5c05d0d	fix(github): use robot PAT for automated PRs to pass CLA check (#20641 )	2026-02-28 01:13:58 +00:00

1 2 3 4 5 ...

4982 Commits