gemini-cli

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-05-14 13:53:02 -07:00

Author	SHA1	Message	Date
Alisa Novikova	5ede571439	feat(core): implement validation back-off mechanism Adds a strict retry threshold to the agent's validation loop: 'If validation fails 3 times on the exact same test or error, DO NOT attempt another minor code tweak. You must immediately step back, use search tools to gather wider context, and formulate a completely new strategy.' This prevents the agent from getting stuck in repetitive, unsuccessful minor tweaks and encourages a more strategic approach when initial fixes fail.	2026-03-03 00:50:59 -08:00
Alisa Novikova	1df5178800	feat(core): prioritize existing test infrastructure and timebox test setup Introduces three critical mandates to the agent's testing and validation workflow: 1. Prioritize Existing Infrastructure: Strictly prefer running the project's existing test suite over writing custom reproduction scripts to avoid environment/import difficulties. 2. Timebox Test Setup: Abandon custom reproduction scripts if they fail to set up within 3-5 turns due to environment or import errors, falling back to static analysis and built-in tests. 3. Mandate Exhaustive Validation: Explicitly requires running relevant existing project tests to prevent regressions, ensuring a passing custom reproduction script is treated as a necessary but not sufficient condition for completion. These changes prevent 'Early Exhaustion' by reducing the complexity of standalone test setup in frameworks like Django.	2026-03-03 00:50:59 -08:00
Alisa Novikova	616062bdf6	feat(core): implement self-validation workflow with exact verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% verbatim coverage of the original system prompt text. By moving to an additive model, we preserve all original reasoning anchors, instructional lead-ins, and senior engineering heuristics while injecting critical autonomous mandates. Verbatim Restoration: - All 'Context Efficiency' guidelines, lead-ins, and scenarios (Search/Understand/Navigate). - All 'Engineering Standards' regarding style mimicry, abstractions, and debt isolation. - Full 'Primary Workflows' sequence and formatting. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (manifests + logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation (Tail-First). Technical Verification: - Verified against 67 core prompt tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	c3215aed93	feat(core): implement self-validation workflow with prompt-verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% semantic and verbatim coverage of the original system prompt. By moving to an additive model, we preserve the original reasoning anchors (lead-ins, heuristics, and formatting) while injecting critical autonomous engineering mandates. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (combining manifests/logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation. Technical Verification: - Verbatim restoration verified against 66 core tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' validation successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	61b35ff745	feat(core): comprehensive agent self-validation and engineering mandates Major upgrade to the agent's self-validation, safety, and project integrity capabilities through five iterations of system prompt enhancements: Workflow & Quality Mandates: 1. Incremental Validation: Mandates building, linting, and testing after every significant file change to maintain a "green" state. 2. Mandatory Reproduction: Requires creating a failing test case to confirm a bug before fixing, and explicitly verifying the failure (Negative Verification). 3. Test Persistence & Locality: Requires integrating repro cases into the permanent test suite, preferably by amending existing related test files. 4. Script Discovery: Mandates identifying project-specific validation commands from configuration files (package.json, Makefile, etc.). 5. Self-Review: Mandates running `git diff` after every edit, using `--name-only` for large changes to preserve context window tokens. 6. Fast-Path Validation: Prioritizes lightweight checks (e.g., `tsc --noEmit`) for frequent feedback, reserving heavy builds for final verification. 7. Output Verification: Requires checking command output (not just exit codes) to prevent false-positives from empty test runs or hidden warnings. Semantic Integrity & Dependency Safety: 8. Global Usage Discovery: Mandates searching the entire workspace for all usages (via `grep_search`) before modifying exported symbols or APIs. 9. Dependency Integrity: Requires verifying that new imports are explicitly declared in the project's dependency manifest (e.g., package.json). 10. Configuration Sync: Mandates updating build/environment configs (tsconfig, Dockerfile, etc.) to support new file types or entry points. 11. Documentation Sync: Requires searching for and updating documentation references when public APIs or CLI interfaces change. 12. Anti-Silencing Mandate: Prohibits using `any`, `@ts-ignore`, or lint suppressions to resolve validation errors. Diagnostics, Safety & Runtime Verification: 13. Error Grounding: Mandates reading full error logs and stack traces upon failure. Includes Smart Log Navigation to prioritize the tail of large files. 14. Scope Isolation: Instructs the agent to focus only on errors introduced by its changes and ignore unrelated legacy technical debt. 15. Destructive Safety: Mandates a `git status` check before deleting files or modifying critical project configurations. 16. Non-Blocking Smoke Tests: Requires briefly running applications to verify boot stability, using background/timeout strategies for servers. Includes 15 new behavioral evaluations verifying these mandates and updated snapshots in packages/core/src/core/prompts.test.ts.	2026-03-03 00:50:59 -08:00
Jacob Richman	8303edbb54	Code review fixes as a pr (#20612 )	2026-03-03 04:32:50 +00:00
Aswin Ashok	0d69f9f7fa	Build binary (#18933 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-03-03 01:02:19 +00:00
Sandy Tao	2e7722d6a3	fix(core): restrict "System: Please continue" invalid stream retry to Gemini 2 models (#20897 )	2026-03-02 23:21:13 +00:00
Yuna Seol	69e15a50d1	fix(core): skip telemetry logging for AbortError exceptions (#19477 ) Co-authored-by: Yuna Seol <yunaseol@google.com>	2026-03-02 23:14:31 +00:00
Adib234	01927a36d1	feat(plan): support annotating plans with feedback for iteration (#20876 )	2026-03-02 23:03:59 +00:00
Shreya Keshive	06ddfa5c4c	feat(admin): enable 30 day default retention for chat history & remove warning (#20853 )	2026-03-02 22:44:49 +00:00
Christian Gunderman	3f7ef816f1	fix(core): increase default headers timeout to 5 minutes (#20890 )	2026-03-02 22:36:58 +00:00
Jerop Kipruto	d05ba11a31	refactor(core): replace manual syncPlanModeTools with declarative policy rules (#20596 )	2026-03-02 22:30:50 +00:00
Allen Hutchison	bb6d1a2775	feat(core): add tool name validation in TOML policy files (#19281 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-02 21:47:21 +00:00
Nayana Parameswarappa	dd9ccc9807	Adding MCPOAuthProvider implementing the MCPSDK OAuthClientProvider (#20121 )	2026-03-02 21:37:44 +00:00
Pyush Sinha	8133d63ac6	refactor(cli): fully remove React anti patterns, improve type safety and fix UX oversights in SettingsDialog.tsx (#18963 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-03-02 21:30:58 +00:00
Sandy Tao	18d0375a7f	feat(core): support authenticated A2A agent card discovery (#20622 ) Co-authored-by: Adam Weidman <adamfweidman@google.com> Co-authored-by: Adam Weidman <65992621+adamfweidman@users.noreply.github.com>	2026-03-02 21:29:31 +00:00
Keith Guerin	31ca57ec94	feat: redesign header to be compact with ASCII icon (#18713 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-03-02 21:12:17 +00:00
Abhi	b7a8f0d1f9	fix(core): ensure subagents use qualified MCP tool names (#20801 )	2026-03-02 21:12:13 +00:00
Abdul Tawab	1502e5cbc3	style(cli) : Dialog pattern for /hooks Command (#17930 )	2026-03-02 21:12:05 +00:00
Christian Gunderman	7ca3a33f8b	Subagent activity UX. (#17570 )	2026-03-02 21:04:31 +00:00
Sandy Tao	ce5a2d0760	feat(core): truncate large MCP tool output (#19365 )	2026-03-02 21:01:49 +00:00
David Pierce	3a7a6e1540	Add install as an option when extension is selected. (#20358 )	2026-03-02 20:41:16 +00:00
Aishanee Shah	659301ff83	feat(core): centralize read_file limits and update gemini-3 description (#20619 )	2026-03-02 20:11:58 +00:00
Sandy Tao	446a4316c4	feat(core): implement HTTP authentication support for A2A remote agents (#20510 ) Co-authored-by: Adam Weidman <adamfweidman@google.com>	2026-03-02 19:59:48 +00:00
Adib234	2e1efaebe4	fix(plan): deflake plan mode integration tests (#20477 )	2026-03-02 19:51:44 +00:00
Sandy Tao	7c9fceba7f	fix(core): reduce LLM-based loop detection false positives (#20701 )	2026-03-02 19:08:15 +00:00
Adam Weidman	740efa2ac2	Merge User and Agent Card Descriptions #20849 (#20850 )	2026-03-02 17:59:29 +00:00
Abhi	703759cfae	fix(cli): allow sub-agent confirmation requests in UI while preventing background flicker (#20722 )	2026-03-01 02:39:25 +00:00
Sehoon Shon	6757d4b5c5	fix(cli): resolve autoThemeSwitching when background hasn't changed but theme mismatches (#20706 )	2026-02-28 23:22:10 +00:00
Sandy Tao	a153ff587b	refactor(core): Extract tool parameter names as constants (#20460 )	2026-02-28 21:27:54 +00:00
N. Taylor Mullen	cd3a8c3f07	fix(cli): reset themeManager between tests to ensure isolation (#20598 )	2026-02-28 19:45:31 +00:00
kartik	b2214a6676	fix: acp/zed race condition between MCP initialisation and prompt (#20205 ) Signed-off-by: Kartik Angiras <angiraskartik@gmail.com>	2026-02-28 17:33:08 +00:00
Jagjeevan Kashid	fae0639ba2	fix: use full paths for ACP diff payloads (#19539 ) Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>	2026-02-28 15:54:44 +00:00
gemini-cli-robot	fb6ff847dd	chore/release: bump version to 0.33.0-nightly.20260228.1ca5c05d0 (#20644 )	2026-02-28 02:13:48 +00:00
Gal Zahavi	0c6c9c6a62	chore(release): bump version to 0.33.0-nightly.20260227.ba149afa0 (#20637 )	2026-02-28 00:51:22 +00:00
Sehoon Shon	a1367e9cdd	fix(core): parse raw ASCII buffer strings in Gaxios errors (#20626 )	2026-02-27 23:57:32 +00:00
nityam	ba149afa0b	fix: merge duplicate imports in a2a-server package (2/4) (#19781 )	2026-02-27 21:13:30 +00:00
nityam	f6533c0250	fix: merge duplicate imports in sdk and test-utils packages (1/4) (#19777 )	2026-02-27 21:13:15 +00:00
Abhi	966b9059d0	feat(core): enable contiguous parallel admission for Kind.Agent tools (#20583 )	2026-02-27 21:08:10 +00:00
Spencer	20d884da2f	fix(core): reduce intrusive MCP errors and deduplicate diagnostics (#20232 )	2026-02-27 20:04:36 +00:00
Dmitry Lyalin	7f8ce8657c	Add low/full CLI error verbosity mode for cleaner UI (#20399 )	2026-02-27 19:15:10 +00:00
Jacob Richman	e00e8f4728	fix(cli): Shell autocomplete polish (#20411 )	2026-02-27 19:03:37 +00:00
Abhi	c914fd0700	fix(cli): prevent sub-agent tool calls from leaking into UI (#20580 )	2026-02-27 19:00:19 +00:00
Jerop Kipruto	5d24d6a9e1	fix(ui): persist expansion in AskUser dialog when navigating options (#20559 )	2026-02-27 18:30:16 +00:00
Gaurav	ea48bd9414	feat: better error messages (#20577 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-27 18:18:16 +00:00
Gaurav	b2d6844f9b	feat(billing): implement G1 AI credits overage flow with billing telemetry (#18590 )	2026-02-27 18:15:06 +00:00
Sehoon Shon	fdd844b405	fix(core): disable retries for code assist streaming requests (#20561 )	2026-02-27 18:04:43 +00:00
Adib234	23905bcd77	fix(plan): prevent agent from using ask_user for shell command confirmation (#20504 )	2026-02-27 17:51:47 +00:00
Dev Randalpura	ec39aa17c2	Moved markdown parsing logic to a separate util file (#20526 )	2026-02-27 17:43:18 +00:00

1 2 3 4 5 ...

3840 Commits