gemini-cli

mirror of https://github.com/google-gemini/gemini-cli.git synced 2026-06-20 00:06:49 -07:00

Author	SHA1	Message	Date
Alisa Novikova	901e94cba8	chore(core): simplify agent mandates to improve efficiency and reduce turn count	2026-03-03 01:00:26 -08:00
Alisa Novikova	adc62a76e0	test(core): update snapshots for new agent behavior mandates Updates core prompt snapshots to include: - Priority for existing test infrastructure - Timeboxed test setup (3-5 turn limit) - Mandate for exhaustive validation against regressions - Validation back-off mechanism (retry threshold) - Detection of circular/looping behavior	2026-03-03 00:50:59 -08:00
Alisa Novikova	0b06a9ae04	feat(core): implement circular behavior detection mandate Adds a self-awareness mandate to the agent's planning phase: 'Before attempting a fix for a validation error, review your recent tool calls. If you are repeatedly applying similar regex replacements or edits to the same block of code without the validation error changing, you are in a loop. Stop, revert your changes to a known good state, and rethink your approach.' This helps the agent identify and break out of unproductive loops during debugging and implementation.	2026-03-03 00:50:59 -08:00
Alisa Novikova	5ede571439	feat(core): implement validation back-off mechanism Adds a strict retry threshold to the agent's validation loop: 'If validation fails 3 times on the exact same test or error, DO NOT attempt another minor code tweak. You must immediately step back, use search tools to gather wider context, and formulate a completely new strategy.' This prevents the agent from getting stuck in repetitive, unsuccessful minor tweaks and encourages a more strategic approach when initial fixes fail.	2026-03-03 00:50:59 -08:00
Alisa Novikova	1df5178800	feat(core): prioritize existing test infrastructure and timebox test setup Introduces three critical mandates to the agent's testing and validation workflow: 1. Prioritize Existing Infrastructure: Strictly prefer running the project's existing test suite over writing custom reproduction scripts to avoid environment/import difficulties. 2. Timebox Test Setup: Abandon custom reproduction scripts if they fail to set up within 3-5 turns due to environment or import errors, falling back to static analysis and built-in tests. 3. Mandate Exhaustive Validation: Explicitly requires running relevant existing project tests to prevent regressions, ensuring a passing custom reproduction script is treated as a necessary but not sufficient condition for completion. These changes prevent 'Early Exhaustion' by reducing the complexity of standalone test setup in frameworks like Django.	2026-03-03 00:50:59 -08:00
Alisa Novikova	616062bdf6	feat(core): implement self-validation workflow with exact verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% verbatim coverage of the original system prompt text. By moving to an additive model, we preserve all original reasoning anchors, instructional lead-ins, and senior engineering heuristics while injecting critical autonomous mandates. Verbatim Restoration: - All 'Context Efficiency' guidelines, lead-ins, and scenarios (Search/Understand/Navigate). - All 'Engineering Standards' regarding style mimicry, abstractions, and debt isolation. - Full 'Primary Workflows' sequence and formatting. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (manifests + logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation (Tail-First). Technical Verification: - Verified against 67 core prompt tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	c3215aed93	feat(core): implement self-validation workflow with prompt-verbatim restoration This commit upgrades the agent with a robust self-validation workflow while ensuring 100% semantic and verbatim coverage of the original system prompt. By moving to an additive model, we preserve the original reasoning anchors (lead-ins, heuristics, and formatting) while injecting critical autonomous engineering mandates. Self-Validation Workflow Injections: - Research Phase: Parallel Discovery (combining manifests/logic) and High-Signal Grep. - Bug Fixing: Negative Verification (confirming repro failure) and Coverage Expansion. - Implementation: Transactional Edits (logical batching of module changes). - Validation Loop: Tiered Validation (Fixers -> Fast-Path -> Related Tests) and Smart Log Navigation. Technical Verification: - Verbatim restoration verified against 66 core tests and 14 snapshots. - New behavioral eval suite passed (evals/self_validation_workflow.eval.ts). - Full 'npm run preflight' validation successful.	2026-03-03 00:50:59 -08:00
Alisa Novikova	61b35ff745	feat(core): comprehensive agent self-validation and engineering mandates Major upgrade to the agent's self-validation, safety, and project integrity capabilities through five iterations of system prompt enhancements: Workflow & Quality Mandates: 1. Incremental Validation: Mandates building, linting, and testing after every significant file change to maintain a "green" state. 2. Mandatory Reproduction: Requires creating a failing test case to confirm a bug before fixing, and explicitly verifying the failure (Negative Verification). 3. Test Persistence & Locality: Requires integrating repro cases into the permanent test suite, preferably by amending existing related test files. 4. Script Discovery: Mandates identifying project-specific validation commands from configuration files (package.json, Makefile, etc.). 5. Self-Review: Mandates running `git diff` after every edit, using `--name-only` for large changes to preserve context window tokens. 6. Fast-Path Validation: Prioritizes lightweight checks (e.g., `tsc --noEmit`) for frequent feedback, reserving heavy builds for final verification. 7. Output Verification: Requires checking command output (not just exit codes) to prevent false-positives from empty test runs or hidden warnings. Semantic Integrity & Dependency Safety: 8. Global Usage Discovery: Mandates searching the entire workspace for all usages (via `grep_search`) before modifying exported symbols or APIs. 9. Dependency Integrity: Requires verifying that new imports are explicitly declared in the project's dependency manifest (e.g., package.json). 10. Configuration Sync: Mandates updating build/environment configs (tsconfig, Dockerfile, etc.) to support new file types or entry points. 11. Documentation Sync: Requires searching for and updating documentation references when public APIs or CLI interfaces change. 12. Anti-Silencing Mandate: Prohibits using `any`, `@ts-ignore`, or lint suppressions to resolve validation errors. Diagnostics, Safety & Runtime Verification: 13. Error Grounding: Mandates reading full error logs and stack traces upon failure. Includes Smart Log Navigation to prioritize the tail of large files. 14. Scope Isolation: Instructs the agent to focus only on errors introduced by its changes and ignore unrelated legacy technical debt. 15. Destructive Safety: Mandates a `git status` check before deleting files or modifying critical project configurations. 16. Non-Blocking Smoke Tests: Requires briefly running applications to verify boot stability, using background/timeout strategies for servers. Includes 15 new behavioral evaluations verifying these mandates and updated snapshots in packages/core/src/core/prompts.test.ts.	2026-03-03 00:50:59 -08:00
Aswin Ashok	0d69f9f7fa	Build binary (#18933 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-03-03 01:02:19 +00:00
Sandy Tao	2e7722d6a3	fix(core): restrict "System: Please continue" invalid stream retry to Gemini 2 models (#20897 )	2026-03-02 23:21:13 +00:00
Yuna Seol	69e15a50d1	fix(core): skip telemetry logging for AbortError exceptions (#19477 ) Co-authored-by: Yuna Seol <yunaseol@google.com>	2026-03-02 23:14:31 +00:00
Christian Gunderman	3f7ef816f1	fix(core): increase default headers timeout to 5 minutes (#20890 )	2026-03-02 22:36:58 +00:00
Jerop Kipruto	d05ba11a31	refactor(core): replace manual syncPlanModeTools with declarative policy rules (#20596 )	2026-03-02 22:30:50 +00:00
Allen Hutchison	bb6d1a2775	feat(core): add tool name validation in TOML policy files (#19281 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-02 21:47:21 +00:00
Nayana Parameswarappa	dd9ccc9807	Adding MCPOAuthProvider implementing the MCPSDK OAuthClientProvider (#20121 )	2026-03-02 21:37:44 +00:00
Sandy Tao	18d0375a7f	feat(core): support authenticated A2A agent card discovery (#20622 ) Co-authored-by: Adam Weidman <adamfweidman@google.com> Co-authored-by: Adam Weidman <65992621+adamfweidman@users.noreply.github.com>	2026-03-02 21:29:31 +00:00
Abhi	b7a8f0d1f9	fix(core): ensure subagents use qualified MCP tool names (#20801 )	2026-03-02 21:12:13 +00:00
Christian Gunderman	7ca3a33f8b	Subagent activity UX. (#17570 )	2026-03-02 21:04:31 +00:00
Sandy Tao	ce5a2d0760	feat(core): truncate large MCP tool output (#19365 )	2026-03-02 21:01:49 +00:00
Aishanee Shah	659301ff83	feat(core): centralize read_file limits and update gemini-3 description (#20619 )	2026-03-02 20:11:58 +00:00
Sandy Tao	446a4316c4	feat(core): implement HTTP authentication support for A2A remote agents (#20510 ) Co-authored-by: Adam Weidman <adamfweidman@google.com>	2026-03-02 19:59:48 +00:00
Adib234	2e1efaebe4	fix(plan): deflake plan mode integration tests (#20477 )	2026-03-02 19:51:44 +00:00
Sandy Tao	7c9fceba7f	fix(core): reduce LLM-based loop detection false positives (#20701 )	2026-03-02 19:08:15 +00:00
Adam Weidman	740efa2ac2	Merge User and Agent Card Descriptions #20849 (#20850 )	2026-03-02 17:59:29 +00:00
Sandy Tao	a153ff587b	refactor(core): Extract tool parameter names as constants (#20460 )	2026-02-28 21:27:54 +00:00
N. Taylor Mullen	cd3a8c3f07	fix(cli): reset themeManager between tests to ensure isolation (#20598 )	2026-02-28 19:45:31 +00:00
kartik	b2214a6676	fix: acp/zed race condition between MCP initialisation and prompt (#20205 ) Signed-off-by: Kartik Angiras <angiraskartik@gmail.com>	2026-02-28 17:33:08 +00:00
Sehoon Shon	a1367e9cdd	fix(core): parse raw ASCII buffer strings in Gaxios errors (#20626 )	2026-02-27 23:57:32 +00:00
nityam	ba149afa0b	fix: merge duplicate imports in a2a-server package (2/4) (#19781 )	2026-02-27 21:13:30 +00:00
Abhi	966b9059d0	feat(core): enable contiguous parallel admission for Kind.Agent tools (#20583 )	2026-02-27 21:08:10 +00:00
Spencer	20d884da2f	fix(core): reduce intrusive MCP errors and deduplicate diagnostics (#20232 )	2026-02-27 20:04:36 +00:00
Gaurav	ea48bd9414	feat: better error messages (#20577 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-27 18:18:16 +00:00
Gaurav	b2d6844f9b	feat(billing): implement G1 AI credits overage flow with billing telemetry (#18590 )	2026-02-27 18:15:06 +00:00
Sehoon Shon	fdd844b405	fix(core): disable retries for code assist streaming requests (#20561 )	2026-02-27 18:04:43 +00:00
Adib234	23905bcd77	fix(plan): prevent agent from using ask_user for shell command confirmation (#20504 )	2026-02-27 17:51:47 +00:00
Sehoon Shon	e709789067	fix(core): handle optional response fields from code assist API (#20345 )	2026-02-27 16:52:37 +00:00
Abhijit Balaji	32e777f838	fix(core): revert auto-save of policies to user space (#20531 )	2026-02-27 16:03:36 +00:00
Pyush Sinha	d7320f5425	refactor(core,cli): useAlternateBuffer read from config (#20346 ) Co-authored-by: Jacob Richman <jacob314@gmail.com>	2026-02-27 15:55:02 +00:00
Adib234	25ade7bcb7	feat(plan): update planning workflow to encourage multi-select with descriptions of options (#20491 )	2026-02-27 15:42:37 +00:00
christine betts	58df1c6237	Fix extension MCP server env var loading (#20374 )	2026-02-27 14:49:10 +00:00
Bryan Morgan	522e95439c	fix(core): apply retry logic to CodeAssistServer for all users (#20507 )	2026-02-27 09:26:53 -05:00
christine betts	e17f927a69	Add support for policy engine in extensions (#20049 ) Co-authored-by: Jerop Kipruto <jerop@google.com>	2026-02-27 03:29:33 +00:00
heaventourist	b1befee8fb	feat(telemetry) Instrument traces with more attributes and make them available to OTEL users (#20237 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Jerop Kipruto <jerop@google.com> Co-authored-by: MD. MOHIBUR RAHMAN <35300157+mrpmohiburrahman@users.noreply.github.com> Co-authored-by: Jeffrey Ying <jeffrey.ying86@live.com> Co-authored-by: Bryan Morgan <bryanmorgan@google.com> Co-authored-by: joshualitt <joshualitt@google.com> Co-authored-by: Dev Randalpura <devrandalpura@google.com> Co-authored-by: Google Admin <github-admin@google.com> Co-authored-by: Ben Knutson <benknutson@google.com>	2026-02-27 02:26:16 +00:00
Tommaso Sciortino	4b7ce1fe67	Avoid overaggressive unescaping (#20520 )	2026-02-27 01:50:21 +00:00
Siddharth Diwan	9b7852f11c	[Gemma x Gemini CLI] Add an Experimental Gemma Router that uses a LiteRT-LM shim into the Composite Model Classifier Strategy (#17231 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com>	2026-02-26 23:43:43 +00:00
Bryan Morgan	6dc9d5ff11	feat(core): increase fetch timeout and fix [object Object] error stringification (#20441 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-26 23:41:09 +00:00
Jerop Kipruto	aa98cafca7	feat(plan): adapt planning workflow based on complexity of task (#20465 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-26 22:58:19 +00:00
krishdef7	f700c923d9	fix(core): flush transcript for pure tool-call responses to ensure BeforeTool hooks see complete state (#20419 ) Co-authored-by: Bryan Morgan <bryanmorgan@google.com>	2026-02-26 22:39:36 +00:00
Sehoon Shon	edb1fdea30	fix(cli): support quota error fallbacks for all authentication types (#20475 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-26 22:39:25 +00:00
Adam Weidman	10c5bd8ce9	feat(core): improve A2A content extraction (#20487 ) Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>	2026-02-26 22:38:30 +00:00

1 2 3 4 5 ...

1855 Commits