From 3336c42913e9af2d7ca457cf4d3a112902c1b3dd Mon Sep 17 00:00:00 2001 From: Sehoon Shon Date: Sun, 15 Feb 2026 17:39:17 -0500 Subject: [PATCH] refactor(core): only enable todo tool for Gemini 2 models --- .gemini/skills/regression-finder/SKILL.md | 55 ++++++++++++++++++ .../references/regression-patterns.md | 41 +++++++++++++ .../regression-finder/scripts/bisect_run.sh | 26 +++++++++ packages/core/src/config/config.ts | 7 ++- regression-finder.skill | Bin 0 -> 3363 bytes 5 files changed, 126 insertions(+), 3 deletions(-) create mode 100644 .gemini/skills/regression-finder/SKILL.md create mode 100644 .gemini/skills/regression-finder/references/regression-patterns.md create mode 100755 .gemini/skills/regression-finder/scripts/bisect_run.sh create mode 100644 regression-finder.skill diff --git a/.gemini/skills/regression-finder/SKILL.md b/.gemini/skills/regression-finder/SKILL.md new file mode 100644 index 0000000000..0f9e1b8946 --- /dev/null +++ b/.gemini/skills/regression-finder/SKILL.md @@ -0,0 +1,55 @@ +--- +name: regression-finder +description: Identify the root cause of a regression by scanning recent PRs and commits or using automated bisecting. Use when a user reports a bug that "used to work before", "broke", "stopped working", or asks "which PR caused this regression". +--- + +# Regression Finder + +## Overview + +This skill helps you find the exact change (PR or commit) that introduced a bug. It uses a combination of metadata analysis (searching PR titles and file changes) and behavioral analysis (automated `git bisect`). + +## Workflow + +### 1. Reproduction Assessment +First, determine if the bug is reliably reproducible. +- Ask the user for a reproduction command or script. +- Attempt to write a minimal unit test or shell script. +- **CRITICAL**: If you cannot create a **reliable** test (e.g., due to complexity, external deps, or flakiness), **DO NOT** create an unreliable one. Skip `git bisect` and proceed to Step 2 to rely solely on metadata and code analysis. + +### 2. Metadata Culprit Search (PR Focus) +Scan recent history to find the PR that likely caused the issue. +- **Prioritize finding the Pull Request (PR)**. PRs provide context (why a change was made) that commits lack. +- **Targeted File Search**: If you have high confidence in which files are involved (e.g., a UI bug in `packages/cli`), search for PRs touching those files first: + - Use `git log -n 30 --pretty=format:"%h %s" -- ` to see the most recent commits to those files. + - Look for commit messages that mention PR numbers (e.g., "Merge pull request #123" or "feat: ... (#456)"). +- **Broad Search**: If the location is unknown, use `gh pr list --limit 30 --state merged` to get a general list of recent PRs and filter by title/description. +- Always try to link a suspicious commit back to its originating PR for full context. + +### 3. Candidate Selection & Verification +Your goal is to identify the **PR** that caused the regression. + +**Scenario A: You HAVE a reliable test** +- **Verify**: Checkout the merge commit of a candidate PR and run the test. +- **Bisect**: If candidates are unclear, use `git bisect` with the test script (see `scripts/bisect_run.sh`). + +**Scenario B: You DO NOT have a reliable test** +- **Manual Analysis**: Read the code diffs of potential candidates. Look for logic changes that match the bug description. +- **Diff Check**: `git show ` or view the PR diff. +- **Selection**: + - If one PR is a **Strong Match** (obvious logic error matching the bug), select it as the result. + - If ambiguous, select the **Top 3 Candidates** based on file relevance and recency. +- **Constraint**: Do **NOT** run `git bisect` without a reliable test. + +### 4. Reporting +- Present your findings: + - **Single Strong Candidate**: If identified. + - **Top 3 Candidates**: If the exact cause is uncertain. +- **Do NOT** automatically attempt to fix, revert, or run further verification unless explicitly asked. The user will decide the next step (e.g., revert locally, investigate further). + +## Tips for Efficiency +- **Limit File Scope**: When scanning metadata, always provide file paths to `git log` to ignore unrelated changes. +- **Sanity Check**: Always verify the "Good" and "Bad" commits manually before starting a long `bisect run`. + +## Common Regression Patterns +See [references/patterns.md](references/regression-patterns.md) for a guide on interpreting "breaking" changes like state synchronization issues or dependency mismatches. diff --git a/.gemini/skills/regression-finder/references/regression-patterns.md b/.gemini/skills/regression-finder/references/regression-patterns.md new file mode 100644 index 0000000000..1c740986bc --- /dev/null +++ b/.gemini/skills/regression-finder/references/regression-patterns.md @@ -0,0 +1,41 @@ +# Common Regression Patterns + +When analyzing a "bad" commit, look for these common patterns that often cause +regressions in this codebase. + +## 1. State Synchronization Issues + +**Symptoms**: UI doesn't update, stale data, "disappearing" history. +**Pattern**: State is derived from multiple sources that are updated at +different times, or state is updated via side-effects (e.g., `useEffect` or +callbacks) instead of being purely derived. **Fix**: Use `useMemo` to derive +state or ensure atomic updates. + +## 2. Missing Hook Dependencies + +**Symptoms**: Logic runs with old state, variables seem stuck. **Pattern**: A +`useCallback`, `useMemo`, or `useEffect` has an incomplete dependency array. +**Check**: Look for ESLint suppression comments like +`// eslint-disable-next-line react-hooks/exhaustive-deps`. + +## 3. Bypassed Logic + +**Symptoms**: Feature works sometimes but fails in specific paths (e.g., after +cancellation). **Pattern**: A refactor introduced a new submission or update +path that bypasses existing validation or cleanup logic. **Check**: Search for +direct calls to `onSubmit` or `setState` that should have gone through a wrapper +function (like `handleSubmit`). + +## 4. Environment/Platform Specifics + +**Symptoms**: Works on MacOS but fails on Windows or Linux. **Pattern**: Usage +of path delimiters (`/` vs +``), terminal escape sequences, or platform-specific CLI flags. **Fix**: Use `node:path` +and verify terminal capability detection. + +## 5. Mock Mismatch in Tests + +**Symptoms**: Tests pass but application fails (or vice versa). **Pattern**: A +mock in a unit test was not updated to reflect a change in the real +implementation's interface or behavioral expectations. **Check**: Verify that +mock return values and implementations match the current code. diff --git a/.gemini/skills/regression-finder/scripts/bisect_run.sh b/.gemini/skills/regression-finder/scripts/bisect_run.sh new file mode 100755 index 0000000000..e055f6b0d7 --- /dev/null +++ b/.gemini/skills/regression-finder/scripts/bisect_run.sh @@ -0,0 +1,26 @@ +#!/bin/bash + +# bisect_run.sh +# Automated test script for git bisect run. +# Expects the test command as the first argument. + +TEST_COMMAND=$1 + +if [ -z "$TEST_COMMAND" ]; then + echo "Error: No test command provided." + exit 125 # Skip this commit +fi + +echo "Running test: $TEST_COMMAND" + +# Execute the test command +eval "$TEST_COMMAND" +EXIT_CODE=$? + +if [ $EXIT_CODE -eq 0 ]; then + echo ">>> COMMIT IS GOOD" + exit 0 +else + echo ">>> COMMIT IS BAD (Exit Code: $EXIT_CODE)" + exit 1 +fi diff --git a/packages/core/src/config/config.ts b/packages/core/src/config/config.ts index 6dfc62f322..84cb96a78f 100644 --- a/packages/core/src/config/config.ts +++ b/packages/core/src/config/config.ts @@ -56,6 +56,7 @@ import { DEFAULT_GEMINI_MODEL, DEFAULT_GEMINI_MODEL_AUTO, isAutoModel, + isGemini2Model, isPreviewModel, PREVIEW_GEMINI_FLASH_MODEL, PREVIEW_GEMINI_MODEL, @@ -809,9 +810,9 @@ export class Config { params.truncateToolOutputThreshold ?? DEFAULT_TRUNCATE_TOOL_OUTPUT_THRESHOLD; // // TODO(joshualitt): Re-evaluate the todo tool for 3 family. - this.useWriteTodos = isPreviewModel(this.model) - ? false - : (params.useWriteTodos ?? true); + this.useWriteTodos = isGemini2Model(this.model) + ? (params.useWriteTodos ?? true) + : false; this.enableHooksUI = params.enableHooksUI ?? true; this.enableHooks = params.enableHooks ?? true; this.disabledHooks = params.disabledHooks ?? []; diff --git a/regression-finder.skill b/regression-finder.skill new file mode 100644 index 0000000000000000000000000000000000000000..5ac78500db2bc03ec77b86f7c86afdd6b8bea264 GIT binary patch literal 3363 zcmb7{cT^MU8pZ<&N)rN7q*u$LG^MuyO0S_CkRBlzAS56qbm>h&x}YE+0@8aEMM?rJ zO7B&A?)z|?b@!e-bKW`M{4u|I&%Eb*o@cbxfFM%9*Tdge)#Q)GpBFL!GXM?0 z3rE8ptl$`aeLWHY@K$O#V(zqD+{gd~pb!iR0HkY10MI^#(40O-{0C%r1Im#E1OR5v z00106LeOva6~Ha#6%e6nk7Su>3CQaGc8Onyvep`;NK3xodFE;g>!THF>Y|5*FohVQw} z^5B@=Ph&h3eI*ugv&j2i8W3JYsu)SW)_9qq+hL?`|ISN^kj}NW{_Fy+^q$(z-ruqpsc8K0%UTs7@RGw z1sq|c-5_Iy$v%cAzQ?$9c;E;Cph@UCnl6(kLEt@ez3$MU!=YDrXx-x2T9An4;ACZL zML}H=Kg7bGZ@|4&gou_VcuU6LTJ}cu{>3L{wor9LG;*pJ6hc~mMm>R)X-_KG-1G#bL%Ceu^rQy!3_8yib+Kd=OB}i%KKm zN*m+sTY7TywO0&EGMv?Zll8(e*sBed#tF4)Vm*_*<08Eoz?j9G9$tNuz^ZPzl>mb} zn7BRMbb1CicQn@(*TRMdd;o~>z3*0>q-3&eaR(JUCA@Z;Y&f5Y%zb`^^!i-sgH0vv z!Fv1WLY8>O4sd&I$`_OP(k}#G_x!JggZOmMW312!l=J_i8Xm8+6Q~IQfa%ks{R_bo zfq`2&o1tAC_%OErq!ca>)wCLAX#)564Vg#yoZq%LoeN^jjI$YC+ z$U4nfWmr@FxP42@{ps)JJG1G!BSh8=y9VCM+k*x zVR@{4!JdqlNlUl)d7JXtJdLIi>+QnaiwyU!lq7r2?tqJ& z!R!xQ+|;@}kWMwEWp5rN`)8%PFu^cZEc~5_;`{f{ zH+(V=CNQ26_qQam;c3_`8+55zdZeSu29;QSxMfB&ME%J6vh$%{k4qz$q@_7EFXO#VJ}9j!nbVZETh{a0IXJ%o z&wCf(?QYOS4FQ(DsFeIXAK=a7+XU8){5~@@PD9(x8xd}U@G5Qc4u*uz}X>Dake<7{maIwxMT3$sd zf0jH7kZkKGs71E+Srn>^Za$N#d+O}ebT$)WQhc}H!E&TcPM?X$J3rTy%w|q1+Q`uu z#(1Ex1ytu)@oBzlKHNO5pu~SfxV+;bv&F&s`O>BJBs`_bqM~6b^j1HyO#}MYBJQE5 zI{Y2>N*&$tUi9z8&gLe_&OCoUN|OeU7@%OYPp)qr((84Vf$;I>+E(TK zGH*=Qz}7y;?4J1;5RJX@>`{H#{6y3o@h%qS)hQpsc`-I8+B=RDBFhoD-uA$J)k3Ig z6~B5?3BHwPsb6;OeoTrl2m{2IkB&gNwNOoURRc_B~c$?Ev^ z+hNw?Oh&N@i9uQzq-Ev@L`l?bcostiz9e2!B0dght(|U6R2ws|u%Qim3Wq59+CWE3 z*suf%GKotJ!cYOvq^avhqxe|=8%Z;`$SPOlc9YJnnDqEIy-eSCsd zfAAr3D(f66A0L=!vmM{?US~%#@EkT^){bkFU$=iyVj})h9GNJcA?d_iIC$ZDE^E*A zI~FmFADfBerZPM^U6ahc!bJsRqSh|4!MPrHS_TPH9SoqJkF6z3Uo%;`eq2oZ(&L519+yg{`Lhx;-Nh zl+so+pQg>U)d&b#fxos%GQfpDlQtNzD))8yL)s?#mbQNqvOV@ literal 0 HcmV?d00001