fix(bot): break pigeonholing by syncing lessons-learned and optimizing CI

This update resolves the bot's persistent focus on already-completed tasks:
- Moves and syncs lessons-learned.md to tools/gemini-cli-bot/ to ensure persistent memory.
- Marks metrics fixes, prompt hardenings, and user rejection signals as DONE in the ledger.
- Implements the CI matrix optimization (Node 20.x for PRs) the bot was re-attempting.
- This forces the bot to rotate to a new domain in the next run by satisfying its current goals.
This commit is contained in:
Christian Gunderman
2026-05-05 09:01:14 -07:00
parent 8572e60f9c
commit 0c4ac593eb
2 changed files with 58 additions and 8 deletions
+56
View File
@@ -0,0 +1,56 @@
# 🤖 Gemini Bot Brain: Memory & State
## 📋 Task Ledger
| ID | Status | Goal | PR/Ref | Details |
| :---- | :-------- | :----------------------------------------- | :----- | :----------------------------------------------------------------------------- |
| BT-33 | DONE | Restore & Enforce Lifecycle (Actual) | #26355 | Verified merged on main. Unified lifecycle manager deployed. |
| BT-34 | DONE | Fix linter issues in PR #26355 | #26355 | Verified merged on main. |
| BT-35 | SUBMITTED | Implement Gemini PR Pre-review Reflex | #26471 | Created reflex script and updated pulse workflow. |
| BT-36 | FAILED | Optimize Lifecycle Manager & Prune Backlog | #PR | Branch `bot/task-BT-36` was found empty; changes did not land on main. |
| BT-37 | FAILED | Scale-Safe Lifecycle & Aggressive Pruning | #BT37 | REJECTED: PR closure logic lacks mandatory grace period; false claim in log. |
| BT-38 | FAILED | Implement Robust State-Based Lifecycle | #PR | REJECTED: Introduces N+1 query vulnerability in paginated loop. |
| BT-39 | DONE | Scale-Safe Lifecycle with Grace Period | #26483 | Implemented batch limits and state-based PR closure. Verified in PR. |
| BT-40 | DONE | Uncap Repository Metrics (1000 items) | N/A | Implemented GraphQL pagination in bottlenecks.ts and priority_distribution.ts. |
| BT-41 | DONE | Mandate Local Validation & PR Awareness | N/A | Updated Brain/Critique prompts to require lint/build/test and fix bot PRs. |
| BT-42 | DONE | Learning from User Rejections | N/A | Formally defined closed PRs as explicit rejection signals in prompts. |
| BT-43 | DONE | CI Matrix Optimization | N/A | Optimized ci.yml to run only Node 20.x on PRs, saving compute time. |
## 🧪 Hypothesis Ledger
| Hypothesis | Status | Evidence |
| :-------------------------------- | :-------- | :--------------------------------------------------------------------------- |
| Lifecycle manager is throttled | CONFIRMED | `gemini-lifecycle-manager.cjs` was limited to 100 items without pagination. |
| 60-day stale policy is too slow | CONFIRMED | Arrival rate (24/day) exceeds closure rate; backlog at 2156. |
| Metrics scripts are capped at 100 | CONFIRMED | GraphQL `issues` connection has a hard limit of 100 records per request. |
| Verification fails in CI | CONFIRMED | Multiple PRs failed due to lint errors because local validation was missing. |
## 📜 Decision Log (Append-Only)
- **[2026-05-04]**: BT-36: Identified 100-item pagination bottleneck in
`gemini-lifecycle-manager.cjs`.
- **[2026-05-05]**: BT-39: [APPROVED] Implemented state-based PR closure with
7-day grace period.
- **[2026-05-05]**: BT-40: [DONE] Refactored `bottlenecks.ts` and
`priority_distribution.ts` to use cursor-based pagination (up to 1000 items)
to satisfy GraphQL limits.
- **[2026-05-05]**: BT-41: [DONE] Hardened Brain and Critique system prompts to
enforce `npm run lint`, `npm run build`, and `npm test` before PR submission.
- **[2026-05-05]**: BT-42: [DONE] Updated feedback loop to treat closed
(unmerged) PRs as explicit user rejections, mandating root-cause analysis in
Decision Log.
- **[2026-05-05]**: BT-43: [DONE] Optimized `.github/workflows/ci.yml` matrix to
run only Node 20.x for pull requests, reducing job count and Actions cost by
~57%. Full matrix coverage remains for main/release.
## 📝 Detailed Investigation Findings (Current Run)
- **Root Cause & Conclusions**: The bot was pigeonholed because its memory
(`lessons-learned.md`) was out of sync with reality, causing it to re-attempt
tasks that were already completed manually or were failing due to missing
local validation. Missing pagination in metrics scripts masked progress.
- **Proposed Actions**:
1. Consolidate and move `lessons-learned.md` to
`tools/gemini-cli-bot/lessons-learned.md` to ensure persistent bot
awareness.
2. Mark metrics and validation tasks as DONE to trigger domain rotation.