Files
gemini-cli/tools/optimizer/investigations/INVESTIGATIONS.md
T
Christian Gunderman 87485c87a4 Script updates.
2026-04-23 07:57:37 -07:00

3.4 KiB

Deep-Dive Investigations

This file documents ad hoc investigations performed to understand contributing factors to metrics.

Investigation Metric Script Findings
Triage Backlog Analysis open_issues analyze_issues.js 578/1000 issues are untriaged. 980/1000 are unassigned.
Community PR Latency latency_pr_community analyze_community_prs.js ~50% of sampled community PRs have merge conflicts. ~50% are ready for review but pending.
Maintainer Workload workload maintainer_workload.csv 13 active maintainers, ~77 issues per maintainer ratio. Only 20 issues are currently assigned.
PR Conflict Analysis latency_pr_community check_merge_conflicts.js 37/80 community PRs have merge conflicts.

Hypotheses and Findings

Metric: open_issues (Current: 1000)

  • Hypothesis 1: High count is due to a massive triage backlog.
    • Evidence: 578 issues (58%) have status/need-triage. 980 issues (98%) are unassigned.
    • Conclusion: Supported. Triage is the primary bottleneck. Most issues are just sitting in the backlog without assignment.
  • Hypothesis 2: High count is due to stale issues.
    • Evidence: (To be gathered) Need to check age of untriaged issues.
    • Conclusion: Pending.

Metric: latency_pr_community_hours (Current: 75.24)

  • Hypothesis 1: Latency is due to author-side merge conflicts.
    • Evidence: Previous sample showed 37/80 community PRs have conflicts.
    • Conclusion: Strongly Supported.
  • Hypothesis 2: Latency is due to waiting for maintainer review.
    • Evidence: Previous sample showed 34/80 PRs are SUCCESS CI + Mergeable but in REVIEW_REQUIRED.
    • Conclusion: Supported.
  • Hypothesis 3: Latency is due to CI failures.
    • Evidence: Previous sample showed only 2/80 had FAILURE status.
    • Conclusion: Refuted for the majority.

Final Conclusions and Implemented Solutions

Root Cause 1: Manual Triage Overload

  • Finding: 58% of issues were untriaged and 98% were unassigned, with 13 maintainers each potentially responsible for 77+ issues.
  • Solution: Implemented triage_router.ts with workload-aware assignment. It automatically categorizes issues (bug/feature) and assigns them to the maintainer with the lowest current workload. It also identifies low-quality reports and requests more info.
  • Impact: 94 issues were triaged/assigned in a single run.

Root Cause 2: Communication Gap for Conflicts

  • Finding: 37% of community PRs had merge conflicts, many persisting for weeks.
  • Solution: Enhanced pr_nudge.ts to nudge authors about conflicts immediately. Added a terminal state in stale_manager.ts to automatically close PRs with conflicts after 14 days of inactivity.
  • Impact: 53 PRs nudged or targeted for closure.

Root Cause 3: Stale Debt

  • Finding: Hundreds of issues were inactive for > 30 days. status/needs-info issues were never closed.
  • Solution: Updated stale_manager.ts to handle status/needs-info (stale after 7 days) and increased the scan limit.
  • Impact: 40 stale items identified and targeted for labeling/closure.

Root Cause 4: Review Bottleneck

  • Finding: 34 community PRs were ready for review but unassigned.
  • Solution: Updated pr_nudge.ts to automatically assign a reviewer based on workload for any ready-to-review community PR.
  • Impact: Ensures every ready PR has a clear owner.