2026-05-06 15:50:06 -04:00
# Backlog Analysis Toolkit
This directory contains a suite of AI-powered tools for analyzing GitHub issues
and determining implementation effort levels for the Gemini CLI project.
## 📁 Directory Structure
- `data/` : Contains the issue data in JSON and CSV formats.
- `bugs.json` : The primary source of truth for bug analysis.
2026-05-06 16:04:00 -04:00
- `utils/` : Auxiliary scripts for manual overrides, debugging, and post-analysis
validation (e.g., `validate_effort.py` , `inject_manual_fixes.py` ).
- `*.py` : Core analysis and export scripts (e.g., `bug_analyzer_final.py` ,
`generate_bugs_csv.py` ).
2026-05-06 15:50:06 -04:00
- `loop_analyzer.sh` : A shell script for running iterative analysis until all
issues are processed.
2026-05-06 16:17:37 -04:00
## 📥 Prerequisites: Data Generation
Before running the analyzers, you must fetch the issue data from GitHub. The
2026-05-06 16:22:59 -04:00
scripts expect the data in JSON format.
2026-05-06 16:17:37 -04:00
2026-05-06 16:22:59 -04:00
The easiest way to generate this is to simply copy the URL from your browser
when looking at a filtered list of issues on GitHub, and pass it to our fetcher
script.
2026-05-06 16:17:37 -04:00
2026-05-06 16:22:59 -04:00
_(Note: You must have the [GitHub CLI (`gh`) ](https://cli.github.com/ ) installed
and authenticated)._
2026-05-06 16:17:37 -04:00
``` bash
2026-05-06 16:22:59 -04:00
# Fetch any filtered list of issues directly from a GitHub URL
python3 fetch_from_url.py "https://github.com/google-gemini/gemini-cli/issues/?q=type%3ABug+is%3Aopen" --output data/bugs.json
2026-05-06 16:17:37 -04:00
2026-05-06 16:22:59 -04:00
# Fetch features to a different file
python3 fetch_from_url.py "https://github.com/google-gemini/gemini-cli/issues/?q=type%3AFeature+is%3Aopen" --output data/issues.json
2026-05-06 16:17:37 -04:00
```
2026-05-06 15:50:06 -04:00
## 🚀 Workflows
2026-05-06 16:25:15 -04:00
### 1. Auto-Categorizing Issues with Gemini CLI
If you have a list of uncategorized issues fetched from GitHub, your first step
should be to classify them. You can use the Gemini CLI directly in your terminal
to label them.
**Example command: **
``` bash
gemini "Read data/uncategorized.json. For each issue, determine if it is a bug or a feature request. Then, use the gh CLI tool to add either the 'type/bug' or 'type/feature' label to the issue on GitHub, AND update the JSON object in the file to include a 'type' field with the chosen value."
```
_Note: Make sure your `gemini-cli` has permission to execute shell commands if
you want it to apply the labels automatically via `gh` ._
### 2. Initial Triage (Static)
2026-05-06 15:50:06 -04:00
Use this for a quick, first-pass estimation.
``` bash
2026-05-06 16:05:35 -04:00
python3 analyze_bugs.py --api-key "YOUR_KEY"
2026-05-06 15:50:06 -04:00
```
2026-05-06 16:25:15 -04:00
### 3. Deep Agentic Analysis
2026-05-06 15:50:06 -04:00
Uses Gemini as an agent with access to the codebase.
``` bash
2026-05-06 16:05:35 -04:00
python3 bug_analyzer_final.py --api-key "YOUR_KEY"
2026-05-06 15:50:06 -04:00
```
2026-05-06 16:25:15 -04:00
### 4. Iterative Analysis
2026-05-06 15:50:06 -04:00
Runs the single-turn analyzer in a loop until all issues have a valid analysis.
``` bash
2026-05-06 16:05:35 -04:00
GEMINI_API_KEY = "YOUR_KEY" ./loop_analyzer.sh
2026-05-06 15:50:06 -04:00
```
2026-05-06 16:25:15 -04:00
### 5. Validation & Export
2026-05-06 15:50:06 -04:00
2026-05-06 16:04:00 -04:00
Run validation from the utils folder to ensure consistency, then generate a
readable report.
2026-05-06 15:50:06 -04:00
``` bash
2026-05-06 16:04:00 -04:00
python3 utils/validate_effort.py
2026-05-06 15:50:06 -04:00
python3 generate_bugs_csv.py
```
2026-05-06 16:25:15 -04:00
### 6. Generic Issue Processing
2026-05-06 16:02:30 -04:00
For any other backlog task (e.g., categorizing features, updating labels, or
custom analysis), use the `generic_processor.py` . This script allows you to
provide a custom system prompt and a project root for codebase context.
``` bash
python3 generic_processor.py \
--api-key "YOUR_KEY" \
--input data/features.json \
--output data/features_analyzed.json \
--project ../../packages \
--prompt "Analyze these features and suggest which package they belong in. Output JSON: {\"package\": \"name\"}"
```
2026-05-06 15:50:06 -04:00
## 🧠 Effort Level Criteria
Ratings are based on technical complexity and reproduction difficulty:
- **Small (1 day):** Trivial logic changes, localized fixes (1-2 files), easy to
reproduce.
- **Medium (2-3 days):** Requires tracing across multiple components, UI state
management (React/Ink), or harder reproduction.
- **Large (3+ days):** Architectural issues, platform-specific (Windows, PTY,
Signals), performance bottlenecks, or core protocol changes.
_Note: Any bug that is difficult to reproduce or platform-specific must not be
rated as Small._
## 🛠 Usage Notes
- **API Key:** Ensure you have a valid Gemini API key set in the scripts.
- **Paths:** Scripts are configured to look for data in the `data/` subdirectory
and the codebase in `../../packages` .
- **Requirements:** Requires Python 3 and `jq` (for the shell script).