mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-04-28 22:14:52 -07:00
Initial Version
This commit is contained in:
@@ -124,6 +124,8 @@ npm install -g @google/gemini-cli@nightly
|
|||||||
|
|
||||||
### Advanced Capabilities
|
### Advanced Capabilities
|
||||||
|
|
||||||
|
- **Automated Iterative Loops**: Use [Ralph Wiggum mode](./docs/ralph-wiggum.md)
|
||||||
|
to repeatedly execute prompts until a goal is met (e.g., fixing tests).
|
||||||
- Ground your queries with built-in
|
- Ground your queries with built-in
|
||||||
[Google Search](https://ai.google.dev/gemini-api/docs/grounding) for real-time
|
[Google Search](https://ai.google.dev/gemini-api/docs/grounding) for real-time
|
||||||
information
|
information
|
||||||
@@ -256,6 +258,33 @@ use `--output-format stream-json` to get newline-delimited JSON events:
|
|||||||
gemini -p "Run tests and deploy" --output-format stream-json
|
gemini -p "Run tests and deploy" --output-format stream-json
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Ralph Wiggum Mode (Iterative Automation)
|
||||||
|
|
||||||
|
Ralph Wiggum mode is an advanced automation feature that allows Gemini CLI to
|
||||||
|
repeatedly execute a prompt in a loop until a specific goal is achieved. This is
|
||||||
|
ideal for tasks like fixing failing tests or complex refactoring.
|
||||||
|
|
||||||
|
To use Ralph Wiggum mode, provide a prompt and a **completion promise** (a
|
||||||
|
string to look for in the output). The CLI will:
|
||||||
|
|
||||||
|
1. Enter **YOLO mode** to auto-approve all tool calls.
|
||||||
|
2. Run the prompt and check the response for your completion string.
|
||||||
|
3. If not found, it repeats the process up to a specified **max iterations**.
|
||||||
|
4. **Persistent Context**: It uses a **memory file** (`memories.md` by default)
|
||||||
|
to pass notes between iterations. **Note:** Use a unique `--memory-file` for
|
||||||
|
different tasks in the same directory to ensure context isolation.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
gemini -p "Fix the failing tests in this repo" \
|
||||||
|
--ralph-wiggum \
|
||||||
|
--completion-promise "ALL TESTS PASSED" \
|
||||||
|
--max-iterations 5 \
|
||||||
|
--memory-file "task-fix-tests.md"
|
||||||
|
```
|
||||||
|
|
||||||
|
At the end of the run, a summary table displays the result of each iteration and
|
||||||
|
extracted test statistics.
|
||||||
|
|
||||||
### Quick Examples
|
### Quick Examples
|
||||||
|
|
||||||
#### Start a new project
|
#### Start a new project
|
||||||
|
|||||||
@@ -35,6 +35,9 @@ and parameters.
|
|||||||
| `--model` | `-m` | string | `auto` | Model to use. See [Model Selection](#model-selection) for available values. |
|
| `--model` | `-m` | string | `auto` | Model to use. See [Model Selection](#model-selection) for available values. |
|
||||||
| `--prompt` | `-p` | string | - | Prompt text. Appended to stdin input if provided. **Deprecated:** Use positional arguments instead. |
|
| `--prompt` | `-p` | string | - | Prompt text. Appended to stdin input if provided. **Deprecated:** Use positional arguments instead. |
|
||||||
| `--prompt-interactive` | `-i` | string | - | Execute prompt and continue in interactive mode |
|
| `--prompt-interactive` | `-i` | string | - | Execute prompt and continue in interactive mode |
|
||||||
|
| `--ralph-wiggum` | - | boolean | `false` | Enable Ralph Wiggum iterative loop mode |
|
||||||
|
| `--completion-promise` | - | string | - | String to look for to signal completion in Ralph Wiggum mode |
|
||||||
|
| `--max-iterations` | - | number | `10` | Maximum loop iterations for Ralph Wiggum mode |
|
||||||
| `--sandbox` | `-s` | boolean | `false` | Run in a sandboxed environment for safer execution |
|
| `--sandbox` | `-s` | boolean | `false` | Run in a sandboxed environment for safer execution |
|
||||||
| `--approval-mode` | - | string | `default` | Approval mode for tool execution. Choices: `default`, `auto_edit`, `yolo` |
|
| `--approval-mode` | - | string | `default` | Approval mode for tool execution. Choices: `default`, `auto_edit`, `yolo` |
|
||||||
| `--yolo` | `-y` | boolean | `false` | **Deprecated.** Auto-approve all actions. Use `--approval-mode=yolo` instead. |
|
| `--yolo` | `-y` | boolean | `false` | **Deprecated.** Auto-approve all actions. Use `--approval-mode=yolo` instead. |
|
||||||
|
|||||||
@@ -0,0 +1,141 @@
|
|||||||
|
# Ralph Wiggum mode
|
||||||
|
|
||||||
|
Ralph Wiggum mode is an iterative automation technique that lets Gemini CLI
|
||||||
|
repeatedly execute a prompt until a specific goal is met. This mode is designed
|
||||||
|
for tasks that benefit from persistent refinement, such as fixing failing tests
|
||||||
|
or performing complex refactoring.
|
||||||
|
|
||||||
|
> **Note:** This is a preview feature currently under active development.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Inspired by the "Ralph Wiggum" technique, this mode treats failures as data and
|
||||||
|
uses a feedback loop to reach a successful state. When you enable Ralph Wiggum
|
||||||
|
mode, Gemini CLI enters YOLO (auto-approval) mode and continues to process the
|
||||||
|
provided prompt until it detects your specified completion string in the model's
|
||||||
|
output or reaches the maximum number of iterations.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
To use Ralph Wiggum mode, you must provide a prompt using the `-p` or `--prompt`
|
||||||
|
flag. You then configure the loop behavior using the following flags:
|
||||||
|
|
||||||
|
| Flag | Description |
|
||||||
|
| :--------------------- | :--------------------------------------------------------- |
|
||||||
|
| `--ralph-wiggum` | Enables the Ralph Wiggum iterative loop mode. |
|
||||||
|
| `--completion-promise` | The string to look for in the output to signal completion. |
|
||||||
|
| `--max-iterations` | The maximum number of times to run the loop (default: 10). |
|
||||||
|
| `--memory-file` | Task-specific memory file (default: `memories.md`). |
|
||||||
|
|
||||||
|
### Example
|
||||||
|
|
||||||
|
The following command attempts to fix tests by running the loop up to 5 times
|
||||||
|
until the string "TESTS PASSED" appears in the output, using a specific memory
|
||||||
|
file for this task:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
gemini -p "Fix the tests in packages/core" \
|
||||||
|
--ralph-wiggum \
|
||||||
|
--completion-promise "TESTS PASSED" \
|
||||||
|
--max-iterations 5 \
|
||||||
|
--memory-file "fix-core-tests.md"
|
||||||
|
```
|
||||||
|
|
||||||
|
## How it works
|
||||||
|
|
||||||
|
When you run Gemini CLI with the `--ralph-wiggum` flag, the following process
|
||||||
|
occurs:
|
||||||
|
|
||||||
|
1. **Enforces YOLO mode:** The tool automatically sets the approval mode to
|
||||||
|
`yolo`. This ensures that tool calls (like writing files or running shell
|
||||||
|
commands) are approved automatically to allow the automation to proceed
|
||||||
|
without human intervention.
|
||||||
|
2. **Iterative execution:** The CLI executes the provided prompt in a loop.
|
||||||
|
3. **Completion check:** After each iteration, the CLI scans the full text of
|
||||||
|
the assistant's response for the string provided in `--completion-promise`.
|
||||||
|
4. **Loop termination:**
|
||||||
|
- If the completion string is found, the loop exits successfully.
|
||||||
|
- If the completion string is not found, the CLI starts a new iteration
|
||||||
|
using the same initial prompt.
|
||||||
|
- If the number of iterations reaches the `--max-iterations` limit, the loop
|
||||||
|
stops.
|
||||||
|
|
||||||
|
## Persistent context (Memories)
|
||||||
|
|
||||||
|
To help the agent learn from previous attempts, Ralph Wiggum mode uses a
|
||||||
|
`memories.md` file in your current working directory.
|
||||||
|
|
||||||
|
- **Automatic creation:** If the file doesn't exist, the CLI creates it with a
|
||||||
|
default header.
|
||||||
|
- **Context injection:** At the start of each iteration, the content of
|
||||||
|
`memories.md` is read and prepended to your prompt.
|
||||||
|
- **Usage:** You (or the agent, via tool use) can write notes, error logs, or
|
||||||
|
successful patterns into this file. This allows the agent to "remember" what
|
||||||
|
failed in iteration 1 and avoid repeating the same mistake in iteration 2.
|
||||||
|
|
||||||
|
## Summary statistics
|
||||||
|
|
||||||
|
At the end of the execution, Ralph Wiggum mode provides a summary table in the
|
||||||
|
terminal. This table details the performance of each iteration, including:
|
||||||
|
|
||||||
|
- **Iteration number:** The sequence of the run.
|
||||||
|
- **Status:** Whether the iteration met the completion promise ("Success") or
|
||||||
|
failed to do so ("Failed").
|
||||||
|
- **Tests Passed/Failed:** If the output contains recognizable test runner
|
||||||
|
patterns (such as those from Vitest, Jest, or Mocha), the CLI extracts and
|
||||||
|
displays the number of passing and failing tests.
|
||||||
|
|
||||||
|
### Example summary table
|
||||||
|
|
||||||
|
```text
|
||||||
|
--- Ralph Wiggum Mode Summary ---
|
||||||
|
| Iteration | Status | Tests Passed | Tests Failed |
|
||||||
|
|-----------|---------|--------------|--------------|
|
||||||
|
| 1 | Failed | 2 | 10 |
|
||||||
|
| 2 | Failed | 8 | 4 |
|
||||||
|
| 3 | Success | 12 | 0 |
|
||||||
|
---------------------------------
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best practices
|
||||||
|
|
||||||
|
To get the most out of Ralph Wiggum mode, we recommend the following:
|
||||||
|
|
||||||
|
- **Clear completion criteria:** Ensure your prompt instructs the model to emit
|
||||||
|
a specific, unique string (like "ALL TESTS PASSED") only when the task is
|
||||||
|
truly complete.
|
||||||
|
- **Incremental goals:** Use prompts that encourage the model to make small,
|
||||||
|
verifiable changes in each iteration.
|
||||||
|
- **Safety nets:** Always set a reasonable `--max-iterations` limit to prevent
|
||||||
|
unintended long-running processes.
|
||||||
|
|
||||||
|
## Development and rebuilding
|
||||||
|
|
||||||
|
If you're modifying Ralph Wiggum mode or enabling it in a development
|
||||||
|
environment, you must recompile the TypeScript source code.
|
||||||
|
|
||||||
|
### Full rebuild
|
||||||
|
|
||||||
|
To build all packages in the monorepo, run the following command from the root
|
||||||
|
directory:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run build
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fast CLI rebuild
|
||||||
|
|
||||||
|
If you've already performed a full build and are only making changes to the CLI
|
||||||
|
package, you can run a targeted build:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run build -w @google/gemini-cli
|
||||||
|
```
|
||||||
|
|
||||||
|
### Running in development
|
||||||
|
|
||||||
|
After rebuilding, test your changes using the `npm run start` script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run start -- -p "Your task" --ralph-wiggum --completion-promise "SUCCESS"
|
||||||
|
```
|
||||||
@@ -45,6 +45,7 @@
|
|||||||
{ "label": "Custom commands", "slug": "docs/cli/custom-commands" },
|
{ "label": "Custom commands", "slug": "docs/cli/custom-commands" },
|
||||||
{ "label": "Enterprise features", "slug": "docs/cli/enterprise" },
|
{ "label": "Enterprise features", "slug": "docs/cli/enterprise" },
|
||||||
{ "label": "Headless mode & scripting", "slug": "docs/cli/headless" },
|
{ "label": "Headless mode & scripting", "slug": "docs/cli/headless" },
|
||||||
|
{ "label": "Ralph Wiggum mode", "slug": "docs/ralph-wiggum" },
|
||||||
{ "label": "Sandboxing", "slug": "docs/cli/sandbox" },
|
{ "label": "Sandboxing", "slug": "docs/cli/sandbox" },
|
||||||
{ "label": "System prompt override", "slug": "docs/cli/system-prompt" },
|
{ "label": "System prompt override", "slug": "docs/cli/system-prompt" },
|
||||||
{ "label": "Telemetry", "slug": "docs/cli/telemetry" }
|
{ "label": "Telemetry", "slug": "docs/cli/telemetry" }
|
||||||
|
|||||||
@@ -70,6 +70,11 @@ export interface CliArgs {
|
|||||||
prompt: string | undefined;
|
prompt: string | undefined;
|
||||||
promptInteractive: string | undefined;
|
promptInteractive: string | undefined;
|
||||||
|
|
||||||
|
ralphWiggum: boolean | undefined;
|
||||||
|
completionPromise: string | undefined;
|
||||||
|
maxIterations: number | undefined;
|
||||||
|
memoryFile: string | undefined;
|
||||||
|
|
||||||
yolo: boolean | undefined;
|
yolo: boolean | undefined;
|
||||||
approvalMode: string | undefined;
|
approvalMode: string | undefined;
|
||||||
allowedMcpServerNames: string[] | undefined;
|
allowedMcpServerNames: string[] | undefined;
|
||||||
@@ -141,6 +146,31 @@ export async function parseArguments(
|
|||||||
description: 'Run in sandbox?',
|
description: 'Run in sandbox?',
|
||||||
})
|
})
|
||||||
|
|
||||||
|
.option('ralph-wiggum', {
|
||||||
|
alias: 'ralphWiggum',
|
||||||
|
type: 'boolean',
|
||||||
|
description:
|
||||||
|
'Enable Ralph Wiggum mode (iterative loop with YOLO mode).',
|
||||||
|
})
|
||||||
|
.option('completion-promise', {
|
||||||
|
alias: 'completionPromise',
|
||||||
|
type: 'string',
|
||||||
|
description:
|
||||||
|
'The string to look for in the output to signal completion in Ralph Wiggum mode.',
|
||||||
|
})
|
||||||
|
.option('max-iterations', {
|
||||||
|
alias: 'maxIterations',
|
||||||
|
type: 'number',
|
||||||
|
description: 'Maximum number of iterations for Ralph Wiggum mode.',
|
||||||
|
})
|
||||||
|
.option('memory-file', {
|
||||||
|
alias: 'memoryFile',
|
||||||
|
type: 'string',
|
||||||
|
description:
|
||||||
|
'Task-specific memory file for Ralph Wiggum mode (defaults to memories.md).',
|
||||||
|
default: 'memories.md',
|
||||||
|
})
|
||||||
|
|
||||||
.option('yolo', {
|
.option('yolo', {
|
||||||
alias: 'y',
|
alias: 'y',
|
||||||
type: 'boolean',
|
type: 'boolean',
|
||||||
|
|||||||
@@ -476,6 +476,10 @@ describe('gemini.tsx main function kitty protocol', () => {
|
|||||||
prompt: undefined,
|
prompt: undefined,
|
||||||
promptInteractive: undefined,
|
promptInteractive: undefined,
|
||||||
query: undefined,
|
query: undefined,
|
||||||
|
ralphWiggum: undefined,
|
||||||
|
completionPromise: undefined,
|
||||||
|
maxIterations: undefined,
|
||||||
|
memoryFile: undefined,
|
||||||
yolo: undefined,
|
yolo: undefined,
|
||||||
approvalMode: undefined,
|
approvalMode: undefined,
|
||||||
allowedMcpServerNames: undefined,
|
allowedMcpServerNames: undefined,
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ import { loadSettings, SettingScope } from './config/settings.js';
|
|||||||
import { getStartupWarnings } from './utils/startupWarnings.js';
|
import { getStartupWarnings } from './utils/startupWarnings.js';
|
||||||
import { getUserStartupWarnings } from './utils/userStartupWarnings.js';
|
import { getUserStartupWarnings } from './utils/userStartupWarnings.js';
|
||||||
import { ConsolePatcher } from './ui/utils/ConsolePatcher.js';
|
import { ConsolePatcher } from './ui/utils/ConsolePatcher.js';
|
||||||
import { runNonInteractive } from './nonInteractiveCli.js';
|
import { runNonInteractive, runRalphWiggum } from './nonInteractiveCli.js';
|
||||||
import {
|
import {
|
||||||
cleanupCheckpoints,
|
cleanupCheckpoints,
|
||||||
registerCleanup,
|
registerCleanup,
|
||||||
@@ -740,13 +740,25 @@ export async function main() {
|
|||||||
|
|
||||||
initializeOutputListenersAndFlush();
|
initializeOutputListenersAndFlush();
|
||||||
|
|
||||||
await runNonInteractive({
|
if (argv.ralphWiggum) {
|
||||||
config,
|
await runRalphWiggum({
|
||||||
settings,
|
config,
|
||||||
input,
|
settings,
|
||||||
prompt_id,
|
input,
|
||||||
resumedSessionData,
|
prompt_id,
|
||||||
});
|
resumedSessionData,
|
||||||
|
completionPromise: argv.completionPromise,
|
||||||
|
maxIterations: argv.maxIterations,
|
||||||
|
});
|
||||||
|
} else {
|
||||||
|
await runNonInteractive({
|
||||||
|
config,
|
||||||
|
settings,
|
||||||
|
input,
|
||||||
|
prompt_id,
|
||||||
|
resumedSessionData,
|
||||||
|
});
|
||||||
|
}
|
||||||
// Call cleanup before process.exit, which causes cleanup to not run
|
// Call cleanup before process.exit, which causes cleanup to not run
|
||||||
await runExitCleanup();
|
await runExitCleanup();
|
||||||
process.exit(ExitCodes.SUCCESS);
|
process.exit(ExitCodes.SUCCESS);
|
||||||
|
|||||||
@@ -55,13 +55,187 @@ interface RunNonInteractiveParams {
|
|||||||
resumedSessionData?: ResumedSessionData;
|
resumedSessionData?: ResumedSessionData;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
interface IterationResult {
|
||||||
|
iteration: number;
|
||||||
|
status: 'Success' | 'Failed';
|
||||||
|
testsPassed?: number;
|
||||||
|
testsFailed?: number;
|
||||||
|
testsTotal?: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
function extractTestStats(output: string): {
|
||||||
|
passed?: number;
|
||||||
|
failed?: number;
|
||||||
|
total?: number;
|
||||||
|
} {
|
||||||
|
// Common patterns for test runners (Vitest, Jest, Mocha, etc.)
|
||||||
|
const patterns = [
|
||||||
|
// Vitest/Jest: "Tests: 3 passed, 1 failed, 4 total"
|
||||||
|
/Tests:\s*(?:(\d+)\s+passed)?(?:,\s*)?(?:(\d+)\s+failed)?(?:,\s*)?(?:(\d+)\s+total)?/i,
|
||||||
|
// Mocha: "3 passing (10ms)"
|
||||||
|
/(\d+)\s+passing/i,
|
||||||
|
// Mocha: "1 failing"
|
||||||
|
/(\d+)\s+failing/i,
|
||||||
|
// Generic: "Passed: 3, Failed: 1"
|
||||||
|
/Passed:\s*(\d+)/i,
|
||||||
|
/Failed:\s*(\d+)/i,
|
||||||
|
];
|
||||||
|
|
||||||
|
let passed: number | undefined;
|
||||||
|
let failed: number | undefined;
|
||||||
|
let total: number | undefined;
|
||||||
|
|
||||||
|
// Try Vitest/Jest pattern first as it is most comprehensive
|
||||||
|
const vitestMatch = output.match(patterns[0]);
|
||||||
|
if (vitestMatch && (vitestMatch[1] || vitestMatch[2] || vitestMatch[3])) {
|
||||||
|
passed = vitestMatch[1] ? parseInt(vitestMatch[1], 10) : 0;
|
||||||
|
failed = vitestMatch[2] ? parseInt(vitestMatch[2], 10) : 0;
|
||||||
|
total = vitestMatch[3] ? parseInt(vitestMatch[3], 10) : 0;
|
||||||
|
return { passed, failed, total };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fallback to individual patterns
|
||||||
|
const passingMatch = output.match(patterns[1]);
|
||||||
|
if (passingMatch) {
|
||||||
|
passed = parseInt(passingMatch[1], 10);
|
||||||
|
} else {
|
||||||
|
const passedMatch = output.match(patterns[3]);
|
||||||
|
if (passedMatch) passed = parseInt(passedMatch[1], 10);
|
||||||
|
}
|
||||||
|
|
||||||
|
const failingMatch = output.match(patterns[2]);
|
||||||
|
if (failingMatch) {
|
||||||
|
failed = parseInt(failingMatch[1], 10);
|
||||||
|
} else {
|
||||||
|
const failedMatch = output.match(patterns[4]);
|
||||||
|
if (failedMatch) failed = parseInt(failedMatch[1], 10);
|
||||||
|
}
|
||||||
|
|
||||||
|
return { passed, failed, total };
|
||||||
|
}
|
||||||
|
|
||||||
|
function printSummary(results: IterationResult[]) {
|
||||||
|
process.stderr.write('\n--- Ralph Wiggum Mode Summary ---\n');
|
||||||
|
process.stderr.write(
|
||||||
|
'| Iteration | Status | Tests Passed | Tests Failed |\n',
|
||||||
|
);
|
||||||
|
process.stderr.write(
|
||||||
|
'|-----------|---------|--------------|--------------|\n',
|
||||||
|
);
|
||||||
|
for (const result of results) {
|
||||||
|
const passed = result.testsPassed !== undefined ? result.testsPassed : '-';
|
||||||
|
const failed = result.testsFailed !== undefined ? result.testsFailed : '-';
|
||||||
|
process.stderr.write(
|
||||||
|
`| ${result.iteration.toString().padEnd(9)} | ${result.status.padEnd(7)} | ${passed.toString().padEnd(12)} | ${failed.toString().padEnd(12)} |\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
process.stderr.write('---------------------------------\n\n');
|
||||||
|
}
|
||||||
|
|
||||||
|
import fs from 'node:fs';
|
||||||
|
import path from 'node:path';
|
||||||
|
|
||||||
|
// ... (existing imports)
|
||||||
|
|
||||||
|
export async function runRalphWiggum({
|
||||||
|
config,
|
||||||
|
settings,
|
||||||
|
input,
|
||||||
|
prompt_id,
|
||||||
|
resumedSessionData,
|
||||||
|
completionPromise,
|
||||||
|
maxIterations,
|
||||||
|
memoryFile,
|
||||||
|
}: RunNonInteractiveParams & {
|
||||||
|
completionPromise?: string;
|
||||||
|
maxIterations?: number;
|
||||||
|
memoryFile?: string;
|
||||||
|
}): Promise<void> {
|
||||||
|
const effectiveMaxIterations = maxIterations ?? 10;
|
||||||
|
let iterations = 0;
|
||||||
|
let currentResumedSessionData = resumedSessionData;
|
||||||
|
const results: IterationResult[] = [];
|
||||||
|
const effectiveMemoryFile = memoryFile || 'memories.md';
|
||||||
|
const memoriesPath = path.join(process.cwd(), effectiveMemoryFile);
|
||||||
|
|
||||||
|
if (!fs.existsSync(memoriesPath)) {
|
||||||
|
fs.writeFileSync(
|
||||||
|
memoriesPath,
|
||||||
|
`# Ralph Wiggum Memories\n\nTask: ${input}\n\nUse this file (${effectiveMemoryFile}) to store notes on what worked and what didn't work across iterations. The agent will read this at the start of each run.\n\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Starting loop. Max iterations: ${effectiveMaxIterations}\n`,
|
||||||
|
);
|
||||||
|
|
||||||
|
while (iterations < effectiveMaxIterations) {
|
||||||
|
iterations++;
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Iteration ${iterations}/${effectiveMaxIterations}\n`,
|
||||||
|
);
|
||||||
|
|
||||||
|
let currentInput = input;
|
||||||
|
try {
|
||||||
|
if (fs.existsSync(memoriesPath)) {
|
||||||
|
const memories = fs.readFileSync(memoriesPath, 'utf-8');
|
||||||
|
if (memories.trim()) {
|
||||||
|
currentInput = `Context from previous iterations (${effectiveMemoryFile}):\n${memories}\n\nTask:\n${input}`;
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Loaded context from ${effectiveMemoryFile}\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (error) {
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Failed to read ${effectiveMemoryFile}: ${error}\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const output = await runNonInteractive({
|
||||||
|
config,
|
||||||
|
settings,
|
||||||
|
input: currentInput,
|
||||||
|
prompt_id,
|
||||||
|
resumedSessionData: currentResumedSessionData,
|
||||||
|
});
|
||||||
|
|
||||||
|
const stats = extractTestStats(output);
|
||||||
|
const success =
|
||||||
|
completionPromise && output.includes(completionPromise) ? true : false;
|
||||||
|
|
||||||
|
results.push({
|
||||||
|
iteration: iterations,
|
||||||
|
status: success ? 'Success' : 'Failed',
|
||||||
|
testsPassed: stats.passed,
|
||||||
|
testsFailed: stats.failed,
|
||||||
|
testsTotal: stats.total,
|
||||||
|
});
|
||||||
|
|
||||||
|
if (success) {
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Completion promise "${completionPromise}" met. Exiting.\n`,
|
||||||
|
);
|
||||||
|
printSummary(results);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Clear resumedSessionData so we don't try to resume partially through
|
||||||
|
currentResumedSessionData = undefined;
|
||||||
|
}
|
||||||
|
process.stderr.write(
|
||||||
|
`[Ralph Wiggum] Max iterations reached without meeting completion promise.\n`,
|
||||||
|
);
|
||||||
|
printSummary(results);
|
||||||
|
}
|
||||||
|
|
||||||
export async function runNonInteractive({
|
export async function runNonInteractive({
|
||||||
config,
|
config,
|
||||||
settings,
|
settings,
|
||||||
input,
|
input,
|
||||||
prompt_id,
|
prompt_id,
|
||||||
resumedSessionData,
|
resumedSessionData,
|
||||||
}: RunNonInteractiveParams): Promise<void> {
|
}: RunNonInteractiveParams): Promise<string> {
|
||||||
return promptIdContext.run(prompt_id, async () => {
|
return promptIdContext.run(prompt_id, async () => {
|
||||||
const consolePatcher = new ConsolePatcher({
|
const consolePatcher = new ConsolePatcher({
|
||||||
stderr: true,
|
stderr: true,
|
||||||
@@ -181,6 +355,9 @@ export async function runNonInteractive({
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Store accumulated response text to return
|
||||||
|
let fullResponseText = '';
|
||||||
|
|
||||||
let errorToHandle: unknown | undefined;
|
let errorToHandle: unknown | undefined;
|
||||||
try {
|
try {
|
||||||
consolePatcher.patch();
|
consolePatcher.patch();
|
||||||
@@ -316,6 +493,13 @@ export async function runNonInteractive({
|
|||||||
const isRaw =
|
const isRaw =
|
||||||
config.getRawOutput() || config.getAcceptRawOutputRisk();
|
config.getRawOutput() || config.getAcceptRawOutputRisk();
|
||||||
const output = isRaw ? event.value : stripAnsi(event.value);
|
const output = isRaw ? event.value : stripAnsi(event.value);
|
||||||
|
|
||||||
|
// Accumulate full response
|
||||||
|
if (event.value) {
|
||||||
|
fullResponseText += event.value;
|
||||||
|
responseText += output;
|
||||||
|
}
|
||||||
|
|
||||||
if (streamFormatter) {
|
if (streamFormatter) {
|
||||||
streamFormatter.emitEvent({
|
streamFormatter.emitEvent({
|
||||||
type: JsonStreamEventType.MESSAGE,
|
type: JsonStreamEventType.MESSAGE,
|
||||||
@@ -325,7 +509,7 @@ export async function runNonInteractive({
|
|||||||
delta: true,
|
delta: true,
|
||||||
});
|
});
|
||||||
} else if (config.getOutputFormat() === OutputFormat.JSON) {
|
} else if (config.getOutputFormat() === OutputFormat.JSON) {
|
||||||
responseText += output;
|
// responseText is already updated
|
||||||
} else {
|
} else {
|
||||||
if (event.value) {
|
if (event.value) {
|
||||||
textOutput.write(output);
|
textOutput.write(output);
|
||||||
@@ -381,7 +565,7 @@ export async function runNonInteractive({
|
|||||||
),
|
),
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
return;
|
return fullResponseText;
|
||||||
} else if (event.type === GeminiEventType.AgentExecutionBlocked) {
|
} else if (event.type === GeminiEventType.AgentExecutionBlocked) {
|
||||||
const blockMessage = `Agent execution blocked: ${event.value.systemMessage?.trim() || event.value.reason}`;
|
const blockMessage = `Agent execution blocked: ${event.value.systemMessage?.trim() || event.value.reason}`;
|
||||||
if (config.getOutputFormat() === OutputFormat.TEXT) {
|
if (config.getOutputFormat() === OutputFormat.TEXT) {
|
||||||
@@ -488,7 +672,7 @@ export async function runNonInteractive({
|
|||||||
} else {
|
} else {
|
||||||
textOutput.ensureTrailingNewline(); // Ensure a final newline
|
textOutput.ensureTrailingNewline(); // Ensure a final newline
|
||||||
}
|
}
|
||||||
return;
|
return fullResponseText;
|
||||||
}
|
}
|
||||||
|
|
||||||
currentMessages = [{ role: 'user', parts: toolResponseParts }];
|
currentMessages = [{ role: 'user', parts: toolResponseParts }];
|
||||||
@@ -512,7 +696,7 @@ export async function runNonInteractive({
|
|||||||
} else {
|
} else {
|
||||||
textOutput.ensureTrailingNewline(); // Ensure a final newline
|
textOutput.ensureTrailingNewline(); // Ensure a final newline
|
||||||
}
|
}
|
||||||
return;
|
return fullResponseText;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
@@ -528,5 +712,6 @@ export async function runNonInteractive({
|
|||||||
if (errorToHandle) {
|
if (errorToHandle) {
|
||||||
handleError(errorToHandle, config);
|
handleError(errorToHandle, config);
|
||||||
}
|
}
|
||||||
|
return fullResponseText;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user